Object Detection - 2025-04
Object Detection - 2025-04
| Publish Date | Title | Authors | Translate | Read | Code | |
|---|---|---|---|---|---|---|
| 2025-04-30 | V3LMA: Visual 3D-enhanced Language Model for Autonomous Driving | Jannik Lübberstedt et.al. | 2505.00156 | translate | read | null |
| 2025-04-30 | LLM-Empowered Embodied Agent for Memory-Augmented Task Planning in Household Robotics | Marc Glocker et.al. | 2504.21716 | translate | read | null |
| 2025-04-30 | Visual Text Processing: A Comprehensive Review and Unified Evaluation | Yan Shu et.al. | 2504.21682 | translate | read | null |
| 2025-04-29 | T2ID-CAS: Diffusion Model and Class Aware Sampling to Mitigate Class Imbalance in Neck Ultrasound Anatomical Landmark Detection | Manikanta Varaganti et.al. | 2504.21231 | translate | read | null |
| 2025-04-29 | FLIM-based Salient Object Detection Networks with Adaptive Decoders | Gilson Junior Soares et.al. | 2504.20872 | translate | read | null |
| 2025-04-29 | A Survey on Event-based Optical Marker Systems | Nafiseh Jabbari Tofighi et.al. | 2504.20736 | translate | read | null |
| 2025-04-29 | Purifying, Labeling, and Utilizing: A High-Quality Pipeline for Small Object Detection | Siwei Wang et.al. | 2504.20602 | translate | read | null |
| 2025-04-29 | Style-Adaptive Detection Transformer for Single-Source Domain Generalized Object Detection | Jianhong Han et.al. | 2504.20498 | translate | read | null |
| 2025-04-28 | More Clear, More Flexible, More Precise: A Comprehensive Oriented Object Detection benchmark for UAV | Kai Ye et.al. | 2504.20032 | translate | read | null |
| 2025-04-28 | Lossy Source Coding with Focal Loss | Alex Dytso et.al. | 2504.19913 | translate | read | null |
| 2025-04-28 | Neural network task specialization via domain constraining | Roman Malashin et.al. | 2504.19592 | translate | read | null |
| 2025-04-28 | GMAR: Gradient-Driven Multi-Head Attention Rollout for Vision Transformer Interpretability | Sehyeong Jo et.al. | 2504.19414 | translate | read | null |
| 2025-04-27 | Improving Small Drone Detection Through Multi-Scale Processing and Data Augmentation | Rayson Laroca et.al. | 2504.19347 | translate | read | null |
| 2025-04-27 | ODExAI: A Comprehensive Object Detection Explainable AI Evaluation | Loc Phuc Truong Nguyen et.al. | 2504.19249 | translate | read | null |
| 2025-04-27 | Boosting Single-domain Generalized Object Detection via Vision-Language Knowledge Interaction | Xiaoran Xu et.al. | 2504.19086 | translate | read | null |
| 2025-04-26 | Federated Learning-based Semantic Segmentation for Lane and Object Detection in Autonomous Driving | Gharbi Khamis Alshammari et.al. | 2504.18939 | translate | read | null |
| 2025-04-25 | Dream-Box: Object-wise Outlier Generation for Out-of-Distribution Detection | Brian K. S. Isaac-Medina et.al. | 2504.18746 | translate | read | null |
| 2025-04-25 | A Review of 3D Object Detection with Vision-Language Models | Ranjan Sapkota et.al. | 2504.18738 | translate | read | null |
| 2025-04-25 | Examining the Impact of Optical Aberrations to Image Classification and Object Detection Models | Patrick Müller et.al. | 2504.18510 | translate | read | null |
| 2025-04-25 | Iterative Event-based Motion Segmentation by Variational Contrast Maximization | Ryo Yamaki et.al. | 2504.18447 | translate | read | null |
| 2025-04-25 | A Multimodal Hybrid Late-Cascade Fusion Network for Enhanced 3D Object Detection | Carlo Sgaravatti et.al. | 2504.18419 | translate | read | null |
| 2025-04-25 | A comprehensive review of classifier probability calibration metrics | Richard Oliver Lane et.al. | 2504.18278 | translate | read | null |
| 2025-04-25 | LiDAR-Guided Monocular 3D Object Detection for Long-Range Railway Monitoring | Raul David Dominguez Sanchez et.al. | 2504.18203 | translate | read | null |
| 2025-04-25 | Multi-Grained Compositional Visual Clue Learning for Image Intent Recognition | Yin Tang et.al. | 2504.18201 | translate | read | null |
| 2025-04-25 | E-InMeMo: Enhanced Prompting for Visual In-Context Learning | Jiahao Zhang et.al. | 2504.18158 | translate | read | null |
| 2025-04-25 | MASF-YOLO: An Improved YOLOv11 Network for Small Object Detection on Drone View | Liugang Lu et.al. | 2504.18136 | translate | read | null |
| 2025-04-25 | Opportunistic Collaborative Planning with Large Vision Model Guided Control and Joint Query-Service Optimization | Jiayi Chen et.al. | 2504.18057 | translate | read | null |
| 2025-04-25 | Direct sampling method to retrieve small objects from two-dimensional limited-aperture scattered field data | Won-Kwang Park et.al. | 2504.18036 | translate | read | null |
| 2025-04-24 | DIVE: Inverting Conditional Diffusion Models for Discriminative Tasks | Yinqi Li et.al. | 2504.17253 | translate | read | link |
| 2025-04-24 | Perspective-Aware Reasoning in Vision-Language Models via Mental Imagery Simulation | Phillip Y. Lee et.al. | 2504.17207 | translate | read | null |
| 2025-04-24 | AUTHENTICATION: Identifying Rare Failure Modes in Autonomous Vehicle Perception Systems using Adversarially Guided Diffusion Models | Mohammad Zarei et.al. | 2504.17179 | translate | read | null |
| 2025-04-23 | Scene-Aware Location Modeling for Data Augmentation in Automotive Object Detection | Jens Petersen et.al. | 2504.17076 | translate | read | null |
| 2025-04-23 | Gaussian Splatting is an Effective Data Generator for 3D Object Detection | Farhad G. Zanjani et.al. | 2504.16740 | translate | read | null |
| 2025-04-23 | EHGCN: Hierarchical Euclidean-Hyperbolic Fusion via Motion-Aware GCN for Hybrid Event Stream Perception | Haosheng Chen et.al. | 2504.16616 | translate | read | null |
| 2025-04-23 | Beyond Anonymization: Object Scrubbing for Privacy-Preserving 2D and 3D Vision Tasks | Murat Bilgehan Ertan et.al. | 2504.16557 | translate | read | null |
| 2025-04-23 | Assessing the Feasibility of Internet-Sourced Video for Automatic Cattle Lameness Detection | Md Fahimuzzman Sohan et.al. | 2504.16404 | translate | read | null |
| 2025-04-23 | Revisiting Radar Camera Alignment by Contrastive Learning for 3D Object Detection | Linhua Kong et.al. | 2504.16368 | translate | read | null |
| 2025-04-22 | Vision Controlled Orthotic Hand Exoskeleton | Connor Blais et.al. | 2504.16319 | translate | read | null |
| 2025-04-22 | $π_{0.5}$ : a Vision-Language-Action Model with Open-World Generalization | Physical Intelligence et.al. | 2504.16054 | translate | read | null |
| 2025-04-22 | SAGA: Semantic-Aware Gray color Augmentation for Visible-to-Thermal Domain Adaptation across Multi-View Drone and Ground-Based Vision Systems | Manjunath D et.al. | 2504.15728 | translate | read | null |
| 2025-04-22 | You Sense Only Once Beneath: Ultra-Light Real-Time Underwater Object Detection | Jun Dong et.al. | 2504.15694 | translate | read | null |
| 2025-04-22 | A Vision-Enabled Prosthetic Hand for Children with Upper Limb Disabilities | Md Abdul Baset Sarker et.al. | 2504.15654 | translate | read | null |
| 2025-04-21 | Context Aware Grounded Teacher for Source Free Object Detection | Tajamul Ashraf et.al. | 2504.15404 | translate | read | null |
| 2025-04-21 | SuoiAI: Building a Dataset for Aquatic Invertebrates in Vietnam | Tue Vo et.al. | 2504.15252 | translate | read | null |
| 2025-04-21 | An Efficient Aerial Image Detection with Variable Receptive Fields | Liu Wenbin et.al. | 2504.15165 | translate | read | null |
| 2025-04-19 | Balancing Privacy and Action Performance: A Penalty-Driven Approach to Image Anonymization | Nazia Aslam et.al. | 2504.14301 | translate | read | null |
| 2025-04-19 | Visual Consensus Prompting for Co-Salient Object Detection | Jie Wang et.al. | 2504.14254 | translate | read | link |
| 2025-04-18 | Feature Alignment and Representation Transfer in Knowledge Distillation for Large Language Models | Junjie Yang et.al. | 2504.13825 | translate | read | null |
| 2025-04-18 | Lightweight LiDAR-Camera 3D Dynamic Object Detection and Multi-Class Trajectory Prediction | Yushen He et.al. | 2504.13647 | translate | read | link |
| 2025-04-18 | DenSe-AdViT: A novel Vision Transformer for Dense SAR Object Detection | Yang Zhang et.al. | 2504.13638 | translate | read | null |
| 2025-04-18 | HMPE:HeatMap Embedding for Efficient Transformer-Based Small Object Detection | YangChen Zeng et.al. | 2504.13469 | translate | read | null |
| 2025-04-18 | Towards a Multi-Agent Vision-Language System for Zero-Shot Novel Hazardous Object Detection for Autonomous Driving Safety | Shashank Shriram et.al. | 2504.13399 | translate | read | link |
| 2025-04-17 | VLLFL: A Vision-Language Model Based Lightweight Federated Learning Framework for Smart Agriculture | Long Li et.al. | 2504.13365 | translate | read | null |
| 2025-04-17 | SAR Object Detection with Self-Supervised Pretraining and Curriculum-Aware Sampling | Yasin Almalioglu et.al. | 2504.13310 | translate | read | null |
| 2025-04-17 | Weak Cube R-CNN: Weakly Supervised 3D Detection using only 2D Bounding Boxes | Andreas Lau Hansen et.al. | 2504.13297 | translate | read | null |
| 2025-04-17 | RF-DETR Object Detection vs YOLOv12 : A Study of Transformer-based and CNN-based Architectures for Single-Class and Multi-Class Greenfruit Detection in Complex Orchard Environments Under Label Ambiguity | Ranjan Sapkota et.al. | 2504.13099 | translate | read | null |
| 2025-04-17 | Self-Supervised Pre-training with Combined Datasets for 3D Perception in Autonomous Driving | Shumin Wang et.al. | 2504.12709 | translate | read | null |
| 2025-04-18 | RoPETR: Improving Temporal Camera-Only 3D Detection by Integrating Enhanced Rotary Position Embedding | Hang Ji et.al. | 2504.12643 | translate | read | null |
| 2025-04-16 | Towards a General-Purpose Zero-Shot Synthetic Low-Light Image and Video Pipeline | Joanne Lin et.al. | 2504.12169 | translate | read | null |
| 2025-04-16 | RADLER: Radar Object Detection Leveraging Semantic 3D City Models and Self-Supervised Radar-Image Learning | Yuan Luo et.al. | 2504.12167 | translate | read | null |
| 2025-04-16 | pix2pockets: Shot Suggestions in 8-Ball Pool from a Single Image in the Wild | Jonas Myhre Schiøtt et.al. | 2504.12045 | translate | read | null |
| 2025-04-16 | A Review of YOLOv12: Attention-Based Enhancements vs. Previous Versions | Rahima Khanam et.al. | 2504.11995 | translate | read | null |
| 2025-04-16 | Multimodal Spatio-temporal Graph Learning for Alignment-free RGBT Video Object Detection | Qishun Wang et.al. | 2504.11779 | translate | read | null |
| 2025-04-15 | Multi-level Cellular Automata for FLIM networks | Felipe Crispim Salvagnini et.al. | 2504.11406 | translate | read | null |
| 2025-04-15 | OpenTuringBench: An Open-Model-based Benchmark and Framework for Machine-Generated Text Detection and Attribution | Lucio La Cava et.al. | 2504.11369 | translate | read | null |
| 2025-04-15 | CFIS-YOLO: A Lightweight Multi-Scale Fusion Network for Edge-Deployable Wood Defect Detection | Jincheng Kang et.al. | 2504.11305 | translate | read | null |
| 2025-04-15 | TSAL: Few-shot Text Segmentation Based on Attribute Learning | Chenming Li et.al. | 2504.11164 | translate | read | null |
| 2025-04-15 | Flyweight FLIM Networks for Salient Object Detection in Biomedical Images | Leonardo M. Joao et.al. | 2504.11112 | translate | read | null |
| 2025-04-15 | S $^2$ Teacher: Step-by-step Teacher for Sparsely Annotated Oriented Object Detection | Yu Lin et.al. | 2504.11111 | translate | read | null |
| 2025-04-15 | DRIFT open dataset: A drone-derived intelligence for traffic analysis in urban environmen | Hyejin Lee et.al. | 2504.11019 | translate | read | null |
| 2025-04-16 | GATE3D: Generalized Attention-based Task-synergized Estimation in 3D* | Eunsoo Im et.al. | 2504.11014 | translate | read | null |
| 2025-04-15 | CDUPatch: Color-Driven Universal Adversarial Patch Attack for Dual-Modal Visible-Infrared Detectors | Jiahuan Long et.al. | 2504.10888 | translate | read | null |
| 2025-04-15 | Safe-Construct: Redefining Construction Safety Violation Recognition as 3D Multi-View Engagement Task | Aviral Chharia et.al. | 2504.10880 | translate | read | null |
| 2025-04-14 | DiffMOD: Progressive Diffusion Point Denoising for Moving Object Detection in Remote Sensing | Jinyue Zhang et.al. | 2504.10278 | translate | read | null |
| 2025-04-14 | Balancing Stability and Plasticity in Pretrained Detector: A Dual-Path Framework for Incremental Object Detection | Songze Li et.al. | 2504.10214 | translate | read | null |
| 2025-04-14 | WildLive: Near Real-time Visual Wildlife Tracking onboard UAVs | Nguyen Ngoc Dat et.al. | 2504.10165 | translate | read | null |
| 2025-04-14 | COUNTS: Benchmarking Object Detectors and Multimodal Large Language Models under Distribution Shifts | Jiansheng Li et.al. | 2504.10158 | translate | read | null |
| 2025-04-14 | SemiETS: Integrating Spatial and Content Consistencies for Semi-Supervised End-to-end Text Spotting | Dongliang Luo et.al. | 2504.09966 | translate | read | null |
| 2025-04-14 | Small Object Detection with YOLO: A Performance Analysis Across Model Versions and Hardware | Muhammad Fasih Tariq et.al. | 2504.09900 | translate | read | null |
| 2025-04-14 | Density-based Object Detection in Crowded Scenes | Chenyang Zhao et.al. | 2504.09819 | translate | read | null |
| 2025-04-13 | Uncertainty Guided Refinement for Fine-Grained Salient Object Detection | Yao Yuan et.al. | 2504.09666 | translate | read | link |
| 2025-04-13 | Pillar-Voxel Fusion Network for 3D Object Detection in Airborne Hyperspectral Point Clouds | Yanze Jiang et.al. | 2504.09506 | translate | read | null |
| 2025-04-13 | Vision-Language Model for Object Detection and Segmentation: A Review and Evaluation | Yongchao Feng et.al. | 2504.09480 | translate | read | null |
| 2025-04-11 | TinyCenterSpeed: Efficient Center-Based Object Detection for Autonomous Racing | Neil Reichlin et.al. | 2504.08655 | translate | read | null |
| 2025-04-11 | Training-free Guidance in Text-to-Video Generation via Multimodal Planning and Structured Noise Initialization | Jialu Li et.al. | 2504.08641 | translate | read | null |
| 2025-04-10 | Enhanced Cooperative Perception Through Asynchronous Vehicle to Infrastructure Framework with Delay Mitigation for Connected and Automated Vehicles | Nithish Kumar Saravanan et.al. | 2504.08172 | translate | read | null |
| 2025-04-10 | Multi-Task Learning with Multi-Annotation Triplet Loss for Improved Object Detection | Meilun Zhou et.al. | 2504.08054 | translate | read | null |
| 2025-04-10 | Detect Anything 3D in the Wild | Hanxue Zhang et.al. | 2504.07958 | translate | read | null |
| 2025-04-11 | Pychop: Emulating Low-Precision Arithmetic in Numerical Methods and Neural Networks | Erin Carson et.al. | 2504.07835 | translate | read | link |
| 2025-04-10 | P2Object: Single Point Supervised Object Detection and Instance Segmentation | Pengfei Chen et.al. | 2504.07813 | translate | read | null |
| 2025-04-10 | Nonlocal Retinex-Based Variational Model and its Deep Unfolding Twin for Low-Light Image Enhancement | Daniel Torres et.al. | 2504.07810 | translate | read | null |
| 2025-04-10 | Adaptive Detection of Fast Moving Celestial Objects Using a Mixture of Experts and Physical-Inspired Neural Network | Peng Jia et.al. | 2504.07777 | translate | read | null |
| 2025-04-10 | Prediction of Usage Probabilities of Shopping-Mall Corridors Using Heterogeneous Graph Neural Networks | Malik M Barakathullah et.al. | 2504.07645 | translate | read | null |
| 2025-04-10 | VLM-R1: A Stable and Generalizable R1-style Large Vision-Language Model | Haozhan Shen et.al. | 2504.07615 | translate | read | link |
| 2025-04-10 | RASMD: RGB And SWIR Multispectral Driving Dataset for Robust Perception in Adverse Conditions | Youngwan Jin et.al. | 2504.07603 | translate | read | null |
| 2025-04-10 | WS-DETR: Robust Water Surface Object Detection through Vision-Radar Fusion with Detection Transformer | Huilin Yin et.al. | 2504.07441 | translate | read | null |
| 2025-04-10 | Model Discrepancy Learning: Synthetic Faces Detection Based on Multi-Reconstruction | Qingchao Jiang et.al. | 2504.07382 | translate | read | link |
| 2025-04-09 | Generalized Semantic Contrastive Learning via Embedding Side Information for Few-Shot Object Detection | Ruoyu Chen et.al. | 2504.07060 | translate | read | null |
| 2025-04-09 | UAV Position Estimation using a LiDAR-based 3D Object Detection Method | Uthman Olawoye et.al. | 2504.07028 | translate | read | null |
| 2025-04-09 | Towards Efficient Roadside LiDAR Deployment: A Fast Surrogate Metric Based on Entropy-Guided Visibility | Yuze Jiang et.al. | 2504.06772 | translate | read | null |
| 2025-04-09 | Domain-Conditioned Scene Graphs for State-Grounded Task Planning | Jonas Herzog et.al. | 2504.06661 | translate | read | null |
| 2025-04-09 | Visually Similar Pair Alignment for Robust Cross-Domain Object Detection | Onkar Krishna et.al. | 2504.06607 | translate | read | null |
| 2025-04-08 | From Broadcast to Minimap: Achieving State-of-the-Art SoccerNet Game State Reconstruction | Vladimir Golovkin et.al. | 2504.06357 | translate | read | null |
| 2025-04-08 | Analyzing the Impact of Low-Rank Adaptation for Cross-Domain Few-Shot Object Detection in Aerial Images | Hicham Talaoubrid et.al. | 2504.06330 | translate | read | link |
| 2025-04-08 | Security Analysis of Thumbnail-Preserving Image Encryption and a New Framework | Dong Xie et.al. | 2504.06083 | translate | read | null |
| 2025-04-08 | Balancing long- and short-term dynamics for the modeling of saliency in videos | Theodor Wulff et.al. | 2504.05913 | translate | read | null |
| 2025-04-08 | PRIMEDrive-CoT: A Precognitive Chain-of-Thought Framework for Uncertainty-Aware Object Interaction in Driving Scene Scenario | Sriram Mandalika et.al. | 2504.05908 | translate | read | null |
| 2025-04-08 | Intrinsic Saliency Guided Trunk-Collateral Network for Unsupervised Video Object Segmentation | Xiangyu Zheng et.al. | 2504.05904 | translate | read | null |
| 2025-04-08 | KAN-SAM: Kolmogorov-Arnold Network Guided Segment Anything Model for RGB-T Salient Object Detection | Xingyuan Li et.al. | 2504.05878 | translate | read | null |
| 2025-04-08 | DefMamba: Deformable Visual State Space Model | Leiye Liu et.al. | 2504.05794 | translate | read | null |
| 2025-04-08 | Event-based Civil Infrastructure Visual Defect Detection: ev-CIVIL Dataset and Benchmark | Udayanga G. W. K. N. Gamage et.al. | 2504.05679 | translate | read | null |
| 2025-04-08 | POD: Predictive Object Detection with Single-Frame FMCW LiDAR Point Cloud | Yining Shi et.al. | 2504.05649 | translate | read | null |
| 2025-04-08 | AD-Det: Boosting Object Detection in UAV Images with Focused Small Objects and Balanced Tail Classes | Zhenteng Li et.al. | 2504.05601 | translate | read | null |
| 2025-04-07 | SSLFusion: Scale & Space Aligned Latent Fusion Model for Multimodal 3D Object Detection | Bonan Ding et.al. | 2504.05170 | translate | read | null |
| 2025-04-07 | Inland Waterway Object Detection in Multi-environment: Dataset and Approach | Shanshan Wang et.al. | 2504.04835 | translate | read | null |
| 2025-04-07 | Playing Non-Embedded Card-Based Games with Reinforcement Learning | Tianyang Wu et.al. | 2504.04783 | translate | read | link |
| 2025-04-07 | Feedback-Enhanced Hallucination-Resistant Vision-Language Model for Real-Time Scene Understanding | Zahir Alsulaimawi et.al. | 2504.04772 | translate | read | null |
| 2025-04-07 | Inverse++: Vision-Centric 3D Semantic Occupancy Prediction Assisted with 3D Object Detection | Zhenxing Ming et.al. | 2504.04732 | translate | read | null |
| 2025-04-06 | Enhance Then Search: An Augmentation-Search Strategy with Foundation Models for Cross-Domain Few-Shot Object Detection | Jiancheng Pan et.al. | 2504.04517 | translate | read | link |
| 2025-04-06 | eKalibr-Stereo: Continuous-Time Spatiotemporal Calibration for Event-Based Stereo Visual Systems | Shuolong Chen et.al. | 2504.04451 | translate | read | link |
| 2025-04-05 | Autoregressive High-Order Finite Difference Modulo Imaging: High-Dynamic Range for Computer Vision Applications | Brayan Monroy et.al. | 2504.04228 | translate | read | null |
| 2025-04-05 | An Optimized Density-Based Lane Keeping System for A Cost-Efficient Autonomous Vehicle Platform: AurigaBot V1 | Farbod Younesi et.al. | 2504.04217 | translate | read | null |
| 2025-04-05 | Learning about the Physical World through Analytic Concepts | Jianhua Sun et.al. | 2504.04170 | translate | read | null |
| 2025-04-04 | VISTA-OCR: Towards generative and interactive end to end OCR models | Laziz Hamdi et.al. | 2504.03621 | translate | read | null |
| 2025-04-04 | PF3Det: A Prompted Foundation Feature Assisted Visual LiDAR 3D Detector | Kaidong Li et.al. | 2504.03563 | translate | read | null |
| 2025-04-04 | ZFusion: An Effective Fuser of Camera and 4D Radar for 3D Object Perception in Autonomous Driving | Sheng Yang et.al. | 2504.03438 | translate | read | null |
| 2025-04-04 | Infrared bubble recognition in the Milky Way and beyond using deep learning | Shimpei Nishimoto et.al. | 2504.03367 | translate | read | null |
| 2025-04-04 | Real-Time Roadway Obstacle Detection for Electric Scooters Using Deep Learning and Multi-Sensor Fusion | Zeyang Zheng et.al. | 2504.03171 | translate | read | null |
| 2025-04-04 | Finding the Reflection Point: Unpadding Images to Remove Data Augmentation Artifacts in Large Open Source Image Datasets for Machine Learning | Lucas Choi et.al. | 2504.03168 | translate | read | null |
| 2025-04-03 | Attention-Aware Multi-View Pedestrian Tracking | Reef Alturki et.al. | 2504.03047 | translate | read | null |
| 2025-04-03 | LiDAR-based Object Detection with Real-time Voice Specifications | Anurag Kulkarni et.al. | 2504.02920 | translate | read | null |
| 2025-04-03 | BOP Challenge 2024 on Model-Based and Model-Free 6D Object Pose Estimation | Van Nguyen Nguyen et.al. | 2504.02812 | translate | read | link |
| 2025-04-03 | Rip Current Segmentation: A Novel Benchmark and YOLOv8 Baseline Results | Andrei Dumitriu et.al. | 2504.02558 | translate | read | link |
| 2025-04-03 | Multimodal Fusion and Vision-Language Models: A Survey for Robot Vision | Xiaofeng Han et.al. | 2504.02477 | translate | read | link |
| 2025-04-03 | CornerPoint3D: Look at the Nearest Corner Instead of the Center | Ruixiao Zhang et.al. | 2504.02464 | translate | read | null |
| 2025-04-03 | Hyperspectral Remote Sensing Images Salient Object Detection: The First Benchmark Dataset and Baseline | Peifu Liu et.al. | 2504.02416 | translate | read | null |
| 2025-04-03 | SemiISP/SemiIE: Semi-Supervised Image Signal Processor and Image Enhancement Leveraging One-to-Many Mapping sRGB-to-RAW | Masakazu Yoshimura et.al. | 2504.02345 | translate | read | null |
| 2025-04-03 | Improving Harmful Text Detection with Joint Retrieval and External Knowledge | Zidong Yu et.al. | 2504.02310 | translate | read | null |
| 2025-04-03 | LLM-Guided Evolution: An Autonomous Model Optimization for Object Detection | YiMing Yu et.al. | 2504.02280 | translate | read | null |
| 2025-04-02 | Cat-Eye Inspired Active-Passive-Composite Aperture-Shared Sub-Terahertz Meta-Imager for Non-Interactive Concealed Object Detection | Mingshuang Hu et.al. | 2504.01473 | translate | read | null |
| 2025-04-02 | CFMD: Dynamic Cross-layer Feature Fusion for Salient Object Detection | Jin Lian et.al. | 2504.01326 | translate | read | null |
| 2025-04-01 | Enabling Efficient Processing of Spiking Neural Networks with On-Chip Learning on Commodity Neuromorphic Processors for Edge AI Systems | Rachmad Vidya Wicaksana Putra et.al. | 2504.00957 | translate | read | null |
| 2025-04-01 | NeuRadar: Neural Radiance Fields for Automotive Radar Point Clouds | Mahan Rafidashti et.al. | 2504.00859 | translate | read | null |
| 2025-04-01 | AttentiveGRU: Recurrent Spatio-Temporal Modeling for Advanced Radar-Based BEV Object Detection | Loveneet Saini et.al. | 2504.00559 | translate | read | null |
| 2025-04-01 | Archival Faces: Detection of Faces in Digitized Historical Documents | Marek Vaško et.al. | 2504.00558 | translate | read | null |
| 2025-04-01 | High-Quality Pseudo-Label Generation Based on Visual Prompt Assisted Cloud Model Update | Xinrun Xu et.al. | 2504.00526 | translate | read | null |
| 2025-04-01 | Intrinsic-feature-guided 3D Object Detection | Wanjing Zhang et.al. | 2504.00382 | translate | read | null |
| 2025-04-01 | CamoSAM2: Motion-Appearance Induced Auto-Refining Prompts for Video Camouflaged Object Detection | Xin Zhang et.al. | 2504.00375 | translate | read | null |
(<a href=../Object_Detection.md>back to Object Detection</a>)