Object Detection - 2025-06
Object Detection - 2025-06
| Publish Date | Title | Authors | Translate | Read | Code | |
|---|---|---|---|---|---|---|
| 2025-06-30 | Continual Adaptation: Environment-Conditional Parameter Generation for Object Detection in Dynamic Scenarios | Deng Li et.al. | 2506.24063 | translate | read | null |
| 2025-06-30 | Visual Textualization for Image Prompted Object Detection | Yongjian Wu et.al. | 2506.23785 | translate | read | null |
| 2025-06-30 | PBCAT: Patch-based composite adversarial training against physically realizable attacks on object detection | Xiao Li et.al. | 2506.23581 | translate | read | null |
| 2025-06-30 | Event-based Tiny Object Detection: A Benchmark Dataset and Baseline | Nuo Chen et.al. | 2506.23575 | translate | read | null |
| 2025-06-30 | OcRFDet: Object-Centric Radiance Fields for Multi-View 3D Object Detection in Autonomous Driving | Mingqian Ji et.al. | 2506.23565 | translate | read | null |
| 2025-06-30 | From Sight to Insight: Unleashing Eye-Tracking in Weakly Supervised Video Salient Object Detection | Qi Qin et.al. | 2506.23519 | translate | read | null |
| 2025-06-30 | Improve Underwater Object Detection through YOLOv12 Architecture and Physics-informed Augmentation | Tinh Nguyen et.al. | 2506.23505 | translate | read | null |
| 2025-06-29 | Detecting What Matters: A Novel Approach for Out-of-Distribution 3D Object Detection in Autonomous Vehicles | Menna Taha et.al. | 2506.23426 | translate | read | null |
| 2025-06-29 | Layer Decomposition and Morphological Reconstruction for Task-Oriented Infrared Image Enhancement | Siyuan Chai et.al. | 2506.23353 | translate | read | null |
| 2025-06-29 | GeoProg3D: Compositional Visual Reasoning for City-Scale 3D Language Fields | Shunsuke Yasuki et.al. | 2506.23352 | translate | read | null |
| 2025-06-27 | Attention-disentangled Uniform Orthogonal Feature Space Optimization for Few-shot Object Detection | Taijin Zhao et.al. | 2506.22161 | translate | read | null |
| 2025-06-27 | Evaluating Pointing Gestures for Target Selection in Human-Robot Collaboration | Noora Sassali et.al. | 2506.22116 | translate | read | null |
| 2025-06-27 | CERBERUS: Crack Evaluation & Recognition Benchmark for Engineering Reliability & Urban Stability | Justin Reinman et.al. | 2506.21909 | translate | read | null |
| 2025-06-27 | Visual Content Detection in Educational Videos with Transfer Learning and Dataset Enrichment | Dipayan Biswas et.al. | 2506.21903 | translate | read | null |
| 2025-06-27 | Embodied Domain Adaptation for Object Detection | Xiangyu Shi et.al. | 2506.21860 | translate | read | null |
| 2025-06-26 | PhotonSplat: 3D Scene Reconstruction and Colorization from SPAD Sensors | Sai Sri Teja et.al. | 2506.21680 | translate | read | null |
| 2025-06-26 | Towards Reliable Detection of Empty Space: Conditional Marked Point Processes for Object Detection | Tobias J. Riedlinger et.al. | 2506.21486 | translate | read | null |
| 2025-06-26 | TITAN: Query-Token based Domain Adaptive Adversarial Learning | Tajamul Ashraf et.al. | 2506.21484 | translate | read | null |
| 2025-06-26 | A Comprehensive Dataset for Underground Miner Detection in Diverse Scenario | Cyrus Addy et.al. | 2506.21451 | translate | read | null |
| 2025-06-26 | DuET: Dual Incremental Object Detection via Exemplar-Free Task Arithmetic | Munish Monga et.al. | 2506.21260 | translate | read | null |
| 2025-06-26 | LASFNet: A Lightweight Attention-Guided Self-Modulation Feature Fusion Network for Multimodal Object Detection | Lei Hao et.al. | 2506.21018 | translate | read | null |
| 2025-06-26 | ThermalDiffusion: Visual-to-Thermal Image-to-Image Translation for Autonomous Navigation | Shruti Bansal et.al. | 2506.20969 | translate | read | null |
| 2025-06-25 | Lightweight Multi-Frame Integration for Robust YOLO Object Detection in Videos | Yitong Quan et.al. | 2506.20550 | translate | read | null |
| 2025-06-25 | Learning-based safety lifting monitoring system for cranes on construction sites | Hao Chen et.al. | 2506.20475 | translate | read | null |
| 2025-06-25 | Feature Hallucination for Self-supervised Action Recognition | Lei Wang et.al. | 2506.20342 | translate | read | null |
| 2025-06-25 | From Codicology to Code: A Comparative Study of Transformer and YOLO-based Detectors for Layout Analysis in Historical Documents | Sergio Torres Aguilar et.al. | 2506.20326 | translate | read | null |
| 2025-06-25 | TDiR: Transformer based Diffusion for Image Restoration Tasks | Abbas Anwar et.al. | 2506.20302 | translate | read | null |
| 2025-06-25 | Integrated optomechanical ultrasonic sensors with nano-Pascal-level sensitivity | Xuening Cao et.al. | 2506.20219 | translate | read | null |
| 2025-06-24 | A Survey of Multi-sensor Fusion Perception for Embodied AI: Background, Methods, Challenges and Prospects | Shulan Ruan et.al. | 2506.19769 | translate | read | null |
| 2025-06-24 | Semantic Scene Graph for Ultrasound Image Explanation and Scanning Guidance | Xuesong Li et.al. | 2506.19683 | translate | read | null |
| 2025-06-24 | Probabilistic modelling and safety assurance of an agriculture robot providing light-treatment | Mustafa Adam et.al. | 2506.19620 | translate | read | null |
| 2025-06-24 | USIS16K: High-Quality Dataset for Underwater Salient Instance Segmentation | Lin Hong et.al. | 2506.19472 | translate | read | null |
| 2025-06-23 | SpaNN: Detecting Multiple Adversarial Patches on CNNs by Spanning Saliency Thresholds | Mauricio Byrd Victorica et.al. | 2506.18591 | translate | read | null |
| 2025-06-23 | Improvement on LiDAR-Camera Calibration Using Square Targets | Zhongyuan Li et.al. | 2506.18294 | translate | read | null |
| 2025-06-23 | Learning Approach to Efficient Vision-based Active Tracking of a Flying Target by an Unmanned Aerial Vehicle | Jagadeswara PKV Pothuri et.al. | 2506.18264 | translate | read | null |
| 2025-06-23 | Ground tracking for improved landmine detection in a GPR system | Li Tang et.al. | 2506.18258 | translate | read | null |
| 2025-06-24 | Referring Expression Instance Retrieval and A Strong End-to-End Baseline | Xiangzhao Hao et.al. | 2506.18246 | translate | read | null |
| 2025-06-24 | Unfolding the Past: A Comprehensive Deep Learning Approach to Analyzing Incunabula Pages | Klaudia Ropel et.al. | 2506.18069 | translate | read | null |
| 2025-06-21 | YOLOv13: Real-Time Object Detection with Hypergraph-Enhanced Adaptive Visual Perception | Mengqi Lei et.al. | 2506.17733 | translate | read | link |
| 2025-06-21 | CSDN: A Context-Gated Self-Adaptive Detection Network for Real-Time Object Detection | Wei Haolin et.al. | 2506.17679 | translate | read | null |
| 2025-06-21 | DRAMA-X: A Fine-grained Intent Prediction and Risk Reasoning Benchmark For Driving | Mihir Godbole et.al. | 2506.17590 | translate | read | null |
| 2025-06-20 | YASMOT: Yet another stereo image multi-object tracker | Ketil Malde et.al. | 2506.17186 | translate | read | link |
| 2025-06-20 | Class Agnostic Instance-level Descriptor for Visual Instance Search | Qi-Ying Sun et.al. | 2506.16745 | translate | read | null |
| 2025-06-20 | Cross-modal Offset-guided Dynamic Alignment and Fusion for Weakly Aligned UAV Object Detection | Liu Zongzhen et.al. | 2506.16737 | translate | read | null |
| 2025-06-19 | How Hard Is Snow? A Paired Domain Adaptation Dataset for Clear and Snowy Weather: CADC+ | Mei Qi Tang et.al. | 2506.16531 | translate | read | null |
| 2025-06-19 | Can AI Dream of Unseen Galaxies? Conditional Diffusion Model for Galaxy Morphology Augmentation | Chenrui Ma et.al. | 2506.16233 | translate | read | null |
| 2025-06-19 | VideoGAN-based Trajectory Proposal for Automated Vehicles | Annajoyce Mariani et.al. | 2506.16209 | translate | read | null |
| 2025-06-19 | BLADE: An Automated Framework for Classifying Light Curves from the Center for Near-Earth Object Studies (CNEOS) Fireball Database | Elizabeth A. Silber et.al. | 2506.16099 | translate | read | null |
| 2025-06-19 | Polyline Path Masked Attention for Vision Transformer | Zhongchen Zhao et.al. | 2506.15940 | translate | read | link |
| 2025-06-18 | PhantomHunter: Detecting Unseen Privately-Tuned LLM-Generated Text via Family-Aware Learning | Yuhui Shi et.al. | 2506.15683 | translate | read | null |
| 2025-06-18 | BoxFusion: Reconstruction-Free Open-Vocabulary 3D Object Detection via Real-Time Multi-View Box Fusion | Yuqing Lan et.al. | 2506.15610 | translate | read | null |
| 2025-06-18 | Retrospective Memory for Camouflaged Object Detection | Chenxi Zhang et.al. | 2506.15244 | translate | read | null |
| 2025-06-18 | Fiber Signal Denoising Algorithm using Hybrid Deep Learning Networks | Linlin Wang et.al. | 2506.15125 | translate | read | null |
| 2025-06-19 | Efficient Retail Video Annotation: A Robust Key Frame Generation Approach for Product and Customer Interaction Analysis | Varun Mannam et.al. | 2506.14854 | translate | read | null |
| 2025-06-18 | YOLOv11-RGBT: Towards a Comprehensive Single-Stage Multispectral Object Detection Framework | Dahang Wan et.al. | 2506.14696 | translate | read | null |
| 2025-06-17 | VisText-Mosquito: A Multimodal Dataset and Benchmark for AI-Based Mosquito Breeding Site Detection and Reasoning | Md. Adnanul Islam et.al. | 2506.14629 | translate | read | link |
| 2025-06-17 | GAMORA: A Gesture Articulated Meta Operative Robotic Arm for Hazardous Material Handling in Containment-Level Environments | Farha Abdul Wasay et.al. | 2506.14513 | translate | read | null |
| 2025-06-17 | Comparison of Two Methods for Stationary Incident Detection Based on Background Image | Deepak Ghimire et.al. | 2506.14256 | translate | read | null |
| 2025-06-16 | A Point Cloud Completion Approach for the Grasping of Partially Occluded Objects and Its Applications in Robotic Strawberry Harvesting | Ali Abouzeid et.al. | 2506.14066 | translate | read | link |
| 2025-06-16 | FindMeIfYouCan: Bringing Open Set metrics to $\textit{near} $, $ \textit{far} $ and $\textit{farther}$ Out-of-Distribution Object Detection | Daniel Montoya et.al. | 2506.14008 | translate | read | null |
| 2025-06-16 | How Real is CARLAs Dynamic Vision Sensor? A Study on the Sim-to-Real Gap in Traffic Object Detection | Kaiyuan Tan et.al. | 2506.13722 | translate | read | null |
| 2025-06-17 | Lecture Video Visual Objects (LVVO) Dataset: A Benchmark for Visual Object Detection in Educational Videos | Dipayan Biswas et.al. | 2506.13657 | translate | read | link |
| 2025-06-16 | UAV Object Detection and Positioning in a Mining Industrial Metaverse with Custom Geo-Referenced Data | Vasiliki Balaska et.al. | 2506.13505 | translate | read | null |
| 2025-06-16 | Sparse Convolutional Recurrent Learning for Efficient Event-based Neuromorphic Object Detection | Shenqi Wang et.al. | 2506.13440 | translate | read | null |
| 2025-06-16 | Cognitive Synergy Architecture: SEGO for Human-Centric Collaborative Robots | Jaehong Oh et.al. | 2506.13149 | translate | read | null |
| 2025-06-15 | MGDFIS: Multi-scale Global-detail Feature Integration Strategy for Small Object Detection | Yuxiang Wang et.al. | 2506.12697 | translate | read | null |
| 2025-06-14 | UniDet-D: A Unified Dynamic Spectral Attention Model for Object Detection under Adverse Weathers | Yuantao Wang et.al. | 2506.12324 | translate | read | null |
| 2025-06-14 | MatchPlant: An Open-Source Pipeline for UAV-Based Single-Plant Detection and Data Extraction | Worasit Sangjan et.al. | 2506.12295 | translate | read | link |
| 2025-06-13 | Vision-based Lifting of 2D Object Detections for Automated Driving | Hendrik Königshof et.al. | 2506.11839 | translate | read | null |
| 2025-06-13 | Teleoperated Driving: a New Challenge for 3D Object Detection in Compressed Point Clouds | Filippo Bragato et.al. | 2506.11804 | translate | read | null |
| 2025-06-13 | GPLQ: A General, Practical, and Lightning QAT Method for Vision Transformers | Guang Liang et.al. | 2506.11784 | translate | read | null |
| 2025-06-13 | On the Natural Robustness of Vision-Language Models Against Visual Perception Attacks in Autonomous Driving | Pedram MohajerAnsari et.al. | 2506.11472 | translate | read | null |
| 2025-06-12 | Teaching in adverse scenes: a statistically feedback-driven threshold and mask adjustment teacher-student framework for object detection in UAV images under adverse scenes | Hongyu Chen et.al. | 2506.11175 | translate | read | null |
| 2025-06-12 | Discrete Lorenz Attractors in 3D Sinusoidal Maps | Sishu Shankar Muni et.al. | 2506.10788 | translate | read | null |
| 2025-06-12 | Uncertainty-Masked Bernoulli Diffusion for Camouflaged Object Detection Refinement | Yuqi Shen et.al. | 2506.10712 | translate | read | null |
| 2025-06-12 | Semantic-decoupled Spatial Partition Guided Point-supervised Oriented Object Detection | Xinyuan Liu et.al. | 2506.10601 | translate | read | link |
| 2025-06-12 | Improving Medical Visual Representation Learning with Pathological-level Cross-Modal Alignment and Correlation Exploration | Jun Wang et.al. | 2506.10573 | translate | read | null |
| 2025-06-12 | FSATFusion: Frequency-Spatial Attention Transformer for Infrared and Visible Image Fusion | Tianpei Zhang et.al. | 2506.10366 | translate | read | link |
| 2025-06-11 | DySS: Dynamic Queries and State-Space Learning for Efficient 3D Object Detection from Multi-Camera Videos | Rajeev Yasarla et.al. | 2506.10242 | translate | read | null |
| 2025-06-11 | CEM-FBGTinyDet: Context-Enhanced Foreground Balance with Gradient Tuning for tiny Objects | Tao Liu et.al. | 2506.09897 | translate | read | null |
| 2025-06-11 | 3DGeoDet: General-purpose Geometry-aware Image-based 3D Object Detection | Yi Zhang et.al. | 2506.09541 | translate | read | null |
| 2025-06-11 | MSSDF: Modality-Shared Self-supervised Distillation for High-Resolution Multi-modal Remote Sensing Image Learning | Tong Wang et.al. | 2506.09327 | translate | read | null |
| 2025-06-10 | Efficient Edge Deployment of Quantized YOLOv4-Tiny for Aerial Emergency Object Detection on Raspberry Pi 5 | Sindhu Boddu et.al. | 2506.09300 | translate | read | null |
| 2025-06-10 | Lightweight Object Detection Using Quantized YOLOv4-Tiny for Emergency Response in Aerial Imagery | Sindhu Boddu et.al. | 2506.09299 | translate | read | null |
| 2025-06-10 | WD-DETR: Wavelet Denoising-Enhanced Real-Time Object Detection Transformer for Robot Perception with Event Cameras | Yangjie Cui et.al. | 2506.09098 | translate | read | null |
| 2025-06-11 | Cosmos-Drive-Dreams: Scalable Synthetic Driving Data Generation with World Foundation Models | Xuanchi Ren et.al. | 2506.09042 | translate | read | link |
| 2025-06-10 | ADAM: Autonomous Discovery and Annotation Model using LLMs for Context-Aware Annotations | Amirreza Rouhi et.al. | 2506.08968 | translate | read | null |
| 2025-06-10 | Data Augmentation For Small Object using Fast AutoAugment | DaeEun Yoon et.al. | 2506.08956 | translate | read | null |
| 2025-06-11 | Gaussian2Scene: 3D Scene Representation Learning via Self-supervised Learning with 3D Gaussian Splatting | Keyi Liu et.al. | 2506.08777 | translate | read | null |
| 2025-06-09 | CrosswalkNet: An Optimized Deep Learning Framework for Pedestrian Crosswalk Detection in Aerial Images with High-Performance Computing | Zubin Bhuyan et.al. | 2506.07885 | translate | read | null |
| 2025-06-09 | SAM2Auto: Auto Annotation Using FLASH | Arash Rocky et.al. | 2506.07850 | translate | read | null |
| 2025-06-09 | Design and Evaluation of Deep Learning-Based Dual-Spectrum Image Fusion Methods | Beining Xu et.al. | 2506.07779 | translate | read | null |
| 2025-06-09 | SpikeSMOKE: Spiking Neural Networks for Monocular 3D Object Detection with Cross-Scale Gated Coding | Xuemei Chen et.al. | 2506.07737 | translate | read | null |
| 2025-06-09 | Domain Randomization for Object Detection in Manufacturing Applications using Synthetic Data: A Comprehensive Study | Xiaomeng Zhu et.al. | 2506.07539 | translate | read | null |
| 2025-06-09 | SpatialLM: Training Large Language Models for Structured Indoor Modeling | Yongsen Mao et.al. | 2506.07491 | translate | read | link |
| 2025-06-09 | Happiness Finder: Exploring the Role of AI in Enhancing Well-Being During Four-Leaf Clover Searches | Anna Yokokubo et.al. | 2506.07393 | translate | read | null |
| 2025-06-09 | Multiple Object Stitching for Unsupervised Representation Learning | Chengchao Shen et.al. | 2506.07364 | translate | read | link |
| 2025-06-09 | CBAM-STN-TPS-YOLO: Enhancing Agricultural Object Detection through Spatially Adaptive Attention Mechanisms | Satvik Praveen et.al. | 2506.07357 | translate | read | null |
| 2025-06-08 | UCOD-DPL: Unsupervised Camouflaged Object Detection via Dynamic Pseudo-label Learning | Weiqi Yan et.al. | 2506.07087 | translate | read | null |
| 2025-06-06 | Domain-RAG: Retrieval-Guided Compositional Image Generation for Cross-Domain Few-Shot Object Detection | Yu Li et.al. | 2506.05872 | translate | read | null |
| 2025-06-06 | Token Transforming: A Unified and Training-Free Token Compression Framework for Vision Transformer Acceleration | Fanhu Zeng et.al. | 2506.05709 | translate | read | null |
| 2025-06-06 | Integer Binary-Range Alignment Neuron for Spiking Neural Networks | Binghao Ye et.al. | 2506.05679 | translate | read | null |
| 2025-06-05 | CL-ISR: A Contrastive Learning and Implicit Stance Reasoning Framework for Misleading Text Detection on Social Media | Tianyi Huang et.al. | 2506.05107 | translate | read | null |
| 2025-06-05 | Synthetic Dataset Generation for Autonomous Mobile Robots Using 3D Gaussian Splatting for Vision Training | Aneesh Deogan et.al. | 2506.05092 | translate | read | null |
| 2025-06-06 | Bridging Annotation Gaps: Transferring Labels to Align Object Detection Datasets | Mikhail Kennerley et.al. | 2506.04737 | translate | read | null |
| 2025-06-05 | Gen-n-Val: Agentic Image Data Generation and Validation | Jing-En Huang et.al. | 2506.04676 | translate | read | null |
| 2025-06-05 | VoxDet: Rethinking 3D Semantic Occupancy Prediction as Dense Object Detection | Wuyang Li et.al. | 2506.04623 | translate | read | null |
| 2025-06-04 | FALO: Fast and Accurate LiDAR 3D Object Detection on Resource-Constrained Devices | Shizhong Han et.al. | 2506.04499 | translate | read | null |
| 2025-06-04 | Neural Object Detection for 4D STEM: High-Throughput Sub-Pixel Electron Diffraction Pattern Recognition | Arda Genc et.al. | 2506.04477 | translate | read | null |
| 2025-06-04 | Diffusion Domain Teacher: Diffusion Guided Domain Adaptive Object Detector | Boyong He et.al. | 2506.04211 | translate | read | link |
| 2025-06-04 | FSHNet: Fully Sparse Hybrid Network for 3D Object Detection | Shuai Liu et.al. | 2506.03714 | translate | read | null |
| 2025-06-04 | How PARTs assemble into wholes: Learning the relative composition of images | Melika Ayoughi et.al. | 2506.03682 | translate | read | null |
| 2025-06-05 | MambaNeXt-YOLO: A Hybrid State Space Model for Real-time Object Detection | Xiaochun Lei et.al. | 2506.03654 | translate | read | null |
| 2025-06-04 | DiagNet: Detecting Objects using Diagonal Constraints on Adjacency Matrix of Graph Neural Network | Chong Hyun Lee et.al. | 2506.03571 | translate | read | null |
| 2025-06-03 | SportMamba: Adaptive Non-Linear Multi-Object Tracking with State Space Models for Team Sports | Dheeraj Khanna et.al. | 2506.03335 | translate | read | null |
| 2025-06-03 | Simulate Any Radar: Attribute-Controllable Radar Simulation via Waveform Parameter Embedding | Weiqing Xiao et.al. | 2506.03134 | translate | read | link |
| 2025-06-03 | HACo-Det: A Study Towards Fine-Grained Machine-Generated Text Detection under Human-AI Coauthoring | Zhixiong Su et.al. | 2506.02959 | translate | read | null |
| 2025-06-03 | Towards Auto-Annotation from Annotation Guidelines: A Benchmark through 3D LiDAR Detection | Yechi Ma et.al. | 2506.02914 | translate | read | null |
| 2025-06-03 | A Dynamic Transformer Network for Vehicle Detection | Chunwei Tian et.al. | 2506.02765 | translate | read | null |
| 2025-06-03 | Open-PMC-18M: A High-Fidelity Large Scale Medical Dataset for Multimodal Representation Learning | Negin Baghbanzadeh et.al. | 2506.02738 | translate | read | null |
| 2025-06-03 | GeneA-SLAM2: Dynamic SLAM with AutoEncoder-Preprocessed Genetic Keypoints Resampling and Depth Variance-Guided Dynamic Region Removal | Shufan Qing et.al. | 2506.02736 | translate | read | link |
| 2025-06-03 | Sight Guide: A Wearable Assistive Perception and Navigation System for the Vision Assistance Race in the Cybathlon 2024 | Patrick Pfreundschuh et.al. | 2506.02676 | translate | read | null |
| 2025-06-03 | Probabilistic Online Event Downsampling | Andreu Girbau-Xalabarder et.al. | 2506.02547 | translate | read | null |
| 2025-06-03 | Efficient Test-time Adaptive Object Detection via Sensitivity-Guided Pruning | Kunyu Wang et.al. | 2506.02462 | translate | read | null |
| 2025-06-03 | Auto-Labeling Data for Object Detection | Brent A. Griffin et.al. | 2506.02359 | translate | read | null |
(<a href=../Object_Detection.md>back to Object Detection</a>)