Object Detection
Object Detection
| Publish Date | Title | Authors | Code | |
|---|---|---|---|---|
| 2025-12-18 | DenseBEV: Transforming BEV Grid Cells into 3D Objects | Marius Dähling et.al. | 2512.16818 | null |
| 2025-12-18 | FlowDet: Unifying Object Detection and Generative Transport Flows | Enis Baty et.al. | 2512.16771 | null |
| 2025-12-18 | YOLO11-4K: An Efficient Architecture for Real-Time Small Object Detection in 4K Panoramic Images | Huma Hafeez et.al. | 2512.16493 | null |
| 2025-12-18 | Autoencoder-based Denoising Defense against Adversarial Attacks on Object Detection | Min Geun Song et.al. | 2512.16123 | null |
| 2025-12-18 | Auto-Vocabulary 3D Object Detection | Haomeng Zhang et.al. | 2512.16077 | null |
| 2025-12-17 | From Words to Wavelengths: VLMs for Few-Shot Multispectral Object Detection | Manuel Nkegoum et.al. | 2512.15971 | null |
| 2025-12-13 | Two-Step Data Augmentation for Masked Face Detection and Recognition: Turning Fake Masks to Real | Yan Yang et.al. | 2512.15774 | null |
| 2025-12-17 | IMKD: Intensity-Aware Multi-Level Knowledge Distillation for Camera-Radar Fusion | Shashank Mishra et.al. | 2512.15581 | null |
| 2025-12-17 | Evaluation of deep learning architectures for wildlife object detection: A comparative study of ResNet and Inception | Malach Obisa Amonga et.al. | 2512.15480 | null |
| 2025-12-17 | Vision-based module for accurately reading linear scales in a laboratory | Parvesh Saini et.al. | 2512.15327 | null |
| 2025-12-17 | EPSM: A Novel Metric to Evaluate the Safety of Environmental Perception in Autonomous Driving | Jörg Gamerdinger et.al. | 2512.15195 | null |
| 2025-12-17 | Criticality Metrics for Relevance Classification in Safety Evaluation of Object Detection in Automated Driving | Jörg Gamerdinger et.al. | 2512.15181 | null |
| 2025-12-17 | Beyond Proximity: A Keypoint-Trajectory Framework for Classifying Affiliative and Agonistic Social Networks in Dairy Cattle | Sibi Parivendan et.al. | 2512.14998 | null |
| 2025-12-16 | TUMTraf EMOT: Event-Based Multi-Object Tracking Dataset and Baseline for Traffic Scenarios | Mengyu Li et.al. | 2512.14595 | null |
| 2025-12-16 | 4D-RaDiff: Latent Diffusion for 4D Radar Point Cloud Generation | Jimmie Kwok et.al. | 2512.14235 | null |
| 2025-12-16 | CIS-BA: Continuous Interaction Space Based Backdoor Attack for Object Detection in the Real-World | Shuxin Zhao et.al. | 2512.14158 | null |
| 2025-12-16 | Neurosymbolic Inference On Foundation Models For Remote Sensing Text-to-image Retrieval With Complex Queries | Emanuele Mezzi et.al. | 2512.14102 | null |
| 2025-12-16 | Deep Learning Perspective of Scene Understanding in Autonomous Robots | Afia Maham et.al. | 2512.14020 | null |
| 2025-12-16 | Real-Time Service Subscription and Adaptive Offloading Control in Vehicular Edge Computing | Chuanchao Gao et.al. | 2512.14002 | null |
| 2025-12-16 | FocalComm: Hard Instance-Aware Multi-Agent Perception | Dereje Shenkut et.al. | 2512.13982 | null |
| 2025-12-15 | Route-DETR: Pairwise Query Routing in Transformers for Object Detection | Ye Zhang et.al. | 2512.13876 | null |
| 2025-12-15 | VajraV1 – The most accurate Real Time Object Detector of the YOLO family | Naman Balbir Singh Makkar et.al. | 2512.13834 | null |
| 2025-12-15 | Near-Field Perception for Safety Enhancement of Autonomous Mobile Robots in Manufacturing Environments | Li-Wei Shih et.al. | 2512.13561 | null |
| 2025-12-15 | On the Ability of Deep Learning to Detect Signals with Unknown Parameters | Tom Anders et.al. | 2512.13542 | null |
| 2025-12-15 | Computer vision training dataset generation for robotic environments using Gaussian splatting | Patryk Niżeniec et.al. | 2512.13411 | null |
| 2025-12-15 | Diffusion-Based Restoration for Multi-Modal 3D Object Detection in Adverse Weather | Zhijian He et.al. | 2512.13107 | null |
| 2025-12-14 | Cross-Level Sensor Fusion with Object Lists via Transformer for 3D Object Detection | Xiangzhong Liu et.al. | 2512.12884 | null |
| 2025-12-13 | INDOOR-LiDAR: Bridging Simulation and Reality for Robot-Centric 360 degree Indoor LiDAR Perception – A Robot-Centric Hybrid Dataset | Haichuan Li et.al. | 2512.12377 | null |
| 2025-12-13 | WeDetect: Fast Open-Vocabulary Object Detection as Retrieval | Shenghao Fu et.al. | 2512.12309 | null |
| 2025-12-13 | Cognitive-YOLO: LLM-Driven Architecture Synthesis from First Principles of Data for Object Detection | Jiahao Zhao et.al. | 2512.12281 | null |
| 2025-12-13 | AI-Augmented Pollen Recognition in Optical and Holographic Microscopy for Veterinary Imaging | Swarn S. Warshaneyan et.al. | 2512.12101 | null |
| 2025-12-12 | TransBridge: Boost 3D Object Detection by Scene-Level Completion with Transformer Decoder | Qinghao Meng et.al. | 2512.11926 | null |
| 2025-12-12 | Depth-Copy-Paste: Multimodal and Depth-Aware Compositing for Robust Face Detection | Qiushi Guo et.al. | 2512.11683 | null |
| 2025-12-12 | DOS: Distilling Observable Softmaps of Zipfian Prototypes for Self-Supervised Point Representation | Mohamed Abdelsamad et.al. | 2512.11465 | null |
| 2025-12-12 | Assisted Refinement Network Based on Channel Information Interaction for Camouflaged and Salient Object Detection | Kuan Wang et.al. | 2512.11369 | null |
| 2025-12-12 | Reliable Detection of Minute Targets in High-Resolution Aerial Imagery across Temporal Shifts | Mohammad Sadegh Gholizadeh et.al. | 2512.11360 | null |
| 2025-12-11 | VGent: Visual Grounding via Modular Design for Disentangling Reasoning and Prediction | Weitai Kang et.al. | 2512.11099 | null |
| 2025-12-11 | Salient Object Detection in Complex Weather Conditions via Noise Indicators | Quan Chen et.al. | 2512.10592 | null |
| 2025-12-11 | Adaptive Dual-Weighted Gravitational Point Cloud Denoising Method | Ge Zhang et.al. | 2512.10386 | null |
| 2025-12-10 | ABBSPO: Adaptive Bounding Box Scaling and Symmetric Prior based Orientation Prediction for Detecting Aerial Image Objects | Woojin Lee et.al. | 2512.10031 | null |
| 2025-12-10 | NordFKB: a fine-grained benchmark dataset for geospatial AI in Norway | Sander Riisøen Jyhne et.al. | 2512.09913 | null |
| 2025-12-10 | Hands-on Evaluation of Visual Transformers for Object Recognition and Detection | Dimitrios N. Vlachogiannis et.al. | 2512.09579 | null |
| 2025-12-10 | MODA: The First Challenging Benchmark for Multispectral Object Detection in Aerial Images | Shuaihao Han et.al. | 2512.09489 | null |
| 2025-12-10 | A Hierarchical, Model-Based System for High-Performance Humanoid Soccer | Quanyou Wang et.al. | 2512.09431 | null |
| 2025-12-10 | Identifying Bias in Machine-generated Text Detection | Kevin Stowe et.al. | 2512.09292 | null |
| 2025-12-10 | ROI-Packing: Efficient Region-Based Compression for Machine Vision | Md Eimran Hossain Eimon et.al. | 2512.09258 | null |
| 2025-12-09 | Automated Pollen Recognition in Optical and Holographic Microscopy Images | Swarn Singh Warshaneyan et.al. | 2512.08589 | null |
| 2025-12-09 | SSCATeR: Sparse Scatter-Based Convolution Algorithm with Temporal Data Recycling for Real-Time 3D Object Detection in LiDAR Point Clouds | Alexander Dow et.al. | 2512.08557 | null |
| 2025-12-09 | Distilling Future Temporal Knowledge with Masked Feature Reconstruction for 3D Object Detection | Haowen Zheng et.al. | 2512.08247 | null |
| 2025-12-09 | SOP^2: Transfer Learning with Scene-Oriented Prompt Pool on 3D Object Detection | Ching-Hung Cheng et.al. | 2512.08223 | null |
| 2025-12-09 | Metasurfaces Enable Active-Like Passive Radar | Mingyi Li et.al. | 2512.08208 | null |
| 2025-11-27 | Semi-Supervised Contrastive Learning with Orthonormal Prototypes | Huanran Li et.al. | 2512.07880 | null |
| 2025-12-08 | An AI-Powered Autonomous Underwater System for Sea Exploration and Scientific Research | Hamad Almazrouei et.al. | 2512.07652 | null |
| 2025-12-08 | Towards Robust DeepFake Detection under Unstable Face Sequences: Adaptive Sparse Graph Embedding with Order-Free Representation and Explicit Laplacian Spectral Prior | Chih-Chung Hsu et.al. | 2512.07498 | null |
| 2025-12-08 | Enhancing Small Object Detection with YOLO: A Novel Framework for Improved Accuracy and Efficiency | Mahila Moghadami et.al. | 2512.07379 | null |
| 2025-12-08 | A graph generation pipeline for critical infrastructures based on heuristics, images and depth data | Mike Diessner et.al. | 2512.07269 | null |
| 2025-12-08 | DART: Leveraging Multi-Agent Disagreement for Tool Recruitment in Multimodal Reasoning | Nithin Sivakumaran et.al. | 2512.07132 | null |
| 2025-12-08 | DFIR-DETR: Frequency Domain Enhancement and Dynamic Feature Aggregation for Cross-Scene Small Object Detection | Bo Gao et.al. | 2512.07078 | null |
| 2025-12-07 | Large Language Models and Forensic Linguistics: Navigating Opportunities and Threats in the Age of Generative AI | George Mikros et.al. | 2512.06922 | null |
| 2025-12-07 | Spatial Retrieval Augmented Autonomous Driving | Xiaosong Jia et.al. | 2512.06865 | null |
| 2025-12-07 | CoT4Det: A Chain-of-Thought Framework for Perception-Oriented Vision-Language Tasks | Yu Qi et.al. | 2512.06663 | null |
| 2025-12-07 | TextMamba: Scene Text Detector with Mamba | Qiyan Zhao et.al. | 2512.06657 | null |
| 2025-12-06 | Neural expressiveness for beyond importance model compression | Angelos-Christos Maroudis et.al. | 2512.06440 | null |
| 2025-12-06 | Are AI-Generated Driving Videos Ready for Autonomous Driving? A Diagnostic Evaluation Framework | Xinhao Xiang et.al. | 2512.06376 | null |
| 2025-12-05 | OWL: Unsupervised 3D Object Detection by Occupancy Guided Warm-up and Large Model Priors Reasoning | Xusheng Guo et.al. | 2512.05698 | null |
| 2025-12-05 | LeAD-M3D: Leveraging Asymmetric Distillation for Real-time Monocular 3D Detection | Johannes Meier et.al. | 2512.05663 | null |
| 2025-12-05 | An Integrated System for WEEE Sorting Employing X-ray Imaging, AI-based Object Detection and Segmentation, and Delta Robot Manipulation | Panagiotis Giannikos et.al. | 2512.05599 | null |
| 2025-12-05 | Concept-based Explainable Data Mining with VLM for 3D Detection | Mai Tsujimoto et.al. | 2512.05482 | null |
| 2025-12-05 | Moving object detection from multi-depth images with an attention-enhanced CNN | Masato Shibukawa et.al. | 2512.05415 | null |
| 2025-12-05 | YOLO and SGBM Integration for Autonomous Tree Branch Detection and Depth Estimation in Radiata Pine Pruning Applications | Yida Lin et.al. | 2512.05412 | null |
| 2025-12-04 | GeoPE:A Unified Geometric Positional Embedding for Structured Tensors | Yupu Yao et.al. | 2512.04963 | null |
| 2025-12-04 | You Only Train Once (YOTO): A Retraining-Free Object Detection Framework | Priyanto Hidayatullah et.al. | 2512.04888 | null |
| 2025-12-04 | DuGI-MAE: Improving Infrared Mask Autoencoders via Dual-Domain Guidance | Yinghui Xing et.al. | 2512.04511 | null |
| 2025-12-04 | Dual-Stream Spectral Decoupling Distillation for Remote Sensing Object Detection | Xiangyi Gao et.al. | 2512.04413 | null |
| 2025-12-03 | Real-time Cricket Sorting By Sex | Juan Manuel Cantarero Angulo et.al. | 2512.04311 | null |
| 2025-12-03 | Fast & Efficient Normalizing Flows and Applications of Image Generative Models | Sandeep Nagar et.al. | 2512.04039 | null |
| 2025-12-03 | MKSNet: Advanced Small Object Detection in Remote Sensing Imagery with Multi-Kernel and Dual Attention Mechanisms | Jiahao Zhang et.al. | 2512.03640 | null |
| 2025-12-03 | Real-Time Control and Automation Framework for Acousto-Holographic Microscopy | Hasan Berkay Abdioğlu et.al. | 2512.03539 | null |
| 2025-12-03 | YOLOA: Real-Time Affordance Detection via LLM Adapter | Yuqi Ji et.al. | 2512.03418 | null |
| 2025-12-02 | GraphFusion3D: Dynamic Graph Attention Convolution with Adaptive Cross-Modal Transformer for 3D Object Detection | Md Sohag Mia et.al. | 2512.02991 | null |
| 2025-12-02 | BEVDilation: LiDAR-Centric Multi-Modal Fusion for 3D Object Detection | Guowen Zhang et.al. | 2512.02972 | null |
| 2025-12-02 | MRD: Multi-resolution Retrieval-Detection Fusion for High-Resolution Image Understanding | Fan Yang et.al. | 2512.02906 | null |
| 2025-12-02 | ALDI-ray: Adapting the ALDI Framework for Security X-ray Object Detection | Omid Reza Heidari et.al. | 2512.02696 | null |
| 2025-12-02 | SAM2Grasp: Resolve Multi-modal Grasping via Prompt-conditioned Temporal Action Prediction | Shengkai Wu et.al. | 2512.02609 | null |
| 2025-12-02 | GeoDiT: A Diffusion-based Vision-Language Model for Geospatial Understanding | Jiaqi Liu et.al. | 2512.02505 | null |
| 2025-12-02 | Temporal Dynamics Enhancer for Directly Trained Spiking Object Detectors | Fan Luo et.al. | 2512.02447 | null |
| 2025-12-01 | Physical ID-Transfer Attacks against Multi-Object Tracking via Adversarial Trajectory | Chenyi Wang et.al. | 2512.01934 | null |
| 2025-12-01 | SAM3-UNet: Simplified Adaptation of Segment Anything Model 3 | Xinyu Xiong et.al. | 2512.01789 | null |
| 2025-12-01 | Bridging the Scale Gap: Balanced Tiny and General Object Detection in Remote Sensing Imagery | Zhicheng Zhao et.al. | 2512.01665 | null |
| 2025-12-01 | ViT $^3$ : Unlocking Test-Time Training in Vision | Dongchen Han et.al. | 2512.01643 | null |
| 2025-12-01 | OpenBox: Annotate Any Bounding Boxes in 3D | In-Jae Lee et.al. | 2512.01352 | null |
| 2025-12-01 | FOD-S2R: A FOD Dataset for Sim2Real Transfer Learning based Object Detection | Ashish Vashist et.al. | 2512.01315 | null |
| 2025-12-01 | Supervised Contrastive Machine Unlearning of Background Bias in Sonar Image Classification with Fine-Grained Explainable AI | Kamal Basha S et.al. | 2512.01291 | null |
| 2025-12-01 | VSRD++: Autolabeling for 3D Object Detection via Instance-Aware Volumetric Silhouette Rendering | Zihua Liu et.al. | 2512.01178 | null |
| 2025-12-01 | Real-Time On-the-Go Annotation Framework Using YOLO for Automated Dataset Generation | Mohamed Abdallah Salem et.al. | 2512.01165 | null |
| 2025-11-30 | Autonomous Grasping On Quadruped Robot With Task Level Interaction | Muhtadin et.al. | 2512.01052 | null |
| 2025-11-30 | Med-CMR: A Fine-Grained Benchmark Integrating Visual Evidence and Clinical Logic for Medical Complex Multimodal Reasoning | Haozhen Gong et.al. | 2512.00818 | null |
| 2025-11-30 | DEJIMA: A Novel Large-scale Japanese Dataset for Image Captioning and Visual Question Answering | Toshiki Katsube et.al. | 2512.00773 | null |
| 2025-11-29 | MM-DETR: An Efficient Multimodal Detection Transformer with Mamba-Driven Dual-Granularity Fusion and Frequency-Aware Modality Adapters | Jianhong Han et.al. | 2512.00363 | null |
| 2025-11-28 | Hybrid Synthetic Data Generation with Domain Randomization Enables Zero-Shot Vision-Based Part Inspection Under Extreme Class Imbalance | Ruo-Syuan Mei et.al. | 2512.00125 | null |
| 2025-11-25 | Diffusion-Based Synthetic Brightfield Microscopy Images for Enhanced Single Cell Detection | Mario de Jesus da Graca et.al. | 2512.00078 | null |
| 2025-11-24 | ProvRain: Rain-Adaptive Denoising and Vehicle Detection via MobileNet-UNet and Faster R-CNN | Aswinkumar Varathakumaran et.al. | 2512.00073 | null |
| 2025-11-23 | PEFT-DML: Parameter-Efficient Fine-Tuning Deep Metric Learning for Robust Multi-Modal 3D Object Detection in Autonomous Driving | Abdolazim Rezaei et.al. | 2512.00060 | null |
| 2025-11-28 | Object-Centric Data Synthesis for Category-level Object Detection | Vikhyat Agarwal et.al. | 2511.23450 | null |
| 2025-11-28 | Toward Automatic Safe Driving Instruction: A Large-Scale Vision Language Model Approach | Haruki Sakajo et.al. | 2511.23311 | null |
| 2025-11-28 | Synthetic Industrial Object Detection: GenAI vs. Feature-Based Methods | Jose Moises Araya-Martinez et.al. | 2511.23241 | null |
| 2025-11-28 | Zero-Shot Multi-Criteria Visual Quality Inspection for Semi-Controlled Industrial Environments via Real-Time 3D Digital Twin Simulation | Jose Moises Araya-Martinez et.al. | 2511.23214 | null |
| 2025-11-28 | Bharat Scene Text: A Novel Comprehensive Dataset and Benchmark for Indian Language Scene Text Understanding | Anik De et.al. | 2511.23071 | null |
| 2025-11-28 | Barcode and QR Code Object Detection: An Experimental Study on YOLOv8 Models | Kushagra Pandya et.al. | 2511.22937 | null |
| 2025-11-28 | DM $^3$ T: Harmonizing Modalities via Diffusion for Multi-Object Tracking | Weiran Li et.al. | 2511.22896 | null |
| 2025-11-27 | DocVAL: Validated Chain-of-Thought Distillation for Grounded Document VQA | Ahmad Mohammadshirazi et.al. | 2511.22521 | null |
| 2025-11-27 | Small Object Detection for Birds with Swin Transformer | Da Huo et.al. | 2511.22310 | null |
| 2025-11-27 | Simplex-Optimized Hybrid Ensemble for Large Language Model Text Detection Under Generative Distribution Drif | Sepyan Purnama Kristanto et.al. | 2511.22153 | null |
| 2025-11-27 | Bistatic Passive Tracking via CSI Power | Zhongqin Wang et.al. | 2511.22144 | null |
| 2025-11-27 | SemOD: Semantic Enabled Object Detection Network under Various Weather Conditions | Aiyinsi Zuo et.al. | 2511.22142 | null |
| 2025-11-27 | PAGen: Phase-guided Amplitude Generation for Domain-adaptive Object Detection | Shuchen Du et.al. | 2511.22029 | null |
| 2025-11-22 | A Lightweight Approach to Detection of AI-Generated Texts Using Stylometric Features | Sergey K. Aityan et.al. | 2511.21744 | null |
| 2025-11-26 | Continual Error Correction on Low-Resource Devices | Kirill Paramonov et.al. | 2511.21652 | null |
| 2025-11-26 | CanKD: Cross-Attention-based Non-local operation for Feature-based Knowledge Distillation | Shizhe Sun et.al. | 2511.21503 | null |
| 2025-11-26 | Co-Training Vision Language Models for Remote Sensing Multi-task Learning | Qingyun Li et.al. | 2511.21272 | link |
| 2025-11-26 | OVOD-Agent: A Markov-Bandit Framework for Proactive Visual Reasoning and Self-Evolving Detection | Chujie Wang et.al. | 2511.21064 | null |
| 2025-11-26 | AerialMind: Towards Referring Multi-Object Tracking in UAV Scenarios | Chenglizhao Chen et.al. | 2511.21053 | null |
| 2025-11-26 | Wavefront-Constrained Passive Obscured Object Detection | Zhiwen Zheng et.al. | 2511.20991 | null |
| 2025-11-26 | RefOnce: Distilling References into a Prototype Memory for Referring Camouflaged Object Detection | Yu-Huan Wu et.al. | 2511.20989 | null |
| 2025-11-25 | Video Object Recognition in Mobile Edge Networks: Local Tracking or Edge Detection? | Kun Guo et.al. | 2511.20716 | null |
| 2025-11-25 | MedROV: Towards Real-Time Open-Vocabulary Detection Across Diverse Medical Imaging Modalities | Tooba Tehreem Sheikh et.al. | 2511.20650 | null |
| 2025-11-25 | Zoo3D: Zero-Shot 3D Object Detection at Scene Level | Andrey Lemeshko et.al. | 2511.20253 | null |
| 2025-11-25 | Intelligent Image Search Algorithms Fusing Visual Large Models | Kehan Wang et.al. | 2511.19920 | null |
| 2025-11-24 | Maritime Small Object Detection from UAVs using Deep Learning with Altitude-Aware Dynamic Tiling | Sakib Ahmed et.al. | 2511.19728 | null |
| 2025-11-24 | Studying Maps at Scale: A Digital Investigation of Cartography and the Evolution of Figuration | Remi Petitpierre et.al. | 2511.19538 | null |
| 2025-11-24 | SAM3-Adapter: Efficient Adaptation of Segment Anything 3 for Camouflage Object Segmentation, Shadow Detection, and Medical Image Segmentation | Tianrun Chen et.al. | 2511.19425 | null |
| 2025-11-24 | IDEAL-M3D: Instance Diversity-Enriched Active Learning for Monocular 3D Detection | Johannes Meier et.al. | 2511.19301 | null |
| 2025-11-24 | SpectraNet: FFT-assisted Deep Learning Classifier for Deepfake Face Detection | Nithira Jayarathne et.al. | 2511.19187 | null |
| 2025-11-24 | MambaRefine-YOLO: A Dual-Modality Small Object Detector for UAV Imagery | Shuyu Cao et.al. | 2511.19134 | null |
| 2025-11-24 | 3M-TI: High-Quality Mobile Thermal Imaging via Calibration-free Multi-Camera Cross-Modal Diffusion | Minchong Chen et.al. | 2511.19117 | null |
| 2025-11-24 | LLMAID: Identifying AI Capabilities in Android Apps with LLMs | Pei Liu et.al. | 2511.19059 | null |
| 2025-11-24 | LAA3D: A Benchmark of Detecting and Tracking Low-Altitude Aircraft in 3D Space | Hai Wu et.al. | 2511.19057 | null |
| 2025-11-24 | Enhancing Fast Radio Transient Detection with Mask R-CNN Image Segmentation | Sergio Belmonte Diaz et.al. | 2511.19014 | null |
| 2025-11-24 | Peregrine: One-Shot Fine-Tuning for FHE Inference of General Deep CNNs | Huaming Ling et.al. | 2511.18976 | null |
| 2025-11-24 | DualGazeNet: A Biologically Inspired Dual-Gaze Query Network for Salient Object Detection | Yu Zhang et.al. | 2511.18865 | null |
| 2025-11-24 | DetAny4D: Detect Anything 4D Temporally in a Streaming RGB Video | Jiawei Hou et.al. | 2511.18814 | null |
| 2025-11-24 | StereoDETR: Stereo-based Transformer for 3D Object Detection | Shiyi Mu et.al. | 2511.18788 | null |
| 2025-11-24 | DriveFlow: Rectified Flow Adaptation for Robust 3D Object Detection in Autonomous Driving | Hongbin Lin et.al. | 2511.18713 | null |
| 2025-11-24 | Dendritic Convolution for Noise Image Recognition | Jiarui Xue et.al. | 2511.18699 | null |
| 2025-11-24 | Multimodal Real-Time Anomaly Detection and Industrial Applications | Aman Verma et.al. | 2511.18698 | null |
| 2025-11-24 | Exploring Surround-View Fisheye Camera 3D Object Detection | Changcai Li et.al. | 2511.18695 | null |
| 2025-11-23 | UniFlow: Towards Zero-Shot LiDAR Scene Flow for Autonomous Vehicles via Cross-Domain Generalization | Siyi Li et.al. | 2511.18254 | null |
| 2025-11-22 | VK-Det: Visual Knowledge Guided Prototype Learning for Open-Vocabulary Aerial Object Detection | Jianhang Yao et.al. | 2511.18075 | null |
| 2025-11-22 | Diverse Instance Generation via Diffusion Models for Enhanced Few-Shot Object Detection in Remote Sensing Images | Yanxing Liu et.al. | 2511.18031 | null |
| 2025-11-22 | State and Scene Enhanced Prototypes for Weakly Supervised Open-Vocabulary Object Detection | Jiaying Zhou et.al. | 2511.18012 | null |
| 2025-11-21 | REXO: Indoor Multi-View Radar Object Detection via 3D Bounding Box Diffusion | Ryoma Yataka et.al. | 2511.17806 | null |
| 2025-11-21 | PUCP-Metrix: An Open-source and Comprehensive Toolkit for Linguistic Analysis of Spanish Texts | Javier Alonso Villegas Luis et.al. | 2511.17402 | null |
| 2025-11-04 | In-Context Adaptation of VLMs for Few-Shot Cell Detection in Optical Microscopy | Shreyan Ganguly et.al. | 2511.05565 | null |
| 2025-11-03 | Compressing Multi-Task Model for Autonomous Driving via Pruning and Knowledge Distillation | Jiayuan Wang et.al. | 2511.05557 | null |
| 2025-11-06 | NovisVQ: A Streaming Convolutional Neural Network for No-Reference Opinion-Unaware Frame Quality Assessment | Kylie Cancilla et.al. | 2511.04628 | null |
| 2025-11-06 | Evaluating the Impact of Weather-Induced Sensor Occlusion on BEVFusion for 3D Object Detection | Sanjay Kumar et.al. | 2511.04347 | null |
| 2025-11-06 | Comparative Study of CNN Architectures for Binary Classification of Horses and Motorcycles in the VOC 2008 Dataset | Muhammad Annas Shaikh et.al. | 2511.04344 | null |
| 2025-11-06 | Deep learning-based object detection of offshore platforms on Sentinel-1 Imagery and the impact of synthetic training data | Robin Spanier et.al. | 2511.04304 | null |
| 2025-11-06 | DMSORT: An efficient parallel maritime multi-object tracking architecture for unmanned vessel platforms | Shengyu Tang et.al. | 2511.04128 | null |
| 2025-11-05 | Desert Waste Detection and Classification Using Data-Based and Model-Based Enhanced YOLOv12 DL Model | Abdulmumin Sa’ad et.al. | 2511.03888 | null |
| 2025-11-05 | ISC-Perception: A Hybrid Computer Vision Dataset for Object Detection in Novel Steel Assembly | Miftahur Rahman et.al. | 2511.03098 | null |
| 2025-11-05 | A Computer Vision Based Proxy for Political Polarization in Religious Countries: A Turkiye Case Study | Liangze Ke et.al. | 2511.03088 | null |
| 2025-11-04 | Diffusion Models are Robust Pretrainers | Mika Yagoda et.al. | 2511.02793 | null |
| 2025-11-04 | DetectiumFire: A Comprehensive Multi-modal Dataset Bridging Vision and Language for Fire Understanding | Zixuan Liu et.al. | 2511.02495 | null |
| 2025-11-04 | Object Detection as an Optional Basis: A Graph Matching Network for Cross-View UAV Localization | Tao Liu et.al. | 2511.02489 | null |
| 2025-11-04 | Facial Expression Recognition System Using DNN Accelerator with Multi-threading on FPGA | Takuto Ando et.al. | 2511.02408 | null |
| 2025-11-04 | 3D Point Cloud Object Detection on Edge Devices for Split Computing | Taisuke Noguchi et.al. | 2511.02293 | null |
| 2025-11-04 | Autobiasing Event Cameras for Flickering Mitigation | Mehdi Sefidgar Dilmaghani et.al. | 2511.02180 | null |
| 2025-11-03 | UniLION: Towards Unified Autonomous Driving Model with Linear Group RNNs | Zhe Liu et.al. | 2511.01768 | null |
| 2025-11-03 | CGF-DETR: Cross-Gated Fusion DETR for Enhanced Pneumonia Detection in Chest X-rays | Yefeng Wu et.al. | 2511.01730 | null |
| 2025-11-03 | Contrast-Guided Cross-Modal Distillation for Thermal Object Detection | SiWoo Kim et.al. | 2511.01435 | null |
| 2025-11-03 | Eyes on Target: Gaze-Aware Object Detection in Egocentric Video | Vishakha Lall et.al. | 2511.01237 | null |
| 2025-11-03 | DEER: Disentangled Mixture of Experts with Instance-Adaptive Routing for Generalizable Machine-Generated Text Detection | Guoxin Ma et.al. | 2511.01192 | null |
| 2025-11-02 | Advancing Machine-Generated Text Detection from an Easy to Hard Supervision Perspective | Chenwang Wu et.al. | 2511.00988 | null |
| 2025-11-02 | A Hybrid YOLOv5-SSD IoT-Based Animal Detection System for Durian Plantation Protection | Anis Suttan Shahrir et.al. | 2511.00777 | null |
| 2025-10-28 | Which LiDAR scanning pattern is better for roadside perception: Repetitive or Non-repetitive? | Zhiqi Qi et.al. | 2511.00060 | null |
| 2025-10-31 | Gaussian Combined Distance: A Generic Metric for Object Detection | Ziqian Guan et.al. | 2510.27649 | null |
| 2025-10-31 | Parameterized Prompt for Incremental Object Detection | Zijia An et.al. | 2510.27316 | null |
| 2025-10-31 | C-LEAD: Contrastive Learning for Enhanced Adversarial Defense | Suklav Ghosh et.al. | 2510.27249 | null |
| 2025-10-31 | M^3Detection: Multi-Frame Multi-Level Feature Fusion for Multi-Modal 3D Object Detection with Camera and 4D Imaging Radar | Xiaozhi Li et.al. | 2510.27166 | null |
| 2025-10-31 | Generating Accurate and Detailed Captions for High-Resolution Images | Hankyeol Lee et.al. | 2510.27164 | null |
| 2025-10-31 | MLPerf Automotive | Radoyeh Shojaei et.al. | 2510.27065 | null |
| 2025-10-30 | Using Salient Object Detection to Identify Manipulative Cookie Banners that Circumvent GDPR | Riley Grossman et.al. | 2510.26967 | null |
| 2025-10-30 | Improving Classification of Occluded Objects through Scene Context | Courtney M. King et.al. | 2510.26681 | null |
| 2025-10-30 | All You Need for Object Detection: From Pixels, Points, and Prompts to Next-Gen Fusion and Multimodal LLMs/VLMs in Autonomous Vehicles | Sayed Pedram Haeri Boroujeni et.al. | 2510.26641 | null |
| 2025-10-30 | PT-DETR: Small Target Detection Based on Partially-Aware Detail Focus | Bingcong Huo et.al. | 2510.26630 | null |
| 2025-10-30 | Spiking Patches: Asynchronous, Sparse, and Efficient Tokens for Event Cameras | Christoffer Koo Øhrstrøm et.al. | 2510.26614 | null |
| 2025-10-30 | Detecting Unauthorized Vehicles using Deep Learning for Smart Cities: A Case Study on Bangladesh | Sudipto Das Sukanto et.al. | 2510.26154 | null |
| 2025-10-29 | Enhancing Underwater Object Detection through Spatio-Temporal Analysis and Spatial Attention Networks | Sai Likhith Karri et.al. | 2510.25797 | null |
| 2025-10-29 | Prototype-Driven Adaptation for Few-Shot Object Detection | Yushen Huang et.al. | 2510.25318 | null |
| 2025-10-29 | GaTector+: A Unified Head-free Framework for Gaze Object and Gaze Following Prediction | Yang Jin et.al. | 2510.25301 | null |
| 2025-10-29 | RT-DETRv4: Painlessly Furthering Real-Time Object Detection with Vision Foundation Models | Zijun Liao et.al. | 2510.25257 | null |
| 2025-10-29 | Test-Time Adaptive Object Detection with Foundation Model | Yingjie Gao et.al. | 2510.25175 | null |
| 2025-10-29 | DINO-YOLO: Self-Supervised Pre-training for Data-Efficient Object Detection in Civil Engineering Applications | Malaisree P et.al. | 2510.25140 | null |
| 2025-10-28 | Pixels to Signals: A Real-Time Framework for Traffic Demand Estimation | H Mhatre et.al. | 2510.24902 | null |
| 2025-10-28 | MIC-BEV: Multi-Infrastructure Camera Bird’s-Eye-View Transformer with Relation-Aware Fusion for 3D Object Detection | Yun Zhang et.al. | 2510.24688 | null |
| 2025-10-28 | A Critical Study towards the Detection of Parkinsons Disease using ML Technologies | Vivek Chetia et.al. | 2510.24456 | null |
| 2025-10-28 | Delving into Cascaded Instability: A Lipschitz Continuity View on Image Restoration and Object Detection Synergy | Qing Zhao et.al. | 2510.24232 | null |
| 2025-10-28 | Mars-Bench: A Benchmark for Evaluating Foundation Models for Mars Science Tasks | Mirali Purohit et.al. | 2510.24010 | null |
| 2025-10-27 | A U-Net and Transformer Pipeline for Multilingual Image Translation | Siddharth Sahay et.al. | 2510.23554 | null |
| 2025-10-27 | FRBNet: Revisiting Low-Light Vision through Frequency-Domain Radial Basis Network | Fangtong Sun et.al. | 2510.23444 | null |
| 2025-10-27 | One-Timestep is Enough: Achieving High-performance ANN-to-SNN Conversion via Scale-and-Fire Neurons | Qiuyang Chen et.al. | 2510.23383 | null |
| 2025-10-27 | Spoofing resilience for simple-detection quantum illumination LIDAR | Richard J. Murchie et.al. | 2510.23228 | null |
| 2025-10-27 | AG-Fusion: adaptive gated multimodal fusion for 3d object detection in complex scenes | Sixian Liu et.al. | 2510.23151 | null |
| 2025-10-27 | DQ3D: Depth-guided Query for Transformer-Based 3D Object Detection in Traffic Scenarios | Ziyu Wang et.al. | 2510.23144 | null |
| 2025-10-27 | M $^{3}$ T2IBench: A Large-Scale Multi-Category, Multi-Instance, Multi-Relation Text-to-Image Benchmark | Huixuan Zhang et.al. | 2510.23020 | null |
| 2025-10-26 | A Comprehensive Dataset for Human vs. AI Generated Text Detection | Rajarshi Roy et.al. | 2510.22874 | null |
| 2025-10-26 | A Critical Study on Tea Leaf Disease Detection using Deep Learning Techniques | Nabajyoti Borah et.al. | 2510.22647 | null |
| 2025-10-25 | 3D Roadway Scene Object Detection with LIDARs in Snowfall Conditions | Ghazal Farhani et.al. | 2510.22436 | null |
| 2025-10-25 | TrajGATFormer: A Graph-Based Transformer Approach for Worker and Obstacle Trajectory Prediction in Off-site Construction Environments | Mohammed Alduais et.al. | 2510.22205 | null |
| 2025-10-21 | Comparative Analysis of Object Detection Algorithms for Surface Defect Detection | Arpan Maity et.al. | 2510.21811 | null |
| 2025-10-24 | On Thin Ice: Towards Explainable Conservation Monitoring via Attribution and Perturbations | Jiayi Zhou et.al. | 2510.21689 | null |
| 2025-10-24 | S3OD: Towards Generalizable Salient Object Detection with Synthetic Data | Orest Kupyn et.al. | 2510.21605 | null |
| 2025-10-24 | Scalpel: Automotive Deep Learning Framework Testing via Assembling Model Components | Yinglong Zou et.al. | 2510.21451 | null |
| 2025-10-24 | Unveiling the Spatial-temporal Effective Receptive Fields of Spiking Neural Networks | Jieyuan Zhang et.al. | 2510.21403 | null |
| 2025-10-24 | WhaleVAD-BPN: Improving Baleen Whale Call Detection with Boundary Proposal Networks and Post-processing Optimisation | Christiaan M. Geldenhuys et.al. | 2510.21280 | null |
| 2025-10-23 | BioDet: Boosting Industrial Object Detection with Image Preprocessing Strategies | Jiaqi Hu et.al. | 2510.21000 | null |
| 2025-10-23 | BUSTED at AraGenEval Shared Task: A Comparative Study of Transformer-Based Models for Arabic AI-Generated Text Detection | Ali Zain et.al. | 2510.20610 | null |
| 2025-10-23 | Synthetic Data for Robust Runway Detection | Estelle Chigot et.al. | 2510.20349 | null |
| 2025-10-23 | Physics-Guided Fusion for Robust 3D Tracking of Fast Moving Small Objects | Prithvi Raj Singh et.al. | 2510.20126 | null |
| 2025-10-22 | A Unified Detection Pipeline for Robust Object Detection in Fisheye-Based Traffic Surveillance | Neema Jakisa Owor et.al. | 2510.20016 | null |
| 2025-10-22 | Can You Trust What You See? Alpha Channel No-Box Attacks on Video Object Detection | Ariana Yi et.al. | 2510.19574 | null |
| 2025-10-22 | Machine Text Detectors are Membership Inference Attacks | Ryuto Koike et.al. | 2510.19492 | link |
| 2025-10-22 | Towards Single-Source Domain Generalized Object Detection via Causal Visual Prompts | Chen Li et.al. | 2510.19487 | null |
| 2025-10-22 | Space Object Detection using Multi-frame Temporal Trajectory Completion Method | Xiaoqing Lan et.al. | 2510.19220 | null |
| 2025-10-22 | SFGFusion: Surface Fitting Guided 3D Object Detection with 4D Radar and Camera Fusion | Xiaozhi Li et.al. | 2510.19215 | null |
| 2025-10-21 | Kinematic Analysis and Integration of Vision Algorithms for a Mobile Manipulator Employed Inside a Self-Driving Laboratory | Shifa Sulaiman et.al. | 2510.19081 | null |
| 2025-10-21 | GBlobs: Local LiDAR Geometry for Improved Sensor Placement Generalization | Dušan Malić et.al. | 2510.18539 | null |
| 2025-10-21 | DWaste: Greener AI for Waste Sorting using Mobile and Edge Devices | Suman Kunwar et.al. | 2510.18513 | null |
| 2025-10-21 | Beyond Single Images: Retrieval Self-Augmented Unsupervised Camouflaged Object Detection | Ji Du et.al. | 2510.18437 | null |
| 2025-10-21 | ScaleNet: Scaling up Pretrained Neural Networks with Incremental Parameters | Zhiwei Hao et.al. | 2510.18431 | null |
| 2025-10-21 | Beyond Frequency: Scoring-Driven Debiasing for Object Detection via Blueprint-Prompted Image Synthesis | Xinhao Cai et.al. | 2510.18229 | null |
| 2025-10-20 | Accelerating Vision Transformers with Adaptive Patch Sizes | Rohan Choudhury et.al. | 2510.18091 | link |
| 2025-10-20 | Big Data, Tiny Targets: An Exploratory Study in Machine Learning-enhanced Detection of Microplastic from Filters | Paul-Tiberiu Miclea et.al. | 2510.18089 | null |
| 2025-10-15 | MUSE: Model-based Uncertainty-aware Similarity Estimation for zero-shot 2D Object Detection and Segmentation | Sungmin Cho et.al. | 2510.17866 | null |
| 2025-10-20 | Towards 3D Objectness Learning in an Open World | Taichi Liu et.al. | 2510.17686 | null |
| 2025-10-20 | On-the-Fly OVD Adaptation with FLAME: Few-shot Localization via Active Marginal-Samples Exploration | Yehonathan Refael et.al. | 2510.17670 | null |
| 2025-10-20 | DETree: DEtecting Human-AI Collaborative Texts via Tree-Structured Hierarchical Representation Learning | Yongxin He et.al. | 2510.17489 | link |
| 2025-10-20 | Split-Fuse-Transport: Annotation-Free Saliency via Dual Clustering and Optimal Transport Alignment | Muhammad Umer Ramzan et.al. | 2510.17484 | null |
| 2025-10-20 | Monitoring Horses in Stalls: From Object to Event Detection | Dmitrii Galimzianov et.al. | 2510.17409 | null |
| 2025-10-20 | Machine Vision-Based Surgical Lighting System:Design and Implementation | Amir Gharghabi et.al. | 2510.17287 | null |
| 2025-10-20 | Investigating Adversarial Robustness against Preprocessing used in Blackbox Face Recognition | Roland Croft et.al. | 2510.17169 | null |
| 2025-10-20 | Towards a Generalizable Fusion Architecture for Multimodal Object Detection | Jad Berjawi et.al. | 2510.17078 | null |
| 2025-10-19 | ArmFormer: Lightweight Transformer Architecture for Real-Time Multi-Class Weapon Segmentation and Classification | Akhila Kambhatla et.al. | 2510.16854 | null |
| 2025-10-18 | Towards Intelligent Traffic Signaling in Dhaka City Based on Vehicle Detection and Congestion Optimization | Kazi Ababil Azam et.al. | 2510.16622 | null |
| 2025-10-18 | AI-Generated Text Detection in Low-Resource Languages: A Case Study on Urdu | Muhammad Ammar et.al. | 2510.16573 | null |
| 2025-10-18 | ReviewGuard: Enhancing Deficient Peer Review Detection via LLM-Driven Data Augmentation | Haoxuan Zhang et.al. | 2510.16549 | null |
| 2025-10-18 | OOS-DSD: Improving Out-of-stock Detection in Retail Images using Auxiliary Tasks | Franko Šikić et.al. | 2510.16508 | null |
| 2025-10-18 | Enhancing Rotated Object Detection via Anisotropic Gaussian Bounding Box and Bhattacharyya Distance | Chien Thai et.al. | 2510.16445 | null |
| 2025-10-17 | Seeing Through the Brain: New Insights from Decoding Visual Stimuli with fMRI | Zheng Huang et.al. | 2510.16196 | null |
| 2025-10-17 | ObjectTransforms for Uncertainty Quantification and Reduction in Vision-Based Perception for Autonomous Vehicles | Nishad Sahu et.al. | 2510.16118 | null |
| 2025-10-17 | StripRFNet: A Strip Receptive Field and Shape-Aware Network for Road Damage Detection | Jianhan Lin et.al. | 2510.16115 | null |
| 2025-10-17 | LightsOut: Diffusion-based Outpainting for Enhanced Lens Flare Removal | Shr-Ruei Tsai et.al. | 2510.15868 | link |
| 2025-10-17 | ReCon: Region-Controllable Data Augmentation with Rectification and Alignment for Object Detection | Haowei Zhu et.al. | 2510.15783 | null |
| 2025-10-17 | Valeo Near-Field: a novel dataset for pedestrian intent detection | Antonyo Musabini et.al. | 2510.15673 | null |
| 2025-10-17 | FreqPDE: Rethinking Positional Depth Embedding for Multi-View 3D Object Detection Transformers | Haisheng Su et.al. | 2510.15385 | null |
| 2025-10-17 | Symmetric Entropy-Constrained Video Coding for Machines | Yuxiao Sun et.al. | 2510.15347 | null |
| 2025-10-16 | MOBIUS: Big-to-Mobile Universal Instance Segmentation via Multi-modal Bottleneck Fusion and Calibrated Decoder Pruning | Mattia Segu et.al. | 2510.15026 | null |
| 2025-10-16 | EdgeNavMamba: Mamba Optimized Object Detection for Energy Efficient Edge Devices | Romina Aalishah et.al. | 2510.14946 | null |
| 2025-10-16 | VLA^2: Empowering Vision-Language-Action Models with an Agentic Framework for Unseen Concept Manipulation | Han Zhao et.al. | 2510.14902 | link |
| 2025-10-16 | CoT-PL: Visual Chain-of-Thought Reasoning Meets Pseudo-Labeling for Open-Vocabulary Object Detection | Hojun Choi et.al. | 2510.14792 | null |
| 2025-10-16 | Cross-Layer Feature Self-Attention Module for Multi-Scale Object Detection | Dingzhou Xie et.al. | 2510.14726 | null |
| 2025-10-16 | Structured Universal Adversarial Attacks on Object Detection for Video Sequences | Sven Jacob et.al. | 2510.14460 | null |
| 2025-10-16 | Beat Tracking as Object Detection | Jaehoon Ahn et.al. | 2510.14391 | null |
| 2025-10-15 | How Sampling Affects the Detectability of Machine-written texts: A Comprehensive Study | Matthieu Dubois et.al. | 2510.13681 | null |
| 2025-10-15 | A Modular Object Detection System for Humanoid Robots Using YOLO | Nicolas Pottier et.al. | 2510.13625 | null |
| 2025-10-15 | Fusion Meets Diverse Conditions: A High-diversity Benchmark and Baseline for UAV-based Multimodal Object Detection with Condition Cues | Chen Chen et.al. | 2510.13620 | null |
| 2025-10-15 | Automated document processing system for government agencies using DBNET++ and BART models | Aya Kaysan Bahjat et.al. | 2510.13303 | null |
| 2025-10-15 | LLM one-shot style transfer for Authorship Attribution and Verification | Pablo Miralles-González et.al. | 2510.13302 | null |
| 2025-10-15 | What “Not” to Detect: Negation-Aware VLMs via Structured Reasoning and Token Merging | Inha Kang et.al. | 2510.13232 | null |
| 2025-10-15 | An Analytical Framework to Enhance Autonomous Vehicle Perception for Smart Cities | Jalal Khan et.al. | 2510.13230 | null |
| 2025-10-14 | Detect Anything via Next Point Prediction | Qing Jiang et.al. | 2510.12798 | link |
| 2025-10-14 | StyleDecipher: Robust and Explainable Detection of LLM-Generated Texts with Stylistic Analysis | Siyuan Li et.al. | 2510.12608 | null |
| 2025-10-14 | WaterFlow: Explicit Physics-Prior Rectified Flow for Underwater Saliency Mask Generation | Runting Li et.al. | 2510.12605 | null |
| 2025-10-14 | When Personalization Tricks Detectors: The Feature-Inversion Trap in Machine-Generated Text Detection | Lang Gao et.al. | 2510.12476 | null |
| 2025-10-14 | The Impact of Synthetic Data on Object Detection Model Performance: A Comparative Analysis with Real-World Data | Muammer Bay et.al. | 2510.12208 | null |
| 2025-10-14 | SpikePool: Event-driven Spiking Transformer with Pooling Attention | Donghyun Lee et.al. | 2510.12102 | null |
| 2025-10-14 | APGNet: Adaptive Prior-Guided for Underwater Camouflaged Object Detection | Xinxin Huang et.al. | 2510.12056 | null |
| 2025-10-13 | NV3D: Leveraging Spatial Shape Through Normal Vector-based 3D Object Detection | Krittin Chaowakarn et.al. | 2510.11632 | null |
| 2025-10-13 | Enhancing Maritime Domain Awareness on Inland Waterways: A YOLO-Based Fusion of Satellite and AIS for Vessel Characterization | Geoffery Agorku et.al. | 2510.11449 | null |
| 2025-10-13 | A Modular AIoT Framework for Low-Latency Real-Time Robotic Teleoperation in Smart Cities | Shih-Chieh Sun et.al. | 2510.11421 | null |
| 2025-10-13 | REACT3D: Recovering Articulations for Interactive Physical 3D Scenes | Zhao Huang et.al. | 2510.11340 | null |
| 2025-10-13 | When Does Supervised Training Pay Off? The Hidden Economics of Object Detection in the Era of Vision-Language Models | Samer Al-Hamadani et.al. | 2510.11302 | null |
| 2025-10-13 | A Large-Language-Model Assisted Automated Scale Bar Detection and Extraction Framework for Scanning Electron Microscopic Images | Yuxuan Chen et.al. | 2510.11260 | null |
| 2025-10-13 | Source-Free Object Detection with Detection Transformer | Huizai Yao et.al. | 2510.11090 | null |
| 2025-10-13 | Slitless Spectroscopy Source Detection Using YOLO Deep Neural Network | Xiaohan Chen et.al. | 2510.10922 | null |
| 2025-10-12 | EGD-YOLO: A Lightweight Multimodal Framework for Robust Drone-Bird Discrimination via Ghost-Enhanced YOLOv8n and EMA Attention under Adverse Condition | Sudipto Sarkar et.al. | 2510.10765 | null |
| 2025-10-12 | MRS-YOLO Railroad Transmission Line Foreign Object Detection Based on Improved YOLO11 and Channel Pruning | Siyuan Liu et.al. | 2510.10553 | null |
| 2025-10-12 | Risk-Budgeted Control Framework for Balanced Performance and Safety in Autonomous Vehicles | Pei Yu Chang et.al. | 2510.10442 | null |
| 2025-10-11 | Ordinal Scale Traffic Congestion Classification with Multi-Modal Vision-Language and Motion Analysis | Yu-Hsuan Lin et.al. | 2510.10342 | null |
| 2025-10-11 | Bridging Perspectives: Foundation Model Guided BEV Maps for 3D Object Detection and Tracking | Markus Käppeler et.al. | 2510.10287 | null |
| 2025-10-11 | MRI Brain Tumor Detection with Computer Vision | Jack Krolik et.al. | 2510.10250 | null |
| 2025-10-11 | BurstDeflicker: A Benchmark Dataset for Flicker Removal in Dynamic Scenes | Lishen Qu et.al. | 2510.09996 | null |
| 2025-10-10 | SpectralCA: Bi-Directional Cross-Attention for Next-Generation UAV Hyperspectral Vision | D. V. Brovko et.al. | 2510.09912 | null |
| 2025-10-06 | Ultralytics YOLO Evolution: An Overview of YOLO26, YOLO11, YOLOv8 and YOLOv5 Object Detectors for Computer Vision and Pattern Recognition | Ranjan Sapkota et.al. | 2510.09653 | null |
| 2025-10-10 | FSP-DETR: Few-Shot Prototypical Parasitic Ova Detection | Shubham Trehan et.al. | 2510.09583 | null |
| 2025-10-10 | PRNet: Original Information Is All You Have | PeiHuang Zheng et.al. | 2510.09531 | null |
| 2025-10-10 | Utilizing dynamic sparsity on pretrained DETR | Reza Sedghi et.al. | 2510.09380 | null |
| 2025-10-10 | TARO: Toward Semantically Rich Open-World Object Detection | Yuchen Zhang et.al. | 2510.09173 | null |
| 2025-10-10 | SOS: Synthetic Object Segments Improve Detection, Segmentation, and Grounding | Weikai Huang et.al. | 2510.09110 | null |
| 2025-10-09 | Re-Identifying Kākā with AI-Automated Video Key Frame Extraction | Paula Maddigan et.al. | 2510.08775 | null |
| 2025-10-03 | Beyond CNNs: Efficient Fine-Tuning of Multi-Modal LLMs for Object Detection on Low-Data Regimes | Nirmal Elamon et.al. | 2510.08589 | null |
| 2025-10-09 | A Multimodal Depth-Aware Method For Embodied Reference Understanding | Fevziye Irem Eyiokur et.al. | 2510.08278 | null |
| 2025-10-09 | RayFusion: Ray Fusion Enhanced Collaborative Visual Perception | Shaohong Wang et.al. | 2510.08017 | null |
| 2025-10-09 | A Large-scale Dataset for Robust Complex Anime Scene Text Detection | Ziyi Dong et.al. | 2510.07951 | null |
| 2025-10-08 | Robust Measurement of Stellar Streams Around the Milky Way: Correcting Spatially Variable Observational Selection Effects in Optical Imaging Surveys | K. Boone et.al. | 2510.07511 | null |
| 2025-10-08 | A million-solar-mass object detected at cosmological distance using gravitational imaging | D. M. Powell et.al. | 2510.07382 | null |
| 2025-10-08 | Inconsistent Affective Reaction: Sentiment of Perception and Opinion in Urban Environments | Jingfei Huang et.al. | 2510.07359 | null |
| 2025-10-07 | Enhancing Maritime Object Detection in Real-Time with RT-DETR and Data Augmentation | Nader Nemati et.al. | 2510.07346 | null |
| 2025-10-08 | Explaining raw data complexity to improve satellite onboard processing | Adrien Dorise et.al. | 2510.06858 | null |
| 2025-10-08 | Extreme Amodal Face Detection | Changlin Song et.al. | 2510.06791 | null |
| 2025-10-08 | SDQM: Synthetic Data Quality Metric for Object Detection Dataset Evaluation | Ayush Zenith et.al. | 2510.06596 | link |
| 2025-10-08 | Adaptive Stain Normalization for Cross-Domain Medical Histology | Tianyue Xu et.al. | 2510.06592 | null |
| 2025-10-06 | General and Efficient Visual Goal-Conditioned Reinforcement Learning using Object-Agnostic Masks | Fahim Shahriar et.al. | 2510.06277 | null |
| 2025-10-06 | Comparative Analysis of YOLOv5, Faster R-CNN, SSD, and RetinaNet for Motorbike Detection in Kigali Autonomous Driving Context | Ngeyen Yinkfu et.al. | 2510.04912 | null |
| 2025-10-06 | CLEAR-IR: Clarity-Enhanced Active Reconstruction of Infrared Imagery | Nathan Shankar et.al. | 2510.04883 | null |
| 2025-10-06 | SPEGNet: Synergistic Perception-Guided Network for Camouflaged Object Detection | Baber Jan et.al. | 2510.04472 | link |
| 2025-10-04 | From Filters to VLMs: Benchmarking Defogging Methods through Object Detection and Segmentation Performance | Ardalan Aryashad et.al. | 2510.03906 | null |
| 2025-10-04 | Cross-View Open-Vocabulary Object Detection in Aerial Imagery | Jyoti Kini et.al. | 2510.03858 | null |
| 2025-10-04 | Person-Centric Annotations of LAION-400M: Auditing Bias and Its Transfer to Models | Leander Girrbach et.al. | 2510.03721 | null |
| 2025-10-04 | SAMSOD: Rethinking SAM Optimization for RGB-T Salient Object Detection | Zhengyi Liu et.al. | 2510.03689 | null |
| 2025-10-03 | ALHD: A Large-Scale and Multigenre Benchmark Dataset for Arabic LLM-Generated Text Detection | Ali Khairallah et.al. | 2510.03502 | null |
| 2025-10-03 | Visual Language Model as a Judge for Object Detection in Industrial Diagrams | Sanjukta Ghosh et.al. | 2510.03376 | null |
| 2025-10-03 | Neural Posterior Estimation with Autoregressive Tiling for Detecting Objects in Astronomical Images | Jeffrey Regier et.al. | 2510.03074 | null |
| 2025-10-03 | Align Your Query: Representation Alignment for Multimodality Medical Object Detection | Ara Seo et.al. | 2510.02789 | null |
| 2025-10-02 | Multimodal Large Language Model Framework for Safe and Interpretable Grid-Integrated EVs | Jean Douglas Carvalho et.al. | 2510.02592 | null |
| 2025-10-02 | Clink! Chop! Thud! – Learning Object Sounds from Real-World Interactions | Mengyu Yang et.al. | 2510.02313 | null |
| 2025-10-02 | kabr-tools: Automated Framework for Multi-Species Behavioral Monitoring | Jenna Kline et.al. | 2510.02030 | link |
| 2025-10-02 | Automated Defect Detection for Mass-Produced Electronic Components Based on YOLO Object Detection Models | Wei-Lung Mao et.al. | 2510.01914 | null |
| 2025-10-02 | Calibrating the Full Predictive Class Distribution of 3D Object Detectors for Autonomous Driving | Cornelius Schröder et.al. | 2510.01829 | null |
| 2025-10-01 | Breaking the Code: Security Assessment of AI Code Agents Through Systematic Jailbreaking Attacks | Shoumik Saha et.al. | 2510.01359 | null |
| 2025-10-01 | Span-level Detection of AI-generated Scientific Text via Contrastive Learning and Structural Calibration | Zhen Yin et.al. | 2510.00890 | null |
| 2025-10-01 | Adaptive Event Stream Slicing for Open-Vocabulary Event-Based Object Detection via Vision-Language Knowledge Distillation | Jinchang Zhang et.al. | 2510.00681 | null |
| 2025-10-01 | Forestpest-YOLO: A High-Performance Detection Framework for Small Forestry Pests | Aoduo Li et.al. | 2510.00547 | null |
| 2025-09-30 | Looking Beyond the Known: Towards a Data Discovery Guided Open-World Object Detection | Anay Majee et.al. | 2510.00303 | null |
| 2025-09-30 | Neural Network-Based Single-Carrier Joint Communication and Sensing: Loss Design, Constellation Shaping and Precoding | Charlotte Muth et.al. | 2509.26508 | null |
| 2025-09-30 | Point2RBox-v3: Self-Bootstrapping from Point Annotations via Integrated Pseudo-Label Refinement and Utilization | Teng Zhang et.al. | 2509.26281 | null |
| 2025-09-30 | Beyond Overall Accuracy: Pose- and Occlusion-driven Fairness Analysis in Pedestrian Detection for Autonomous Driving | Mohammad Khoshkdahan et.al. | 2509.26166 | null |
| 2025-09-30 | Towards Continual Expansion of Data Coverage: Automatic Text-guided Edge-case Synthesis | Kyeongryeol Go et.al. | 2509.26158 | null |
| 2025-09-30 | Predicting Penalty Kick Direction Using Multi-Modal Deep Learning with Pose-Guided Attention | Pasindu Ranasinghe et.al. | 2509.26088 | null |
| 2025-09-30 | Geometric Learning of Canonical Parameterizations of $2D$ -curves | Ioana Ciuclea et.al. | 2509.26070 | null |
| 2025-09-30 | CEAID: Benchmark of Multilingual Machine-Generated Text Detection Methods for Central European Languages | Dominik Macko et.al. | 2509.26051 | null |
| 2025-09-30 | Adapting SAM with Dynamic Similarity Graphs for Few-Shot Parameter-Efficient Small Dense Object Detection: A Case Study of Chickpea Pods in Field Conditions | Xintong Jiang et.al. | 2509.25805 | null |
| 2025-09-29 | AttentionViG: Cross-Attention-Based Dynamic Neighbor Aggregation in Vision GNNs | Hakan Emre Gedik et.al. | 2509.25570 | null |
| 2025-09-29 | Infrastructure Sensor-enabled Vehicle Data Generation using Multi-Sensor Fusion for Proactive Safety Applications at Work Zone | Suhala Rabab Saba et.al. | 2509.25452 | null |
| 2025-09-29 | YOLO26: Key Architectural Enhancements and Performance Benchmarking for Real-Time Object Detection | Ranjan Sapkota et.al. | 2509.25164 | null |
| 2025-09-29 | Who’s Your Judge? On the Detectability of LLM-Generated Judgments | Dawei Li et.al. | 2509.25154 | link |
| 2025-09-29 | Accelerating Dynamic Image Graph Construction on FPGA for Vision GNNs | Anvitha Ramachandran et.al. | 2509.25121 | null |
| 2025-09-29 | GeoVLM-R1: Reinforcement Fine-Tuning for Improved Remote Sensing Reasoning | Mustansar Fiaz et.al. | 2509.25026 | null |
| 2025-09-29 | Comprehensive Benchmarking of YOLOv11 Architectures for Scalable and Granular Peripheral Blood Cell Detection | Mohamad Abou Ali et.al. | 2509.24595 | null |
| 2025-09-29 | Talk in Pieces, See in Whole: Disentangling and Hierarchical Aggregating Representations for Language-based Object Detection | Sojung An et.al. | 2509.24192 | null |
| 2025-09-28 | Bridging the Task Gap: Multi-Task Adversarial Transferability in CLIP and Its Derivatives | Kuanrong Liu et.al. | 2509.23917 | null |
| 2025-09-28 | Learning Adaptive Pseudo-Label Selection for Semi-Supervised 3D Object Detection | Taehun Kong et.al. | 2509.23880 | null |
| 2025-09-28 | A Multi-Camera Vision-Based Approach for Fine-Grained Assembly Quality Control | Ali Nazeri et.al. | 2509.23815 | null |
| 2025-09-28 | Diff-3DCap: Shape Captioning with Diffusion Models | Zhenyu Shu et.al. | 2509.23718 | null |
| 2025-09-27 | On the Impact of LiDAR Point Cloud Compression on Remote Semantic Segmentation | Tiago de S. Fernandes et.al. | 2509.23341 | null |
| 2025-09-27 | C3-OWD: A Curriculum Cross-modal Contrastive Learning Framework for Open-World Detection | Siheng Wang et.al. | 2509.23316 | null |
| 2025-09-27 | FMC-DETR: Frequency-Decoupled Multi-Domain Coordination for Aerial-View Object Detection | Ben Liang et.al. | 2509.23056 | null |
| 2025-09-26 | SSVIF: Self-Supervised Segmentation-Oriented Visible and Infrared Image Fusion | Zixian Zhao et.al. | 2509.22450 | null |
| 2025-09-26 | $γ$ -Quant: Towards Learnable Quantization for Low-bit Pattern Recognition | Mishal Fatima et.al. | 2509.22448 | null |
| 2025-09-26 | HierLight-YOLO: A Hierarchical and Lightweight Object Detection Network for UAV Photography | Defan Chen et.al. | 2509.22365 | null |
| 2025-09-26 | Mixture of Detectors: A Compact View of Machine-Generated Text Detection | Sai Teja Lekkala et.al. | 2509.22147 | null |
| 2025-09-07 | S-LAM3D: Segmentation-Guided Monocular 3D Object Detection via Feature Space Fusion | Diana-Alexandra Sas et.al. | 2509.05999 | null |
| 2025-07-31 | 3D-MOOD: Lifting 2D to 3D for Monocular Open-Set Object Detection | Yung-Hsu Yang et.al. | 2507.23567 | link |
| 2025-07-24 | Protecting Vulnerable Voices: Synthetic Dataset Generation for Self-Disclosure Detection | Shalini Jangra et.al. | 2507.22930 | null |
| 2025-07-25 | Bias Analysis for Synthetic Face Detection: A Case Study of the Impact of Facial Attributes | Asmae Lamsaf et.al. | 2507.19705 | null |
| 2025-07-25 | Co-Win: Joint Object Detection and Instance Segmentation in LiDAR Point Clouds via Collaborative Window Processing | Haichuan Li et.al. | 2507.19691 | null |
| 2025-07-25 | An OpenSource CI/CD Pipeline for Variant-Rich Software-Defined Vehicles | Matthias Weiß et.al. | 2507.19446 | null |
| 2025-07-25 | EffiComm: Bandwidth Efficient Multi Agent Communication | Melih Yazgan et.al. | 2507.19354 | null |
| 2025-07-25 | Multistream Network for LiDAR and Camera-based 3D Object Detection in Outdoor Scenes | Muhammad Ibrahim et.al. | 2507.19304 | null |
| 2025-07-25 | Cross Spatial Temporal Fusion Attention for Remote Sensing Object Detection via Image Feature Matching | Abu Sadat Mohammad Salehin Amit et.al. | 2507.19118 | null |
| 2025-07-25 | Revisiting DETR for Small Object Detection via Noise-Resilient Query Optimization | Xiaocheng Fang et.al. | 2507.19059 | null |
| 2025-07-25 | YOLO for Knowledge Extraction from Vehicle Images: A Baseline Study | Saraa Al-Saddik et.al. | 2507.18966 | null |
| 2025-07-25 | WiSE-OD: Benchmarking Robustness in Infrared Object Detection | Heitor R. Medeiros et.al. | 2507.18925 | null |
| 2025-07-25 | Synthetic-to-Real Camouflaged Object Detection | Zhihao Luo et.al. | 2507.18911 | null |
| 2025-07-24 | Towards Large Scale Geostatistical Methane Monitoring with Part-based Object Detection | Adhemar de Senneville et.al. | 2507.18513 | null |
| 2025-07-24 | Human Scanpath Prediction in Target-Present Visual Search with Semantic-Foveal Bayesian Attention | João Luzio et.al. | 2507.18503 | null |
| 2025-07-24 | A COCO-Formatted Instance-Level Dataset for Plasmodium Falciparum Detection in Giemsa-Stained Blood Smears | Frauke Wilm et.al. | 2507.18483 | null |
| 2025-07-24 | Revisiting Physically Realizable Adversarial Object Attack against LiDAR-based Detection: Clarifying Problem Formulation and Experimental Protocols | Luo Cheng et.al. | 2507.18457 | null |
| 2025-07-24 | Boosting Multi-View Indoor 3D Object Detection via Adaptive 3D Volume Construction | Runmin Zhang et.al. | 2507.18331 | link |
| 2025-07-24 | LMM-Det: Make Large Multimodal Models Excel in Object Detection | Jincheng Li et.al. | 2507.18300 | link |
| 2025-07-24 | Evaluation of facial landmark localization performance in a surgical setting | Ines Frajtag et.al. | 2507.18248 | null |
| 2025-07-24 | Real-Time Object Detection and Classification using YOLO for Edge FPGAs | Rashed Al Amin et.al. | 2507.18174 | null |
| 2025-07-24 | WaveMamba: Wavelet-Driven Mamba Fusion for RGB-Infrared Object Detection | Haodong Zhu et.al. | 2507.18173 | null |
| 2025-07-24 | OpenNav: Open-World Navigation with Multimodal Large Language Models | Mingfeng Yuan et.al. | 2507.18033 | null |
| 2025-07-23 | Bearded Dragon Activity Recognition Pipeline: An AI-Based Approach to Behavioural Monitoring | Arsen Yermukan et.al. | 2507.17987 | null |
| 2025-07-23 | FishDet-M: A Unified Large-Scale Benchmark for Robust Fish Detection and CLIP-Guided Model Selection in Diverse Aquatic Visual Domains | Muayad Abujabal et.al. | 2507.17859 | null |
| 2025-07-23 | Perspective-Invariant 3D Object Detection | Ao Liang et.al. | 2507.17665 | null |
| 2025-07-23 | Constructing Ophthalmic MLLM for Positioning-diagnosis Collaboration Through Clinical Cognitive Chain Reasoning | Xinyao Liu et.al. | 2507.17539 | link |
| 2025-07-23 | Illicit object detection in X-ray imaging using deep learning techniques: A comparative evaluation | Jorgen Cani et.al. | 2507.17508 | link |
| 2025-07-23 | Dynamic-DINO: Fine-Grained Mixture of Experts Tuning for Real-time Open-Vocabulary Object Detection | Yehao Lu et.al. | 2507.17436 | null |
| 2025-07-23 | SFUOD: Source-Free Unknown Object Detection | Keon-Hee Park et.al. | 2507.17373 | null |
| 2025-07-23 | Optimizing Delivery Logistics: Enhancing Speed and Safety with Drone Technology | Maharshi Shastri et.al. | 2507.17253 | null |
| 2025-07-23 | A Low-Cost Machine Learning Approach for Timber Diameter Estimation | Fatemeh Hasanzadeh Fard et.al. | 2507.17219 | null |
| 2025-07-22 | Few-Shot Learning in Video and 3D Object Detection: A Survey | Md Meftahul Ferdaus et.al. | 2507.17079 | null |
| 2025-07-22 | Transformer Based Building Boundary Reconstruction using Attraction Field Maps | Muhammad Kamran et.al. | 2507.17038 | null |
| 2025-07-22 | Divisive Decisions: Improving Salience-Based Training for Generalization in Binary Classification Tasks | Jacob Piland et.al. | 2507.17000 | null |
| 2025-07-22 | Task-Specific Zero-shot Quantization-Aware Training for Object Detection | Changhao Li et.al. | 2507.16782 | null |
| 2025-07-22 | Screen2AX: Vision-Based Approach for Automatic macOS Accessibility Generation | Viktor Muryn et.al. | 2507.16704 | null |
| 2025-07-22 | QRetinex-Net: Quaternion-Valued Retinex Decomposition for Low-Level Computer Vision Applications | Sos Agaian et.al. | 2507.16683 | null |
| 2025-07-22 | Benchmarking pig detection and tracking under diverse and challenging conditions | Jonathan Henrich et.al. | 2507.16639 | null |
| 2025-07-22 | A2Mamba: Attention-augmented State Space Models for Visual Recognition | Meng Lou et.al. | 2507.16624 | null |
| 2025-07-22 | PlantSAM: An Object Detection-Driven Segmentation Pipeline for Herbarium Specimens | Youcef Sklab et.al. | 2507.16506 | null |
| 2025-07-22 | Towards Railway Domain Adaptation for LiDAR-based 3D Detection: Road-to-Rail and Sim-to-Real via SynDRA-BBox | Xavier Diaz et.al. | 2507.16413 | null |
| 2025-07-22 | Scene Text Detection and Recognition “in light of” Challenging Environmental Conditions using Aria Glasses Egocentric Vision Cameras | Joseph De Mathia et.al. | 2507.16330 | null |
| 2025-07-22 | MAN++: Scaling Momentum Auxiliary Network for Supervised Local Learning in Vision Tasks | Junhao Su et.al. | 2507.16279 | null |
| 2025-07-22 | Edge-case Synthesis for Fisheye Object Detection: A Data-centric Perspective | Seunghyeon Kim et.al. | 2507.16254 | null |
| 2025-07-21 | Experimenting active and sequential learning in a medieval music manuscript | Sachin Sharma et.al. | 2507.15633 | null |
| 2025-07-21 | Few-Shot Object Detection via Spatial-Channel State Space Model | Zhimeng Xin et.al. | 2507.15308 | null |
| 2025-07-21 | Beyond Easy Wins: A Text Hardness-Aware Benchmark for LLM-generated Text Detection | Navid Ayoobi et.al. | 2507.15286 | null |
| 2025-07-20 | Event-based Graph Representation with Spatial and Motion Vectors for Asynchronous Object Detection | Aayush Atul Verma et.al. | 2507.15150 | null |
| 2025-07-20 | BleedOrigin: Dynamic Bleeding Source Localization in Endoscopic Submucosal Dissection via Dual-Stage Detection and Tracking | Mengya Xu et.al. | 2507.15094 | null |
| 2025-07-20 | InsightX Agent: An LMM-based Agentic Framework with Integrated Tools for Reliable X-ray NDT Analysis | Jiale Liu et.al. | 2507.14899 | null |
| 2025-07-20 | An Uncertainty-aware DETR Enhancement Framework for Object Detection | Xingshu Chen et.al. | 2507.14855 | null |
| 2025-07-20 | Seeing Through Deepfakes: A Human-Inspired Framework for Multi-Face Detection | Juan Hu et.al. | 2507.14807 | null |
| 2025-07-19 | GCC-Spam: Spam Detection via GAN, Contrastive Learning, and Character Similarity Networks | Zixin Xu et.al. | 2507.14679 | null |
| 2025-07-19 | Multispectral State-Space Feature Fusion: Bridging Shared and Cross-Parametric Interactions for Object Detection | Jifeng Shen et.al. | 2507.14643 | null |
| 2025-07-18 | C-DOG: Training-Free Multi-View Multi-Object Association in Dense Scenes Without Visual Feature via Connected δ-Overlap Graphs | Yung-Hong Sun et.al. | 2507.14095 | null |
| 2025-07-18 | Enhancing LiDAR Point Features with Foundation Model Priors for 3D Object Detection | Yujian Mo et.al. | 2507.13899 | null |
| 2025-07-18 | Moving Object Detection from Moving Camera Using Focus of Expansion Likelihood and Segmentation | Masahiro Ogawa et.al. | 2507.13628 | null |
| 2025-07-17 | NSF-DOE Vera C. Rubin Observatory Observations of Interstellar Comet 3I/ATLAS (C/2025 N1) | Colin Orion Chandler et.al. | 2507.13409 | null |
| 2025-07-17 | A Real-Time System for Egocentric Hand-Object Interaction Detection in Industrial Domains | Antonio Finocchiaro et.al. | 2507.13326 | null |
| 2025-07-17 | RS-TinyNet: Stage-wise Feature Fusion Network for Detecting Tiny Objects in Remote Sensing Images | Xiaozheng Jiang et.al. | 2507.13120 | null |
| 2025-07-17 | Decoupled PROB: Decoupled Query Initialization Tasks and Objectness-Class Learning for Open World Object Detection | Riku Inoue et.al. | 2507.13085 | null |
| 2025-07-17 | Dual LiDAR-Based Traffic Movement Count Estimation at a Signalized Intersection: Deployment, Data Collection, and Preliminary Analysis | Saswat Priyadarshi Nayak et.al. | 2507.13073 | null |
| 2025-07-17 | SOD-YOLO: Enhancing YOLO-Based Detection of Small Objects in UAV Imagery | Peijun Wang et.al. | 2507.12727 | null |
| 2025-07-16 | Vision-based Perception for Autonomous Vehicles in Obstacle Avoidance Scenarios | Van-Hoang-Anh Phan et.al. | 2507.12449 | null |
| 2025-07-16 | InterpIoU: Rethinking Bounding Box Regression with Interpolation-Based IoU Optimization | Haoyuan Liu et.al. | 2507.12420 | null |
| 2025-07-16 | AutoVDC: Automated Vision Data Cleaning Using Vision-Language Models | Santosh Vasa et.al. | 2507.12414 | null |
| 2025-07-16 | OD-VIRAT: A Large-Scale Benchmark for Object Detection in Realistic Surveillance Environments | Hayat Ullah et.al. | 2507.12396 | null |
| 2025-07-16 | Improving Lightweight Weed Detection via Knowledge Distillation | Ahmet Oğuz Saltık et.al. | 2507.12344 | null |
| 2025-07-16 | SS-DC: Spatial-Spectral Decoupling and Coupling Across Visible-Infrared Gap for Domain Adaptive Object Detection | Xiwei Zhang et.al. | 2507.12017 | null |
| 2025-07-16 | Frequency-Dynamic Attention Modulation for Dense Prediction | Linwei Chen et.al. | 2507.12006 | null |
| 2025-07-15 | Tomato Multi-Angle Multi-Pose Dataset for Fine-Grained Phenotyping | Yujie Zhang et.al. | 2507.11279 | null |
| 2025-07-15 | Using Continual Learning for Real-Time Detection of Vulnerable Road Users in Complex Traffic Scenarios | Faryal Aurooj Nasir et.al. | 2507.11046 | null |
| 2025-07-15 | Combining Transformers and CNNs for Efficient Object Detection in High-Resolution Satellite Imagery | Nicolas Drapier et.al. | 2507.11040 | null |
| 2025-07-14 | A Lightweight and Robust Framework for Real-Time Colorectal Polyp Detection Using LOF-Based Preprocessing and YOLO-v11n | Saadat Behzadi et.al. | 2507.10864 | null |
| 2025-07-14 | LLM-Guided Agentic Object Detection for Open-World Understanding | Furkan Mumcu et.al. | 2507.10844 | null |
| 2025-07-14 | Versatile and Generalizable Manipulation via Goal-Conditioned Reinforcement Learning with Grounded Object Detection | Huiyi Wang et.al. | 2507.10814 | null |
| 2025-07-14 | Fine-Grained Zero-Shot Object Detection | Hongxu Ma et.al. | 2507.10358 | null |
| 2025-07-14 | BlueGlass: A Framework for Composite AI Safety | Harshal Nandigramwar et.al. | 2507.10106 | null |
| 2025-07-14 | SRG/ART-XC All-Sky X-ray Survey: Sensitivity Assessment Based on Aperture Photometry | N. Y. Tyrin et.al. | 2507.10060 | null |
| 2025-07-14 | 3DGAA: Realistic and Robust 3D Gaussian-based Adversarial Attack for Autonomous Driving | Yixun Zhang et.al. | 2507.09993 | null |
| 2025-07-14 | Measuring the Impact of Rotation Equivariance on Aerial Object Detection | Xiuyu Wu et.al. | 2507.09896 | null |
| 2025-07-14 | Secure and Efficient UAV-Based Face Detection via Homomorphic Encryption and Edge Computing | Nguyen Van Duc et.al. | 2507.09860 | null |
| 2025-07-13 | MLoRQ: Bridging Low-Rank and Quantization for Transformer Compression | Ofir Gordon et.al. | 2507.09616 | null |
| 2025-07-12 | Stereo-based 3D Anomaly Object Detection for Autonomous Driving: A New Dataset and Baseline | Shiyi Mu et.al. | 2507.09214 | null |
| 2025-07-12 | On the Fragility of Multimodal Perception to Temporal Misalignment in Autonomous Driving | Md Hasan Shahriar et.al. | 2507.09095 | null |
| 2025-07-11 | VISTA: A Visual Analytics Framework to Enhance Foundation Model-Generated Data Labels | Xiwei Xuan et.al. | 2507.09008 | null |
| 2025-07-11 | RoundaboutHD: High-Resolution Real-World Urban Environment Benchmark for Multi-Camera Vehicle Tracking | Yuqiang Lin et.al. | 2507.08729 | null |
| 2025-07-11 | DatasetAgent: A Novel Multi-Agent System for Auto-Constructing Datasets from Real-World Images | Haoran Sun et.al. | 2507.08648 | null |
| 2025-07-11 | OnlineBEV: Recurrent Temporal Fusion in Bird’s Eye View Representations for Multi-Camera 3D Perception | Junho Koh et.al. | 2507.08644 | null |
| 2025-07-11 | Smelly, dense, and spreaded: The Object Detection for Olfactory References (ODOR) dataset | Mathias Zinnen et.al. | 2507.08384 | null |
| 2025-07-11 | Spectroscopic Observations of Four Candidates for Blue Large-Amplitude Pulsators. No BLAPs at High Galactic Latitudes | P. Pietrukowicz et.al. | 2507.08372 | null |
| 2025-07-11 | Understanding Driving Risks using Large Language Models: Toward Elderly Driver Assessment | Yuki Yoshihara et.al. | 2507.08367 | null |
| 2025-07-10 | An Embedded Real-time Object Alert System for Visually Impaired: A Monocular Depth Estimation based Approach through Computer Vision | Jareen Anjom et.al. | 2507.08165 | null |
| 2025-07-10 | Rainbow Artifacts from Electromagnetic Signal Injection Attacks on Image Sensors | Youqian Zhang et.al. | 2507.07773 | null |
| 2025-07-09 | Automated Video Segmentation Machine Learning Pipeline | Johannes Merz et.al. | 2507.07242 | null |
| 2025-07-09 | Aerial Maritime Vessel Detection and Identification | Antonella Barisic Kulas et.al. | 2507.07153 | null |
| 2025-07-09 | DenoiseCP-Net: Efficient Collective Perception in Adverse Weather via Joint LiDAR-Based 3D Object Detection and Denoising | Sven Teufel et.al. | 2507.06976 | null |
| 2025-07-09 | A multi-modal dataset for insect biodiversity with imagery and DNA at the trap and individual level | Johanna Orsholm et.al. | 2507.06972 | null |
| 2025-07-09 | Dataset and Benchmark for Enhancing Critical Retained Foreign Object Detection | Yuli Wang et.al. | 2507.06937 | null |
| 2025-07-09 | Unlocking Thermal Aerial Imaging: Synthetic Enhancement of UAV Datasets | Antonella Barisic Kulas et.al. | 2507.06797 | null |
| 2025-07-09 | LOVON: Legged Open-Vocabulary Object Navigator | Daojie Peng et.al. | 2507.06747 | null |
| 2025-07-09 | EA: An Event Autoencoder for High-Speed Vision Sensing | Riadul Islam et.al. | 2507.06459 | null |
| 2025-07-08 | Hierarchical Multi-Stage Transformer Architecture for Context-Aware Temporal Action Localization | Hayat Ullah et.al. | 2507.06411 | null |
| 2025-07-08 | ECORE: Energy-Conscious Optimized Routing for Deep Learning Models at the Edge | Daghash K. Alqahtani et.al. | 2507.06011 | null |
| 2025-07-08 | R-VLM: Region-Aware Vision Language Model for Precise GUI Grounding | Joonhyung Park et.al. | 2507.05673 | null |
| 2025-07-07 | YOLO-APD: Enhancing YOLOv8 for Robust Pedestrian Detection on Complex Road Geometries | Aquino Joctum et.al. | 2507.05376 | null |
| 2025-07-07 | From a Different Star: 3I/ATLAS in the context of the Ōtautahi-Oxford interstellar object population model | Matthew J. Hopkins et.al. | 2507.05318 | null |
| 2025-07-07 | Beyond One Shot, Beyond One Perspective: Cross-View and Long-Horizon Distillation for Better LiDAR Representations | Xiang Xu et.al. | 2507.05260 | null |
| 2025-07-07 | AI Generated Text Detection Using Instruction Fine-tuned Large Language and Transformer-Based Models | Chinnappa Guggilla et.al. | 2507.05157 | null |
| 2025-07-07 | LERa: Replanning with Visual Feedback in Instruction Following | Svyatoslav Pchelintsev et.al. | 2507.05135 | null |
| 2025-07-07 | Robustifying 3D Perception through Least-Squares Multi-Agent Graphs Object Tracking | Maria Damanaki et.al. | 2507.04762 | null |
| 2025-07-07 | CVFusion: Cross-View Fusion of 4D Radar and Camera for 3D Object Detection | Hanzhi Zhong et.al. | 2507.04587 | null |
| 2025-07-06 | MambaFusion: Height-Fidelity Dense Global Fusion for Multi-modal 3D Object Detection | Hanshi Wang et.al. | 2507.04369 | null |
| 2025-07-06 | DMAT: An End-to-End Framework for Joint Atmospheric Turbulence Mitigation and Object Detection | Paul Hill et.al. | 2507.04323 | null |
| 2025-07-06 | ZERO: Multi-modal Prompt-based Visual Grounding | Sangbum Choi et.al. | 2507.04270 | null |
| 2025-07-05 | Towards Accurate and Efficient 3D Object Detection for Autonomous Driving: A Mixture of Experts Computing System on Edge | Linshen Liu et.al. | 2507.04123 | null |
| 2025-07-04 | Zero Memory Overhead Approach for Protecting Vision Transformer Parameters | Fereshteh Baradaran et.al. | 2507.03816 | null |
| 2025-07-03 | Partial Weakly-Supervised Oriented Object Detection | Mingxin Liu et.al. | 2507.02751 | null |
| 2025-07-03 | Automatic Labelling for Low-Light Pedestrian Detection | Dimitrios Bouzoulas et.al. | 2507.02513 | null |
| 2025-07-03 | Weakly-supervised Contrastive Learning with Quantity Prompts for Moving Infrared Small Target Detection | Weiwei Duan et.al. | 2507.02454 | null |
| 2025-07-03 | A Late Collaborative Perception Framework for 3D Multi-Object and Multi-Source Association and Fusion | Maryem Fadili et.al. | 2507.02430 | null |
| 2025-07-03 | PLOT: Pseudo-Labeling via Video Object Tracking for Scalable Monocular 3D Object Detection | Seokyeong Lee et.al. | 2507.02393 | null |
| 2025-07-03 | Two-Steps Neural Networks for an Automated Cerebrovascular Landmark Detection | Rafic Nader et.al. | 2507.02349 | null |
| 2025-07-03 | Perception Activator: An intuitive and portable framework for brain cognitive exploration | Le Xu et.al. | 2507.02311 | null |
| 2025-07-03 | Understanding Trade offs When Conditioning Synthetic Data | Brandon Trabucco et.al. | 2507.02217 | null |
| 2025-07-02 | How Well Does GPT-4o Understand Vision? Evaluating Multimodal Foundation Models on Standard Computer Vision Tasks | Rahul Ramachandran et.al. | 2507.01955 | link |
| 2025-07-02 | Survivability of Backdoor Attacks on Unconstrained Face Recognition Systems | Quentin Le Roux et.al. | 2507.01607 | null |
| 2025-07-02 | Learning from Random Subspace Exploration: Generalized Test-Time Augmentation with Self-supervised Distillation | Andrei Jelea et.al. | 2507.01347 | null |
| 2025-07-01 | Rapid Salient Object Detection with Difference Convolutional Neural Networks | Zhuo Su et.al. | 2507.01182 | null |
| 2025-07-01 | Robust Component Detection for Flexible Manufacturing: A Deep Learning Approach to Tray-Free Object Recognition under Variable Lighting | Fatemeh Sadat Daneshmand et.al. | 2507.00852 | null |
| 2025-07-01 | UAVD-Mamba: Deformable Token Fusion Vision Mamba for Multimodal UAV Detection | Wei Li et.al. | 2507.00849 | null |
| 2025-07-01 | High-Frequency Semantics and Geometric Priors for End-to-End Detection Transformers in Challenging UAV Imagery | Hongxing Peng et.al. | 2507.00825 | null |
| 2025-07-01 | Multi-Modal Graph Convolutional Network with Sinusoidal Encoding for Robust Human Action Segmentation | Hao Xing et.al. | 2507.00752 | null |
| 2025-07-01 | UPRE: Zero-Shot Domain Adaptation for Object Detection via Unified Prompt and Representation Enhancement | Xiao Zhang et.al. | 2507.00721 | null |
| 2025-07-01 | Rectifying Magnitude Neglect in Linear Attention | Qihang Fan et.al. | 2507.00698 | link |
| 2025-06-30 | Continual Adaptation: Environment-Conditional Parameter Generation for Object Detection in Dynamic Scenarios | Deng Li et.al. | 2506.24063 | null |
| 2025-06-30 | Visual Textualization for Image Prompted Object Detection | Yongjian Wu et.al. | 2506.23785 | null |
| 2025-06-30 | PBCAT: Patch-based composite adversarial training against physically realizable attacks on object detection | Xiao Li et.al. | 2506.23581 | null |
| 2025-06-30 | Event-based Tiny Object Detection: A Benchmark Dataset and Baseline | Nuo Chen et.al. | 2506.23575 | null |
| 2025-06-30 | OcRFDet: Object-Centric Radiance Fields for Multi-View 3D Object Detection in Autonomous Driving | Mingqian Ji et.al. | 2506.23565 | null |
| 2025-06-30 | From Sight to Insight: Unleashing Eye-Tracking in Weakly Supervised Video Salient Object Detection | Qi Qin et.al. | 2506.23519 | null |
| 2025-06-30 | Improve Underwater Object Detection through YOLOv12 Architecture and Physics-informed Augmentation | Tinh Nguyen et.al. | 2506.23505 | null |
| 2025-06-29 | Detecting What Matters: A Novel Approach for Out-of-Distribution 3D Object Detection in Autonomous Vehicles | Menna Taha et.al. | 2506.23426 | null |
| 2025-06-29 | Layer Decomposition and Morphological Reconstruction for Task-Oriented Infrared Image Enhancement | Siyuan Chai et.al. | 2506.23353 | null |
| 2025-06-29 | GeoProg3D: Compositional Visual Reasoning for City-Scale 3D Language Fields | Shunsuke Yasuki et.al. | 2506.23352 | null |
| 2025-06-27 | Attention-disentangled Uniform Orthogonal Feature Space Optimization for Few-shot Object Detection | Taijin Zhao et.al. | 2506.22161 | null |
| 2025-06-27 | Evaluating Pointing Gestures for Target Selection in Human-Robot Collaboration | Noora Sassali et.al. | 2506.22116 | null |
| 2025-06-27 | CERBERUS: Crack Evaluation & Recognition Benchmark for Engineering Reliability & Urban Stability | Justin Reinman et.al. | 2506.21909 | null |
| 2025-06-27 | Visual Content Detection in Educational Videos with Transfer Learning and Dataset Enrichment | Dipayan Biswas et.al. | 2506.21903 | null |
| 2025-06-27 | Embodied Domain Adaptation for Object Detection | Xiangyu Shi et.al. | 2506.21860 | null |
| 2025-06-26 | PhotonSplat: 3D Scene Reconstruction and Colorization from SPAD Sensors | Sai Sri Teja et.al. | 2506.21680 | null |
| 2025-06-26 | Towards Reliable Detection of Empty Space: Conditional Marked Point Processes for Object Detection | Tobias J. Riedlinger et.al. | 2506.21486 | null |
| 2025-06-26 | TITAN: Query-Token based Domain Adaptive Adversarial Learning | Tajamul Ashraf et.al. | 2506.21484 | null |
| 2025-06-26 | A Comprehensive Dataset for Underground Miner Detection in Diverse Scenario | Cyrus Addy et.al. | 2506.21451 | null |
| 2025-06-26 | DuET: Dual Incremental Object Detection via Exemplar-Free Task Arithmetic | Munish Monga et.al. | 2506.21260 | null |
| 2025-06-26 | LASFNet: A Lightweight Attention-Guided Self-Modulation Feature Fusion Network for Multimodal Object Detection | Lei Hao et.al. | 2506.21018 | null |
| 2025-06-26 | ThermalDiffusion: Visual-to-Thermal Image-to-Image Translation for Autonomous Navigation | Shruti Bansal et.al. | 2506.20969 | null |
| 2025-06-25 | Lightweight Multi-Frame Integration for Robust YOLO Object Detection in Videos | Yitong Quan et.al. | 2506.20550 | null |
| 2025-06-25 | Learning-based safety lifting monitoring system for cranes on construction sites | Hao Chen et.al. | 2506.20475 | null |
| 2025-06-25 | Feature Hallucination for Self-supervised Action Recognition | Lei Wang et.al. | 2506.20342 | null |
| 2025-06-25 | From Codicology to Code: A Comparative Study of Transformer and YOLO-based Detectors for Layout Analysis in Historical Documents | Sergio Torres Aguilar et.al. | 2506.20326 | null |
| 2025-06-25 | TDiR: Transformer based Diffusion for Image Restoration Tasks | Abbas Anwar et.al. | 2506.20302 | null |
| 2025-06-25 | Integrated optomechanical ultrasonic sensors with nano-Pascal-level sensitivity | Xuening Cao et.al. | 2506.20219 | null |
| 2025-06-24 | A Survey of Multi-sensor Fusion Perception for Embodied AI: Background, Methods, Challenges and Prospects | Shulan Ruan et.al. | 2506.19769 | null |
| 2025-06-24 | Semantic Scene Graph for Ultrasound Image Explanation and Scanning Guidance | Xuesong Li et.al. | 2506.19683 | null |
| 2025-06-24 | Probabilistic modelling and safety assurance of an agriculture robot providing light-treatment | Mustafa Adam et.al. | 2506.19620 | null |
| 2025-06-24 | USIS16K: High-Quality Dataset for Underwater Salient Instance Segmentation | Lin Hong et.al. | 2506.19472 | null |
| 2025-06-23 | SpaNN: Detecting Multiple Adversarial Patches on CNNs by Spanning Saliency Thresholds | Mauricio Byrd Victorica et.al. | 2506.18591 | null |
| 2025-06-23 | Improvement on LiDAR-Camera Calibration Using Square Targets | Zhongyuan Li et.al. | 2506.18294 | null |
| 2025-06-23 | Learning Approach to Efficient Vision-based Active Tracking of a Flying Target by an Unmanned Aerial Vehicle | Jagadeswara PKV Pothuri et.al. | 2506.18264 | null |
| 2025-06-23 | Ground tracking for improved landmine detection in a GPR system | Li Tang et.al. | 2506.18258 | null |
| 2025-06-24 | Referring Expression Instance Retrieval and A Strong End-to-End Baseline | Xiangzhao Hao et.al. | 2506.18246 | null |
| 2025-06-24 | Unfolding the Past: A Comprehensive Deep Learning Approach to Analyzing Incunabula Pages | Klaudia Ropel et.al. | 2506.18069 | null |
| 2025-06-21 | YOLOv13: Real-Time Object Detection with Hypergraph-Enhanced Adaptive Visual Perception | Mengqi Lei et.al. | 2506.17733 | link |
| 2025-06-21 | CSDN: A Context-Gated Self-Adaptive Detection Network for Real-Time Object Detection | Wei Haolin et.al. | 2506.17679 | null |
| 2025-06-21 | DRAMA-X: A Fine-grained Intent Prediction and Risk Reasoning Benchmark For Driving | Mihir Godbole et.al. | 2506.17590 | null |
| 2025-06-20 | YASMOT: Yet another stereo image multi-object tracker | Ketil Malde et.al. | 2506.17186 | link |
| 2025-06-20 | Class Agnostic Instance-level Descriptor for Visual Instance Search | Qi-Ying Sun et.al. | 2506.16745 | null |
| 2025-06-20 | Cross-modal Offset-guided Dynamic Alignment and Fusion for Weakly Aligned UAV Object Detection | Liu Zongzhen et.al. | 2506.16737 | null |
| 2025-06-19 | How Hard Is Snow? A Paired Domain Adaptation Dataset for Clear and Snowy Weather: CADC+ | Mei Qi Tang et.al. | 2506.16531 | null |
| 2025-06-19 | Can AI Dream of Unseen Galaxies? Conditional Diffusion Model for Galaxy Morphology Augmentation | Chenrui Ma et.al. | 2506.16233 | null |
| 2025-06-19 | VideoGAN-based Trajectory Proposal for Automated Vehicles | Annajoyce Mariani et.al. | 2506.16209 | null |
| 2025-06-19 | BLADE: An Automated Framework for Classifying Light Curves from the Center for Near-Earth Object Studies (CNEOS) Fireball Database | Elizabeth A. Silber et.al. | 2506.16099 | null |
| 2025-06-19 | Polyline Path Masked Attention for Vision Transformer | Zhongchen Zhao et.al. | 2506.15940 | link |
| 2025-06-18 | PhantomHunter: Detecting Unseen Privately-Tuned LLM-Generated Text via Family-Aware Learning | Yuhui Shi et.al. | 2506.15683 | null |
| 2025-06-18 | BoxFusion: Reconstruction-Free Open-Vocabulary 3D Object Detection via Real-Time Multi-View Box Fusion | Yuqing Lan et.al. | 2506.15610 | null |
| 2025-06-18 | Retrospective Memory for Camouflaged Object Detection | Chenxi Zhang et.al. | 2506.15244 | null |
| 2025-06-18 | Fiber Signal Denoising Algorithm using Hybrid Deep Learning Networks | Linlin Wang et.al. | 2506.15125 | null |
| 2025-06-19 | Efficient Retail Video Annotation: A Robust Key Frame Generation Approach for Product and Customer Interaction Analysis | Varun Mannam et.al. | 2506.14854 | null |
| 2025-06-18 | YOLOv11-RGBT: Towards a Comprehensive Single-Stage Multispectral Object Detection Framework | Dahang Wan et.al. | 2506.14696 | null |
| 2025-06-17 | VisText-Mosquito: A Multimodal Dataset and Benchmark for AI-Based Mosquito Breeding Site Detection and Reasoning | Md. Adnanul Islam et.al. | 2506.14629 | link |
| 2025-06-17 | GAMORA: A Gesture Articulated Meta Operative Robotic Arm for Hazardous Material Handling in Containment-Level Environments | Farha Abdul Wasay et.al. | 2506.14513 | null |
| 2025-06-17 | Comparison of Two Methods for Stationary Incident Detection Based on Background Image | Deepak Ghimire et.al. | 2506.14256 | null |
| 2025-06-16 | A Point Cloud Completion Approach for the Grasping of Partially Occluded Objects and Its Applications in Robotic Strawberry Harvesting | Ali Abouzeid et.al. | 2506.14066 | link |
| 2025-06-16 | FindMeIfYouCan: Bringing Open Set metrics to $\textit{near} $, $ \textit{far} $ and $\textit{farther}$ Out-of-Distribution Object Detection | Daniel Montoya et.al. | 2506.14008 | null |
| 2025-06-16 | How Real is CARLAs Dynamic Vision Sensor? A Study on the Sim-to-Real Gap in Traffic Object Detection | Kaiyuan Tan et.al. | 2506.13722 | null |
| 2025-06-17 | Lecture Video Visual Objects (LVVO) Dataset: A Benchmark for Visual Object Detection in Educational Videos | Dipayan Biswas et.al. | 2506.13657 | link |
| 2025-06-16 | UAV Object Detection and Positioning in a Mining Industrial Metaverse with Custom Geo-Referenced Data | Vasiliki Balaska et.al. | 2506.13505 | null |
| 2025-06-16 | Sparse Convolutional Recurrent Learning for Efficient Event-based Neuromorphic Object Detection | Shenqi Wang et.al. | 2506.13440 | null |
| 2025-06-16 | Cognitive Synergy Architecture: SEGO for Human-Centric Collaborative Robots | Jaehong Oh et.al. | 2506.13149 | null |
| 2025-06-15 | MGDFIS: Multi-scale Global-detail Feature Integration Strategy for Small Object Detection | Yuxiang Wang et.al. | 2506.12697 | null |
| 2025-06-14 | UniDet-D: A Unified Dynamic Spectral Attention Model for Object Detection under Adverse Weathers | Yuantao Wang et.al. | 2506.12324 | null |
| 2025-06-14 | MatchPlant: An Open-Source Pipeline for UAV-Based Single-Plant Detection and Data Extraction | Worasit Sangjan et.al. | 2506.12295 | link |
| 2025-06-13 | Vision-based Lifting of 2D Object Detections for Automated Driving | Hendrik Königshof et.al. | 2506.11839 | null |
| 2025-06-13 | Teleoperated Driving: a New Challenge for 3D Object Detection in Compressed Point Clouds | Filippo Bragato et.al. | 2506.11804 | null |
| 2025-06-13 | GPLQ: A General, Practical, and Lightning QAT Method for Vision Transformers | Guang Liang et.al. | 2506.11784 | null |
| 2025-06-13 | On the Natural Robustness of Vision-Language Models Against Visual Perception Attacks in Autonomous Driving | Pedram MohajerAnsari et.al. | 2506.11472 | null |
| 2025-06-12 | Teaching in adverse scenes: a statistically feedback-driven threshold and mask adjustment teacher-student framework for object detection in UAV images under adverse scenes | Hongyu Chen et.al. | 2506.11175 | null |
| 2025-06-12 | Discrete Lorenz Attractors in 3D Sinusoidal Maps | Sishu Shankar Muni et.al. | 2506.10788 | null |
| 2025-06-12 | Uncertainty-Masked Bernoulli Diffusion for Camouflaged Object Detection Refinement | Yuqi Shen et.al. | 2506.10712 | null |
| 2025-06-12 | Semantic-decoupled Spatial Partition Guided Point-supervised Oriented Object Detection | Xinyuan Liu et.al. | 2506.10601 | link |
| 2025-06-12 | Improving Medical Visual Representation Learning with Pathological-level Cross-Modal Alignment and Correlation Exploration | Jun Wang et.al. | 2506.10573 | null |
| 2025-06-12 | FSATFusion: Frequency-Spatial Attention Transformer for Infrared and Visible Image Fusion | Tianpei Zhang et.al. | 2506.10366 | link |
| 2025-06-11 | DySS: Dynamic Queries and State-Space Learning for Efficient 3D Object Detection from Multi-Camera Videos | Rajeev Yasarla et.al. | 2506.10242 | null |
| 2025-06-11 | CEM-FBGTinyDet: Context-Enhanced Foreground Balance with Gradient Tuning for tiny Objects | Tao Liu et.al. | 2506.09897 | null |
| 2025-06-11 | 3DGeoDet: General-purpose Geometry-aware Image-based 3D Object Detection | Yi Zhang et.al. | 2506.09541 | null |
| 2025-06-11 | MSSDF: Modality-Shared Self-supervised Distillation for High-Resolution Multi-modal Remote Sensing Image Learning | Tong Wang et.al. | 2506.09327 | null |
| 2025-06-10 | Efficient Edge Deployment of Quantized YOLOv4-Tiny for Aerial Emergency Object Detection on Raspberry Pi 5 | Sindhu Boddu et.al. | 2506.09300 | null |
| 2025-06-10 | Lightweight Object Detection Using Quantized YOLOv4-Tiny for Emergency Response in Aerial Imagery | Sindhu Boddu et.al. | 2506.09299 | null |
| 2025-06-10 | WD-DETR: Wavelet Denoising-Enhanced Real-Time Object Detection Transformer for Robot Perception with Event Cameras | Yangjie Cui et.al. | 2506.09098 | null |
| 2025-06-11 | Cosmos-Drive-Dreams: Scalable Synthetic Driving Data Generation with World Foundation Models | Xuanchi Ren et.al. | 2506.09042 | link |
| 2025-06-10 | ADAM: Autonomous Discovery and Annotation Model using LLMs for Context-Aware Annotations | Amirreza Rouhi et.al. | 2506.08968 | null |
| 2025-06-10 | Data Augmentation For Small Object using Fast AutoAugment | DaeEun Yoon et.al. | 2506.08956 | null |
| 2025-06-11 | Gaussian2Scene: 3D Scene Representation Learning via Self-supervised Learning with 3D Gaussian Splatting | Keyi Liu et.al. | 2506.08777 | null |
| 2025-06-09 | CrosswalkNet: An Optimized Deep Learning Framework for Pedestrian Crosswalk Detection in Aerial Images with High-Performance Computing | Zubin Bhuyan et.al. | 2506.07885 | null |
| 2025-06-09 | SAM2Auto: Auto Annotation Using FLASH | Arash Rocky et.al. | 2506.07850 | null |
| 2025-06-09 | Design and Evaluation of Deep Learning-Based Dual-Spectrum Image Fusion Methods | Beining Xu et.al. | 2506.07779 | null |
| 2025-06-09 | SpikeSMOKE: Spiking Neural Networks for Monocular 3D Object Detection with Cross-Scale Gated Coding | Xuemei Chen et.al. | 2506.07737 | null |
| 2025-06-09 | Domain Randomization for Object Detection in Manufacturing Applications using Synthetic Data: A Comprehensive Study | Xiaomeng Zhu et.al. | 2506.07539 | null |
| 2025-06-09 | SpatialLM: Training Large Language Models for Structured Indoor Modeling | Yongsen Mao et.al. | 2506.07491 | link |
| 2025-06-09 | Happiness Finder: Exploring the Role of AI in Enhancing Well-Being During Four-Leaf Clover Searches | Anna Yokokubo et.al. | 2506.07393 | null |
| 2025-06-09 | Multiple Object Stitching for Unsupervised Representation Learning | Chengchao Shen et.al. | 2506.07364 | link |
| 2025-06-09 | CBAM-STN-TPS-YOLO: Enhancing Agricultural Object Detection through Spatially Adaptive Attention Mechanisms | Satvik Praveen et.al. | 2506.07357 | null |
| 2025-06-08 | UCOD-DPL: Unsupervised Camouflaged Object Detection via Dynamic Pseudo-label Learning | Weiqi Yan et.al. | 2506.07087 | null |
| 2025-06-06 | Domain-RAG: Retrieval-Guided Compositional Image Generation for Cross-Domain Few-Shot Object Detection | Yu Li et.al. | 2506.05872 | null |
| 2025-06-06 | Token Transforming: A Unified and Training-Free Token Compression Framework for Vision Transformer Acceleration | Fanhu Zeng et.al. | 2506.05709 | null |
| 2025-06-06 | Integer Binary-Range Alignment Neuron for Spiking Neural Networks | Binghao Ye et.al. | 2506.05679 | null |
| 2025-06-05 | CL-ISR: A Contrastive Learning and Implicit Stance Reasoning Framework for Misleading Text Detection on Social Media | Tianyi Huang et.al. | 2506.05107 | null |
| 2025-06-05 | Synthetic Dataset Generation for Autonomous Mobile Robots Using 3D Gaussian Splatting for Vision Training | Aneesh Deogan et.al. | 2506.05092 | null |
| 2025-06-06 | Bridging Annotation Gaps: Transferring Labels to Align Object Detection Datasets | Mikhail Kennerley et.al. | 2506.04737 | null |
| 2025-06-05 | Gen-n-Val: Agentic Image Data Generation and Validation | Jing-En Huang et.al. | 2506.04676 | null |
| 2025-06-05 | VoxDet: Rethinking 3D Semantic Occupancy Prediction as Dense Object Detection | Wuyang Li et.al. | 2506.04623 | null |
| 2025-06-04 | FALO: Fast and Accurate LiDAR 3D Object Detection on Resource-Constrained Devices | Shizhong Han et.al. | 2506.04499 | null |
| 2025-06-04 | Neural Object Detection for 4D STEM: High-Throughput Sub-Pixel Electron Diffraction Pattern Recognition | Arda Genc et.al. | 2506.04477 | null |
| 2025-06-04 | Diffusion Domain Teacher: Diffusion Guided Domain Adaptive Object Detector | Boyong He et.al. | 2506.04211 | link |
| 2025-06-04 | FSHNet: Fully Sparse Hybrid Network for 3D Object Detection | Shuai Liu et.al. | 2506.03714 | null |
| 2025-06-04 | How PARTs assemble into wholes: Learning the relative composition of images | Melika Ayoughi et.al. | 2506.03682 | null |
| 2025-06-05 | MambaNeXt-YOLO: A Hybrid State Space Model for Real-time Object Detection | Xiaochun Lei et.al. | 2506.03654 | null |
| 2025-06-04 | DiagNet: Detecting Objects using Diagonal Constraints on Adjacency Matrix of Graph Neural Network | Chong Hyun Lee et.al. | 2506.03571 | null |
| 2025-06-03 | SportMamba: Adaptive Non-Linear Multi-Object Tracking with State Space Models for Team Sports | Dheeraj Khanna et.al. | 2506.03335 | null |
| 2025-06-03 | Simulate Any Radar: Attribute-Controllable Radar Simulation via Waveform Parameter Embedding | Weiqing Xiao et.al. | 2506.03134 | link |
| 2025-06-03 | HACo-Det: A Study Towards Fine-Grained Machine-Generated Text Detection under Human-AI Coauthoring | Zhixiong Su et.al. | 2506.02959 | null |
| 2025-06-03 | Towards Auto-Annotation from Annotation Guidelines: A Benchmark through 3D LiDAR Detection | Yechi Ma et.al. | 2506.02914 | null |
| 2025-06-03 | A Dynamic Transformer Network for Vehicle Detection | Chunwei Tian et.al. | 2506.02765 | null |
| 2025-06-03 | Open-PMC-18M: A High-Fidelity Large Scale Medical Dataset for Multimodal Representation Learning | Negin Baghbanzadeh et.al. | 2506.02738 | null |
| 2025-06-03 | GeneA-SLAM2: Dynamic SLAM with AutoEncoder-Preprocessed Genetic Keypoints Resampling and Depth Variance-Guided Dynamic Region Removal | Shufan Qing et.al. | 2506.02736 | link |
| 2025-06-03 | Sight Guide: A Wearable Assistive Perception and Navigation System for the Vision Assistance Race in the Cybathlon 2024 | Patrick Pfreundschuh et.al. | 2506.02676 | null |
| 2025-06-03 | Probabilistic Online Event Downsampling | Andreu Girbau-Xalabarder et.al. | 2506.02547 | null |
| 2025-06-03 | Efficient Test-time Adaptive Object Detection via Sensitivity-Guided Pruning | Kunyu Wang et.al. | 2506.02462 | null |
| 2025-06-03 | Auto-Labeling Data for Object Detection | Brent A. Griffin et.al. | 2506.02359 | null |
| 2025-05-30 | Stress-testing Machine Generated Text Detection: Shifting Language Models Writing Style to Fool Detectors | Andrea Pedrotti et.al. | 2505.24523 | link |
| 2025-05-30 | Deformable Attention Mechanisms Applied to Object Detection, case of Remote Sensing | Anasse Boutayeb et.al. | 2505.24489 | null |
| 2025-05-30 | Leadership Assessment in Pediatric Intensive Care Unit Team Training | Liangyang Ouyang et.al. | 2505.24389 | null |
| 2025-05-30 | D2AF: A Dual-Driven Annotation and Filtering Framework for Visual Grounding | Yichi Zhang et.al. | 2505.24372 | null |
| 2025-05-29 | Conformal Object Detection by Sequential Risk Control | Léo Andéol et.al. | 2505.24038 | null |
| 2025-05-29 | Rooms from Motion: Un-posed Indoor 3D Object Detection as Localization and Mapping | Justin Lazarow et.al. | 2505.23756 | null |
| 2025-05-29 | Boosting Domain Incremental Learning: Selecting the Optimal Parameters is All You Need | Qiang Wang et.al. | 2505.23744 | link |
| 2025-05-29 | FMG-Det: Foundation Model Guided Robust Object Detection | Darryl Hannan et.al. | 2505.23726 | null |
| 2025-05-29 | CF-DETR: Coarse-to-Fine Transformer for Real-Time Object Detection | Woojin Shin et.al. | 2505.23317 | null |
| 2025-05-30 | WTEFNet: Real-Time Low-Light Object Detection for Advanced Driver Assistance Systems | Hao Wu et.al. | 2505.23201 | null |
| 2025-05-29 | Language-guided Learning for Object Detection Tackling Multiple Variations in Aerial Images | Sungjune Park et.al. | 2505.23193 | null |
| 2025-05-29 | DIP-R1: Deep Inspection and Perception with RL Looking Through and Understanding Complex Scenes | Sungjune Park et.al. | 2505.23179 | null |
| 2025-05-29 | The Meeseeks Mesh: Spatially Consistent 3D Adversarial Objects for BEV Detector | Aixuan Li et.al. | 2505.22499 | null |
| 2025-05-28 | VME: A Satellite Imagery Dataset and Benchmark for Detecting Vehicles in the Middle East and Beyond | Noora Al-Emadi et.al. | 2505.22353 | link |
| 2025-05-28 | Task-Driven Implicit Representations for Automated Design of LiDAR Systems | Nikhil Behari et.al. | 2505.22344 | null |
| 2025-05-29 | YH-MINER: Multimodal Intelligent System for Natural Ecological Reef Metric Extraction | Mingzhuang Wang et.al. | 2505.22250 | null |
| 2025-05-28 | S2AFormer: Strip Self-Attention for Efficient Vision Transformer | Guoan Xu et.al. | 2505.22195 | null |
| 2025-05-28 | Learning A Robust RGB-Thermal Detector for Extreme Modality Imbalance | Chao Tian et.al. | 2505.22154 | null |
| 2025-05-28 | Prototype Embedding Optimization for Human-Object Interaction Detection in Livestreaming | Menghui Zhang et.al. | 2505.22011 | null |
| 2025-05-28 | Cross-DINO: Cross the Deep MLP and Transformer for Small Object Detection | Guiping Cao et.al. | 2505.21868 | null |
| 2025-05-27 | Object Concepts Emerge from Motion | Haoqian Liang et.al. | 2505.21635 | null |
| 2025-05-27 | Active-O3: Empowering Multimodal Large Language Models with Active Perception via GRPO | Muzhi Zhu et.al. | 2505.21457 | link |
| 2025-05-27 | Visual Product Graph: Bridging Visual Products And Composite Images For End-to-End Style Recommendations | Yue Li Du et.al. | 2505.21454 | null |
| 2025-05-27 | YOLO-SPCI: Enhancing Remote Sensing Object Detection via Selective-Perspective-Class Integration | Xinyuan Wang et.al. | 2505.21370 | null |
| 2025-05-27 | Assured Autonomy with Neuro-Symbolic Perception | R. Spencer Hallyburton et.al. | 2505.21322 | null |
| 2025-05-27 | Robust Video-Based Pothole Detection and Area Estimation for Intelligent Vehicles with Depth Map and Kalman Smoothing | Dehao Wang et.al. | 2505.21049 | null |
| 2025-05-27 | Facial Attribute Based Text Guided Face Anonymization | Mustafa İzzet Muştu et.al. | 2505.21002 | null |
| 2025-05-27 | YOLO-FireAD: Efficient Fire Detection via Attention-Guided Inverted Residual Learning and Dual-Pooling Feature Preservation | Weichao Pan et.al. | 2505.20884 | null |
| 2025-05-27 | Open-Det: An Efficient Learning Framework for Open-Ended Detection | Guiping Cao et.al. | 2505.20639 | null |
| 2025-05-27 | Roboflow100-VL: A Multi-Domain Object Detection Benchmark for Vision-Language Models | Peter Robicheaux et.al. | 2505.20612 | null |
| 2025-05-26 | From Data to Modeling: Fully Open-vocabulary Scene Graph Generation | Zuyao Chen et.al. | 2505.20106 | null |
| 2025-05-26 | Target Tracking via LiDAR-RADAR Sensor Fusion for Autonomous Racing | Marcello Cellina et.al. | 2505.20043 | null |
| 2025-05-26 | Underwater Diffusion Attention Network with Contrastive Language-Image Joint Learning for Underwater Image Enhancement | Afrah Shaahid et.al. | 2505.19895 | null |
| 2025-05-26 | ADD-SLAM: Adaptive Dynamic Dense SLAM with Gaussian Splatting | Wenhua Wu et.al. | 2505.19420 | null |
| 2025-05-26 | Neural nanophotonic object detector with ultra-wide field-of-view | Ji Chen et.al. | 2505.19379 | null |
| 2025-05-25 | What do Blind and Low-Vision People Really Want from Assistive Smart Devices? Comparison of the Literature with a Focus Study | Bhanuka Gamage et.al. | 2505.19325 | null |
| 2025-05-25 | VL-SAM-V2: Open-World Object Detection with General and Specific Query Fusion | Zhiwei Lin et.al. | 2505.18986 | null |
| 2025-05-24 | Mitigating Context Bias in Domain Adaptation for Object Detection using Mask Pooling | Hojun Son et.al. | 2505.18446 | null |
| 2025-05-23 | Sampling Strategies for Efficient Training of Deep Learning Object Detection Algorithms | Gefei Shen et.al. | 2505.18302 | null |
| 2025-05-23 | One RL to See Them All: Visual Triple Unified Reinforcement Learning | Yan Ma et.al. | 2505.18129 | link |
| 2025-05-23 | SemSegBench & DetecBench: Benchmarking Reliability and Generalization Beyond Classification | Shashank Agnihotri et.al. | 2505.18015 | null |
| 2025-05-23 | RQR3D: Reparametrizing the regression targets for BEV-based 3D object detection | Ozsel Kilinc et.al. | 2505.17732 | null |
| 2025-05-23 | Adaptive Semantic Token Communication for Transformer-based Edge Inference | Alessio Devoto et.al. | 2505.17604 | null |
| 2025-05-23 | Distance Estimation in Outdoor Driving Environments Using Phase-only Correlation Method with Event Cameras | Masataka Kobayashi et.al. | 2505.17582 | null |
| 2025-05-23 | OrionBench: A Benchmark for Chart and Human-Recognizable Object Detection in Infographics | Jiangning Zhu et.al. | 2505.17473 | null |
| 2025-05-23 | Reflectance Prediction-based Knowledge Distillation for Robust 3D Object Detection in Compressed Point Clouds | Hao Jing et.al. | 2505.17442 | null |
| 2025-05-23 | Optimizing YOLOv8 for Parking Space Detection: Comparative Analysis of Custom YOLOv8 Architecture | Apar Pokhrel et.al. | 2505.17364 | null |
| 2025-05-22 | Extending Dataset Pruning to Object Detection: A Variance-based Approach | Ryota Yagi et.al. | 2505.17245 | null |
| 2025-05-22 | Semi-Supervised State-Space Model with Dynamic Stacking Filter for Real-World Video Deraining | Shangquan Sun et.al. | 2505.16811 | null |
| 2025-05-22 | Robust Vision-Based Runway Detection through Conformal Prediction and Conformal mAP | Alya Zouzou et.al. | 2505.16740 | link |
| 2025-05-22 | CodeMerge: Codebook-Guided Model Merging for Robust Test-Time Adaptation in Autonomous Driving | Huitong Yang et.al. | 2505.16524 | null |
| 2025-05-22 | MAFE R-CNN: Selecting More Samples to Learn Category-aware Features for Small Object Detection | Yichen Li et.al. | 2505.16442 | null |
| 2025-05-22 | AdvReal: Adversarial Patch Generation Framework with Application to Adversarial Safety Evaluation of Object Detection Systems | Yuanhao Huang et.al. | 2505.16402 | link |
| 2025-05-22 | Self-Classification Enhancement and Correction for Weakly Supervised Object Detection | Yufei Yin et.al. | 2505.16294 | null |
| 2025-05-21 | MIKU-PAL: An Automated and Standardized Multi-Modal Method for Speech Paralinguistic and Affect Labeling | Cheng Yifan et.al. | 2505.15772 | null |
| 2025-05-21 | The Devil is in Fine-tuning and Long-tailed Problems:A New Benchmark for Scene Text Detection | Tianjiao Cao et.al. | 2505.15649 | link |
| 2025-05-21 | SNAP: A Benchmark for Testing the Effects of Capture Conditions on Fundamental Vision Tasks | Iuliia Kotseruba et.al. | 2505.15628 | link |
| 2025-05-21 | Detection of Underwater Multi-Targets Based on Self-Supervised Learning and Deformable Path Aggregation Feature Pyramid Network | Chang Liu et.al. | 2505.15518 | null |
| 2025-05-21 | Trends and Challenges in Authorship Analysis: A Review of ML, DL, and LLM Approaches | Nudrat Habib et.al. | 2505.15422 | null |
| 2025-05-21 | RAZER: Robust Accelerated Zero-Shot 3D Open-Vocabulary Panoptic Reconstruction with Spatio-Temporal Aggregation | Naman Patel et.al. | 2505.15373 | null |
| 2025-05-21 | AGENT-X: Adaptive Guideline-based Expert Network for Threshold-free AI-generated teXt detection | Jiatao Li et.al. | 2505.15261 | null |
| 2025-05-21 | Multispectral Detection Transformer with Infrared-Centric Sensor Fusion | Seongmin Hwang et.al. | 2505.15137 | null |
| 2025-05-20 | Colors Matter: AI-Driven Exploration of Human Feature Colors | Rama Alyoubi et.al. | 2505.14931 | link |
| 2025-05-20 | Language Models Optimized to Fool Detectors Still Have a Distinct Style (And How to Change It) | Rafael Rivera Soto et.al. | 2505.14608 | null |
| 2025-05-20 | SCAN: Semantic Document Layout Analysis for Textual and Visual Retrieval-Augmented Generation | Yuyang Dong et.al. | 2505.14381 | null |
| 2025-05-20 | FAID: Fine-grained AI-generated Text Detection using Multi-task Auxiliary and Multi-level Contrastive Learning | Minh Ngoc Ta et.al. | 2505.14271 | null |
| 2025-05-20 | Decoupling Classifier for Boosting Few-shot Object Detection and Instance Segmentation | Bin-Bin Gao et.al. | 2505.14239 | null |
| 2025-05-20 | Intra-class Patch Swap for Self-Distillation | Hongjun Choi et.al. | 2505.14124 | link |
| 2025-05-20 | Scaling Vision Mamba Across Resolutions via Fractal Traversal | Bo Li et.al. | 2505.14062 | null |
| 2025-05-20 | Automated Quality Evaluation of Cervical Cytopathology Whole Slide Images Based on Content Analysis | Lanlan Kang et.al. | 2505.13875 | null |
| 2025-05-20 | Safety2Drive: Safety-Critical Scenario Benchmark for the Evaluation of Autonomous Driving | Jingzheng Li et.al. | 2505.13872 | null |
| 2025-05-20 | Domain Gating Ensemble Networks for AI-Generated Text Detection | Arihant Tripathi et.al. | 2505.13855 | null |
| 2025-05-20 | A Challenge to Build Neuro-Symbolic Video Agents | Sahil Shah et.al. | 2505.13851 | null |
| 2025-05-19 | Dynamic Graph Induced Contour-aware Heat Conduction Network for Event-based Object Detection | Xiao Wang et.al. | 2505.12908 | link |
| 2025-05-19 | Rethinking Features-Fused-Pyramid-Neck for Object Detection | Hulin Li et.al. | 2505.12820 | link |
| 2025-05-19 | Enhancing Transformers Through Conditioned Embedded Tokens | Hemanth Saratchandran et.al. | 2505.12789 | null |
| 2025-05-19 | LiDAR MOT-DETR: A LiDAR-based Two-Stage Transformer for 3D Multiple Object Tracking | Martha Teiko Teye et.al. | 2505.12753 | null |
| 2025-05-19 | VLC Fusion: Vision-Language Conditioned Sensor Fusion for Robust Object Detection | Aditya Taparia et.al. | 2505.12715 | null |
| 2025-05-18 | LM $^2$ otifs : An Explainable Framework for Machine-Generated Texts Detection | Xu Zheng et.al. | 2505.12507 | null |
| 2025-05-17 | EarthSynth: Generating Informative Earth Observation with Diffusion Models | Jiancheng Pan et.al. | 2505.12108 | null |
| 2025-05-17 | Experimental Study on Automatically Assembling Custom Catering Packages With a 3-DOF Delta Robot Using Deep Learning Methods | Reihaneh Yourdkhani et.al. | 2505.11879 | null |
| 2025-05-16 | Improving Object Detection Performance through YOLOv8: A Comprehensive Training and Evaluation Study | Rana Poureskandar et.al. | 2505.11424 | null |
| 2025-05-16 | MTevent: A Multi-Task Event Camera Dataset for 6D Pose Estimation and Moving Object Detection | Shrutarv Awasthi et.al. | 2505.11282 | null |
| 2025-05-16 | M4-SAR: A Multi-Resolution, Multi-Polarization, Multi-Scene, Multi-Source Dataset and Benchmark for Optical-SAR Fusion Object Detection | Chao Wang et.al. | 2505.10931 | null |
| 2025-05-16 | A High-Performance Thermal Infrared Object Detection Framework with Centralized Regulation | Jinke Li et.al. | 2505.10825 | null |
| 2025-05-15 | StoryReasoning Dataset: Using Chain-of-Thought for Scene Understanding and Grounded Story Generation | Daniel A. P. Oliveira et.al. | 2505.10292 | link |
| 2025-05-15 | Defect Detection in Photolithographic Patterns Using Deep Learning Models Trained on Synthetic Data | Prashant P. Shinde et.al. | 2505.10192 | null |
| 2025-05-15 | Application of YOLOv8 in monocular downward multiple Car Target detection | Shijie Lyu et.al. | 2505.10016 | null |
| 2025-05-14 | EdgeAI Drone for Autonomous Construction Site Demonstrator | Emre Girgin et.al. | 2505.09837 | link |
| 2025-05-14 | WhatsAI: Transforming Meta Ray-Bans into an Extensible Generative AI Platform for Accessibility | Nasif Zaman et.al. | 2505.09823 | null |
| 2025-05-14 | MoRAL: Motion-aware Multi-Frame 4D Radar and LiDAR Fusion for Robust 3D Object Detection | Xiangyuan Peng et.al. | 2505.09422 | null |
| 2025-05-14 | A drone that learns to efficiently find objects in agricultural fields: from simulation to the real world | Rick van Essen et.al. | 2505.09278 | null |
| 2025-05-14 | DRRNet: Macro-Micro Feature Fusion and Dual Reverse Refinement for Camouflaged Object Detection | Jianlin Sun et.al. | 2505.09168 | link |
| 2025-05-14 | Beyond General Prompts: Automated Prompt Refinement using Contrastive Class Alignment Scores for Disambiguating Objects in Vision-Language Models | Lucas Choi et.al. | 2505.09139 | null |
| 2025-05-14 | Promoting SAM for Camouflaged Object Detection via Selective Key Point-based Guidance | Guoying Liang et.al. | 2505.09123 | null |
| 2025-05-13 | Robustness Analysis against Adversarial Patch Attacks in Fully Unmanned Stores | Hyunsik Na et.al. | 2505.08835 | null |
| 2025-05-13 | Augmented Reality for RObots (ARRO): Pointing Visuomotor Policies Towards Visual Robustness | Reihaneh Mirjalili et.al. | 2505.08627 | null |
| 2025-05-14 | Thermal Detection of People with Mobility Restrictions for Barrier Reduction at Traffic Lights Controlled Intersections | Xiao Ni et.al. | 2505.08568 | link |
| 2025-05-13 | MDF: Multi-Modal Data Fusion with CNN-Based Object Detection for Enhanced Indoor Localization Using LiDAR-SLAM | Saqi Hussain Kalan et.al. | 2505.08388 | null |
| 2025-05-13 | HMPNet: A Feature Aggregation Architecture for Maritime Object Detection from a Shipborne Perspective | Yu Zhang et.al. | 2505.08231 | link |
| 2025-05-13 | Object detection in adverse weather conditions for autonomous vehicles using Instruct Pix2Pix | Unai Gurbindo et.al. | 2505.08228 | null |
| 2025-05-13 | MoKD: Multi-Task Optimization for Knowledge Distillation | Zeeshan Hayder et.al. | 2505.08170 | null |
| 2025-05-12 | LAMM-ViT: AI Face Detection via Layer-Aware Modulation of Region-Guided Attention | Jiangling Zhang et.al. | 2505.07734 | null |
| 2025-05-12 | Hybrid Spiking Vision Transformer for Object Detection with Event Cameras | Qi Xu et.al. | 2505.07715 | null |
| 2025-05-12 | Self-Supervised Event Representations: Towards Accurate, Real-Time Perception on SoC FPGAs | Kamil Jeziorek et.al. | 2505.07556 | null |
| 2025-05-12 | Automated Visual Attention Detection using Mobile Eye Tracking in Behavioral Classroom Studies | Efe Bozkir et.al. | 2505.07552 | null |
| 2025-05-12 | DepthFusion: Depth-Aware Hybrid Feature Fusion for LiDAR-Camera 3D Object Detection | Mingqian Ji et.al. | 2505.07398 | null |
| 2025-05-12 | Language-Driven Dual Style Mixing for Single-Domain Generalized Object Detection | Hongda Qin et.al. | 2505.07219 | link |
| 2025-05-11 | Differentiable NMS via Sinkhorn Matching for End-to-End Fabric Defect Detection | Zhengyang Lu et.al. | 2505.07040 | null |
| 2025-05-11 | VALISENS: A Validated Innovative Multi-Sensor System for Cooperative Automated Driving | Lei Wan et.al. | 2505.06980 | null |
| 2025-05-10 | M3CAD: Towards Generic Cooperative Autonomous Driving Benchmark | Morui Zhu et.al. | 2505.06746 | null |
| 2025-05-10 | Underwater object detection in sonar imagery with detection transformer and Zero-shot neural architecture search | XiaoTong Gu et.al. | 2505.06694 | null |
| 2025-05-09 | Camera-Only Bird’s Eye View Perception: A Neural Approach to LiDAR-Free Environmental Mapping for Autonomous Vehicles | Anupkumar Bochare et.al. | 2505.06113 | null |
| 2025-05-09 | Artificial intelligence pioneers the double-strangeness factory | Yan He et.al. | 2505.05802 | null |
| 2025-05-09 | Dome-DETR: DETR with Density-Oriented Feature-Query Manipulation for Efficient Tiny Object Detection | Zhangchi Hu et.al. | 2505.05741 | null |
| 2025-05-09 | DiGIT: Multi-Dilated Gated Encoder and Central-Adjacent Region Integrated Decoder for Temporal Action Detection Transformer | Ho-Joong Kim et.al. | 2505.05711 | link |
| 2025-05-08 | PillarMamba: Learning Local-Global Context for Roadside Point Cloud via Hybrid State Space Model | Zhang Zhang et.al. | 2505.05397 | null |
| 2025-05-08 | PaniCar: Securing the Perception of Advanced Driving Assistance Systems Against Emergency Vehicle Lighting | Elad Feldman et.al. | 2505.05183 | null |
| 2025-05-08 | Reliably Bounding False Positives: A Zero-Shot Machine-Generated Text Detection Framework via Multiscaled Conformal Prediction | Xiaowei Zhu et.al. | 2505.05084 | null |
| 2025-05-08 | FG-CLIP: Fine-Grained Visual and Textual Alignment | Chunyu Xie et.al. | 2505.05071 | null |
| 2025-05-08 | A Simple Detector with Frame Dynamics is a Strong Tracker | Chenxu Peng et.al. | 2505.04917 | null |
| 2025-05-08 | Mix-QSAM: Mixed-Precision Quantization of the Segment Anything Model | Navin Ranjan et.al. | 2505.04861 | null |
| 2025-05-07 | Lightweight RGB-D Salient Object Detection from a Speed-Accuracy Tradeoff Perspective | Songsong Duan et.al. | 2505.04758 | null |
| 2025-05-07 | Hyb-KAN ViT: Hybrid Kolmogorov-Arnold Networks Augmented Vision Transformer | Sainath Dey et.al. | 2505.04740 | null |
| 2025-05-08 | MonoCoP: Chain-of-Prediction for Monocular 3D Object Detection | Zhihao Zhang et.al. | 2505.04594 | null |
| 2025-05-07 | Edge-GPU Based Face Tracking for Face Detection and Recognition Acceleration | Asma Baobaid et.al. | 2505.04524 | null |
| 2025-05-07 | Leveraging Simultaneous Usage of Edge GPU Hardware Engines for Video Face Detection and Recognition | Asma Baobaid et.al. | 2505.04502 | null |
| 2025-05-07 | DeCLIP: Decoupled Learning for Open-Vocabulary Dense Perception | Junjie Wang et.al. | 2505.04410 | null |
| 2025-05-06 | LogisticsVLN: Vision-Language Navigation For Low-Altitude Terminal Delivery Based on Agentic UAVs | Xinyuan Zhang et.al. | 2505.03460 | null |
| 2025-05-06 | Robustness in AI-Generated Detection: Enhancing Resistance to Adversarial Attacks | Sun Haoxuan et.al. | 2505.03435 | null |
| 2025-05-06 | From Word to Sentence: A Large-Scale Multi-Instance Dataset for Open-Set Aerial Detection | Guoting Wei et.al. | 2505.03334 | null |
| 2025-05-06 | VISLIX: An XAI Framework for Validating Vision Models with Slice Discovery and Analysis | Xinyuan Yan et.al. | 2505.03132 | null |
| 2025-05-05 | Sim2Real Transfer for Vision-Based Grasp Verification | Pau Amargant et.al. | 2505.03046 | link |
| 2025-05-05 | DPNet: Dynamic Pooling Network for Tiny Object Detection | Luqi Gong et.al. | 2505.02797 | null |
| 2025-05-05 | RGBX-DiffusionDet: A Framework for Multi-Modal RGB-X Object Detection Using DiffusionDet | Eliraz Orfaig et.al. | 2505.02586 | null |
| 2025-05-05 | Point Cloud Recombination: Systematic Real Data Augmentation Using Robotic Targets for LiDAR Perception Validation | Hubert Padusinski et.al. | 2505.02476 | null |
| 2025-05-04 | Robust AI-Generated Face Detection with Imbalanced Data | Yamini Sri Krubha et.al. | 2505.02182 | link |
| 2025-05-04 | Transforming faces into video stories – VideoFace2.0 | Branko Brkljač et.al. | 2505.02060 | null |
| 2025-05-03 | DriveNetBench: An Affordable and Configurable Single-Camera Benchmarking System for Autonomous Driving Networks | Ali Al-Bustami et.al. | 2505.01893 | link |
| 2025-05-03 | OODTE: A Differential Testing Engine for the ONNX Optimizer | Nikolaos Louloudakis et.al. | 2505.01892 | null |
| 2025-05-03 | CMAWRNet: Multiple Adverse Weather Removal via a Unified Quaternion Neural Architecture | Vladimir Frants et.al. | 2505.01882 | null |
| 2025-05-03 | DualDiff: Dual-branch Diffusion Model for Autonomous Driving with Semantic Fusion | Haoteng Li et.al. | 2505.01857 | null |
| 2025-05-03 | Toward Onboard AI-Enabled Solutions to Space Object Detection for Space Sustainability | Wenxuan Zhang et.al. | 2505.01650 | null |
| 2025-05-02 | Efficient Vision-based Vehicle Speed Estimation | Andrej Macko et.al. | 2505.01203 | null |
| 2025-05-02 | CDFormer: Cross-Domain Few-Shot Object Detection Transformer Against Feature Confusion | Boyuan Meng et.al. | 2505.00938 | null |
| 2025-05-01 | Efficient On-Chip Implementation of 4D Radar-Based 3D Object Detection on Hailo-8L | Woong-Chan Byun et.al. | 2505.00757 | null |
| 2025-05-03 | Vision Mamba in Remote Sensing: A Comprehensive Survey of Techniques, Applications and Outlook | Muyi Bao et.al. | 2505.00630 | null |
| 2025-05-01 | Visual Trajectory Prediction of Vessels for Inland Navigation | Alexander Puzicha et.al. | 2505.00599 | null |
| 2025-05-01 | Synthesizing and Identifying Noise Levels in Autonomous Vehicle Camera Radar Datasets | Mathis Morales et.al. | 2505.00584 | null |
| 2025-05-01 | X-ray illicit object detection using hybrid CNN-transformer neural network architectures | Jorgen Cani et.al. | 2505.00564 | null |
| 2025-05-01 | A Robust Deep Networks based Multi-Object MultiCamera Tracking System for City Scale Traffic | Muhammad Imran Zaman et.al. | 2505.00534 | null |
| 2025-05-01 | Inconsistency-based Active Learning for LiDAR Object Detection | Esteban Rivera et.al. | 2505.00511 | null |
| 2025-05-01 | HeAL3D: Heuristical-enhanced Active Learning for 3D Object Detection | Esteban Rivera et.al. | 2505.00507 | null |
| 2025-05-01 | Quaternion Wavelet-Conditioned Diffusion Models for Image Super-Resolution | Luigi Sigillo et.al. | 2505.00334 | null |
| 2025-04-30 | V3LMA: Visual 3D-enhanced Language Model for Autonomous Driving | Jannik Lübberstedt et.al. | 2505.00156 | null |
| 2025-04-30 | LLM-Empowered Embodied Agent for Memory-Augmented Task Planning in Household Robotics | Marc Glocker et.al. | 2504.21716 | null |
| 2025-04-30 | Visual Text Processing: A Comprehensive Review and Unified Evaluation | Yan Shu et.al. | 2504.21682 | null |
| 2025-04-29 | T2ID-CAS: Diffusion Model and Class Aware Sampling to Mitigate Class Imbalance in Neck Ultrasound Anatomical Landmark Detection | Manikanta Varaganti et.al. | 2504.21231 | null |
| 2025-04-29 | FLIM-based Salient Object Detection Networks with Adaptive Decoders | Gilson Junior Soares et.al. | 2504.20872 | null |
| 2025-04-29 | A Survey on Event-based Optical Marker Systems | Nafiseh Jabbari Tofighi et.al. | 2504.20736 | null |
| 2025-04-29 | Purifying, Labeling, and Utilizing: A High-Quality Pipeline for Small Object Detection | Siwei Wang et.al. | 2504.20602 | null |
| 2025-04-29 | Style-Adaptive Detection Transformer for Single-Source Domain Generalized Object Detection | Jianhong Han et.al. | 2504.20498 | null |
| 2025-04-28 | More Clear, More Flexible, More Precise: A Comprehensive Oriented Object Detection benchmark for UAV | Kai Ye et.al. | 2504.20032 | null |
| 2025-04-28 | Lossy Source Coding with Focal Loss | Alex Dytso et.al. | 2504.19913 | null |
| 2025-04-28 | Neural network task specialization via domain constraining | Roman Malashin et.al. | 2504.19592 | null |
| 2025-04-28 | GMAR: Gradient-Driven Multi-Head Attention Rollout for Vision Transformer Interpretability | Sehyeong Jo et.al. | 2504.19414 | null |
| 2025-04-27 | Improving Small Drone Detection Through Multi-Scale Processing and Data Augmentation | Rayson Laroca et.al. | 2504.19347 | null |
| 2025-04-27 | ODExAI: A Comprehensive Object Detection Explainable AI Evaluation | Loc Phuc Truong Nguyen et.al. | 2504.19249 | null |
| 2025-04-27 | Boosting Single-domain Generalized Object Detection via Vision-Language Knowledge Interaction | Xiaoran Xu et.al. | 2504.19086 | null |
| 2025-04-26 | Federated Learning-based Semantic Segmentation for Lane and Object Detection in Autonomous Driving | Gharbi Khamis Alshammari et.al. | 2504.18939 | null |
| 2025-04-25 | Dream-Box: Object-wise Outlier Generation for Out-of-Distribution Detection | Brian K. S. Isaac-Medina et.al. | 2504.18746 | null |
| 2025-04-25 | A Review of 3D Object Detection with Vision-Language Models | Ranjan Sapkota et.al. | 2504.18738 | null |
| 2025-04-25 | Examining the Impact of Optical Aberrations to Image Classification and Object Detection Models | Patrick Müller et.al. | 2504.18510 | null |
| 2025-04-25 | Iterative Event-based Motion Segmentation by Variational Contrast Maximization | Ryo Yamaki et.al. | 2504.18447 | null |
| 2025-04-25 | A Multimodal Hybrid Late-Cascade Fusion Network for Enhanced 3D Object Detection | Carlo Sgaravatti et.al. | 2504.18419 | null |
| 2025-04-25 | A comprehensive review of classifier probability calibration metrics | Richard Oliver Lane et.al. | 2504.18278 | null |
| 2025-04-25 | LiDAR-Guided Monocular 3D Object Detection for Long-Range Railway Monitoring | Raul David Dominguez Sanchez et.al. | 2504.18203 | null |
| 2025-04-25 | Multi-Grained Compositional Visual Clue Learning for Image Intent Recognition | Yin Tang et.al. | 2504.18201 | null |
| 2025-04-25 | E-InMeMo: Enhanced Prompting for Visual In-Context Learning | Jiahao Zhang et.al. | 2504.18158 | null |
| 2025-04-25 | MASF-YOLO: An Improved YOLOv11 Network for Small Object Detection on Drone View | Liugang Lu et.al. | 2504.18136 | null |
| 2025-04-25 | Opportunistic Collaborative Planning with Large Vision Model Guided Control and Joint Query-Service Optimization | Jiayi Chen et.al. | 2504.18057 | null |
| 2025-04-25 | Direct sampling method to retrieve small objects from two-dimensional limited-aperture scattered field data | Won-Kwang Park et.al. | 2504.18036 | null |
| 2025-04-24 | DIVE: Inverting Conditional Diffusion Models for Discriminative Tasks | Yinqi Li et.al. | 2504.17253 | link |
| 2025-04-24 | Perspective-Aware Reasoning in Vision-Language Models via Mental Imagery Simulation | Phillip Y. Lee et.al. | 2504.17207 | null |
| 2025-04-24 | AUTHENTICATION: Identifying Rare Failure Modes in Autonomous Vehicle Perception Systems using Adversarially Guided Diffusion Models | Mohammad Zarei et.al. | 2504.17179 | null |
| 2025-04-23 | Scene-Aware Location Modeling for Data Augmentation in Automotive Object Detection | Jens Petersen et.al. | 2504.17076 | null |
| 2025-04-23 | Gaussian Splatting is an Effective Data Generator for 3D Object Detection | Farhad G. Zanjani et.al. | 2504.16740 | null |
| 2025-04-23 | EHGCN: Hierarchical Euclidean-Hyperbolic Fusion via Motion-Aware GCN for Hybrid Event Stream Perception | Haosheng Chen et.al. | 2504.16616 | null |
| 2025-04-23 | Beyond Anonymization: Object Scrubbing for Privacy-Preserving 2D and 3D Vision Tasks | Murat Bilgehan Ertan et.al. | 2504.16557 | null |
| 2025-04-23 | Assessing the Feasibility of Internet-Sourced Video for Automatic Cattle Lameness Detection | Md Fahimuzzman Sohan et.al. | 2504.16404 | null |
| 2025-04-23 | Revisiting Radar Camera Alignment by Contrastive Learning for 3D Object Detection | Linhua Kong et.al. | 2504.16368 | null |
| 2025-04-22 | Vision Controlled Orthotic Hand Exoskeleton | Connor Blais et.al. | 2504.16319 | null |
| 2025-04-22 | $π_{0.5}$ : a Vision-Language-Action Model with Open-World Generalization | Physical Intelligence et.al. | 2504.16054 | null |
| 2025-04-22 | SAGA: Semantic-Aware Gray color Augmentation for Visible-to-Thermal Domain Adaptation across Multi-View Drone and Ground-Based Vision Systems | Manjunath D et.al. | 2504.15728 | null |
| 2025-04-22 | You Sense Only Once Beneath: Ultra-Light Real-Time Underwater Object Detection | Jun Dong et.al. | 2504.15694 | null |
| 2025-04-22 | A Vision-Enabled Prosthetic Hand for Children with Upper Limb Disabilities | Md Abdul Baset Sarker et.al. | 2504.15654 | null |
| 2025-04-21 | Context Aware Grounded Teacher for Source Free Object Detection | Tajamul Ashraf et.al. | 2504.15404 | null |
| 2025-04-21 | SuoiAI: Building a Dataset for Aquatic Invertebrates in Vietnam | Tue Vo et.al. | 2504.15252 | null |
| 2025-04-21 | An Efficient Aerial Image Detection with Variable Receptive Fields | Liu Wenbin et.al. | 2504.15165 | null |
| 2025-04-19 | Balancing Privacy and Action Performance: A Penalty-Driven Approach to Image Anonymization | Nazia Aslam et.al. | 2504.14301 | null |
| 2025-04-19 | Visual Consensus Prompting for Co-Salient Object Detection | Jie Wang et.al. | 2504.14254 | link |
| 2025-04-18 | Feature Alignment and Representation Transfer in Knowledge Distillation for Large Language Models | Junjie Yang et.al. | 2504.13825 | null |
| 2025-04-18 | Lightweight LiDAR-Camera 3D Dynamic Object Detection and Multi-Class Trajectory Prediction | Yushen He et.al. | 2504.13647 | link |
| 2025-04-18 | DenSe-AdViT: A novel Vision Transformer for Dense SAR Object Detection | Yang Zhang et.al. | 2504.13638 | null |
| 2025-04-18 | HMPE:HeatMap Embedding for Efficient Transformer-Based Small Object Detection | YangChen Zeng et.al. | 2504.13469 | null |
| 2025-04-18 | Towards a Multi-Agent Vision-Language System for Zero-Shot Novel Hazardous Object Detection for Autonomous Driving Safety | Shashank Shriram et.al. | 2504.13399 | link |
| 2025-04-17 | VLLFL: A Vision-Language Model Based Lightweight Federated Learning Framework for Smart Agriculture | Long Li et.al. | 2504.13365 | null |
| 2025-04-17 | SAR Object Detection with Self-Supervised Pretraining and Curriculum-Aware Sampling | Yasin Almalioglu et.al. | 2504.13310 | null |
| 2025-04-17 | Weak Cube R-CNN: Weakly Supervised 3D Detection using only 2D Bounding Boxes | Andreas Lau Hansen et.al. | 2504.13297 | null |
| 2025-04-17 | RF-DETR Object Detection vs YOLOv12 : A Study of Transformer-based and CNN-based Architectures for Single-Class and Multi-Class Greenfruit Detection in Complex Orchard Environments Under Label Ambiguity | Ranjan Sapkota et.al. | 2504.13099 | null |
| 2025-04-17 | Self-Supervised Pre-training with Combined Datasets for 3D Perception in Autonomous Driving | Shumin Wang et.al. | 2504.12709 | null |
| 2025-04-18 | RoPETR: Improving Temporal Camera-Only 3D Detection by Integrating Enhanced Rotary Position Embedding | Hang Ji et.al. | 2504.12643 | null |
| 2025-04-16 | Towards a General-Purpose Zero-Shot Synthetic Low-Light Image and Video Pipeline | Joanne Lin et.al. | 2504.12169 | null |
| 2025-04-16 | RADLER: Radar Object Detection Leveraging Semantic 3D City Models and Self-Supervised Radar-Image Learning | Yuan Luo et.al. | 2504.12167 | null |
| 2025-04-16 | pix2pockets: Shot Suggestions in 8-Ball Pool from a Single Image in the Wild | Jonas Myhre Schiøtt et.al. | 2504.12045 | null |
| 2025-04-16 | A Review of YOLOv12: Attention-Based Enhancements vs. Previous Versions | Rahima Khanam et.al. | 2504.11995 | null |
| 2025-04-16 | Multimodal Spatio-temporal Graph Learning for Alignment-free RGBT Video Object Detection | Qishun Wang et.al. | 2504.11779 | null |
| 2025-04-15 | Multi-level Cellular Automata for FLIM networks | Felipe Crispim Salvagnini et.al. | 2504.11406 | null |
| 2025-04-15 | OpenTuringBench: An Open-Model-based Benchmark and Framework for Machine-Generated Text Detection and Attribution | Lucio La Cava et.al. | 2504.11369 | null |
| 2025-04-15 | CFIS-YOLO: A Lightweight Multi-Scale Fusion Network for Edge-Deployable Wood Defect Detection | Jincheng Kang et.al. | 2504.11305 | null |
| 2025-04-15 | TSAL: Few-shot Text Segmentation Based on Attribute Learning | Chenming Li et.al. | 2504.11164 | null |
| 2025-04-15 | Flyweight FLIM Networks for Salient Object Detection in Biomedical Images | Leonardo M. Joao et.al. | 2504.11112 | null |
| 2025-04-15 | S $^2$ Teacher: Step-by-step Teacher for Sparsely Annotated Oriented Object Detection | Yu Lin et.al. | 2504.11111 | null |
| 2025-04-15 | DRIFT open dataset: A drone-derived intelligence for traffic analysis in urban environmen | Hyejin Lee et.al. | 2504.11019 | null |
| 2025-04-16 | GATE3D: Generalized Attention-based Task-synergized Estimation in 3D* | Eunsoo Im et.al. | 2504.11014 | null |
| 2025-04-15 | CDUPatch: Color-Driven Universal Adversarial Patch Attack for Dual-Modal Visible-Infrared Detectors | Jiahuan Long et.al. | 2504.10888 | null |
| 2025-04-15 | Safe-Construct: Redefining Construction Safety Violation Recognition as 3D Multi-View Engagement Task | Aviral Chharia et.al. | 2504.10880 | null |
| 2025-04-14 | DiffMOD: Progressive Diffusion Point Denoising for Moving Object Detection in Remote Sensing | Jinyue Zhang et.al. | 2504.10278 | null |
| 2025-04-14 | Balancing Stability and Plasticity in Pretrained Detector: A Dual-Path Framework for Incremental Object Detection | Songze Li et.al. | 2504.10214 | null |
| 2025-04-14 | WildLive: Near Real-time Visual Wildlife Tracking onboard UAVs | Nguyen Ngoc Dat et.al. | 2504.10165 | null |
| 2025-04-14 | COUNTS: Benchmarking Object Detectors and Multimodal Large Language Models under Distribution Shifts | Jiansheng Li et.al. | 2504.10158 | null |
| 2025-04-14 | SemiETS: Integrating Spatial and Content Consistencies for Semi-Supervised End-to-end Text Spotting | Dongliang Luo et.al. | 2504.09966 | null |
| 2025-04-14 | Small Object Detection with YOLO: A Performance Analysis Across Model Versions and Hardware | Muhammad Fasih Tariq et.al. | 2504.09900 | null |
| 2025-04-14 | Density-based Object Detection in Crowded Scenes | Chenyang Zhao et.al. | 2504.09819 | null |
| 2025-04-13 | Uncertainty Guided Refinement for Fine-Grained Salient Object Detection | Yao Yuan et.al. | 2504.09666 | link |
| 2025-04-13 | Pillar-Voxel Fusion Network for 3D Object Detection in Airborne Hyperspectral Point Clouds | Yanze Jiang et.al. | 2504.09506 | null |
| 2025-04-13 | Vision-Language Model for Object Detection and Segmentation: A Review and Evaluation | Yongchao Feng et.al. | 2504.09480 | null |
| 2025-04-11 | TinyCenterSpeed: Efficient Center-Based Object Detection for Autonomous Racing | Neil Reichlin et.al. | 2504.08655 | null |
| 2025-04-11 | Training-free Guidance in Text-to-Video Generation via Multimodal Planning and Structured Noise Initialization | Jialu Li et.al. | 2504.08641 | null |
| 2025-04-10 | Enhanced Cooperative Perception Through Asynchronous Vehicle to Infrastructure Framework with Delay Mitigation for Connected and Automated Vehicles | Nithish Kumar Saravanan et.al. | 2504.08172 | null |
| 2025-04-10 | Multi-Task Learning with Multi-Annotation Triplet Loss for Improved Object Detection | Meilun Zhou et.al. | 2504.08054 | null |
| 2025-04-10 | Detect Anything 3D in the Wild | Hanxue Zhang et.al. | 2504.07958 | null |
| 2025-04-11 | Pychop: Emulating Low-Precision Arithmetic in Numerical Methods and Neural Networks | Erin Carson et.al. | 2504.07835 | null |
| 2025-04-10 | P2Object: Single Point Supervised Object Detection and Instance Segmentation | Pengfei Chen et.al. | 2504.07813 | null |
| 2025-04-10 | Nonlocal Retinex-Based Variational Model and its Deep Unfolding Twin for Low-Light Image Enhancement | Daniel Torres et.al. | 2504.07810 | null |
| 2025-04-10 | Adaptive Detection of Fast Moving Celestial Objects Using a Mixture of Experts and Physical-Inspired Neural Network | Peng Jia et.al. | 2504.07777 | null |
| 2025-04-10 | Prediction of Usage Probabilities of Shopping-Mall Corridors Using Heterogeneous Graph Neural Networks | Malik M Barakathullah et.al. | 2504.07645 | null |
| 2025-04-10 | VLM-R1: A Stable and Generalizable R1-style Large Vision-Language Model | Haozhan Shen et.al. | 2504.07615 | link |
| 2025-04-10 | RASMD: RGB And SWIR Multispectral Driving Dataset for Robust Perception in Adverse Conditions | Youngwan Jin et.al. | 2504.07603 | null |
| 2025-04-10 | WS-DETR: Robust Water Surface Object Detection through Vision-Radar Fusion with Detection Transformer | Huilin Yin et.al. | 2504.07441 | null |
| 2025-04-10 | Model Discrepancy Learning: Synthetic Faces Detection Based on Multi-Reconstruction | Qingchao Jiang et.al. | 2504.07382 | link |
| 2025-04-09 | Generalized Semantic Contrastive Learning via Embedding Side Information for Few-Shot Object Detection | Ruoyu Chen et.al. | 2504.07060 | null |
| 2025-04-09 | UAV Position Estimation using a LiDAR-based 3D Object Detection Method | Uthman Olawoye et.al. | 2504.07028 | null |
| 2025-04-09 | Towards Efficient Roadside LiDAR Deployment: A Fast Surrogate Metric Based on Entropy-Guided Visibility | Yuze Jiang et.al. | 2504.06772 | null |
| 2025-04-09 | Domain-Conditioned Scene Graphs for State-Grounded Task Planning | Jonas Herzog et.al. | 2504.06661 | null |
| 2025-04-09 | Visually Similar Pair Alignment for Robust Cross-Domain Object Detection | Onkar Krishna et.al. | 2504.06607 | null |
| 2025-04-08 | From Broadcast to Minimap: Achieving State-of-the-Art SoccerNet Game State Reconstruction | Vladimir Golovkin et.al. | 2504.06357 | null |
| 2025-04-08 | Analyzing the Impact of Low-Rank Adaptation for Cross-Domain Few-Shot Object Detection in Aerial Images | Hicham Talaoubrid et.al. | 2504.06330 | null |
| 2025-04-08 | Security Analysis of Thumbnail-Preserving Image Encryption and a New Framework | Dong Xie et.al. | 2504.06083 | null |
| 2025-04-08 | Balancing long- and short-term dynamics for the modeling of saliency in videos | Theodor Wulff et.al. | 2504.05913 | null |
| 2025-04-08 | PRIMEDrive-CoT: A Precognitive Chain-of-Thought Framework for Uncertainty-Aware Object Interaction in Driving Scene Scenario | Sriram Mandalika et.al. | 2504.05908 | null |
| 2025-04-08 | Intrinsic Saliency Guided Trunk-Collateral Network for Unsupervised Video Object Segmentation | Xiangyu Zheng et.al. | 2504.05904 | null |
| 2025-04-08 | KAN-SAM: Kolmogorov-Arnold Network Guided Segment Anything Model for RGB-T Salient Object Detection | Xingyuan Li et.al. | 2504.05878 | null |
| 2025-04-08 | DefMamba: Deformable Visual State Space Model | Leiye Liu et.al. | 2504.05794 | null |
| 2025-04-08 | Event-based Civil Infrastructure Visual Defect Detection: ev-CIVIL Dataset and Benchmark | Udayanga G. W. K. N. Gamage et.al. | 2504.05679 | null |
| 2025-04-08 | POD: Predictive Object Detection with Single-Frame FMCW LiDAR Point Cloud | Yining Shi et.al. | 2504.05649 | null |
| 2025-04-08 | AD-Det: Boosting Object Detection in UAV Images with Focused Small Objects and Balanced Tail Classes | Zhenteng Li et.al. | 2504.05601 | null |
| 2025-04-07 | SSLFusion: Scale & Space Aligned Latent Fusion Model for Multimodal 3D Object Detection | Bonan Ding et.al. | 2504.05170 | null |
| 2025-04-07 | Inland Waterway Object Detection in Multi-environment: Dataset and Approach | Shanshan Wang et.al. | 2504.04835 | null |
| 2025-04-07 | Playing Non-Embedded Card-Based Games with Reinforcement Learning | Tianyang Wu et.al. | 2504.04783 | null |
| 2025-04-07 | Feedback-Enhanced Hallucination-Resistant Vision-Language Model for Real-Time Scene Understanding | Zahir Alsulaimawi et.al. | 2504.04772 | null |
| 2025-04-07 | Inverse++: Vision-Centric 3D Semantic Occupancy Prediction Assisted with 3D Object Detection | Zhenxing Ming et.al. | 2504.04732 | null |
| 2025-04-06 | Enhance Then Search: An Augmentation-Search Strategy with Foundation Models for Cross-Domain Few-Shot Object Detection | Jiancheng Pan et.al. | 2504.04517 | link |
| 2025-04-06 | eKalibr-Stereo: Continuous-Time Spatiotemporal Calibration for Event-Based Stereo Visual Systems | Shuolong Chen et.al. | 2504.04451 | link |
| 2025-04-05 | Autoregressive High-Order Finite Difference Modulo Imaging: High-Dynamic Range for Computer Vision Applications | Brayan Monroy et.al. | 2504.04228 | null |
| 2025-04-05 | An Optimized Density-Based Lane Keeping System for A Cost-Efficient Autonomous Vehicle Platform: AurigaBot V1 | Farbod Younesi et.al. | 2504.04217 | null |
| 2025-04-05 | Learning about the Physical World through Analytic Concepts | Jianhua Sun et.al. | 2504.04170 | null |
| 2025-04-04 | VISTA-OCR: Towards generative and interactive end to end OCR models | Laziz Hamdi et.al. | 2504.03621 | null |
| 2025-04-04 | PF3Det: A Prompted Foundation Feature Assisted Visual LiDAR 3D Detector | Kaidong Li et.al. | 2504.03563 | null |
| 2025-04-04 | ZFusion: An Effective Fuser of Camera and 4D Radar for 3D Object Perception in Autonomous Driving | Sheng Yang et.al. | 2504.03438 | null |
| 2025-04-04 | Infrared bubble recognition in the Milky Way and beyond using deep learning | Shimpei Nishimoto et.al. | 2504.03367 | null |
| 2025-04-04 | Real-Time Roadway Obstacle Detection for Electric Scooters Using Deep Learning and Multi-Sensor Fusion | Zeyang Zheng et.al. | 2504.03171 | null |
| 2025-04-04 | Finding the Reflection Point: Unpadding Images to Remove Data Augmentation Artifacts in Large Open Source Image Datasets for Machine Learning | Lucas Choi et.al. | 2504.03168 | null |
| 2025-04-03 | Attention-Aware Multi-View Pedestrian Tracking | Reef Alturki et.al. | 2504.03047 | null |
| 2025-04-03 | LiDAR-based Object Detection with Real-time Voice Specifications | Anurag Kulkarni et.al. | 2504.02920 | null |
| 2025-04-03 | BOP Challenge 2024 on Model-Based and Model-Free 6D Object Pose Estimation | Van Nguyen Nguyen et.al. | 2504.02812 | null |
| 2025-04-03 | Rip Current Segmentation: A Novel Benchmark and YOLOv8 Baseline Results | Andrei Dumitriu et.al. | 2504.02558 | null |
| 2025-04-03 | Multimodal Fusion and Vision-Language Models: A Survey for Robot Vision | Xiaofeng Han et.al. | 2504.02477 | null |
| 2025-04-03 | CornerPoint3D: Look at the Nearest Corner Instead of the Center | Ruixiao Zhang et.al. | 2504.02464 | null |
| 2025-04-03 | Hyperspectral Remote Sensing Images Salient Object Detection: The First Benchmark Dataset and Baseline | Peifu Liu et.al. | 2504.02416 | null |
| 2025-04-03 | SemiISP/SemiIE: Semi-Supervised Image Signal Processor and Image Enhancement Leveraging One-to-Many Mapping sRGB-to-RAW | Masakazu Yoshimura et.al. | 2504.02345 | null |
| 2025-04-03 | Improving Harmful Text Detection with Joint Retrieval and External Knowledge | Zidong Yu et.al. | 2504.02310 | null |
| 2025-04-03 | LLM-Guided Evolution: An Autonomous Model Optimization for Object Detection | YiMing Yu et.al. | 2504.02280 | null |
| 2025-04-02 | Cat-Eye Inspired Active-Passive-Composite Aperture-Shared Sub-Terahertz Meta-Imager for Non-Interactive Concealed Object Detection | Mingshuang Hu et.al. | 2504.01473 | null |
| 2025-04-02 | CFMD: Dynamic Cross-layer Feature Fusion for Salient Object Detection | Jin Lian et.al. | 2504.01326 | null |
| 2025-04-01 | Enabling Efficient Processing of Spiking Neural Networks with On-Chip Learning on Commodity Neuromorphic Processors for Edge AI Systems | Rachmad Vidya Wicaksana Putra et.al. | 2504.00957 | null |
| 2025-04-01 | NeuRadar: Neural Radiance Fields for Automotive Radar Point Clouds | Mahan Rafidashti et.al. | 2504.00859 | null |
| 2025-04-01 | AttentiveGRU: Recurrent Spatio-Temporal Modeling for Advanced Radar-Based BEV Object Detection | Loveneet Saini et.al. | 2504.00559 | null |
| 2025-04-01 | Archival Faces: Detection of Faces in Digitized Historical Documents | Marek Vaško et.al. | 2504.00558 | null |
| 2025-04-01 | High-Quality Pseudo-Label Generation Based on Visual Prompt Assisted Cloud Model Update | Xinrun Xu et.al. | 2504.00526 | null |
| 2025-04-01 | Intrinsic-feature-guided 3D Object Detection | Wanjing Zhang et.al. | 2504.00382 | null |
| 2025-04-01 | CamoSAM2: Motion-Appearance Induced Auto-Refining Prompts for Video Camouflaged Object Detection | Xin Zhang et.al. | 2504.00375 | null |
| 2025-03-31 | Towards Precise Action Spotting: Addressing Temporal Misalignment in Labels with Dynamic Label Assignment | Masato Tamura et.al. | 2504.00149 | null |
| 2025-03-31 | SU-YOLO: Spiking Neural Network for Efficient Underwater Object Detection | Chenyang Li et.al. | 2503.24389 | link |
| 2025-03-31 | MB-ORES: A Multi-Branch Object Reasoner for Visual Grounding in Remote Sensing | Karim Radouane et.al. | 2503.24219 | link |
| 2025-03-31 | Spectral-Adaptive Modulation Networks for Visual Perception | Guhnoo Yun et.al. | 2503.23947 | null |
| 2025-03-31 | Reliable Traffic Monitoring Using Low-Cost Doppler Radar Units | Mishay Naidoo et.al. | 2503.23926 | null |
| 2025-03-31 | Expanding-and-Shrinking Binary Neural Networks | Xulong Shi et.al. | 2503.23709 | link |
| 2025-03-30 | Beyond Detection: Designing AI-Resilient Assessments with Automated Feedback Tool to Foster Critical Thinking | Muhammad Sajjad Akbar et.al. | 2503.23622 | null |
| 2025-03-30 | Re-Aligning Language to Visual Objects with an Agentic Workflow | Yuming Chen et.al. | 2503.23508 | null |
| 2025-03-30 | EagleVision: Object-level Attribute Multimodal LLM for Remote Sensing | Hongxiang Jiang et.al. | 2503.23330 | null |
| 2025-03-29 | Context in object detection: a systematic literature review | Mahtab Jamali et.al. | 2503.23249 | null |
| 2025-03-29 | Large Self-Supervised Models Bridge the Gap in Domain Adaptive Object Detection | Marc-Antoine Lavoie et.al. | 2503.23220 | null |
| 2025-03-28 | AnnoPage Dataset: Dataset of Non-Textual Elements in Documents with Fine-Grained Categorization | Martin Kišš et.al. | 2503.22526 | null |
| 2025-03-28 | Data Quality Matters: Quantifying Image Quality Impact on Machine Learning Performance | Christian Steinhauser et.al. | 2503.22375 | null |
| 2025-03-28 | ForcePose: A Deep Learning Approach for Force Calculation Based on Action Recognition Using MediaPipe Pose Estimation Combined with Object Detection | Nandakishor M et.al. | 2503.22363 | null |
| 2025-03-28 | SKDU at De-Factify 4.0: Natural Language Features for AI-Generated Text-Detection | Shrikant Malviya et.al. | 2503.22338 | link |
| 2025-03-28 | Knowledge Rectification for Camouflaged Object Detection: Unlocking Insights from Low-Quality Data | Juwei Guan et.al. | 2503.22180 | null |
| 2025-03-28 | A Survey on Remote Sensing Foundation Models: From Vision to Multimodality | Ziyue Huang et.al. | 2503.22081 | null |
| 2025-03-27 | AGILE: A Diffusion-Based Attention-Guided Image and Label Translation for Efficient Cross-Domain Plant Trait Identification | Earl Ranario et.al. | 2503.22019 | null |
| 2025-03-27 | FACETS: Efficient Once-for-all Object Detection via Constrained Iterative Search | Tony Tran et.al. | 2503.21999 | null |
| 2025-03-27 | Exponentially Weighted Instance-Aware Repeat Factor Sampling for Long-Tailed Object Detection Model Training in Unmanned Aerial Vehicles Surveillance Scenarios | Taufiq Ahmed et.al. | 2503.21893 | null |
| 2025-03-27 | Learning Class Prototypes for Unified Sparse Supervised 3D Object Detection | Yun Zhu et.al. | 2503.21099 | link |
| 2025-03-26 | SaViD: Spectravista Aesthetic Vision Integration for Robust and Discerning 3D Object Detection in Challenging Environments | Tanmoy Dam et.al. | 2503.20614 | link |
| 2025-03-26 | Small Object Detection: A Comprehensive Survey on Challenges, Techniques and Real-World Applications | Mahya Nikouei et.al. | 2503.20516 | null |
| 2025-03-25 | Gemini Robotics: Bringing AI into the Physical World | Gemini Robotics Team et.al. | 2503.20020 | null |
| 2025-03-25 | Hyperdimensional Uncertainty Quantification for Multimodal Uncertainty Fusion in Autonomous Vehicles Perception | Luke Chen et.al. | 2503.20011 | null |
| 2025-03-25 | Mind the Gap: Benchmarking Spatial Reasoning in Vision-Language Models | Ilias Stogiannidis et.al. | 2503.19707 | null |
| 2025-03-25 | BiblioPage: A Dataset of Scanned Title Pages for Bibliographic Metadata Extraction | Jan Kohút et.al. | 2503.19658 | null |
| 2025-03-25 | Single Shot AI-assisted quantification of KI-67 proliferation index in breast cancer | Deepti Madurai Muthu et.al. | 2503.19606 | null |
| 2025-03-25 | MATT-GS: Masked Attention-based 3DGS for Robot Perception and Object Detection | Jee Won Lee et.al. | 2503.19330 | null |
| 2025-03-25 | Multiscale Feature Importance-based Bit Allocation for End-to-End Feature Coding for Machines | Junle Liu et.al. | 2503.19278 | null |
| 2025-03-24 | Benchmarking Object Detectors under Real-World Distribution Shifts in Satellite Imagery | Sara Al-Emadi et.al. | 2503.19202 | null |
| 2025-03-24 | Pitch Contour Exploration Across Audio Domains: A Vision-Based Transfer Learning Approach | Jakob Abeßer et.al. | 2503.19161 | null |
| 2025-03-24 | Cooperative Control of Multi-Quadrotors for Transporting Cable-Suspended Payloads: Obstacle-Aware Planning and Event-Based Nonlinear Model Predictive Control | Tohid Kargar Tasooji et.al. | 2503.19135 | null |
| 2025-03-24 | Building Blocks for Robust and Effective Semi-Supervised Real-World Object Detection | Moussa Kassem Sbeyti et.al. | 2503.18903 | null |
| 2025-03-24 | LGI-DETR: Local-Global Interaction for UAV Object Detection | Zifa Chen et.al. | 2503.18785 | null |
| 2025-03-25 | Frequency Dynamic Convolution for Dense Image Prediction | Linwei Chen et.al. | 2503.18783 | null |
| 2025-03-24 | CQ-DINO: Mitigating Gradient Dilution via Category Queries for Vast Vocabulary Object Detection | Zhichao Sun et.al. | 2503.18430 | null |
| 2025-03-24 | Vision-Guided Loco-Manipulation with a Snake Robot | Adarsh Salagame et.al. | 2503.18308 | null |
| 2025-03-23 | Extended Visibility of Autonomous Vehicles via Optimized Cooperative Perception under Imperfect Communication | Ahmad Sarlak et.al. | 2503.18192 | null |
| 2025-03-22 | MAMAT: 3D Mamba-Based Atmospheric Turbulence Removal and its Object Detection Capability | Paul Hill et.al. | 2503.17700 | null |
| 2025-03-22 | Sense4FL: Vehicular Crowdsensing Enhanced Federated Learning for Autonomous Driving | Yanan Ma et.al. | 2503.17697 | null |
| 2025-03-21 | Should we pre-train a decoder in contrastive learning for dense prediction tasks? | Sébastien Quetin et.al. | 2503.17526 | null |
| 2025-03-21 | Event-Based Crossing Dataset (EBCD) | Joey Mulé et.al. | 2503.17499 | null |
| 2025-03-21 | An Iterative Feedback Mechanism for Improving Natural Language Class Descriptions in Open-Vocabulary Object Detection | Louis Y. Kim et.al. | 2503.17285 | null |
| 2025-03-21 | Which2comm: An Efficient Collaborative Perception Framework for 3D Object Detection | Duanrui Yu et.al. | 2503.17175 | null |
| 2025-03-21 | Hi-ALPS – An Experimental Robustness Quantification of Six LiDAR-based Object Detection Systems for Autonomous Driving | Alexandra Arzberger et.al. | 2503.17168 | null |
| 2025-03-21 | R-LiViT: A LiDAR-Visual-Thermal Dataset Enabling Vulnerable Road User Focused Roadside Perception | Jonas Mirlach et.al. | 2503.17122 | null |
| 2025-03-21 | Exploring Few-Shot Object Detection on Blood Smear Images: A Case Study of Leukocytes and Schistocytes | Davide Antonio Mura et.al. | 2503.17107 | null |
| 2025-03-21 | R2LDM: An Efficient 4D Radar Super-Resolution Framework Leveraging Diffusion Model | Boyuan Zheng et.al. | 2503.17097 | null |
| 2025-03-21 | Superpowering Open-Vocabulary Object Detectors for X-ray Vision | Pablo Garcia-Fernandez et.al. | 2503.17071 | null |
| 2025-03-21 | Scoring, Remember, and Reference: Catching Camouflaged Objects in Videos | Yuang Feng et.al. | 2503.17050 | null |
| 2025-03-21 | Salient Object Detection in Traffic Scene through the TSOD10K Dataset | Yu Qiu et.al. | 2503.16910 | null |
| 2025-03-21 | Seg2Box: 3D Object Detection by Point-Wise Semantics Supervision | Maoji Zheng et.al. | 2503.16811 | null |
| 2025-03-20 | RESFL: An Uncertainty-Aware Framework for Responsible Federated Learning by Balancing Privacy, Fairness and Utility in Autonomous Vehicles | Dawood Wasif et.al. | 2503.16251 | null |
| 2025-03-20 | MapGlue: Multimodal Remote Sensing Image Matching | Peihao Wu et.al. | 2503.16185 | null |
| 2025-03-20 | Uncertainty Meets Diversity: A Comprehensive Active Learning Framework for Indoor 3D Object Detection | Jiangyi Wang et.al. | 2503.16125 | null |
| 2025-03-20 | Semantic-Guided Global-Local Collaborative Networks for Lightweight Image Super-Resolution | Wanshu Fan et.al. | 2503.16056 | null |
| 2025-03-19 | A Context-Driven Training-Free Network for Lightweight Scene Text Segmentation and Recognition | Ritabrata Chakraborty et.al. | 2503.15639 | null |
| 2025-03-19 | DCA: Dividing and Conquering Amnesia in Incremental Object Detection | Aoting Zhang et.al. | 2503.15295 | null |
| 2025-03-19 | Test-Time Backdoor Detection for Object Detection Models | Hangtao Zhang et.al. | 2503.15293 | null |
| 2025-03-19 | GO-N3RDet: Geometry Optimized NeRF-enhanced 3D Object Detector | Zechuan Li et.al. | 2503.15211 | null |
| 2025-03-19 | UltraFlwr – An Efficient Federated Medical and Surgical Object Detection Framework | Yang Li et.al. | 2503.15161 | null |
| 2025-03-19 | An Investigation of Beam Density on LiDAR Object Detection Performance | Christoph Griesbacher et.al. | 2503.15087 | null |
| 2025-03-19 | SPADE: Systematic Prompt Framework for Automated Dialogue Expansion in Machine-Generated Text Detection | Haoyi Li et.al. | 2503.15044 | null |
| 2025-03-19 | Fine-Grained Open-Vocabulary Object Detection with Fined-Grained Prompts: Task, Dataset and Benchmark | Ying Liu et.al. | 2503.14862 | null |
| 2025-03-19 | State Space Model Meets Transformer: A New Paradigm for 3D Object Detection | Chuxin Wang et.al. | 2503.14493 | null |
| 2025-03-18 | Panoramic Distortion-Aware Tokenization for Person Detection and Localization Using Transformers in Overhead Fisheye Images | Nobuhiko Wakai et.al. | 2503.14228 | null |
| 2025-03-18 | A Revisit to the Decoder for Camouflaged Object Detection | Seung Woo Ko et.al. | 2503.14035 | null |
| 2025-03-18 | Shift, Scale and Rotation Invariant Multiple Object Detection using Balanced Joint Transform Correlator | Xi Shen et.al. | 2503.14034 | null |
| 2025-03-18 | LEGNet: Lightweight Edge-Gaussian Driven Network for Low-Quality Remote Sensing Image Object Detection | Wei Lu et.al. | 2503.14012 | null |
| 2025-03-18 | FrustumFusionNets: A Three-Dimensional Object Detection Network Based on Tractor Road Scene | Lili Yang et.al. | 2503.13951 | null |
| 2025-03-18 | Is Discretization Fusion All You Need for Collaborative Perception? | Kang Yang et.al. | 2503.13946 | null |
| 2025-03-18 | PSA-SSL: Pose and Size-aware Self-Supervised Learning on LiDAR Point Clouds | Barza Nisar et.al. | 2503.13914 | null |
| 2025-03-18 | HSOD-BIT-V2: A New Challenging Benchmarkfor Hyperspectral Salient Object Detection | Yuhao Qiu et.al. | 2503.13906 | null |
| 2025-03-18 | TGBFormer: Transformer-GraphFormer Blender Network for Video Object Detection | Qiang Qi et.al. | 2503.13903 | null |
| 2025-03-17 | Beyond RGB: Adaptive Parallel Processing for RAW Object Detection | Shani Gamrian et.al. | 2503.13163 | null |
| 2025-03-17 | Who Wrote This? Identifying Machine vs Human-Generated Text in Hausa | Babangida Sani et.al. | 2503.13101 | null |
| 2025-03-17 | SparseAlign: A Fully Sparse Framework for Cooperative Object Detection | Yunshuang Yuan et.al. | 2503.12982 | null |
| 2025-03-17 | Efficient Multimodal 3D Object Detector via Instance-Level Contrastive Distillation | Zhuoqun Su et.al. | 2503.12914 | null |
| 2025-03-16 | Point Cloud Based Scene Segmentation: A Survey | Dan Halperin et.al. | 2503.12595 | null |
| 2025-03-16 | GeoRSMLLM: A Multimodal Large Language Model for Vision-Language Tasks in Geoscience and Remote Sensing | Zilun Zhang et.al. | 2503.12490 | null |
| 2025-03-16 | Deepfake Detection with Optimized Hybrid Model: EAR Biometric Descriptor via Improved RCNN | Ruchika Sharma et.al. | 2503.12381 | null |
| 2025-03-15 | An Efficient Deep Learning-Based Approach to Automating Invoice Document Validation | Aziz Amari et.al. | 2503.12267 | null |
| 2025-03-15 | Minuscule Cell Detection in AS-OCT Images with Progressive Field-of-View Focusing | Boyu Chen et.al. | 2503.12249 | null |
| 2025-03-15 | SFMNet: Sparse Focal Modulation for 3D Object Detection | Oren Shrout et.al. | 2503.12093 | null |
| 2025-03-14 | FLASHμ: Fast Localizing And Sizing of Holographic Microparticles | Ayush Paliwal et.al. | 2503.11538 | null |
| 2025-03-14 | Falcon: A Remote Sensing Vision-Language Foundation Model | Kelu Yao et.al. | 2503.11070 | null |
| 2025-03-14 | FMNet: Frequency-Assisted Mamba-Like Linear Attention Network for Camouflaged Object Detection | Ming Deng et.al. | 2503.11030 | null |
| 2025-03-14 | Comparative Analysis of Advanced AI-based Object Detection Models for Pavement Marking Quality Assessment during Daytime | Gian Antariksa et.al. | 2503.11008 | null |
| 2025-03-14 | Cyclic Contrastive Knowledge Transfer for Open-Vocabulary Object Detection | Chuhan Zhang et.al. | 2503.11005 | null |
| 2025-03-14 | Enhanced Multi-View Pedestrian Detection Using Probabilistic Occupancy Volume | Reef Alturki et.al. | 2503.10982 | null |
| 2025-03-13 | The Power of One: A Single Example is All it Takes for Segmentation in VLMs | Mir Rayat Imtiaz Hossain et.al. | 2503.10779 | null |
| 2025-03-13 | HeightFormer: Learning Height Prediction in Voxel Features for Roadside Vision Centric 3D Object Detection via Transformer | Zhang Zhang et.al. | 2503.10777 | null |
| 2025-03-13 | Semantic-Supervised Spatial-Temporal Fusion for LiDAR-based 3D Object Detection | Chaoqun Wang et.al. | 2503.10579 | null |
| 2025-03-13 | RoCo-Sim: Enhancing Roadside Collaborative Perception through Foreground Simulation | Yuwen Du et.al. | 2503.10410 | link |
| 2025-03-13 | RoMA: Scaling up Mamba-based Foundation Models for Remote Sensing | Fengxiang Wang et.al. | 2503.10392 | link |
| 2025-03-13 | Object detection characteristics in a learning factory environment using YOLOv8 | Toni Schneidereit et.al. | 2503.10356 | null |
| 2025-03-13 | TARS: Traffic-Aware Radar Scene Flow Estimation | Jialong Wu et.al. | 2503.10210 | null |
| 2025-03-13 | A Hierarchical Semantic Distillation Framework for Open-Vocabulary Object Detection | Shenghao Fu et.al. | 2503.10152 | link |
| 2025-03-13 | Deep Learning-Based Direct Leaf Area Estimation using Two RGBD Datasets for Model Development | Namal Jayasuriya et.al. | 2503.10129 | null |
| 2025-03-13 | Style Evolving along Chain-of-Thought for Unknown-Domain Object Detection | Zihao Zhang et.al. | 2503.09968 | null |
| 2025-03-12 | CleverDistiller: Simple and Spatially Consistent Cross-modal Distillation | Hariprasath Govindarajan et.al. | 2503.09878 | null |
| 2025-03-12 | How good are deep learning methods for automated road safety analysis using video data? An experimental study | Qingwu Liu et.al. | 2503.09807 | null |
| 2025-03-12 | Deep Learning for Climate Action: Computer Vision Analysis of Visual Narratives on X | Katharina Prasse et.al. | 2503.09361 | null |
| 2025-03-12 | Fully-Synthetic Training for Visual Quality Inspection in Automotive Production | Christoph Huber et.al. | 2503.09354 | null |
| 2025-03-12 | DitHub: A Modular Framework for Incremental Open-Vocabulary Object Detection | Chiara Cappellino et.al. | 2503.09271 | null |
| 2025-03-12 | Polygonizing Roof Segments from High-Resolution Aerial Images Using Yolov8-Based Edge Detection | Qipeng Mei et.al. | 2503.09187 | null |
| 2025-03-12 | RFUAV: A Benchmark Dataset for Unmanned Aerial Vehicle Detection and Identification | Rui Shi et.al. | 2503.09033 | link |
| 2025-03-12 | Dual-Domain Homogeneous Fusion with Cross-Modal Mamba and Progressive Decoder for 3D Object Detection | Xuzhong Hu et.al. | 2503.08992 | null |
| 2025-03-11 | GBlobs: Explicit Local Structure via Gaussian Blobs for Improved Cross-Domain LiDAR-based 3D Object Detection | Dušan Malić et.al. | 2503.08639 | null |
| 2025-03-11 | Referring to Any Person | Qing Jiang et.al. | 2503.08507 | link |
| 2025-03-11 | SuperCap: Multi-resolution Superpixel-based Image Captioning | Henry Senior et.al. | 2503.08496 | null |
| 2025-03-13 | Learning to Detect Objects from Multi-Agent LiDAR Scans without Manual Labels | Qiming Xia et.al. | 2503.08421 | null |
| 2025-03-11 | Embodied Crowd Counting | Runling Long et.al. | 2503.08367 | null |
| 2025-03-11 | Physics-based AI methodology for Material Parameter Extraction from Optical Data | M. Koumans et.al. | 2503.08183 | null |
| 2025-03-11 | Bring Remote Sensing Object Detect Into Nature Language Model: Using SFT Method | Fei Wang et.al. | 2503.08144 | null |
| 2025-03-11 | Accelerate 3D Object Detection Models via Zero-Shot Attention Key Pruning | Lizhen Xu et.al. | 2503.08101 | link |
| 2025-03-11 | SparseVoxFormer: Sparse Voxel-based Transformer for Multi-modal 3D Object Detection | Hyeongseok Son et.al. | 2503.08092 | null |
| 2025-03-11 | Simulating Automotive Radar with Lidar and Camera Inputs | Peili Song et.al. | 2503.08068 | null |
| 2025-03-10 | YOLOE: Real-Time Seeing Anything | Ao Wang et.al. | 2503.07465 | link |
| 2025-03-10 | HGO-YOLO: Advancing Anomaly Behavior Detection with Hierarchical Features and Lightweight Optimized Detection | Qizhi Zheng et.al. | 2503.07371 | null |
| 2025-03-10 | Mitigating Hallucinations in YOLO-based Object Detection Models: A Revisit to Out-of-Distribution Detection | Weicheng He et.al. | 2503.07330 | null |
| 2025-03-10 | Semantic Communications with Computer Vision Sensing for Edge Video Transmission | Yubo Peng et.al. | 2503.07252 | null |
| 2025-03-10 | MIRAM: Masked Image Reconstruction Across Multiple Scales for Breast Lesion Risk Prediction | Hung Q. Vo et.al. | 2503.07157 | null |
| 2025-03-10 | A Light Perspective for 3D Object Detection | Marcelo Eduardo Pederiva et.al. | 2503.07133 | null |
| 2025-03-10 | SimROD: A Simple Baseline for Raw Object Detection with Global and Local Enhancements | Haiyang Xie et.al. | 2503.07101 | link |
| 2025-03-10 | RS2V-L: Vehicle-Mounted LiDAR Data Generation from Roadside Sensor Observations | Ruidan Xing et.al. | 2503.07085 | null |
| 2025-03-10 | Availability-aware Sensor Fusion via Unified Canonical Space for 4D Radar, LiDAR, and Camera | Dong-Hee Paek et.al. | 2503.07029 | null |
| 2025-03-10 | Large Language Model Guided Progressive Feature Alignment for Multimodal UAV Object Detection | Wentao Wu et.al. | 2503.06948 | null |
| 2025-03-06 | Collaborative Evaluation of Deepfake Text with Deliberation-Enhancing Dialogue Systems | Jooyoung Lee et.al. | 2503.04945 | null |
| 2025-03-06 | Fine-Tuning Florence2 for Enhanced Object Detection in Un-constructed Environments: Vision-Language Model Approach | Soumyadeep Ro et.al. | 2503.04918 | null |
| 2025-03-06 | Floxels: Fast Unsupervised Voxel Based Scene Flow Estimation | David T. Hoffmann et.al. | 2503.04718 | null |
| 2025-03-06 | DEAL-YOLO: Drone-based Efficient Animal Localization using YOLO | Aditya Prashant Naidu et.al. | 2503.04698 | null |
| 2025-03-06 | Teach YOLO to Remember: A Self-Distillation Approach for Continual Object Detection | Riccardo De Monte et.al. | 2503.04688 | null |
| 2025-03-06 | ReynoldsFlow: Exquisite Flow Estimation via Reynolds Transport Theorem | Yu-Hsi Chen et.al. | 2503.04500 | link |
| 2025-03-06 | A lightweight model FDM-YOLO for small target improvement based on YOLOv8 | Xuerui Zhang et.al. | 2503.04452 | null |
| 2025-03-06 | Shaken, Not Stirred: A Novel Dataset for Visual Understanding of Glasses in Human-Robot Bartending Tasks | Lukáš Gajdošech et.al. | 2503.04308 | null |
| 2025-03-06 | CA-W3D: Leveraging Context-Aware Knowledge for Weakly Supervised Monocular 3D Detection | Chupeng Liu et.al. | 2503.04154 | null |
| 2025-03-06 | Robust Computer-Vision based Construction Site Detection for Assistive-Technology Applications | Junchi Feng et.al. | 2503.04139 | null |
| 2025-03-06 | Fractional Correspondence Framework in Detection Transformer | Masoumeh Zareapoor et.al. | 2503.04107 | null |
| 2025-03-05 | DualDiff+: Dual-Branch Diffusion for High-Fidelity Video Generation with Reward Guidance | Zhao Yang et.al. | 2503.03689 | link |
| 2025-03-05 | 4D Radar Ground Truth Augmentation with LiDAR-to-4D Radar Data Synthesis | Woo-Jin Jung et.al. | 2503.03637 | null |
| 2025-03-05 | Feature-Level Insights into Artificial Text Detection with Sparse Autoencoders | Kristian Kuznetsov et.al. | 2503.03601 | null |
| 2025-03-05 | Simulation-Based Performance Evaluation of 3D Object Detection Methods with Deep Learning for a LiDAR Point Cloud Dataset in a SOTIF-related Use Case | Milin Patel et.al. | 2503.03548 | link |
| 2025-03-05 | AI-Driven Multi-Stage Computer Vision System for Defect Detection in Laser-Engraved Industrial Nameplates | Adhish Anitha Vilasan et.al. | 2503.03395 | null |
| 2025-03-05 | MIAdapt: Source-free Few-shot Domain Adaptive Object Detection for Microscopic Images | Nimra Dilawar et.al. | 2503.03370 | null |
| 2025-03-05 | Automated Attendee Recognition System for Large-Scale Social Events or Conference Gathering | Dhruv Motwani et.al. | 2503.03330 | null |
| 2025-03-05 | BEVMOSNet: Multimodal Fusion for BEV Moving Object Segmentation | Hiep Truong Cong et.al. | 2503.03280 | null |
| 2025-03-05 | Find Matching Faces Based On Face Parameters | Setu A. Bhatt et.al. | 2503.03204 | null |
| 2025-03-04 | Revolutionizing Traffic Management with AI-Powered Machine Vision: A Step Toward Smart Cities | Seyed Hossein Hosseini DolatAbadi et.al. | 2503.02967 | null |
| 2025-03-04 | Class-Aware PillarMix: Can Mixed Sample Data Augmentation Enhance 3D Object Detection with Radar Point Clouds? | Miao Zhang et.al. | 2503.02687 | null |
| 2025-03-04 | Exploring Model Quantization in GenAI-based Image Inpainting and Detection of Arable Plants | Sourav Modak et.al. | 2503.02420 | null |
| 2025-03-04 | Robust detection of overlapping bioacoustic sound events | Louis Mahon et.al. | 2503.02389 | null |
| 2025-03-04 | YOLO-PRO: Enhancing Instance-Specific Object Detection with Full-Channel Global Self-Attention | Lin Huang et.al. | 2503.02348 | null |
| 2025-03-04 | SSNet: Saliency Prior and State Space Model-based Network for Salient Object Detection in RGB-D Images | Gargi Panda et.al. | 2503.02270 | null |
| 2025-03-03 | Generalized Diffusion Detector: Mining Robust Features from Diffusion Models for Domain-Generalized Detection | Boyong He et.al. | 2503.02101 | null |
| 2025-03-03 | Uncertainty Representation in a SOTIF-Related Use Case with Dempster-Shafer Theory for LiDAR Sensor-Based Object Detection | Milin Patel et.al. | 2503.02087 | link |
| 2025-03-03 | Visual-RFT: Visual Reinforcement Fine-Tuning | Ziyu Liu et.al. | 2503.01785 | link |
| 2025-03-03 | Enhancing Object Detection Accuracy in Underwater Sonar Images through Deep Learning-based Denoising | Ziyu Wang et.al. | 2503.01655 | null |
| 2025-03-03 | Evaluating Stenosis Detection with Grounding DINO, YOLO, and DINO-DETR | Muhammad Musab Ansari et.al. | 2503.01601 | null |
| 2025-02-28 | The Common Objects Underwater (COU) Dataset for Robust Underwater Object Detection | Rishi Mukherjee et.al. | 2502.20651 | null |
| 2025-02-28 | RTGen: Real-Time Generative Detection Transformer | Chi Ruan et.al. | 2502.20622 | null |
| 2025-02-28 | LV-DOT: LiDAR-visual dynamic obstacle detection and tracking for autonomous robot navigation | Zhefan Xu et.al. | 2502.20607 | null |
| 2025-02-27 | Multi-Scale Neighborhood Occupancy Masked Autoencoder for Self-Supervised Learning in LiDAR Point Clouds | Mohamed Abdelsamad et.al. | 2502.20316 | null |
| 2025-02-27 | OverLoCK: An Overview-first-Look-Closely-next ConvNet with Context-Mixing Dynamic Kernels | Meng Lou et.al. | 2502.20087 | link |
| 2025-02-27 | Night-Voyager: Consistent and Efficient Nocturnal Vision-Aided State Estimation in Object Maps | Tianxiao Gao et.al. | 2502.20054 | null |
| 2025-02-27 | Learning Mask Invariant Mutual Information for Masked Image Modeling | Tao Huang et.al. | 2502.19718 | null |
| 2025-02-27 | BEVDiffuser: Plug-and-Play Diffusion Model for BEV Denoising with Ground-Truth Guidance | Xin Ye et.al. | 2502.19694 | null |
| 2025-02-26 | Ev-3DOD: Pushing the Temporal Boundaries of 3D Object Detection with Event Cameras | Hoonhee Cho et.al. | 2502.19630 | link |
| 2025-02-26 | Is Your Paper Being Reviewed by an LLM? A New Benchmark Dataset and Approach for Detecting AI Text in Peer Review | Sungduk Yu et.al. | 2502.19614 | null |
| 2025-02-23 | Rewards-based image analysis in microscopy | Kamyar Barakati et.al. | 2502.18522 | null |
| 2025-02-25 | Multi-Perspective Data Augmentation for Few-shot Object Detection | Anh-Khoa Nguyen Vu et.al. | 2502.18195 | null |
| 2025-02-25 | Progressive Local Alignment for Medical Multimodal Pre-training | Huimin Yan et.al. | 2502.18047 | null |
| 2025-02-25 | Automatic Vehicle Detection using DETR: A Transformer-Based Approach for Navigating Treacherous Roads | Istiaq Ahmed Fahad et.al. | 2502.17843 | null |
| 2025-02-24 | Semi-Supervised Weed Detection in Vegetable Fields: In-domain and Cross-domain Experiments | Boyang Deng et.al. | 2502.17673 | null |
| 2025-02-24 | Experimental validation of UAV search and detection system in real wilderness environment | Stella Dumenčić et.al. | 2502.17372 | null |
| 2025-02-24 | LCV2I: Communication-Efficient and High-Performance Collaborative Perception Framework with Low-Resolution LiDAR | Xinxin Feng et.al. | 2502.17039 | null |
| 2025-02-24 | Sarang at DEFACTIFY 4.0: Detecting AI-Generated Text Using Noised Data and an Ensemble of DeBERTa Models | Avinash Trivedi et.al. | 2502.16857 | null |
| 2025-02-23 | Geometry-Aware 3D Salient Object Detection Network | Chen Wang et.al. | 2502.16488 | null |
| 2025-02-26 | MQADet: A Plug-and-Play Paradigm for Enhancing Open-Vocabulary Object Detection via Multimodal Question Answering | Caixiong Li et.al. | 2502.16486 | null |
| 2025-02-23 | Cross-domain Few-shot Object Detection with Multi-modal Textual Enrichment | Zeyu Shangguan et.al. | 2502.16469 | null |
| 2025-02-23 | Deep learning approaches to surgical video segmentation and object detection: A Scoping Review | Devanish N. Kamtam et.al. | 2502.16459 | null |
| 2025-02-22 | FeatSharp: Your Vision Model Features, Sharper | Mike Ranzinger et.al. | 2502.16025 | link |
| 2025-02-21 | Generative AI Framework for 3D Object Generation in Augmented Reality | Majid Behravan et.al. | 2502.15869 | null |
| 2025-02-21 | Machine-generated text detection prevents language model collapse | George Drayson et.al. | 2502.15654 | link |
| 2025-02-21 | Depth-aware Fusion Method based on Image and 4D Radar Spectrum for 3D Object Detection | Yue Sun et.al. | 2502.15516 | null |
| 2025-02-21 | Q-PETR: Quant-aware Position Embedding Transformation for Multi-View 3D Object Detection | Jiangyong Yu et.al. | 2502.15488 | null |
| 2025-02-21 | PFSD: A Multi-Modal Pedestrian-Focus Scene Dataset for Rich Tasks in Semi-Structured Environments | Yueting Liu et.al. | 2502.15342 | null |
| 2025-02-20 | Synth It Like KITTI: Synthetic Data Generation for Object Detection in Driving Scenarios | Richard Marcus et.al. | 2502.15076 | null |
| 2025-02-20 | YOLOv12: A Breakdown of the Key Architectural Features | Mujadded Al Rabbani Alif et.al. | 2502.14740 | null |
| 2025-02-20 | LXLv2: Enhanced LiDAR Excluded Lean 3D Object Detection with Fusion of 4D Radar and Camera | Weiyi Xiong et.al. | 2502.14503 | null |
| 2025-02-20 | ODVerse33: Is the New YOLO Version Always Better? A Multi Domain benchmark from YOLO v5 to v11 | Tianyou Jiang et.al. | 2502.14314 | null |
| 2025-02-19 | PedDet: Adaptive Spectral Optimization for Multimodal Pedestrian Detection | Rui Zhao et.al. | 2502.14063 | link |
| 2025-02-19 | Image compositing is all you need for data augmentation | Ang Jia Ning Shermaine et.al. | 2502.13936 | null |
| 2025-02-19 | MSVCOD:A Large-Scale Multi-Scene Dataset for Video Camouflage Object Detection | Shuyong Gao et.al. | 2502.13859 | null |
| 2025-02-19 | An Overall Real-Time Mechanism for Classification and Quality Evaluation of Rice | Wanke Xia et.al. | 2502.13764 | null |
| 2025-02-18 | Multiple Distribution Shift – Aerial (MDS-A): A Dataset for Test-Time Error Detection and Model Adaptation | Noel Ngu et.al. | 2502.13289 | null |
| 2025-02-18 | RobuRCDet: Enhancing Robustness of Radar-Camera Fusion in Bird’s Eye View for 3D Object Detection | Jingtong Yue et.al. | 2502.13071 | null |
| 2025-02-18 | Task-Oriented Semantic Communication for Stereo-Vision 3D Object Detection | Zijian Cao et.al. | 2502.12735 | null |
| 2025-02-18 | Iron Sharpens Iron: Defending Against Attacks in Machine-Generated Text Detection with Adversarial Training | Yuanfan Li et.al. | 2502.12734 | null |
| 2025-02-18 | DAMamba: Vision State Space Model with Dynamic Adaptive Scan | Tanzhe Li et.al. | 2502.12627 | null |
| 2025-02-18 | Who Writes What: Unveiling the Impact of Author Roles on AI-generated Text Detection | Jiatao Li et.al. | 2502.12611 | null |
| 2025-02-18 | Gaseous Object Detection | Kailai Zhou et.al. | 2502.12415 | null |
| 2025-02-17 | AI-generated Text Detection with a GLTR-based Approach | Lucía Yan Wu et.al. | 2502.12064 | null |
| 2025-02-17 | Enhancing Transparent Object Pose Estimation: A Fusion of GDR-Net and Edge Detection | Tessa Pulli et.al. | 2502.12027 | null |
| 2025-02-17 | ExaGPT: Example-Based Machine-Generated Text Detection for Human Interpretability | Ryuto Koike et.al. | 2502.11336 | null |
| 2025-02-16 | DAViMNet: SSMs-Based Domain Adaptive Object Detection | A. Enes Doruk et.al. | 2502.11178 | null |
| 2025-02-15 | CLoCKDistill: Consistent Location-and-Context-aware Knowledge Distillation for DETRs | Qizhen Lan et.al. | 2502.10683 | null |
| 2025-02-14 | Text-guided Sparse Voxel Pruning for Efficient 3D Visual Grounding | Wenxuan Guo et.al. | 2502.10392 | null |
| 2025-02-14 | Object Detection and Tracking | Md Pranto et.al. | 2502.10310 | null |
| 2025-02-14 | Artificial Intelligence to Assess Dental Findings from Panoramic Radiographs – A Multinational Study | Yin-Chih Chelsea Wang et.al. | 2502.10277 | null |
| 2025-02-13 | Instance Segmentation of Scene Sketches Using Natural Image Priors | Mia Tang et.al. | 2502.09608 | null |
| 2025-02-13 | Wholly-WOOD: Wholly Leveraging Diversified-quality Labels for Weakly-supervised Oriented Object Detection | Yi Yu et.al. | 2502.09471 | link |
| 2025-02-13 | Mitigating the Impact of Prominent Position Shift in Drone-based RGBT Object Detection | Yan Zhang et.al. | 2502.09311 | null |
| 2025-02-13 | Billet Number Recognition Based on Test-Time Adaptation | Yuan Wei et.al. | 2502.09026 | null |
| 2025-02-12 | Uncertainty Aware Human-machine Collaboration in Camouflaged Object Detection | Ziyue Yang et.al. | 2502.08373 | link |
| 2025-02-12 | Modification and Generated-Text Detection: Achieving Dual Detection Capabilities for the Outputs of LLM by Watermark | Yuhang Cai et.al. | 2502.08332 | null |
| 2025-02-12 | Plantation Monitoring Using Drone Images: A Dataset and Performance Review | Yashwanth Karumanchi et.al. | 2502.08233 | null |
| 2025-02-12 | Take What You Need: Flexible Multi-Task Semantic Communications with Channel Adaptation | Xiang Chen et.al. | 2502.08221 | null |
| 2025-02-13 | SARChat-Bench-2M: A Multi-Task Vision-Language Benchmark for SAR Image Interpretation | Zhiming Ma et.al. | 2502.08168 | null |
| 2025-02-12 | Knowledge Swapping via Learning and Unlearning | Mingyu Xing et.al. | 2502.08075 | null |
| 2025-02-11 | Visual-based spatial audio generation system for multi-speaker environments | Xiaojing Liu et.al. | 2502.07538 | null |
| 2025-02-11 | Quantitative Analysis of Objects in Prisoner Artworks | Thea Christoffersen et.al. | 2502.07440 | null |
| 2025-02-11 | Fast-COS: A Fast One-Stage Object Detector Based on Reparameterized Attention Vision Transformer for Autonomous Driving | Novendra Setyawan et.al. | 2502.07417 | null |
| 2025-02-11 | Multi-Task-oriented Nighttime Haze Imaging Enhancer for Vision-driven Measurement Systems | Ai Chen et.al. | 2502.07351 | link |
| 2025-02-11 | SparseFormer: Detecting Objects in HRW Shots via Sparse Vision Transformer | Wenxi Li et.al. | 2502.07216 | null |
| 2025-02-11 | Dense Object Detection Based on De-homogenized Queries | Yueming Huang et.al. | 2502.07194 | null |
| 2025-02-11 | Foreign-Object Detection in High-Voltage Transmission Line Based on Improved YOLOv8m | Zhenyue Wang et.al. | 2502.07175 | null |
| 2025-02-11 | A Survey on Mamba Architecture for Vision Applications | Fady Ibrahim et.al. | 2502.07161 | null |
| 2025-02-10 | Multimodal Search on a Line | Jared Coleman et.al. | 2502.07000 | null |
| 2025-02-10 | AgilePilot: DRL-Based Drone Agent for Real-Time Motion Planning in Dynamic Environments by Leveraging Object Detection | Roohan Ahmed Khan et.al. | 2502.06725 | null |
| 2025-02-10 | EdgeMLBalancer: A Self-Adaptive Approach for Dynamic Model Switching on Resource-Constrained Edge Devices | Akhila Matathammal et.al. | 2502.06493 | null |
| 2025-02-10 | PLATTER: A Page-Level Handwritten Text Recognition System for Indic Scripts | Badri Vishal Kasuba et.al. | 2502.06172 | null |
| 2025-02-10 | Enhancing Document Key Information Localization Through Data Augmentation | Yue Dai et.al. | 2502.06132 | null |
| 2025-02-10 | Improved YOLOv5s model for key components detection of power transmission lines | Chen Chen et.al. | 2502.06127 | null |
| 2025-02-10 | A Novel Multi-Teacher Knowledge Distillation for Real-Time Object Detection using 4D Radar | Seung-Hyun Song et.al. | 2502.06114 | null |
| 2025-02-09 | Training-free Anomaly Event Detection via LLM-guided Symbolic Pattern Discovery | Yuhui Zeng et.al. | 2502.05843 | null |
| 2025-02-08 | Demystifying Catastrophic Forgetting in Two-Stage Incremental Object Detector | Qirui Wu et.al. | 2502.05540 | null |
| 2025-02-07 | Invizo: Arabic Handwritten Document Optical Character Recognition Solution | Alhossien Waly et.al. | 2502.05277 | null |
| 2025-02-07 | LP-DETR: Layer-wise Progressive Relations for Object Detection | Zhengjian Kang et.al. | 2502.05147 | null |
| 2025-02-07 | Counting Fish with Temporal Representations of Sonar Video | Kai Van Brunt et.al. | 2502.05129 | null |
| 2025-02-07 | DetVPCC: RoI-based Point Cloud Sequence Compression for 3D Object Detection | Mingxuan Yan et.al. | 2502.04804 | null |
| 2025-02-07 | MHAF-YOLO: Multi-Branch Heterogeneous Auxiliary Fusion YOLO for accurate object detection | Zhiqiang Yang et.al. | 2502.04656 | link |
| 2025-02-07 | AIQViT: Architecture-Informed Post-Training Quantization for Vision Transformers | Runqing Jiang et.al. | 2502.04628 | null |
| 2025-02-06 | An Optimized YOLOv5 Based Approach For Real-time Vehicle Detection At Road Intersections Using Fisheye Cameras | Md. Jahin Alam et.al. | 2502.04566 | null |
| 2025-02-06 | Group-Adaptive Threshold Optimization for Robust AI-Generated Text Detection | Minseok Jung et.al. | 2502.04528 | null |
| 2025-02-06 | OneTrack-M: A multitask approach to transformer-based MOT models | Luiz C. S. de Araujo et.al. | 2502.04478 | null |
| 2025-02-07 | Point2RBox-v2: Rethinking Point-supervised Oriented Object Detection with Spatial Layout Among Instances | Yi Yu et.al. | 2502.04268 | null |
| 2025-02-06 | An object detection approach for lane change and overtake detection from motion profiles | Andrea Benericetti et.al. | 2502.04244 | null |
| 2025-02-06 | YOLOv4: A Breakthrough in Real-Time Object Detection | Athulya Sundaresan Geetha et.al. | 2502.04161 | null |
| 2025-02-06 | Advanced Object Detection and Pose Estimation with Hybrid Task Cascade and High-Resolution Networks | Yuhui Jin et.al. | 2502.03877 | null |
| 2025-02-06 | Pursuing Better Decision Boundaries for Long-Tailed Object Detection via Category Information Amount | Yanbiao Ma et.al. | 2502.03852 | null |
| 2025-02-06 | Single-Domain Generalized Object Detection by Balancing Domain Diversity and Invariance | Zhenwei He et.al. | 2502.03835 | null |
| 2025-02-06 | UAV Cognitive Semantic Communications Enabled by Knowledge Graph for Robust Object Detection | Xi Song et.al. | 2502.03761 | null |
| 2025-02-06 | RAMOTS: A Real-Time System for Aerial Multi-Object Tracking based on Deep Learning and Big Data Technology | Nhat-Tan Do et.al. | 2502.03760 | null |
| 2025-02-05 | An Empirical Study of Methods for Small Object Detection from Satellite Imagery | Xiaohui Yuan et.al. | 2502.03674 | null |
| 2025-02-05 | Gompertz Linear Units: Leveraging Asymmetry for Enhanced Learning Dynamics | Indrashis Das et.al. | 2502.03654 | link |
| 2025-02-05 | RoboGrasp: A Universal Grasping Policy for Robust Robotic Control | Yiqi Huang et.al. | 2502.03072 | null |
| 2025-02-05 | Enhancing Quantum-ready QUBO-based Suppression for Object Detection with Appearance and Confidence Features | Keiichiro Yamamura et.al. | 2502.02895 | null |
| 2025-02-05 | RS-YOLOX: A High Precision Detector for Object Detection in Satellite Remote Sensing Images | Lei Yang et.al. | 2502.02850 | null |
| 2025-02-04 | Learning the RoPEs: Better 2D and 3D Position Encodings with STRING | Connor Schenck et.al. | 2502.02562 | null |
| 2025-02-04 | Uncertainty Quantification for Collaborative Object Detection Under Adversarial Attacks | Huiqun Huang et.al. | 2502.02537 | null |
| 2025-02-04 | Improving Generalization Ability for 3D Object Detection by Learning Sparsity-invariant Features | Hsin-Cheng Lu et.al. | 2502.02322 | null |
| 2025-02-04 | From Fog to Failure: How Dehazing Can Harm Clear Image Object Detection | Ashutosh Kumar et.al. | 2502.02027 | null |
| 2025-02-04 | Memory Efficient Transformer Adapter for Dense Predictions | Dong Zhang et.al. | 2502.01962 | null |
| 2025-02-04 | INTACT: Inducing Noise Tolerance through Adversarial Curriculum Training for LiDAR-based Safety-Critical Perception and Autonomy | Nastaran Darabi et.al. | 2502.01896 | null |
| 2025-02-04 | SimBEV: A Synthetic Multi-Task Multi-Sensor Driving Data Generation Tool and Dataset | Goodarz Mehr et.al. | 2502.01894 | link |
| 2025-02-03 | Reliability-Driven LiDAR-Camera Fusion for Robust 3D Object Detection | Reza Sadeghian et.al. | 2502.01856 | null |
| 2025-02-03 | GauCho: Gaussian Distributions with Cholesky Decomposition for Oriented Object Detection | Jeffri Murrugarra-LLerena et.al. | 2502.01565 | null |
| 2025-02-03 | Human Body Restoration with One-Step Diffusion Model and A New Benchmark | Jue Gong et.al. | 2502.01411 | null |
| 2025-01-31 | Let Human Sketches Help: Empowering Challenging Image Segmentation Task with Freehand Sketches | Ying Zang et.al. | 2501.19329 | null |
| 2025-01-31 | Beyond checkmate: exploring the creative chokepoints in AI text | Nafis Irtiza Tripto et.al. | 2501.19301 | link |
| 2025-01-31 | GO: The Great Outdoors Multimodal Dataset | Peng Jiang et.al. | 2501.19274 | null |
| 2025-01-31 | Adversarial Attacks on AI-Generated Text Detection Models: A Token Probability-Based Approach Using Embeddings | Ahmed K. Kadhim et.al. | 2501.18998 | null |
| 2025-01-31 | Early Diagnosis and Severity Assessment of Weligama Coconut Leaf Wilt Disease and Coconut Caterpillar Infestation using Deep Learning-based Image Processing Techniques | Samitha Vidhanaarachchi et.al. | 2501.18835 | null |
| 2025-01-30 | Tuning Event Camera Biases Heuristic for Object Detection Applications in Staring Scenarios | David El-Chai Ben-Ezra et.al. | 2501.18788 | null |
| 2025-01-30 | Adaptive Object Detection for Indoor Navigation Assistance: A Performance Evaluation of Real-Time Algorithms | Abhinav Pratap et.al. | 2501.18444 | null |
| 2025-01-29 | Real Time Scheduling Framework for Multi Object Detection via Spiking Neural Networks | Donghwa Kang et.al. | 2501.18412 | null |
| 2025-01-30 | IROAM: Improving Roadside Monocular 3D Object Detection Learning from Autonomous Vehicle Data Domain | Zhe Wang et.al. | 2501.18162 | null |
| 2025-02-03 | Efficient Feature Fusion for UAV Object Detection | Xudong Wang et.al. | 2501.17983 | null |
| 2025-01-29 | TransRAD: Retentive Vision Transformer for Enhanced Radar Object Detection | Lei Cheng et.al. | 2501.17977 | link |
| 2025-01-28 | Object Detection with Deep Learning for Rare Event Search in the GADGET II TPC | Tyler Wheeler et.al. | 2501.17892 | null |
| 2025-01-29 | Detection of Oscillation-like Patterns in Eclipsing Binary Light Curves using Neural Network-based Object Detection Algorithms | Burak Ulaş et.al. | 2501.17538 | null |
| 2025-01-30 | Assessing the Capability of YOLO- and Transformer-based Object Detectors for Real-time Weed Detection | Alicia Allmendinger et.al. | 2501.17387 | null |
| 2025-01-28 | DINOSTAR: Deep Iterative Neural Object Detector Self-Supervised Training for Roadside LiDAR Applications | Muhammad Shahbaz et.al. | 2501.17076 | null |
| 2025-01-28 | Contextual Self-paced Learning for Weakly Supervised Spatio-Temporal Video Grounding | Akash Kumar et.al. | 2501.17053 | null |
| 2025-01-28 | Approach Towards Semi-Automated Certification for Low Criticality ML-Enabled Airborne Applications | Chandrasekar Sridhar et.al. | 2501.17028 | null |
| 2025-01-28 | Modulating CNN Features with Pre-Trained ViT Representations for Open-Vocabulary Object Detection | Xiangyu Gao et.al. | 2501.16981 | null |
| 2025-01-28 | B-FPGM: Lightweight Face Detection via Bayesian-Optimized Soft FPGM Pruning | Nikolaos Kaparinos et.al. | 2501.16917 | null |
| 2025-01-28 | SSF-PAN: Semantic Scene Flow-Based Perception for Autonomous Navigation in Traffic Scenarios | Yinqi Chen et.al. | 2501.16754 | null |
| 2025-01-28 | DebugAgent: Efficient and Interpretable Error Slice Discovery for Comprehensive Model Debugging | Muxi Chen et.al. | 2501.16751 | null |
| 2025-01-28 | DFCon: Attention-Driven Supervised Contrastive Learning for Robust Deepfake Detection | MD Sadik Hossain Shanto et.al. | 2501.16704 | null |
| 2025-01-27 | Efficient Object Detection of Marine Debris using Pruned YOLO Model | Abi Aryaza et.al. | 2501.16571 | null |
| 2025-01-27 | Object Detection for Medical Image Analysis: Insights from the RT-DETR Model | Weijie He et.al. | 2501.16469 | null |
| 2025-01-27 | The Linear Attention Resurrection in Vision Transformer | Chuanyang Zheng et.al. | 2501.16182 | null |
| 2025-01-27 | Real-Time Brain Tumor Detection in Intraoperative Ultrasound Using YOLO11: From Model Training to Deployment in the Operating Room | Santiago Cepeda et.al. | 2501.15994 | null |
| 2025-01-26 | Classifying Deepfakes Using Swin Transformers | Aprille J. Xi et.al. | 2501.15656 | null |
| 2025-01-26 | A Privacy Enhancing Technique to Evade Detection by Street Video Cameras Without Using Adversarial Accessories | Jacob Shams et.al. | 2501.15653 | null |
| 2025-01-26 | Breaking the SSL-AL Barrier: A Synergistic Semi-Supervised Active Learning Framework for 3D Object Detection | Zengran Wang et.al. | 2501.15449 | null |
| 2025-01-26 | FAVbot: An Autonomous Target Tracking Micro-Robot with Frequency Actuation Control | Zhijian Hao et.al. | 2501.15426 | null |
| 2025-01-26 | Doracamom: Joint 3D Detection and Occupancy Prediction with Multi-view 4D Radars and Cameras for Omnidirectional Perception | Lianqing Zheng et.al. | 2501.15394 | null |
| 2025-01-26 | iFormer: Integrating ConvNet and Transformer for Mobile Application | Chuanyang Zheng et.al. | 2501.15369 | link |
| 2025-01-25 | Explainable YOLO-Based Dyslexia Detection in Synthetic Handwriting Data | Nora Fink et.al. | 2501.15263 | null |
| 2025-01-25 | SpikSSD: Better Extraction and Fusion for Object Detection with Spiking Neuron Networks | Yimeng Fan et.al. | 2501.15151 | link |
| 2025-01-24 | LiDAR-Based Vehicle Detection and Tracking for Autonomous Racing | Marcello Cellina et.al. | 2501.14502 | null |
| 2025-01-24 | TD-RD: A Top-Down Benchmark with Real-Time Framework for Road Damage Detection | Xi Xiao et.al. | 2501.14302 | null |
| 2025-01-24 | A Comprehensive Framework for Semantic Similarity Detection Using Transformer Architectures and Enhanced Ensemble Techniques | Lifu Gao et.al. | 2501.14288 | null |
| 2025-01-23 | Efficient Precision Control in Object Detection Models for Enhanced and Reliable Ovarian Follicle Counting | Vincent Blot et.al. | 2501.14036 | null |
| 2025-01-23 | PointOBB-v3: Expanding Performance Boundaries of Single Point-Supervised Oriented Object Detection | Peiyuan Zhang et.al. | 2501.13898 | link |
| 2025-01-23 | First Lessons Learned of an Artificial Intelligence Robotic System for Autonomous Coarse Waste Recycling Using Multispectral Imaging-Based Methods | Timo Lange et.al. | 2501.13855 | null |
| 2025-01-23 | Integrating Causality with Neurochaos Learning: Proposed Approach and Research Agenda | Nanjangud C. Narendra et.al. | 2501.13763 | null |
| 2025-01-23 | You Only Crash Once v2: Perceptually Consistent Strong Features for One-Stage Domain Adaptive Detection of Space Terrain | Timothy Chase Jr et.al. | 2501.13725 | null |
| 2025-01-23 | YOLO11-JDE: Fast and Accurate Multi-Object Tracking with Self-Supervised Re-ID | Iñaki Erregue et.al. | 2501.13710 | link |
| 2025-01-23 | Emotion estimation from video footage with LSTM | Samer Attrah et.al. | 2501.13432 | link |
| 2025-01-23 | Multi-aspect Knowledge Distillation with Large Language Model | Taegyeong Lee et.al. | 2501.13341 | link |
| 2025-01-22 | MONA: Moving Object Detection from Videos Shot by Dynamic Camera | Boxun Hu et.al. | 2501.13183 | null |
| 2025-01-21 | Large-image Object Detection for Fine-grained Recognition of Punches Patterns in Medieval Panel Painting | Josh Bruegger et.al. | 2501.12489 | link |
| 2025-01-21 | TOFFE – Temporally-binned Object Flow from Events for High-speed and Energy-Efficient Object Detection and Tracking | Adarsh Kumar Kosta et.al. | 2501.12482 | null |
| 2025-01-21 | Benchmarking Image Perturbations for Testing Automated Driving Assistance Systems | Stefano Carlo Lambertenghi et.al. | 2501.12269 | null |
| 2025-01-21 | DLEN: Dual Branch of Transformer for Low-Light Image Enhancement in Dual Domains | Junyu Xia et.al. | 2501.12235 | null |
| 2025-01-21 | SVGS-DSGAT: An IoT-Enabled Innovation in Underwater Robotic Object Detection Technology | Dongli Wu et.al. | 2501.12169 | null |
| 2025-01-21 | Co-Paced Learning Strategy Based on Confidence for Flying Bird Object Detection Model Training | Zi-Wei Sun et.al. | 2501.12071 | null |
| 2025-01-21 | SMamba: Sparse Mamba for Event-based Object Detection | Nan Yang et.al. | 2501.11971 | null |
| 2025-01-21 | LuxVeri at GenAI Detection Task 1: Inverse Perplexity Weighted Ensemble for Robust Detection of AI-Generated Text across English and Multilingual Contexts | Md Kamrujjaman Mobin et.al. | 2501.11914 | null |
| 2025-01-20 | Synthetic Data Can Mislead Evaluations: Membership Inference as Machine Text Detection | Ali Naseh et.al. | 2501.11786 | null |
| 2025-01-20 | Everyone’s Privacy Matters! An Analysis of Privacy Leakage from Real-World Facial Images on Twitter and Associated User Behaviors | Yuqi Niu et.al. | 2501.11756 | null |
| 2025-01-20 | Automatic Labelling & Semantic Segmentation with 4D Radar Tensors | Botao Sun et.al. | 2501.11351 | null |
| 2025-01-20 | Enhancing SAR Object Detection with Self-Supervised Pre-training on Masked Auto-Encoders | Xinyang Pu et.al. | 2501.11249 | null |
| 2025-01-17 | MutualForce: Mutual-Aware Enhancement for 4D Radar-LiDAR 3D Object Detection | Xiangyuan Peng et.al. | 2501.10266 | null |
| 2025-01-17 | Leveraging Confident Image Regions for Source-Free Domain-Adaptive Object Detection | Mohamed Lamine Mekhalfi et.al. | 2501.10081 | null |
| 2025-01-17 | One-D-Piece: Image Tokenizer Meets Quality-Controllable Compression | Keita Miwa et.al. | 2501.10064 | null |
| 2025-01-17 | LWGANet: A Lightweight Group Attention Backbone for Remote Sensing Visual Tasks | Wei Lu et.al. | 2501.10040 | link |
| 2025-01-17 | FLORA: Formal Language Model Enables Robust Training-free Zero-shot Object Referring Analysis | Zhe Chen et.al. | 2501.09887 | null |
| 2025-01-16 | Qwen it detect machine-generated text? | Teodor-George Marchitan et.al. | 2501.09813 | link |
| 2025-01-16 | A Simple Aerial Detection Baseline of Multimodal Language Models | Qingyun Li et.al. | 2501.09720 | link |
| 2025-01-16 | Practical Continual Forgetting for Pre-trained Vision Models | Hongbo Zhao et.al. | 2501.09705 | link |
| 2025-01-16 | Exploring AI-based System Design for Pixel-level Protected Health Information Detection in Medical Images | Tuan Truong et.al. | 2501.09552 | null |
| 2025-01-16 | Multi-task deep-learning for sleep event detection and stage classification | Adriana Anido-Alonso et.al. | 2501.09519 | link |
| 2025-01-16 | The Devil is in the Details: Simple Remedies for Image-to-LiDAR Representation Learning | Wonjun Jo et.al. | 2501.09485 | null |
| 2025-01-16 | MonoSOWA: Scalable monocular 3D Object detector Without human Annotations | Jan Skvrna et.al. | 2501.09481 | link |
| 2025-01-16 | RE-POSE: Synergizing Reinforcement Learning-Based Partitioning and Offloading for Edge Object Detection | Jianrui Shi et.al. | 2501.09465 | null |
| 2025-01-16 | On the Relation between Optical Aperture and Automotive Object Detection | Ofer Bar-Shalom et.al. | 2501.09456 | null |
| 2025-01-16 | SoccerSynth-Detection: A Synthetic Dataset for Soccer Player Detection | Haobin Qin et.al. | 2501.09281 | null |
| 2025-01-15 | GenAI Content Detection Task 3: Cross-Domain Machine-Generated Text Detection Challenge | Liam Dugan et.al. | 2501.08913 | null |
| 2025-01-15 | PACF: Prototype Augmented Compact Features for Improving Domain Adaptive Object Detection | Chenguang Liu et.al. | 2501.08605 | null |
| 2025-01-14 | Predicting Performance of Object Detection Models in Electron Microscopy Using Random Forests | Ni Li et.al. | 2501.08465 | link |
| 2025-01-14 | Bootstrapping Corner Cases: High-Resolution Inpainting for Safety Critical Detect and Avoid for Automated Flying | Jonathan Lyhs et.al. | 2501.08142 | null |
| 2025-01-14 | Learning Motion and Temporal Cues for Unsupervised Video Object Segmentation | Yunzhi Zhuge et.al. | 2501.07806 | link |
| 2025-01-14 | Parameter-Inverted Image Pyramid Networks for Visual Perception and Multimodal Understanding | Zhaokai Wang et.al. | 2501.07783 | link |
| 2025-01-13 | SST-EM: Advanced Metrics for Evaluating Semantic, Spatial and Temporal Aspects in Video Editing | Varun Biyyala et.al. | 2501.07554 | link |
| 2025-01-13 | ML Mule: Mobile-Driven Context-Aware Collaborative Learning | Haoxiang Yu et.al. | 2501.07536 | null |
| 2025-01-13 | TimberVision: A Multi-Task Dataset and Framework for Log-Component Segmentation and Tracking in Autonomous Forestry Operations | Daniel Steininger et.al. | 2501.07360 | link |
| 2025-01-13 | Toward Realistic Camouflaged Object Detection: Benchmarks and Method | Zhimeng Xin et.al. | 2501.07297 | link |
| 2025-01-13 | Dual Scale-aware Adaptive Masked Knowledge Distillation for Object Detection | ZhouRui Zhang et.al. | 2501.07101 | null |
| 2025-01-11 | CoreNet: Conflict Resolution Network for Point-Pixel Misalignment and Sub-Task Suppression of 3D LiDAR-Camera Object Detection | Yiheng Li et.al. | 2501.06550 | link |
| 2025-01-11 | CPDR: Towards Highly-Efficient Salient Object Detection via Crossed Post-decoder Refinement | Yijie Li et.al. | 2501.06441 | null |
| 2025-01-11 | FocusDD: Real-World Scene Infusion for Robust Dataset Distillation | Youbing Hu et.al. | 2501.06405 | null |
| 2025-01-10 | A Holistically Point-guided Text Framework for Weakly-Supervised Camouflaged Object Detection | Tsui Qin Mok et.al. | 2501.06038 | null |
| 2025-01-10 | Minimizing Occlusion Effect on Multi-View Camera Perception in BEV with Multi-Sensor Fusion | Sanjay Kumar et.al. | 2501.05997 | null |
| 2025-01-10 | EDNet: Edge-Optimized Small Target Detection in UAV Imagery – Faster Context Attention, Better Feature Fusion, and Hardware Acceleration | Zhifan Song et.al. | 2501.05885 | null |
| 2025-01-10 | Automatic detection of single-electron regime of quantum dots and definition of virtual gates using U-Net and clustering | Yui Muto et.al. | 2501.05878 | null |
| 2025-01-10 | Zero-shot Shark Tracking and Biometrics from Aerial Imagery | Chinmay K Lalgudi et.al. | 2501.05717 | null |
| 2025-01-10 | Dark Energy Survey Year 6 Results: Synthetic-source Injection Across the Full Survey Using Balrog | D. Anbajagane et.al. | 2501.05683 | null |
| 2025-01-09 | Approximate Supervised Object Distance Estimation on Unmanned Surface Vehicles | Benjamin Kiefer et.al. | 2501.05567 | null |
| 2025-01-09 | Performance of YOLOv7 in Kitchen Safety While Handling Knife | Athulya Sundaresan Geetha et.al. | 2501.05399 | null |
| 2025-01-09 | The global consensus on the risk management of autonomous driving | Sebastian Krügel et.al. | 2501.05391 | null |
| 2025-01-09 | A Systematic Literature Review on Deep Learning-based Depth Estimation in Computer Vision | Ali Rohan et.al. | 2501.05147 | null |
| 2025-01-09 | CorrDiff: Adaptive Delay-aware Detector with Temporal Cue Inputs for Real-time Object Detection | Xiang Zhang et.al. | 2501.05132 | null |
| 2025-01-09 | AD-L-JEPA: Self-Supervised Spatial World Models with Joint Embedding Predictive Architecture for Autonomous Driving with LiDAR Data | Haoran Zhu et.al. | 2501.04969 | link |
| 2025-01-09 | Online Continual Learning: A Systematic Literature Review of Approaches, Challenges, and Benchmarks | Seyed Amir Bidaki et.al. | 2501.04897 | link |
| 2025-01-08 | Video Summarisation with Incident and Context Information using Generative AI | Ulindu De Silva et.al. | 2501.04764 | null |
| 2025-01-08 | Boosting Salient Object Detection with Knowledge Distillated from Large Foundation Models | Miaoyang He et.al. | 2501.04582 | null |
| 2025-01-08 | Combining YOLO and Visual Rhythm for Vehicle Counting | Victor Nascimento Ribeiro et.al. | 2501.04534 | link |
| 2025-01-08 | RSAR: Restricted State Angle Resolver and Rotated SAR Benchmark | Xin Zhang et.al. | 2501.04440 | link |
| 2025-01-08 | Integrating LLMs with ITS: Recent Advances, Potentials, Challenges, and Future Directions | Doaa Mahmud et.al. | 2501.04437 | null |
| 2025-01-08 | FGU3R: Fine-Grained Fusion via Unified 3D Representation for Multimodal 3D Object Detection | Guoxin Zhang et.al. | 2501.04373 | null |
| 2025-01-08 | H-MBA: Hierarchical MamBa Adaptation for Multi-Modal Video Understanding in Autonomous Driving | Siran Chen et.al. | 2501.04302 | null |
| 2025-01-08 | UPAQ: A Framework for Real-Time and Energy-Efficient 3D Object Detection in Autonomous Vehicles | Abhishek Balasubramaniam et.al. | 2501.04213 | null |
| 2025-01-07 | LargeAD: Large-Scale Cross-Sensor Data Pretraining for Autonomous Driving | Lingdong Kong et.al. | 2501.04005 | null |
| 2025-01-07 | Not all tokens are created equal: Perplexity Attention Weighted Networks for AI generated text detection | Pablo Miralles-González et.al. | 2501.03940 | null |
| 2025-01-07 | Visual question answering: from early developments to recent advances – a survey | Ngoc Dung Huynh et.al. | 2501.03939 | null |
| 2025-01-07 | SCC-YOLO: An Improved Object Detector for Assisting in Brain Tumor Diagnosis | Runci Bai et.al. | 2501.03836 | null |
| 2025-01-07 | Strip R-CNN: Large Strip Convolution for Remote Sensing Object Detection | Xinbin Yuan et.al. | 2501.03775 | link |
| 2025-01-07 | AuxDepthNet: Real-Time Monocular 3D Object Detection with Depth-Sensitive Features | Ruochen Zhang et.al. | 2501.03700 | null |
| 2025-01-07 | Anomaly Triplet-Net: Progress Recognition Model Using Deep Metric Learning Considering Occlusion for Manual Assembly Work | Takumi Kitsukawa et.al. | 2501.03533 | null |
| 2025-01-07 | SceneVTG++: Controllable Multilingual Visual Text Generation in the Wild | Jiawei Liu et.al. | 2501.02962 | null |
| 2025-01-05 | Multispectral Pedestrian Detection with Sparsely Annotated Label | Chan Lee et.al. | 2501.02640 | null |
| 2025-01-05 | Generalization-Enhanced Few-Shot Object Detection in Remote Sensing | Hui Lin et.al. | 2501.02474 | link |
| 2025-01-04 | Who Wrote This? Zero-Shot Statistical Tests for LLM-Generated Text Detection using Finite Sample Concentration Inequalities | Tara Radvand et.al. | 2501.02406 | link |
| 2025-01-04 | V2X-DGPE: Addressing Domain Gaps and Pose Errors for Robust Collaborative 3D Object Detection | Sichao Wang et.al. | 2501.02363 | link |
| 2025-01-04 | Accurate Crop Yield Estimation of Blueberries using Deep Learning and Smart Drones | Hieu D. Nguyen et.al. | 2501.02344 | null |
| 2025-01-04 | On The Causal Network Of Face-selective Regions In Human Brain During Movie Watching | Ali Bavafa et.al. | 2501.02333 | null |
| 2025-01-04 | RadarNeXt: Real-Time and Reliable 3D Object Detector Based On 4D mmWave Imaging Radar | Liye Jia et.al. | 2501.02314 | null |
| 2025-01-03 | A Separable Self-attention Inspired by the State Space Model for Computer Vision | Juntao Zhang et.al. | 2501.02040 | link |
| 2025-01-03 | UAV-DETR: Efficient End-to-End Object Detection for Unmanned Aerial Vehicle Imagery | Huaxiang Zhang et.al. | 2501.01855 | null |
| 2025-01-03 | Dual Mutual Learning Network with Global-local Awareness for RGB-D Salient Object Detection | Kang Yi et.al. | 2501.01648 | link |
| 2025-01-02 | A Multi-task Supervised Compression Model for Split Computing | Yoshitomo Matsubara et.al. | 2501.01420 | link |
| 2025-01-02 | MSC-Bench: Benchmarking and Analyzing Multi-Sensor Corruption for Driving Perception | Xiaoshuai Hao et.al. | 2501.01037 | null |
| 2025-01-01 | A Novel Approach using CapsNet and Deep Belief Network for Detection and Identification of Oral Leukopenia | Hirthik Mathesh GV et.al. | 2501.00876 | null |
| 2025-01-01 | NMM-HRI: Natural Multi-modal Human-Robot Interaction with Voice and Deictic Posture via Large Language Model | Yuzhi Lai et.al. | 2501.00785 | null |
| 2024-12-31 | Gaussian Building Mesh (GBM): Extract a Building’s 3D Mesh with Google Earth and Gaussian Splatting | Kyle Gao et.al. | 2501.00625 | null |
| 2024-12-31 | B2Net: Camouflaged Object Detection via Boundary Aware and Boundary Fusion | Junmin Cai et.al. | 2501.00426 | null |
| 2024-12-31 | Research on vehicle detection based on improved YOLOv8 network | Haocheng Guo et.al. | 2501.00300 | null |
| 2024-12-30 | TiGDistill-BEV: Multi-view BEV 3D Object Detection via Target Inner-Geometry Learning Distillation | Shaoqing Xu et.al. | 2412.20911 | link |
| 2024-12-30 | Humanoid Robot RHP Friends: Seamless Combination of Autonomous and Teleoperated Tasks in a Nursing Context | Mehdi Benallegue et.al. | 2412.20770 | null |
| 2024-12-30 | Solar Filaments Detection using Active Contours Without Edges | Sanmoy Bandyopadhyay et.al. | 2412.20749 | null |
| 2024-12-30 | Open-Set Object Detection By Aligning Known Class Representations | Hiran Sarkar et.al. | 2412.20701 | null |
| 2024-12-30 | SM3Det: A Unified Model for Multi-Modal Remote Sensing Object Detection | Yuxuan Li et.al. | 2412.20665 | link |
| 2024-12-30 | YOLO-UniOW: Efficient Universal Open-World Object Detection | Lihao Liu et.al. | 2412.20645 | link |
| 2024-12-29 | Controlling Out-of-Domain Gaps in LLMs for Genre Classification and Generated Text Detection | Dmitri Roussinov et.al. | 2412.20595 | link |
| 2024-12-29 | A Novel FPGA-based CNN Hardware Accelerator: Optimization for Convolutional Layers using Karatsuba Ofman Multiplier | Amit Sarkar et.al. | 2412.20393 | null |
| 2024-12-29 | Differential Evolution Integrated Hybrid Deep Learning Model for Object Detection in Pre-made Dishes | Lujia Lv et.al. | 2412.20370 | null |
| 2024-12-28 | Plastic Waste Classification Using Deep Learning: Insights from the WaDaBa Dataset | Suman Kunwar et.al. | 2412.20232 | null |
| 2024-12-27 | Chimera: A Block-Based Neural Architecture Search Framework for Event-Based Object Detection | Diego A. Silva et.al. | 2412.19646 | null |
| 2024-12-27 | Optimizing Helmet Detection with Hybrid YOLO Pipelines: A Detailed Analysis | Vaikunth M et.al. | 2412.19467 | null |
| 2024-12-26 | Revisiting Monocular 3D Object Detection from Scene-Level Depth Retargeting to Instance-Level Spatial Refinement | Qiude Zhang et.al. | 2412.19165 | null |
| 2024-12-26 | From Coin to Data: The Impact of Object Detection on Digital Numismatics | Rafael Cabral et.al. | 2412.19091 | null |
| 2024-12-26 | Assessing Pre-trained Models for Transfer Learning through Distribution of Spectral Components | Tengxue Zhang et.al. | 2412.19085 | null |
| 2024-12-25 | MTCAE-DFER: Multi-Task Cascaded Autoencoder for Dynamic Facial Expression Recognition | Peihao Xiang et.al. | 2412.18988 | null |
| 2024-12-25 | CGCOD: Class-Guided Camouflaged Object Detection | Chenxi Zhang et.al. | 2412.18977 | null |
| 2024-12-25 | HV-BEV: Decoupling Horizontal and Vertical Feature Sampling for Multi-View 3D Object Detection | Di Wu et.al. | 2412.18884 | null |
| 2024-12-25 | TSceneJAL: Joint Active Learning of Traffic Scenes for 3D Object Detection | Chenyang Lei et.al. | 2412.18870 | null |
| 2024-12-25 | Distortion-Aware Adversarial Attacks on Bounding Boxes of Object Detectors | Pham Phuc et.al. | 2412.18815 | link |
| 2024-12-24 | Sampling Bag of Views for Open-Vocabulary Object Detection | Hojun Choi et.al. | 2412.18273 | null |
| 2024-12-24 | Efficient Detection Framework Adaptation for Edge Computing: A Plug-and-play Neural Network Toolbox Enabling Edge Deployment | Jiaqi Wu et.al. | 2412.18230 | null |
| 2024-12-24 | SDM-Car: A Dataset for Small and Dim Moving Vehicles Detection in Satellite Videos | Zhen Zhang et.al. | 2412.18214 | link |
| 2024-12-24 | Spectrum-oriented Point-supervised Saliency Detector for Hyperspectral Images | Peifu Liu et.al. | 2412.18112 | link |
| 2024-12-24 | Multi-Point Positional Insertion Tuning for Small Object Detection | Kanoko Goto et.al. | 2412.18090 | null |
| 2024-12-24 | COMO: Cross-Mamba Interaction and Offset-Guided Fusion for Multimodal Object Detection | Chang Liu et.al. | 2412.18076 | null |
| 2024-12-23 | Comprehensive Multi-Modal Prototypes are Simple and Effective Classifiers for Vast-Vocabulary Object Detection | Yitong Chen et.al. | 2412.17800 | link |
| 2024-12-23 | Enhanced Temporal Processing in Spiking Neural Networks for Static Object Detection Using 3D Convolutions | Huaxu He et.al. | 2412.17654 | null |
| 2024-12-23 | Impact of Evidence Theory Uncertainty on Training Object Detection Models | M. Tahasanul Ibrahim et.al. | 2412.17405 | null |
| 2024-12-23 | Feature Based Methods Domain Adaptation for Object Detection: A Review Paper | Helia Mohamadi et.al. | 2412.17325 | null |
| 2024-12-23 | Towards Unsupervised Model Selection for Domain Adaptive Object Detection | Hengfu Yu et.al. | 2412.17284 | null |
| 2024-12-22 | NumbOD: A Spatial-Frequency Fusion Attack Against Object Detectors | Ziqi Zhou et.al. | 2412.16955 | link |
| 2024-12-22 | Separating Drone Point Clouds From Complex Backgrounds by Cluster Filter – Technical Report for CVPR 2024 UG2 Challenge | Hanfang Liang et.al. | 2412.16947 | null |
| 2024-12-22 | Seamless Detection: Unifying Salient Object Detection and Camouflaged Object Detection | Yi Liu et.al. | 2412.16840 | link |
| 2024-12-22 | Human-Guided Image Generation for Expanding Small-Scale Training Image Datasets | Changjian Chen et.al. | 2412.16839 | null |
| 2024-12-21 | IV-tuning: Parameter-Efficient Transfer Learning for Infrared-Visible Tasks | Yaming Zhang et.al. | 2412.16654 | link |
| 2024-12-20 | NeRF-To-Real Tester: Neural Radiance Fields as Test Image Generators for Vision of Autonomous Systems | Laura Weihl et.al. | 2412.16141 | null |
| 2024-12-20 | MR-GDINO: Efficient Open-World Continual Object Detection | Bowen Dong et.al. | 2412.15979 | link |
| 2024-12-20 | Mask-RadarNet: Enhancing Transformer With Spatial-Temporal Semantic Context for Radar Object Detection in Autonomous Driving | Yuzhi Wu et.al. | 2412.15595 | null |
| 2024-12-19 | Exploring Machine Learning Engineering for Object Detection and Tracking by Unmanned Aerial Vehicle (UAV) | Aneesha Guna et.al. | 2412.15347 | null |
| 2024-12-19 | Leveraging Color Channel Independence for Improved Unsupervised Object Detection | Bastian Jäckl et.al. | 2412.15150 | null |
| 2024-12-19 | Explainable Tampered Text Detection via Multimodal Large Models | Chenfan Qu et.al. | 2412.14816 | null |
| 2024-12-19 | Explicit Relational Reasoning Network for Scene Text Detection | Yuchen Su et.al. | 2412.14692 | null |
| 2024-12-19 | A Light-Weight Framework for Open-Set Object Detection with Decoupled Feature Alignment in Joint Space | Yonghao He et.al. | 2412.14680 | link |
| 2024-12-19 | Progressive Fine-to-Coarse Reconstruction for Accurate Low-Bit Post-Training Quantization in Vision Transformers | Rui Ding et.al. | 2412.14633 | null |
| 2024-12-19 | Alignment-Free RGB-T Salient Object Detection: A Large-scale Dataset and Progressive Correlation Network | Kunpeng Wang et.al. | 2412.14576 | link |
| 2024-12-19 | SCKD: Semi-Supervised Cross-Modality Knowledge Distillation for 4D Radar Object Detection | Ruoyu Xu et.al. | 2412.14571 | null |
| 2024-12-18 | HA-RDet: Hybrid Anchor Rotation Detector for Oriented Object Detection | Phuc D. A. Nguyen et.al. | 2412.14379 | link |
| 2024-12-18 | Joint Perception and Prediction for Autonomous Driving: A Survey | Lucas Dal’Col et.al. | 2412.14088 | link |
| 2024-12-18 | Object Style Diffusion for Generalized Object Detection in Urban Scene | Hao Li et.al. | 2412.13815 | null |
| 2024-12-18 | MMO-IG: Multi-Class and Multi-Scale Object Image Generation for Remote Sensing | Chuang Yang et.al. | 2412.13684 | null |
| 2024-12-18 | Evaluation of LLM Vulnerabilities to Being Misused for Personalized Disinformation Generation | Aneta Zugecova et.al. | 2412.13666 | null |
| 2024-12-18 | Multi-View Pedestrian Occupancy Prediction with a Novel Synthetic Dataset | Sithu Aung et.al. | 2412.13569 | null |
| 2024-12-18 | Comparative Analysis of YOLOv9, YOLOv10 and RT-DETR for Real-Time Weed Detection | Ahmet Oğuz Saltık et.al. | 2412.13490 | null |
| 2024-12-17 | Continuous Patient Monitoring with AI: Real-Time Analysis of Video in Hospital Care Settings | Paolo Gabriel et.al. | 2412.13152 | null |
| 2024-12-17 | A New Adversarial Perspective for LiDAR-based 3D Object Detection | Shijun Zheng et.al. | 2412.13017 | null |
| 2024-12-17 | What is YOLOv6? A Deep Insight into the Object Detection Model | Athulya Sundaresan Geetha et.al. | 2412.13006 | null |
| 2024-12-17 | Differential Alignment for Domain Adaptive Object Detection | Xinyu He et.al. | 2412.12830 | null |
| 2024-12-17 | RCTrans: Radar-Camera Transformer via Radar Densifier and Sequential Decoder for 3D Object Detection | Yiheng Li et.al. | 2412.12799 | link |
| 2024-12-17 | RaCFormer: Towards High-Quality 3D Object Detection via Query-based Radar-Camera Fusion | Xiaomeng Chu et.al. | 2412.12725 | null |
| 2024-12-17 | Efficient Oriented Object Detection with Enhanced Small Object Recognition in Aerial Images | Zhifei Shi et.al. | 2412.12562 | null |
| 2024-12-17 | CREST: An Efficient Conjointly-trained Spike-driven Framework for Event-based Object Detection Exploiting Spatiotemporal Dynamics | Ruixin Mao et.al. | 2412.12525 | link |
| 2024-12-17 | PromptDet: A Lightweight 3D Object Detection Framework with LiDAR Prompts | Kun Guo et.al. | 2412.12460 | link |
| 2024-12-16 | Domain Generalization in Autonomous Driving: Evaluating YOLOv8s, RT-DETR, and YOLO-NAS with the ROAD-Almaty Dataset | Madiyar Alimov et.al. | 2412.12349 | null |
| 2024-12-16 | Coconut Palm Tree Counting on Drone Images with Deep Object Detection and Synthetic Training Data | Tobias Rohe et.al. | 2412.11949 | null |
| 2024-12-16 | Sonar-based Deep Learning in Underwater Robotics: Overview, Robustness and Challenges | Martin Aubard et.al. | 2412.11840 | null |
| 2024-12-16 | CLDA-YOLO: Visual Contrastive Learning Based Domain Adaptive YOLO Detector | Tianheng Qiu et.al. | 2412.11812 | null |
| 2024-12-16 | PhysAug: A Physical-guided and Frequency-based Data Augmentation for Single-Domain Generalized Object Detection | Xiaoran Xu et.al. | 2412.11807 | link |
| 2024-12-16 | Impact of Face Alignment on Face Image Quality | Eren Onaran et.al. | 2412.11779 | null |
| 2024-12-16 | Learning UAV-based path planning for efficient localization of objects using prior knowledge | Rick van Essen et.al. | 2412.11717 | null |
| 2024-12-16 | Oriented Tiny Object Detection: A Dataset, Benchmark, and Dynamic Unbiased Learning | Chang Xu et.al. | 2412.11582 | null |
| 2024-12-16 | Glimpse: Enabling White-Box Methods to Use Proprietary Models for Zero-Shot LLM-Generated Text Detection | Guangsheng Bao et.al. | 2412.11506 | link |
| 2024-12-16 | HGSFusion: Radar-Camera Fusion with Hybrid Generation and Synchronization for 3D Object Detection | Zijian Gu et.al. | 2412.11489 | link |
| 2024-12-16 | Universal Domain Adaptive Object Detection via Dual Probabilistic Alignment | Yuanfan Zheng et.al. | 2412.11443 | link |
| 2024-12-13 | A dual contrastive framework | Yuan Sun et.al. | 2412.10348 | null |
| 2024-12-13 | MVQ:Towards Efficient DNN Compression and Acceleration with Masked Vector Quantization | Shuaiting Li et.al. | 2412.10261 | null |
| 2024-12-13 | Copy-Move Detection in Optical Microscopy: A Segmentation Network and A Dataset | Hao-Chiang Shao et.al. | 2412.10258 | null |
| 2024-12-13 | UN-DETR: Promoting Objectness Learning via Joint Supervision for Unknown Object Detection | Haomiao Liu et.al. | 2412.10176 | link |
| 2024-12-13 | HS-FPN: High Frequency and Spatial Perception FPN for Tiny Object Detection | Zican Shi et.al. | 2412.10116 | null |
| 2024-12-13 | RemDet: Rethinking Efficient Model Design for UAV Object Detection | Chen Li et.al. | 2412.10040 | link |
| 2024-12-13 | Timealign: A multi-modal object detection method for time misalignment fusing in autonomous driving | Zhihang Song et.al. | 2412.10033 | null |
| 2024-12-13 | Object-Focused Data Selection for Dense Prediction Tasks | Niclas Popp et.al. | 2412.10032 | null |
| 2024-12-13 | CP-DETR: Concept Prompt Guide DETR Toward Stronger Universal Object Detection | Qibo Chen et.al. | 2412.09799 | null |
| 2024-12-12 | FD2-Net: Frequency-Driven Feature Decomposition Network for Infrared-Visible Object Detection | Ke Li et.al. | 2412.09258 | null |
| 2024-12-12 | UADet: A Remarkably Simple Yet Effective Uncertainty-Aware Open-Set Object Detection Framework | Silin Cheng et.al. | 2412.09229 | null |
| 2024-12-12 | ContextHOI: Spatial Context Learning for Human-Object Interaction Detection | Mingda Jia et.al. | 2412.09050 | null |
| 2024-12-12 | STEAM: Squeeze and Transform Enhanced Attention Module | Rishabh Sabharwal et.al. | 2412.09023 | null |
| 2024-12-12 | Sensing for Space Safety and Sustainability: A Deep Learning Approach with Vision Transformers | Wenxuan Zhang et.al. | 2412.08913 | null |
| 2024-12-11 | DALI: Domain Adaptive LiDAR Object Detection via Distribution-level and Instance-level Pseudo Label Denoising | Xiaohu Lu et.al. | 2412.08806 | link |
| 2024-12-11 | Utilizing Multi-step Loss for Single Image Reflection Removal | Abdelrahman Elnenaey et.al. | 2412.08582 | link |
| 2024-12-11 | PointCFormer: a Relation-based Progressive Feature Extraction Network for Point Cloud Completion | Yi Zhong et.al. | 2412.08421 | null |
| 2024-12-11 | Pysical Informed Driving World Model | Zhuoran Yang et.al. | 2412.08410 | null |
| 2024-12-11 | Wasserstein Distance Rivals Kullback-Leibler Divergence for Knowledge Distillation | Jiaming Lv et.al. | 2412.08139 | null |
| 2024-12-11 | DTAA: A Detect, Track and Avoid Architecture for navigation in spaces with Multiple Velocity Objects | Samuel Nordström et.al. | 2412.08121 | null |
| 2024-12-11 | THUD++: Large-Scale Dynamic Indoor Scene Dataset and Benchmark for Mobile Robots | Zeshun Li et.al. | 2412.08096 | null |
| 2024-12-11 | MAGIC: Mastering Physical Adversarial Generation in Context through Collaborative LLM Agents | Yun Xing et.al. | 2412.08014 | null |
| 2024-12-10 | Low-Latency Scalable Streaming for Event-Based Vision | Andrew Hamara et.al. | 2412.07889 | null |
| 2024-12-10 | Leveraging Content and Context Cues for Low-Light Image Enhancement | Igor Morawski et.al. | 2412.07693 | link |
| 2024-12-10 | Multimodal Contextualized Support for Enhancing Video Retrieval System | Quoc-Bao Nguyen-Le et.al. | 2412.07584 | null |
| 2024-12-10 | Making the Flow Glow – Robot Perception under Severe Lighting Conditions using Normalizing Flow Gradients | Simon Kristoffersson Lind et.al. | 2412.07565 | null |
| 2024-12-10 | Enhancing 3D Object Detection in Autonomous Vehicles Based on Synthetic Virtual Environment Analysis | Vladislav Li et.al. | 2412.07509 | null |
| 2024-12-10 | DSFEC: Efficient and Deployable Deep Radar Object Detection | Gayathri Dandugula et.al. | 2412.07411 | null |
| 2024-12-10 | Benchmarking Vision-Based Object Tracking for USVs in Complex Maritime Environments | Muhayy Ud Din et.al. | 2412.07392 | null |
| 2024-12-09 | FlexEvent: Event Camera Object Detection at Arbitrary Frequencies | Dongyue Lu et.al. | 2412.06708 | null |
| 2024-12-09 | EMOv2: Pushing 5M Vision Model Frontier | Jiangning Zhang et.al. | 2412.06674 | link |
| 2024-12-09 | Object Detection using Event Camera: A MoE Heat Conduction based Detector and A New Benchmark Dataset | Xiao Wang et.al. | 2412.06647 | null |
| 2024-12-09 | Prediction of Occluded Pedestrians in Road Scenes using Human-like Reasoning: Insights from the OccluRoads Dataset | Melo Castillo Angie Nataly et.al. | 2412.06549 | null |
| 2024-12-09 | Self-Paced Learning Strategy with Easy Sample Prior Based on Confidence for the Flying Bird Object Detection Model Training | Zi-Wei Sun et.al. | 2412.06306 | null |
| 2024-12-09 | No Annotations for Object Detection in Art through Stable Diffusion | Patrick Ramos et.al. | 2412.06286 | link |
| 2024-12-09 | DenseVLM: A Retrieval and Decoupled Alignment Framework for Open-Vocabulary Dense Prediction | Yunheng Li et.al. | 2412.06244 | null |
| 2024-12-09 | A Real-Time Defense Against Object Vanishing Adversarial Patch Attacks for Object Detection in Autonomous Vehicles | Jaden Mu et.al. | 2412.06215 | null |
| 2024-12-09 | PoLaRIS Dataset: A Maritime Object Detection and Tracking Dataset in Pohang Canal | Jiwon Choi et.al. | 2412.06192 | null |
| 2024-12-08 | Tiny Object Detection with Single Point Supervision | Haoran Zhu et.al. | 2412.05837 | null |
| 2024-12-06 | From classical techniques to convolution-based models: A review of object detection algorithms | Fnu Neha et.al. | 2412.05252 | null |
| 2024-12-06 | Towards Flexible 3D Perception: Object-Centric Occupancy Completion Augments 3D Object Detection | Chaoda Zheng et.al. | 2412.05154 | link |
| 2024-12-06 | DEYOLO: Dual-Feature-Enhancement YOLO for Cross-Modality Object Detection | Yishuo Chen et.al. | 2412.04931 | link |
| 2024-12-06 | Beyond Boxes: Mask-Guided Spatio-Temporal Feature Aggregation for Video Object Detection | Khurram Azeem Hashmi et.al. | 2412.04915 | null |
| 2024-12-05 | Cubify Anything: Scaling Indoor 3D Object Detection | Justin Lazarow et.al. | 2412.04458 | null |
| 2024-12-05 | Reflective Teacher: Semi-Supervised Multimodal 3D Object Detection in Bird’s-Eye-View via Uncertainty Measure | Saheli Hazra et.al. | 2412.04337 | null |
| 2024-12-05 | YOLO-CCA: A Context-Based Approach for Traffic Sign Detection | Linfeng Jiang et.al. | 2412.04289 | link |
| 2024-12-05 | DEIM: DETR with Improved Matching for Fast Convergence | Shihua Huang et.al. | 2412.04234 | link |
| 2024-12-05 | Frequency-Adaptive Low-Latency Object Detection Using Events and Frames | Haitian Zhang et.al. | 2412.04149 | null |
| 2024-12-05 | MVUDA: Unsupervised Domain Adaptation for Multi-view Pedestrian Detection | Erik Brorsson et.al. | 2412.04117 | link |
| 2024-12-05 | Thermal and RGB Images Work Better Together in Wind Turbine Damage Detection | Serhii Svystun et.al. | 2412.04114 | null |
| 2024-12-05 | SoRA: Singular Value Decomposed Low-Rank Adaptation for Domain Generalizable Representation Learning | Seokju Yun et.al. | 2412.04077 | null |
| 2024-12-05 | Space to Policy: Scalable Brick Kiln Detection and Automatic Compliance Monitoring with Geospatial Data | Zeel B Patel et.al. | 2412.04065 | null |
| 2024-12-05 | UNCOVER: Unknown Class Object Detection for Autonomous Vehicles in Real-time | Lars Schmarje et.al. | 2412.03986 | null |
| 2024-12-04 | Perception Tokens Enhance Visual Reasoning in Multimodal Language Models | Mahtab Bigverdi et.al. | 2412.03548 | null |
| 2024-12-04 | Data Fusion of Semantic and Depth Information in the Context of Object Detection | Md Abu Yusuf et.al. | 2412.03490 | null |
| 2024-12-04 | Task-driven Image Fusion with Learnable Fusion Loss | Haowen Bai et.al. | 2412.03240 | null |
| 2024-12-04 | ObjectFinder: Open-Vocabulary Assistive System for Interactive Object Search by Blind People | Ruiping Liu et.al. | 2412.03118 | null |
| 2024-12-04 | TREND: Unsupervised 3D Representation Learning via Temporal Forecasting for LiDAR Perception | Runjian Chen et.al. | 2412.03054 | null |
| 2024-12-04 | Assessing the performance of CT image denoisers using Laguerre-Gauss Channelized Hotelling Observer for lesion detection | Prabhat Kc et.al. | 2412.02920 | null |
| 2024-12-03 | EvRT-DETR: The Surprising Effectiveness of DETR-based Detection for Event Cameras | Dmitrii Torbunov et.al. | 2412.02890 | null |
| 2024-12-03 | Optimized CNNs for Rapid 3D Point Cloud Object Recognition | Tianyi Lyu et.al. | 2412.02855 | null |
| 2024-12-03 | Gaussian Splatting Under Attack: Investigating Adversarial Noise in 3D Objects | Abdurrahman Zeybey et.al. | 2412.02803 | null |
| 2024-12-03 | SJTU:Spatial judgments in multimodal models towards unified segmentation through coordinate detection | Joongwon Chae et.al. | 2412.02565 | null |
| 2024-12-03 | Underload: Defending against Latency Attacks for Object Detectors on Edge Devices | Tianyi Wang et.al. | 2412.02171 | null |
| 2024-12-03 | Redundant Queries in DETR-Based 3D Detection Methods: Unnecessary and Prunable | Lizhen Xu et.al. | 2412.02054 | null |
| 2024-12-02 | Smart Parking with Pixel-Wise ROI Selection for Vehicle Detection Using YOLOv8, YOLOv9, YOLOv10, and YOLOv11 | Gustavo P. C. P. da Luz et.al. | 2412.01983 | null |
| 2024-12-02 | HPRM: High-Performance Robotic Middleware for Intelligent Autonomous Systems | Jacky Kwok et.al. | 2412.01799 | null |
| 2024-12-02 | Identifying Reliable Predictions in Detection Transformers | Young-Jin Park et.al. | 2412.01782 | null |
| 2024-12-02 | FEVER-OOD: Free Energy Vulnerability Elimination for Robust Out-of-Distribution Detection | Brian K. S. Isaac-Medina et.al. | 2412.01596 | null |
| 2024-12-02 | Divide-and-Conquer: Confluent Triple-Flow Network for RGB-T Salient Object Detection | Hao Tang et.al. | 2412.01556 | null |
| 2024-12-03 | GFreeDet: Exploiting Gaussian Splatting and Foundation Models for Model-free Unseen Object Detection in the BOP Challenge 2024 | Xingyu Liu et.al. | 2412.01552 | null |
| 2024-12-02 | Improving Object Detection by Modifying Synthetic Data with Explainable AI | Nitish Mital et.al. | 2412.01477 | null |
| 2024-11-29 | SpaRC: Sparse Radar-Camera Fusion for 3D Object Detection | Philipp Wolters et.al. | 2411.19860 | null |
| 2024-11-29 | Feedback-driven object detection and iterative model improvement | Sönke Tenckhoff et.al. | 2411.19835 | link |
| 2024-11-29 | Real-Time Anomaly Detection in Video Streams | Fabien Poirier et.al. | 2411.19731 | null |
| 2024-11-29 | LDA-AQU: Adaptive Query-guided Upsampling via Local Deformable Attention | Zewen Du et.al. | 2411.19585 | link |
| 2024-11-29 | Bootstraping Clustering of Gaussians for View-consistent 3D Scene Understanding | Wenbo Zhang et.al. | 2411.19551 | null |
| 2024-11-28 | Automatic Prompt Generation and Grounding Object Detection for Zero-Shot Image Anomaly Detection | Tsun-Hin Cheung et.al. | 2411.19220 | null |
| 2024-11-28 | Co-Learning: Towards Semi-Supervised Object Detection with Road-side Cameras | Jicheng Yuan et.al. | 2411.19143 | null |
| 2024-11-28 | On Moving Object Segmentation from Monocular Video with Transformers | Christian Homeyer et.al. | 2411.19141 | null |
| 2024-11-28 | Dynamic Attention and Bi-directional Fusion for Safety Helmet Wearing Detection | Junwei Feng et.al. | 2411.19071 | null |
| 2024-11-28 | MVFormer: Diversifying Feature Normalization and Token Mixing for Efficient Vision Transformers | Jongseong Bae et.al. | 2411.18995 | null |
| 2024-11-27 | Exploring Depth Information for Detecting Manipulated Face Videos | Haoyue Wang et.al. | 2411.18572 | null |
| 2024-11-27 | Efficient Dynamic LiDAR Odometry for Mobile Robots with Structured Point Clouds | Jonathan Lichtenfeld et.al. | 2411.18443 | link |
| 2024-11-27 | Deep Fourier-embedded Network for Bi-modal Salient Object Detection | Pengfei Lyu et.al. | 2411.18409 | link |
| 2024-11-27 | Optimizing Multispectral Object Detection: A Bag of Tricks and Comprehensive Benchmarks | Chen Zhou et.al. | 2411.18288 | link |
| 2024-11-27 | From Open Vocabulary to Open World: Teaching Vision Language Models to Detect Novel Objects | Zizhao Li et.al. | 2411.18207 | link |
| 2024-11-27 | RPEE-HEADS: A Novel Benchmark for Pedestrian Head Detection in Crowd Videos | Mohamad Abubaker et.al. | 2411.18164 | null |
| 2024-11-27 | Revisiting Misalignment in Multispectral Pedestrian Detection: A Language-Driven Approach for Cross-modal Alignment Fusion | Taeheon Kim et.al. | 2411.17995 | null |
| 2024-11-27 | ROICtrl: Boosting Instance Control for Visual Generation | Yuchao Gu et.al. | 2411.17949 | null |
| 2024-11-26 | Box for Mask and Mask for Box: weak losses for multi-task partially supervised learning | Hoàng-Ân Lê et.al. | 2411.17536 | link |
| 2024-11-26 | TinyViM: Frequency Decoupling for Tiny Hybrid Vision Mamba | Xiaowen Ma et.al. | 2411.17473 | link |
| 2024-11-26 | Communication-Efficient Cooperative SLAMMOT via Determining the Number of Collaboration Vehicles | Susu Fang et.al. | 2411.17432 | null |
| 2024-11-26 | DGNN-YOLO: Dynamic Graph Neural Networks with YOLO11 for Small Object Detection and Tracking in Traffic Surveillance | Shahriar Soudeep et.al. | 2411.17251 | null |
| 2024-11-26 | Event-based Spiking Neural Networks for Object Detection: A Review of Datasets, Architectures, Learning Rules, and Implementation | Craig Iaboni et.al. | 2411.17006 | link |
| 2024-11-25 | Online Episodic Memory Visual Query Localization with Egocentric Streaming Object Memory | Zaira Manigrasso et.al. | 2411.16934 | null |
| 2024-11-25 | Open Vocabulary Monocular 3D Object Detection | Jin Yao et.al. | 2411.16833 | link |
| 2024-11-25 | Imperceptible Adversarial Examples in the Physical World | Weilin Xu et.al. | 2411.16622 | null |
| 2024-11-25 | STDWeb: Simple Transient Detection pipeline for the Web | Sergey Karpov et.al. | 2411.16470 | null |
| 2024-11-25 | Machine Learning for the Digital Typhoon Dataset: Extensions to Multiple Basins and New Developments in Representations and Tasks | Asanobu Kitamoto et.al. | 2411.16421 | link |
| 2024-11-25 | CutS3D: Cutting Semantics in 3D for 2D Unsupervised Instance Segmentation | Leon Sick et.al. | 2411.16319 | null |
| 2024-11-25 | Diagnosis of diabetic retinopathy using machine learning & deep learning technique | Eric Shah et.al. | 2411.16250 | null |
| 2024-11-25 | Interpreting Object-level Foundation Models via Visual Precision Search | Ruoyu Chen et.al. | 2411.16198 | null |
| 2024-11-25 | Learn from Foundation Model: Fruit Detection Model without Manual Annotation | Yanan Wang et.al. | 2411.16196 | null |
| 2024-11-25 | CIA: Controllable Image Augmentation Framework Based on Stable Diffusion | Mohamed Benkedadra et.al. | 2411.16128 | null |
| 2024-11-25 | You only thermoelastically deform once: Point Absorber Detection in LIGO Test Masses with YOLO | Simon R. Goode et.al. | 2411.16104 | null |
| 2024-11-25 | Leverage Task Context for Object Affordance Ranking | Haojie Huang et.al. | 2411.16082 | null |
| 2024-11-22 | A Real-Time DETR Approach to Bangladesh Road Object Detection for Autonomous Vehicles | Irfan Nafiz Shahan et.al. | 2411.15110 | null |
| 2024-11-22 | MSSF: A 4D Radar and Camera Fusion Framework With Multi-Stage Sampling for 3D Object Detection in Autonomous Driving | Hongsi Liu et.al. | 2411.15016 | null |
| 2024-11-22 | VisionPAD: A Vision-Centric Pre-training Paradigm for Autonomous Driving | Haiming Zhang et.al. | 2411.14716 | null |
| 2024-11-21 | Unveiling the Hidden: A Comprehensive Evaluation of Underwater Image Enhancement and Its Impact on Object Detection | Ali Awad et.al. | 2411.14626 | null |
| 2024-11-21 | DINO-X: A Unified Vision Model for Open-World Object Detection and Understanding | Tianhe Ren et.al. | 2411.14347 | link |
| 2024-11-21 | AnywhereDoor: Multi-Target Backdoor Attacks on Object Detection | Jialin Lu et.al. | 2411.14243 | null |
| 2024-11-21 | Transforming Static Images Using Generative Models for Video Salient Object Detection | Suhwan Cho et.al. | 2411.13975 | link |
| 2024-11-21 | Multitask Learning for SAR Ship Detection with Gaussian-Mask Joint Segmentation | Ming Zhao et.al. | 2411.13847 | null |
| 2024-11-20 | MambaDETR: Query-based Temporal Modeling using State Space Model for Multi-View 3D Object Detection | Tong Ning et.al. | 2411.13628 | null |
| 2024-11-20 | DIS-Mine: Instance Segmentation for Disaster-Awareness in Poor-Light Condition in Underground Mines | Mizanur Rahman Jewel et.al. | 2411.13544 | null |
| 2024-11-20 | A Resource Efficient Fusion Network for Object Detection in Bird’s-Eye View using Camera and Raw Radar Data | Kavin Chandrasekaran et.al. | 2411.13311 | link |
| 2024-11-20 | VADet: Multi-frame LiDAR 3D Object Detection using Variable Aggregation | Chengjie Huang et.al. | 2411.13186 | null |
| 2024-11-20 | RAW-Diffusion: RGB-Guided Diffusion Models for High-Fidelity RAW Image Generation | Christoph Reinders et.al. | 2411.13150 | link |
| 2024-11-20 | YCB-LUMA: YCB Object Dataset with Luminance Keying for Object Localization | Thomas Pöllabauer et.al. | 2411.13149 | link |
| 2024-11-20 | Video-RAG: Visually-aligned Retrieval-Augmented Long Video Comprehension | Yongdong Luo et.al. | 2411.13093 | link |
| 2024-11-20 | Bounding-box Watermarking: Defense against Model Extraction Attacks on Object Detectors | Satoru Koda et.al. | 2411.13047 | null |
| 2024-11-20 | Collaborative Feature-Logits Contrastive Learning for Open-Set Semi-Supervised Object Detection | Xinhao Zhong et.al. | 2411.13001 | null |
| 2024-11-19 | Maps from Motion (MfM): Generating 2D Semantic Maps from Sparse Multi-view Images | Matteo Toso et.al. | 2411.12620 | null |
| 2024-11-19 | GaussianPretrain: A Simple Unified 3D Gaussian Representation for Visual Pre-training in Autonomous Driving | Shaoqing Xu et.al. | 2411.12452 | null |
| 2024-11-19 | Physics-Guided Detector for SAR Airplanes | Zhongling Huang et.al. | 2411.12301 | link |
| 2024-11-18 | Scaling Deep Learning Research with Kubernetes on the NRP Nautilus HyperCluster | J. Alex Hurt et.al. | 2411.12038 | null |
| 2024-11-18 | LightFFDNets: Lightweight Convolutional Neural Networks for Rapid Facial Forgery Detection | Günel Jabbarlı et.al. | 2411.11826 | null |
| 2024-11-18 | WoodYOLO: A Novel Object Detector for Wood Species Detection in Microscopic Images | Lars Nieradzik et.al. | 2411.11738 | null |
| 2024-11-18 | Exploring Emerging Trends and Research Opportunities in Visual Place Recognition | Antonios Gasteratos et.al. | 2411.11481 | null |
| 2024-11-18 | SL-YOLO: A Stronger and Lighter Drone Target Detection Model | Defan Chen et.al. | 2411.11477 | null |
| 2024-11-19 | EVT: Efficient View Transformation for Multi-Modal 3D Object Detection | Yongjin Lee et.al. | 2411.10715 | null |
| 2024-11-15 | Vision Eagle Attention: A New Lens for Advancing Image Classification | Mahmudul Hasan et.al. | 2411.10564 | link |
| 2024-11-15 | Interactive Image-Based Aphid Counting in Yellow Water Traps under Stirring Actions | Xumin Gao et.al. | 2411.10357 | null |
| 2024-11-15 | RETR: Multi-View Radar Detection Transformer for Indoor Perception | Ryoma Yataka et.al. | 2411.10293 | null |
| 2024-11-15 | Visual-Linguistic Agent: Towards Collaborative Contextual Object Reasoning | Jingru Yang et.al. | 2411.10252 | null |
| 2024-11-15 | Real-Time AI-Driven People Tracking and Counting Using Overhead Cameras | Ishrath Ahamed et.al. | 2411.10072 | null |
| 2024-11-15 | Diachronic Document Dataset for Semantic Layout Analysis | Thibault Clérice et.al. | 2411.10068 | null |
| 2024-11-14 | Adversarial Attacks Using Differentiable Rendering: A Survey | Matthew Hull et.al. | 2411.09749 | null |
| 2024-11-14 | Local-Global Attention: An Adaptive Mechanism for Multi-Scale Feature Integration | Yifan Shao et.al. | 2411.09604 | link |
| 2024-11-14 | Long-Tailed Object Detection Pre-training: Dynamic Rebalancing Contrastive Learning with Dual Reconstruction | Chen-Long Duan et.al. | 2411.09453 | null |
| 2024-11-14 | Instruction-Driven Fusion of Infrared-Visible Images: Tailoring for Diverse Downstream Tasks | Zengyi Yang et.al. | 2411.09387 | null |
| 2024-11-14 | DT-JRD: Deep Transformer based Just Recognizable Difference Prediction Model for Video Coding for Machines | Junqi Liu et.al. | 2411.09308 | null |
| 2024-11-14 | Cross-Modal Consistency in Multimodal Large Language Models | Xiang Zhang et.al. | 2411.09273 | null |
| 2024-11-14 | LEAP:D – A Novel Prompt-based Approach for Domain-Generalized Aerial Object Detection | Chanyeong Park et.al. | 2411.09180 | null |
| 2024-11-13 | Multimodal Object Detection using Depth and Image Data for Manufacturing Parts | Nazanin Mahjourian et.al. | 2411.09062 | null |
| 2024-11-13 | DART-LLM: Dependency-Aware Multi-Robot Task Decomposition and Execution using Large Language Models | Yongdong Wang et.al. | 2411.09022 | null |
| 2024-11-13 | UIFormer: A Unified Transformer-based Framework for Incremental Few-Shot Object Detection and Instance Segmentation | Chengyuan Zhang et.al. | 2411.08569 | null |
| 2024-11-13 | Methodology for a Statistical Analysis of Influencing Factors on 3D Object Detection Performance | Anton Kuznietsov et.al. | 2411.08482 | null |
| 2024-11-13 | V2X-R: Cooperative LiDAR-4D Radar Fusion for 3D Object Detection with Denoising Diffusion | Xun Huang et.al. | 2411.08402 | link |
| 2024-11-12 | Large-scale Remote Sensing Image Target Recognition and Automatic Annotation | Wuzheng Dong et.al. | 2411.07802 | link |
| 2024-11-12 | Efficient 3D Perception on Multi-Sweep Point Cloud with Gumbel Spatial Pruning | Jianhao Li et.al. | 2411.07742 | null |
| 2024-11-12 | Depthwise Separable Convolutions with Deep Residual Convolutions | Md Arid Hasan et.al. | 2411.07544 | null |
| 2024-11-11 | Transformers for Charged Particle Track Reconstruction in High Energy Physics | Samuel Van Stroud et.al. | 2411.07149 | null |
| 2024-11-11 | Multi-scale Frequency Enhancement Network for Blind Image Deblurring | Yawen Xiang et.al. | 2411.06893 | null |
| 2024-11-11 | Fast and Efficient Transformer-based Method for Bird’s Eye View Instance Prediction | Miguel Antunes-García et.al. | 2411.06851 | link |
| 2024-11-11 | AV-PedAware: Self-Supervised Audio-Visual Fusion for Dynamic Pedestrian Awareness | Yizhuo Yang et.al. | 2411.06789 | null |
| 2024-11-11 | United Domain Cognition Network for Salient Object Detection in Optical Remote Sensing Images | Yanguang Sun et.al. | 2411.06703 | link |
| 2024-11-11 | Track Any Peppers: Weakly Supervised Sweet Pepper Tracking Using VLMs | Jia Syuen Lim et.al. | 2411.06702 | null |
| 2024-11-11 | LFSamba: Marry SAM with Mamba for Light Field Salient Object Detection | Zhengyi Liu et.al. | 2411.06652 | null |
| 2024-11-09 | Robust Detection of LLM-Generated Text: A Comparative Analysis | Yongye Su et.al. | 2411.06248 | null |
| 2024-11-09 | LSSInst: Improving Geometric Modeling in LSS-Based BEV Perception with Instance Representation | Weijie Ma et.al. | 2411.06173 | link |
| 2024-11-09 | AI-Compass: A Comprehensive and Effective Multi-module Testing Tool for AI Systems | Zhiyu Zhu et.al. | 2411.06146 | null |
| 2024-11-08 | Open-set object detection: towards unified problem formulation and benchmarking | Hejer Ammar et.al. | 2411.05564 | null |
| 2024-11-08 | ZOPP: A Framework of Zero-shot Offboard Panoptic Perception for Autonomous Driving | Tao Ma et.al. | 2411.05311 | null |
| 2024-11-08 | SimpleBEV: Improved LiDAR-Camera Fusion Architecture for 3D Object Detection | Yun Zhao et.al. | 2411.05292 | null |
| 2024-11-07 | On the Inherent Robustness of One-Stage Object Detection against Out-of-Distribution Data | Aitor Martinez-Seras et.al. | 2411.04586 | null |
| 2024-11-07 | l0-Regularized Sparse Coding-based Interpretable Network for Multi-Modal Image Fusion | Gargi Panda et.al. | 2411.04519 | null |
| 2024-11-07 | Pose2Trajectory: Using Transformers on Body Pose to Predict Tennis Player’s Trajectory | Ali K. AlShami et.al. | 2411.04501 | null |
| 2024-11-07 | SuperQ-GRASP: Superquadrics-based Grasp Pose Estimation on Larger Objects for Mobile-Manipulation | Xun Tu et.al. | 2411.04386 | null |
| 2024-11-07 | UEVAVD: A Dataset for Developing UAV’s Eye View Active Object Detection | Xinhua Jiang et.al. | 2411.04348 | null |
| 2024-11-07 | GazeGen: Gaze-Driven User Interaction for Visual Content Generation | He-Yen Hsieh et.al. | 2411.04335 | null |
| 2024-11-06 | An Enhancement of Haar Cascade Algorithm Applied to Face Recognition for Gate Pass Security | Clarence A. Antipona et.al. | 2411.03831 | null |
| 2024-11-06 | Understanding the Effects of Human-written Paraphrases in LLM-generated Text Detection | Hiu Ting Lau et.al. | 2411.03806 | link |
| 2024-11-06 | Efficient Fourier Filtering Network with Contrastive Learning for UAV-based Unaligned Bi-modal Salient Object Detection | Pengfei Lyu et.al. | 2411.03728 | link |
| 2024-11-06 | Estimation of Psychosocial Work Environment Exposures Through Video Object Detection. Proof of Concept Using CCTV Footage | Claus D. Hansen et.al. | 2411.03724 | null |
| 2024-11-06 | Hybrid Attention for Robust RGB-T Pedestrian Detection in Real-World Conditions | Arunkumar Rathinam et.al. | 2411.03576 | null |
| 2024-11-05 | An Application-Agnostic Automatic Target Recognition System Using Vision Language Models | Anthony Palladino et.al. | 2411.03491 | null |
| 2024-11-05 | Self-supervised cross-modality learning for uncertainty-aware object detection and recognition in applications which lack pre-labelled training data | Irum Mehboob et.al. | 2411.03082 | null |
| 2024-11-05 | CRT-Fusion: Camera, Radar, Temporal Fusion Using Motion Information for 3D Object Detection | Jisong Kim et.al. | 2411.03013 | null |
| 2024-11-05 | Centerness-based Instance-aware Knowledge Distillation with Task-wise Mutual Lifting for Object Detection on Drone Imagery | Bowei Du et.al. | 2411.02861 | null |
| 2024-11-05 | Correlation of Object Detection Performance with Visual Saliency and Depth Estimation | Matthias Bartolo et.al. | 2411.02844 | link |
| 2024-11-05 | ERUP-YOLO: Enhancing Object Detection Robustness for Adverse Weather Condition by Unified Image-Adaptive Processing | Yuka Ogino et.al. | 2411.02799 | null |
| 2024-11-05 | Real-Time Text Detection with Similar Mask in Traffic, Industrial, and Natural Scenes | Xu Han et.al. | 2411.02794 | link |
| 2024-11-05 | Efficient Feature Aggregation and Scale-Aware Regression for Monocular 3D Object Detection | Yifan Wang et.al. | 2411.02747 | null |
| 2024-11-05 | Analysis of Multi-epoch JWST Images of $\sim 300$ Little Red Dots: Tentative Detection of Variability in a Minority of Sources | Zijian Zhang et.al. | 2411.02729 | null |
| 2024-11-04 | Intelligent Video Recording Optimization using Activity Detection for Surveillance Systems | Youssef Elmir et.al. | 2411.02632 | null |
| 2024-11-04 | SIRA: Scalable Inter-frame Relation and Association for Radar Perception | Ryoma Yataka et.al. | 2411.02220 | null |
| 2024-11-04 | Advanced computer vision for extracting georeferenced vehicle trajectories from drone imagery | Robert Fonod et.al. | 2411.02136 | null |
| 2024-11-04 | Exploiting Unlabeled Data with Multiple Expert Teachers for Open Vocabulary Aerial Object Detection and Its Orientation Adaptation | Yan Li et.al. | 2411.02057 | link |
| 2024-11-04 | V-CAS: A Realtime Vehicle Anti Collision System Using Vision Transformer on Multi-Camera Streams | Muhammad Waqas Ashraf et.al. | 2411.01963 | null |
| 2024-11-04 | Exploiting Contextual Uncertainty of Visual Data for Efficient Training of Deep Models | Sharat Agarwal et.al. | 2411.01925 | null |
| 2024-11-04 | LiDAttack: Robust Black-box Attack on LiDAR-based Object Detection | Jinyin Chen et.al. | 2411.01889 | link |
| 2024-11-03 | ROAD-Waymo: Action Awareness at Scale for Autonomous Driving | Salman Khan et.al. | 2411.01683 | null |
| 2024-11-03 | OSAD: Open-Set Aircraft Detection in SAR Images | Xiayang Xiao et.al. | 2411.01597 | null |
| 2024-11-03 | One for All: Multi-Domain Joint Training for Point Cloud Based 3D Object Detection | Zhenyu Wang et.al. | 2411.01584 | null |
| 2024-11-03 | A Visual Question Answering Method for SAR Ship: Breaking the Requirement for Multimodal Dataset Construction and Model Fine-Tuning | Fei Wang et.al. | 2411.01445 | null |
| 2024-10-31 | ImOV3D: Learning Open-Vocabulary Point Clouds 3D Object Detection from Only 2D Images | Timing Yang et.al. | 2410.24001 | link |
| 2024-10-31 | Localization, balance and affinity: a stronger multifaceted collaborative salient object detector in remote sensing images | Yakun Xie et.al. | 2410.23991 | null |
| 2024-10-31 | Uncertainty Estimation for 3D Object Detection via Evidential Learning | Nikita Durasov et.al. | 2410.23910 | null |
| 2024-10-31 | From Web Data to Real Fields: Low-Cost Unsupervised Domain Adaptation for Agricultural Robots | Vasileios Tzouras et.al. | 2410.23906 | null |
| 2024-10-31 | Open-Set 3D object detection in LiDAR data as an Out-of-Distribution problem | Louis Soum-Fontez et.al. | 2410.23767 | null |
| 2024-10-31 | DetectRL: Benchmarking LLM-Generated Text Detection in Real-World Scenarios | Junchao Wu et.al. | 2410.23746 | link |
| 2024-10-31 | GigaCheck: Detecting LLM-generated Content | Irina Tolstykh et.al. | 2410.23728 | null |
| 2024-10-31 | Context-Aware Token Selection and Packing for Enhanced Vision Transformer | Tianyi Zhang et.al. | 2410.23608 | null |
| 2024-10-30 | EMMA: End-to-End Multimodal Model for Autonomous Driving | Jyh-Jing Hwang et.al. | 2410.23262 | null |
| 2024-10-30 | S3PT: Scene Semantics and Structure Guided Clustering to Boost Self-Supervised Pre-Training for Autonomous Driving | Maciej K. Wozniak et.al. | 2410.23085 | null |
| 2024-10-30 | First Place Solution to the ECCV 2024 ROAD++ Challenge @ ROAD++ Spatiotemporal Agent Detection 2024 | Tengfei Zhang et.al. | 2410.23077 | null |
| 2024-10-30 | AdaptiveISP: Learning an Adaptive Image Signal Processor for Object Detection | Yujin Wang et.al. | 2410.22939 | null |
| 2024-10-30 | YOLOv11 for Vehicle Detection: Advancements, Performance, and Applications in Intelligent Transportation Systems | Mujadded Al Rabbani Alif et.al. | 2410.22898 | null |
| 2024-10-29 | Unified Domain Generalization and Adaptation for Multi-View 3D Object Detection | Gyusam Chang et.al. | 2410.22461 | null |
| 2024-10-29 | Lighten CARAFE: Dynamic Lightweight Upsampling with Guided Reassemble Kernels | Ruigang Fu et.al. | 2410.22139 | link |
| 2024-10-29 | Data Generation for Hardware-Friendly Post-Training Quantization | Lior Dikstein et.al. | 2410.22110 | null |
| 2024-10-29 | Cognitive Semantic Augmentation LEO Satellite Networks for Earth Observation | Hong-fu Chou et.al. | 2410.21916 | null |
| 2024-10-29 | PK-YOLO: Pretrained Knowledge Guided YOLO for Brain Tumor Detection in Multiplanar MRI Slices | Ming Kang et.al. | 2410.21822 | link |
| 2024-10-28 | MVSDet: Multi-View Indoor 3D Object Detection via Efficient Plane Sweeps | Yating Xu et.al. | 2410.21566 | link |
| 2024-10-28 | TACO: Adversarial Camouflage Optimization on Trucks to Fool Object Detectors | Adonisz Dimitriu et.al. | 2410.21443 | null |
| 2024-10-28 | Joint Audio-Visual Idling Vehicle Detection with Streamlined Input Dependencies | Xiwen Li et.al. | 2410.21170 | null |
| 2024-10-28 | Synthetica: Large Scale Synthetic Data for Robot Perception | Ritvik Singh et.al. | 2410.21153 | null |
| 2024-10-28 | DeTeCtive: Detecting AI-generated Text via Multi-Level Contrastive Learning | Xun Guo et.al. | 2410.20964 | link |
| 2024-10-28 | IndraEye: Infrared Electro-Optical UAV-based Perception Dataset for Robust Downstream Tasks | Manjunath D et.al. | 2410.20953 | null |
| 2024-10-28 | SparseTem: Boosting the Efficiency of CNN-Based Video Encoders by Exploiting Temporal Continuity | Kunyun Wang et.al. | 2410.20790 | null |
| 2024-10-27 | Sebica: Lightweight Spatial and Efficient Bidirectional Channel Attention Super Resolution Network | Chongxiao Liu et.al. | 2410.20546 | null |
| 2024-10-27 | Guidance Disentanglement Network for Optics-Guided Thermal UAV Image Super-Resolution | Zhicheng Zhao et.al. | 2410.20466 | link |
| 2024-10-27 | Open-Vocabulary Object Detection via Language Hierarchy | Jiaxing Huang et.al. | 2410.20371 | null |
| 2024-10-27 | Historical Test-time Prompt Tuning for Vision Foundation Models | Jingyi Zhang et.al. | 2410.20346 | null |
| 2024-10-25 | OReole-FM: successes and challenges toward billion-parameter foundation models for high-resolution satellite imagery | Philipe Dias et.al. | 2410.19965 | null |
| 2024-10-25 | MetaTrading: An Immersion-Aware Model Trading Framework for Vehicular Metaverse Services | Hongjia Wu et.al. | 2410.19665 | null |
| 2024-10-25 | Frozen-DETR: Enhancing DETR with Image Understanding from Frozen Foundation Models | Shenghao Fu et.al. | 2410.19635 | null |
| 2024-10-25 | MonoDGP: Monocular 3D Object Detection with Decoupled-Query and Geometry-Error Priors | Fanqi Pu et.al. | 2410.19590 | link |
| 2024-10-25 | DECADE: Towards Designing Efficient-yet-Accurate Distance Estimation Modules for Collision Avoidance in Mobile Advanced Driver Assistance Systems | Muhammad Zaeem Shahzad et.al. | 2410.19336 | null |
| 2024-10-25 | In-Simulation Testing of Deep Learning Vision Models in Autonomous Robotic Manipulators | Dmytro Humeniuk et.al. | 2410.19277 | null |
| 2024-10-24 | HUE Dataset: High-Resolution Event and Frame Sequences for Low-Light Vision | Burak Ercan et.al. | 2410.19164 | null |
| 2024-10-24 | Optimizing Edge Offloading Decisions for Object Detection | Jiaming Qiu et.al. | 2410.18919 | link |
| 2024-10-24 | You Only Look Around: Learning Illumination Invariant Feature for Low-light Object Detection | Mingbo Hong et.al. | 2410.18398 | null |
| 2024-10-24 | Thermal Chameleon: Task-Adaptive Tone-mapping for Radiometric Thermal-Infrared images | Dong-Guw Lee et.al. | 2410.18340 | link |
| 2024-10-23 | KhmerST: A Low-Resource Khmer Scene Text Detection and Recognition Benchmark | Vannkinh Nom et.al. | 2410.18277 | null |
| 2024-10-23 | Automated Defect Detection and Grading of Piarom Dates Using Deep Learning | Nasrin Azimi et.al. | 2410.18208 | null |
| 2024-10-23 | DREB-Net: Dual-stream Restoration Embedding Blur-feature Fusion Network for High-mobility UAV Object Detection | Qingpeng Li et.al. | 2410.17822 | link |
| 2024-10-23 | YOLO-Vehicle-Pro: A Cloud-Edge Collaborative Framework for Object Detection in Autonomous Driving under Adverse Weather Conditions | Xiguang Li et.al. | 2410.17734 | null |
| 2024-10-23 | YOLOv11: An Overview of the Key Architectural Enhancements | Rahima Khanam et.al. | 2410.17725 | null |
| 2024-10-23 | PlantCamo: Plant Camouflage Detection | Jinyu Yang et.al. | 2410.17598 | link |
| 2024-10-23 | OVT-B: A New Large-Scale Benchmark for Open-Vocabulary Multi-Object Tracking | Haiji Liang et.al. | 2410.17534 | link |
| 2024-10-22 | EPContrast: Effective Point-level Contrastive Learning for Large-scale Point Cloud Understanding | Zhiyi Pan et.al. | 2410.17207 | null |
| 2024-10-22 | YOLO-TS: Real-Time Traffic Sign Detection with Enhanced Accuracy Using Optimized Receptive Fields and Anchor-Free Fusion | Junzhou Chen et.al. | 2410.17144 | null |
| 2024-10-22 | FlightAR: AR Flight Assistance Interface with Multiple Video Streams and Object Detection Aimed at Immersive Drone Control | Oleg Sautenkov et.al. | 2410.16943 | null |
| 2024-10-22 | AttriPrompter: Auto-Prompting with Attribute Semantics for Zero-shot Nuclei Detection via Visual-Language Pre-trained Models | Yongjian Wu et.al. | 2410.16820 | link |
| 2024-10-22 | DSORT-MCU: Detecting Small Objects in Real-Time on Microcontroller Units | Liam Boyle et.al. | 2410.16769 | null |
| 2024-10-22 | DI-MaskDINO: A Joint Object Detection and Instance Segmentation Model | Zhixiong Nan et.al. | 2410.16707 | null |
| 2024-10-22 | Fire and Smoke Detection with Burning Intensity Representation | Xiaoyi Han et.al. | 2410.16642 | link |
| 2024-10-21 | Griffon-G: Bridging Vision-Language and Vision-Centric Tasks via Large Multimodal Models | Yufei Zhan et.al. | 2410.16163 | link |
| 2024-10-21 | Multi-Sensor Fusion for UAV Classification Based on Feature Maps of Image and Radar Data | Nikos Sakellariou et.al. | 2410.16089 | null |
| 2024-10-21 | Few-shot target-driven instance detection based on open-vocabulary object detection models | Ben Crulis et.al. | 2410.16028 | null |
| 2024-10-21 | How Important are Data Augmentations to Close the Domain Gap for Object Detection in Orbit? | Maximilian Ulmer et.al. | 2410.15766 | null |
| 2024-10-21 | P-YOLOv8: Efficient and Accurate Real-Time Detection of Distracted Driving | Mohamed R. Elshamy et.al. | 2410.15602 | null |
| 2024-10-21 | Deep Learning and Machine Learning – Object Detection and Semantic Segmentation: From Theory to Applications | Jintao Ren et.al. | 2410.15584 | null |
| 2024-10-21 | Online Pseudo-Label Unified Object Detection for Multiple Datasets Training | XiaoJun Tang et.al. | 2410.15569 | null |
| 2024-10-20 | TrackMe:A Simple and Effective Multiple Object Tracking Annotation Tool | Thinh Phan et.al. | 2410.15518 | null |
| 2024-10-20 | YOLO-RD: Introducing Relevant and Compact Explicit Knowledge to YOLO by Retriever-Dictionary | Hao-Tang Tsui et.al. | 2410.15346 | null |
| 2024-10-20 | Open-vocabulary vs. Closed-set: Best Practice for Few-shot Object Detection Considering Text Describability | Yusuke Hosoya et.al. | 2410.15315 | null |
| 2024-10-18 | MultiOrg: A Multi-rater Organoid-detection Dataset | Christina Bukas et.al. | 2410.14612 | null |
| 2024-10-18 | Beyond Binary: Towards Fine-Grained LLM-Generated Text Detection via Role Recognition and Involvement Measurement | Zihao Cheng et.al. | 2410.14259 | null |
| 2024-10-18 | Multi-Source Spatial Knowledge Understanding for Immersive Visual Text-to-Speech | Shuwei He et.al. | 2410.14101 | link |
| 2024-10-18 | Enhancing In-vehicle Multiple Object Tracking Systems with Embeddable Ising Machines | Kosuke Tatsumura et.al. | 2410.14093 | null |
| 2024-10-17 | FaceSaliencyAug: Mitigating Geographic, Gender and Stereotypical Biases via Saliency-Based Data Augmentation | Teerath Kumar et.al. | 2410.14070 | null |
| 2024-10-17 | Spatiotemporal Object Detection for Improved Aerial Vehicle Detection in Traffic Monitoring | Kristina Telegraph et.al. | 2410.13616 | null |
| 2024-10-17 | RemoteDet-Mamba: A Hybrid Mamba-CNN Network for Multi-modal Object Detection in Remote Sensing Images | Kejun Ren et.al. | 2410.13532 | null |
| 2024-10-16 | Syn2Real Domain Generalization for Underwater Mine-like Object Detection Using Side-Scan Sonar | Aayush Agrawal et.al. | 2410.12953 | null |
| 2024-10-16 | MambaBEV: An efficient 3D detection model with Mamba2 | Zihan You et.al. | 2410.12673 | null |
| 2024-10-16 | On the Risk of Evidence Pollution for Malicious Social Text Detection in the Era of LLMs | Herun Wan et.al. | 2410.12600 | null |
| 2024-10-16 | Cocoon: Robust Multi-Modal Perception with Uncertainty-Aware Sensor Fusion | Minkyoung Cho et.al. | 2410.12592 | null |
| 2024-10-16 | Feature Augmentation for Self-supervised Contrastive Learning: A Closer Look | Yong Zhang et.al. | 2410.12396 | null |
| 2024-10-16 | Real-time Stereo-based 3D Object Detection for Streaming Perception | Changcai Li et.al. | 2410.12394 | link |
| 2024-10-16 | Context-Infused Visual Grounding for Art | Selina Khan et.al. | 2410.12369 | link |
| 2024-10-16 | Fusion from Decomposition: A Self-Supervised Approach for Image Fusion and Beyond | Pengwei Liang et.al. | 2410.12274 | null |
| 2024-10-16 | Optimizing YOLOv5s Object Detection through Knowledge Distillation algorithm | Guanming Huang et.al. | 2410.12259 | null |
| 2024-10-16 | SAM-Guided Masked Token Prediction for 3D Scene Understanding | Zhimin Chen et.al. | 2410.12158 | null |
| 2024-10-16 | Unveiling the Limits of Alignment: Multi-modal Dynamic Local Fusion Network and A Benchmark for Unaligned RGBT Video Object Detection | Qishun Wang et.al. | 2410.12143 | null |
| 2024-10-15 | Fractal Calibration for long-tailed object detection | Konstantinos Panagiotis Alexandridis et.al. | 2410.11774 | link |
| 2024-10-15 | POLO – Point-based, multi-class animal detection | Giacomo May et.al. | 2410.11741 | null |
| 2024-10-15 | YOLO-ELA: Efficient Local Attention Modeling for High-Performance Real-Time Insulator Defect Detection | Olalekan Akindele et.al. | 2410.11727 | null |
| 2024-10-15 | SeaDATE: Remedy Dual-Attention Transformer with Semantic Alignment via Contrast Learning for Multimodal Object Detection | Shuhan Dong et.al. | 2410.11358 | null |
| 2024-10-15 | Open World Object Detection: A Survey | Yiming Li et.al. | 2410.11301 | null |
| 2024-10-15 | Representation Similarity: A Better Guidance of DNN Layer Sharing for Edge Computing without Training | Bryan Bo Cao et.al. | 2410.11233 | null |
| 2024-10-15 | TEOcc: Radar-camera Multi-modal Occupancy Prediction via Temporal Enhancement | Zhiwei Lin et.al. | 2410.11228 | null |
| 2024-10-15 | CVCP-Fusion: On Implicit Depth Estimation for 3D Bounding Box Prediction | Pranav Gupta et.al. | 2410.11211 | link |
| 2024-10-15 | Multiview Scene Graph | Juexiao Zhang et.al. | 2410.11187 | link |
| 2024-10-14 | UAV3D: A Large-scale 3D Perception Benchmark for Unmanned Aerial Vehicles | Hui Ye et.al. | 2410.11125 | null |
| 2024-10-14 | ROSAR: An Adversarial Re-Training Framework for Robust Side-Scan Sonar Object Detection | Martin Aubard et.al. | 2410.10554 | link |
| 2024-10-14 | Learning to Ground VLMs without Forgetting | Aritra Bhowmik et.al. | 2410.10491 | null |
| 2024-10-14 | SMART-TRACK: A Novel Kalman Filter-Guided Sensor Fusion For Robust UAV Object Tracking in Dynamic Environments | Khaled Gabr et.al. | 2410.10409 | null |
| 2024-10-14 | V2M: Visual 2-Dimensional Mamba for Image Representation Learning | Chengkun Wang et.al. | 2410.10382 | link |
| 2024-10-14 | GlobalMamba: Global Image Serialization for Vision Mamba | Chengkun Wang et.al. | 2410.10316 | link |
| 2024-10-14 | ROA-BEV: 2D Region-Oriented Attention for BEV-based 3D Object | Jiwei Chen et.al. | 2410.10298 | null |
| 2024-10-14 | Out-of-Bounding-Box Triggers: A Stealthy Approach to Cheat Object Detectors | Tao Lin et.al. | 2410.10091 | link |
| 2024-10-15 | Optimizing Waste Management with Advanced Object Detection for Garbage Classification | Everest Z. Kuang et.al. | 2410.09975 | null |
| 2024-10-13 | EITNet: An IoT-Enhanced Framework for Real-Time Basketball Action Recognition | Jingyu Liu et.al. | 2410.09954 | null |
| 2024-10-13 | LoLI-Street: Benchmarking Low-Light Image Enhancement and Beyond | Md Tanvir Islam et.al. | 2410.09831 | link |
| 2024-10-11 | DA-Ada: Learning Domain-Aware Adapter for Domain Adaptive Object Detection | Haochen Li et.al. | 2410.09004 | null |
| 2024-10-11 | LIME-Eval: Rethinking Low-light Image Enhancement Evaluation via Object Detection | Mingjia Li et.al. | 2410.08810 | null |
| 2024-10-11 | Hespi: A pipeline for automatically detecting information from hebarium specimen sheets | Robert Turnbull et.al. | 2410.08740 | null |
| 2024-10-11 | MMLF: Multi-modal Multi-class Late Fusion for Object Detection with Uncertainty Estimation | Qihang Yang et.al. | 2410.08739 | null |
| 2024-10-11 | Boosting Open-Vocabulary Object Detection by Handling Background Samples | Ruizhe Zeng et.al. | 2410.08645 | null |
| 2024-10-11 | DeBiFormer: Vision Transformer with Deformable Agent Bi-level Routing Attention | Nguyen Huu Bao Long et.al. | 2410.08582 | link |
| 2024-10-11 | VOVTrack: Exploring the Potentiality in Videos for Open-Vocabulary Object Tracking | Zekun Qian et.al. | 2410.08529 | null |
| 2024-10-10 | Are We Ready for Real-Time LiDAR Semantic Segmentation in Autonomous Driving? | Samir Abou Haidar et.al. | 2410.08365 | null |
| 2024-10-10 | PointOBB-v2: Towards Simpler, Faster, and Stronger Single Point Supervised Oriented Object Detection | Botao Ren et.al. | 2410.08210 | null |
| 2024-10-10 | Robust AI-Generated Text Detection by Restricted Embeddings | Kristian Kuznetsov et.al. | 2410.08113 | link |
| 2024-10-10 | Dynamic Object Catching with Quadruped Robot Front Legs | André Schakkal et.al. | 2410.08065 | null |
| 2024-10-10 | HeightFormer: A Semantic Alignment Monocular 3D Object Detection Method from Roadside Perspective | Pei Liu et.al. | 2410.07758 | null |
| 2024-10-10 | O1O: Grouping of Known Classes to Identify Unknown Objects as Odd-One-Out | Mısra Yavuz et.al. | 2410.07514 | null |
| 2024-10-09 | Progressive Multi-Modal Fusion for Robust 3D Object Detection | Rohit Mohan et.al. | 2410.07475 | null |
| 2024-10-09 | Self-Supervised Learning for Real-World Object Detection: a Survey | Alina Ciocarlan et.al. | 2410.07442 | null |
| 2024-10-09 | Robust infrared small target detection using self-supervised and a contrario paradigms | Alina Ciocarlan et.al. | 2410.07437 | null |
| 2024-10-09 | SurANet: Surrounding-Aware Network for Concealed Object Detection via Highly-Efficient Interactive Contrastive Learning Strategy | Yuhan Kang et.al. | 2410.06842 | link |
| 2024-10-09 | Rethinking the Evaluation of Visible and Infrared Image Fusion | Dayan Guan et.al. | 2410.06811 | link |
| 2024-10-09 | QuadMamba: Learning Quadtree-based Selective Scan for Visual State Space Model | Fei Xie et.al. | 2410.06806 | null |
| 2024-10-09 | QuadBEV: An Efficient Quadruple-Task Perception Framework via Bird’s-Eye-View Representation | Yuxin Li et.al. | 2410.06516 | null |
| 2024-10-08 | Adver-City: Open-Source Multi-Modal Dataset for Collaborative Perception Under Adverse Weather Conditions | Mateus Karvat et.al. | 2410.06380 | link |
| 2024-10-08 | Toward Scalable Image Feature Compression: A Content-Adaptive and Diffusion-Based Approach | Sha Guo et.al. | 2410.06149 | null |
| 2024-10-08 | Training-free LLM-generated Text Detection by Mining Token Probability Sequences | Yihuai Xu et.al. | 2410.06072 | link |
| 2024-10-08 | Training-Free Open-Ended Object Detection and Segmentation via Attention as Prompts | Zhiwei Lin et.al. | 2410.05963 | null |
| 2024-10-08 | Learning Gaussian Data Augmentation in Feature Space for One-shot Object Detection in Manga | Takara Taniguchi et.al. | 2410.05935 | null |
| 2024-10-08 | Unobserved Object Detection using Generative Models | Subhransu S. Bhattacharjee et.al. | 2410.05869 | link |
| 2024-10-07 | Real-Time Truly-Coupled Lidar-Inertial Motion Correction and Spatiotemporal Dynamic Object Detection | Cedric Le Gentil et.al. | 2410.05152 | null |
| 2024-10-07 | Human-in-the-loop Reasoning For Traffic Sign Detection: Collaborative Approach Yolo With Video-llava | Mehdi Azarafza et.al. | 2410.05096 | null |
| 2024-10-07 | Improving Object Detection via Local-global Contrastive Learning | Danai Triantafyllidou et.al. | 2410.05058 | null |
| 2024-10-07 | Windshield Integration of Thermal and Color Fusion for Automatic Emergency Braking in Low Visibility Conditions | Gabriel Jobert et.al. | 2410.04928 | null |
| 2024-10-07 | Improved detection of discarded fish species through BoxAL active learning | Maria Sokolova et.al. | 2410.04880 | link |
| 2024-10-06 | Learning De-Biased Representations for Remote-Sensing Imagery | Zichen Tian et.al. | 2410.04546 | link |
| 2024-10-05 | AI as Humanity’s Salieri: Quantifying Linguistic Creativity of Language Models via Systematic Attribution of Machine Text against Web Text | Ximing Lu et.al. | 2410.04265 | link |
| 2024-10-05 | ETHcavation: A Dataset and Pipeline for Panoptic Scene Understanding and Object Tracking in Dynamic Construction Environments | Lorenzo Terenzi et.al. | 2410.04250 | null |
| 2024-10-05 | Fast Object Detection with a Machine Learning Edge Device | Richard C. Rodriguez et.al. | 2410.04173 | null |
| 2024-10-05 | Robust Task-Oriented Communication Framework for Real-Time Collaborative Vision Perception | Zhengru Fang et.al. | 2410.04168 | link |
| 2024-10-04 | DRAFTS: A Deep Learning-Based Radio Fast Transient Search Pipeline | Yong-Kun Zhang et.al. | 2410.03200 | null |
| 2024-10-03 | Is Your Paper Being Reviewed by an LLM? Investigating AI Text Detectability in Peer Review | Sungduk Yu et.al. | 2410.03019 | null |
| 2024-10-04 | Learning 3D Perception from Others’ Predictions | Jinsu Yoo et.al. | 2410.02646 | null |
| 2024-10-02 | Enhancing Screen Time Identification in Children with a Multi-View Vision Language Model and Screen Time Tracker | Xinlong Hou et.al. | 2410.01966 | null |
| 2024-10-02 | 3DGS-DET: Empower 3D Gaussian Splatting with Boundary Guidance and Box-Focused Sampling for 3D Object Detection | Yang Cao et.al. | 2410.01647 | link |
| 2024-10-02 | Gaussian-Det: Learning Closed-Surface Gaussians for 3D Object Detection | Hongru Yan et.al. | 2410.01404 | null |
| 2024-10-02 | Finetuning Pre-trained Model with Limited Data for LiDAR-based 3D Object Detection by Bridging Domain Gaps | Jiyun Jang et.al. | 2410.01319 | null |
| 2024-10-02 | Panopticus: Omnidirectional 3D Object Detection on Resource-constrained Edge Devices | Jeho Lee et.al. | 2410.01270 | null |
| 2024-10-02 | High and Low Resolution Tradeoffs in Roadside Multimodal Sensing | Shaozu Ding et.al. | 2410.01250 | link |
| 2024-10-02 | Perceptual Piercing: Human Visual Cue-based Object Detection in Low Visibility Conditions | Ashutosh Kumar et.al. | 2410.01225 | link |
| 2024-10-02 | A versatile machine learning workflow for high-throughput analysis of supported metal catalyst particles | Arda Genc et.al. | 2410.01213 | link |
| 2024-10-01 | Synthetic imagery for fuzzy object detection: A comparative study | Siavash H. Khajavi et.al. | 2410.01124 | null |
| 2024-10-01 | Generating Seamless Virtual Immunohistochemical Whole Slide Images with Content and Color Consistency | Sitong Liu et.al. | 2410.01072 | null |
| 2024-10-01 | ARPOV: Expanding Visualization of Object Detection in AR with Panoramic Mosaic Stitching | Erin McGowan et.al. | 2410.01055 | null |
| 2024-09-30 | Accelerating Non-Maximum Suppression: A Graph Theory Perspective | King-Siong Si et.al. | 2409.20520 | link |
| 2024-09-30 | NUTRIVISION: A System for Automatic Diet Management in Smart Healthcare | Madhumita Veeramreddy et.al. | 2409.20508 | null |
| 2024-09-30 | Navigating Threats: A Survey of Physical Adversarial Attacks on LiDAR Perception Systems in Autonomous Vehicles | Amira Guesmi et.al. | 2409.20426 | null |
| 2024-09-30 | Training a Computer Vision Model for Commercial Bakeries with Primarily Synthetic Images | Thomas H. Schmitt et.al. | 2409.20122 | null |
| 2024-09-30 | GearTrack: Automating 6D Pose Estimation | Yu Deng et.al. | 2409.19986 | null |
| 2024-09-30 | TSdetector: Temporal-Spatial Self-correction Collaborative Learning for Colonoscopy Video Detection | Kaini Wang et.al. | 2409.19983 | null |
| 2024-09-30 | DAOcc: 3D Object Detection Assisted Multi-Sensor Fusion for 3D Occupancy Prediction | Zhen Yang et.al. | 2409.19972 | link |
| 2024-09-30 | HazyDet: Open-source Benchmark for Drone-view Object Detection with Depth-cues in Hazy Scenes | Changfeng Feng et.al. | 2409.19833 | link |
| 2024-09-29 | Applying the Lower-Biased Teacher Model in Semi-Suepervised Object Detection | Shuang Wang et.al. | 2409.19703 | null |
| 2024-09-29 | OrientedFormer: An End-to-End Transformer-Based Oriented Object Detector in Remote Sensing Images | Jiaqi Zhao et.al. | 2409.19648 | link |
| 2024-09-27 | Spectral Wavelet Dropout: Regularization in the Wavelet Domain | Rinor Cakaj et.al. | 2409.18951 | null |
| 2024-09-27 | MCUBench: A Benchmark of Tiny Object Detectors on MCUs | Sudhakar Sah et.al. | 2409.18866 | link |
| 2024-09-27 | A Novel Unified Architecture for Low-Shot Counting by Detection and Segmentation | Jer Pelhan et.al. | 2409.18686 | link |
| 2024-09-27 | Query matching for spatio-temporal action detection with query-based object detector | Shimon Hori et.al. | 2409.18408 | null |
| 2024-09-26 | Efficient Microscopic Image Instance Segmentation for Food Crystal Quality Control | Xiaoyu Ji et.al. | 2409.18291 | null |
| 2024-09-26 | Advancing Object Detection in Transportation with Multimodal Large Language Models (MLLMs): A Comprehensive Review and Empirical Testing | Huthaifa I. Ashqar et.al. | 2409.18286 | null |
| 2024-09-26 | GSON: A Group-based Social Navigation Framework with Large Multimodal Model | Shangyi Luo et.al. | 2409.18084 | null |
| 2024-09-27 | A New Dataset for Monocular Depth Estimation Under Viewpoint Shifts | Aurel Pjetri et.al. | 2409.17851 | null |
| 2024-09-26 | Scene Understanding in Pick-and-Place Tasks: Analyzing Transformations Between Initial and Final Scenes | Seraj Ghasemi et.al. | 2409.17720 | null |
| 2024-09-26 | SLO-Aware Task Offloading within Collaborative Vehicle Platoons | Boris Sedlak et.al. | 2409.17667 | null |
| 2024-09-26 | CAMOT: Camera Angle-aware Multi-Object Tracking | Felix Limanta et.al. | 2409.17533 | null |
| 2024-09-25 | Transient Adversarial 3D Projection Attacks on Object Detection in Autonomous Driving | Ce Zhou et.al. | 2409.17403 | null |
| 2024-09-25 | AgRegNet: A Deep Regression Network for Flower and Fruit Density Estimation, Localization, and Counting in Orchards | Uddhav Bhattarai et.al. | 2409.17400 | null |
| 2024-09-25 | Energy-Efficient & Real-Time Computer Vision with Intelligent Skipping via Reconfigurable CMOS Image Sensors | Md Abdullah-Al Kaiser et.al. | 2409.17341 | null |
| 2024-09-25 | BitQ: Tailoring Block Floating Point Precision for Improved DNN Efficiency on Resource-Constrained Devices | Yongqi Xu et.al. | 2409.17093 | link |
| 2024-09-25 | EventHDR: from Event to High-Speed HDR Videos and Beyond | Yunhao Zou et.al. | 2409.17029 | null |
| 2024-09-25 | Focus Entirety and Perceive Environment for Arbitrary-Shaped Text Detection | Xu Han et.al. | 2409.16827 | null |
| 2024-09-25 | XAI-guided Insulator Anomaly Detection for Imbalanced Datasets | Maximilian Andreas Hoefler et.al. | 2409.16821 | null |
| 2024-09-25 | Spotlight Text Detector: Spotlight on Candidate Regions Like a Camera | Xu Han et.al. | 2409.16820 | null |
| 2024-09-25 | Benchmarking Deep Learning Models for Object Detection on Edge Computing Devices | Daghash K. Alqahtani et.al. | 2409.16808 | null |
| 2024-09-25 | Pix2Next: Leveraging Vision Foundation Models for RGB to NIR Image Translation | Youngwan Jin et.al. | 2409.16706 | link |
| 2024-09-25 | TSBP: Improving Object Detection in Histology Images via Test-time Self-guided Bounding-box Propagation | Tingting Yang et.al. | 2409.16678 | link |
| 2024-09-25 | Source-Free Domain Adaptation for YOLO Object Detection | Simon Varailhon et.al. | 2409.16538 | null |
| 2024-09-24 | Real-Time Detection of Electronic Components in Waste Printed Circuit Boards: A Transformer-Based Approach | Muhammad Mohsin et.al. | 2409.16496 | null |
| 2024-09-24 | Tiny Robotics Dataset and Benchmark for Continual Object Detection | Francesco Pasti et.al. | 2409.16215 | link |
| 2024-09-24 | Seeing Faces in Things: A Model and Dataset for Pareidolia | Mark Hamilton et.al. | 2409.16143 | null |
| 2024-09-24 | HA-FGOVD: Highlighting Fine-grained Attributes via Explicit Linear Composition for Open-Vocabulary Object Detection | Yuqi Ma et.al. | 2409.16136 | null |
| 2024-09-24 | Neuromorphic Drone Detection: an Event-RGB Multimodal Approach | Gabriele Magrini et.al. | 2409.16099 | null |
| 2024-09-24 | Open-World Object Detection with Instance Representation Learning | Sunoh Lee et.al. | 2409.16073 | null |
| 2024-09-24 | Towards Robust Object Detection: Identifying and Removing Backdoors via Module Inconsistency Analysis | Xianda Zhang et.al. | 2409.16057 | null |
| 2024-09-24 | Zero-Shot Detection of AI-Generated Images | Davide Cozzolino et.al. | 2409.15875 | null |
| 2024-09-24 | Automated Assessment of Multimodal Answer Sheets in the STEM domain | Rajlaxmi Patil et.al. | 2409.15749 | null |
| 2024-09-24 | Real-Time Pedestrian Detection on IoT Edge Devices: A Lightweight Deep Learning Approach | Muhammad Dany Alfikri et.al. | 2409.15740 | null |
| 2024-09-24 | PDT: Uav Target Detection Dataset for Pests and Diseases Tree | Mingle Zhou et.al. | 2409.15679 | link |
| 2024-09-18 | Applications of Knowledge Distillation in Remote Sensing: A Survey | Yassine Himeur et.al. | 2409.12111 | null |
| 2024-09-18 | Agglomerative Token Clustering | Joakim Bruslund Haurum et.al. | 2409.11923 | link |
| 2024-09-18 | RockTrack: A 3D Robust Multi-Camera-Ken Multi-Object Tracking Framework | Xiaoyu Li et.al. | 2409.11749 | null |
| 2024-09-17 | Open-Set Semantic Uncertainty Aware Metric-Semantic Graph Matching | Kurran Singh et.al. | 2409.11555 | null |
| 2024-09-17 | VALO: A Versatile Anytime Framework for LiDAR-based Object Detection Deep Neural Networks | Ahmet Soyyigit et.al. | 2409.11542 | link |
| 2024-09-17 | STCMOT: Spatio-Temporal Cohesion Learning for UAV-Based Multiple Object Tracking | Jianbo Ma et.al. | 2409.11234 | link |
| 2024-09-19 | Vision foundation models: can they be applied to astrophysics data? | E. Lastufka et.al. | 2409.11175 | null |
| 2024-09-17 | UltimateDO: An Efficient Framework to Marry Occupancy Prediction with 3D Object Detection via Channel2height | Zichen Yu et.al. | 2409.11160 | null |
| 2024-09-17 | Unleashing the Potential of Mamba: Boosting a LiDAR 3D Sparse Detector by Using Cross-Model Knowledge Distillation | Rui Yu et.al. | 2409.11018 | null |
| 2024-09-17 | TrajSSL: Trajectory-Enhanced Semi-Supervised 3D Object Detection | Philip Jacobson et.al. | 2409.10901 | null |
| 2024-09-18 | Context-Dependent Interactable Graphical User Interface Element Detection for Spatial Computing Applications | Shuqing Li et.al. | 2409.10811 | null |
| 2024-09-16 | Online Learning via Memory: Retrieval-Augmented Detector Adaptation | Yanan Jian et.al. | 2409.10716 | null |
| 2024-09-16 | CoMamba: Real-time Cooperative Perception Unlocked with State Space Models | Jinlong Li et.al. | 2409.10699 | null |
| 2024-09-16 | Point2Graph: An End-to-end Point Cloud-based 3D Open-Vocabulary Scene Graph for Robot Navigation | Yifan Xu et.al. | 2409.10350 | null |
| 2024-09-16 | Performance of Human Annotators in Object Detection and Segmentation of Remotely Sensed Data | Roni Blushtein-Livnon et.al. | 2409.10272 | null |
| 2024-09-16 | Self-Updating Vehicle Monitoring Framework Employing Distributed Acoustic Sensing towards Real-World Settings | Xi Wang et.al. | 2409.10259 | null |
| 2024-09-16 | DAE-Fuse: An Adaptive Discriminative Autoencoder for Multi-Modality Image Fusion | Yuchen Guo et.al. | 2409.10080 | null |
| 2024-09-16 | Towards Physically-Realizable Adversarial Attacks in Embodied Vision Navigation | Meng Chen et.al. | 2409.10071 | link |
| 2024-09-16 | LithoHoD: A Litho Simulator-Powered Framework for IC Layout Hotspot Detection | Hao-Chiang Shao et.al. | 2409.10021 | null |
| 2024-09-16 | Comprehensive Study on Sentiment Analysis: From Rule-based to modern LLM based system | Shailja Gupta et.al. | 2409.09989 | null |
| 2024-09-15 | Tracking Virtual Meetings in the Wild: Re-identification in Multi-Participant Virtual Meetings | Oriel Perl et.al. | 2409.09841 | null |
| 2024-09-15 | Template-based Multi-Domain Face Recognition | Anirudh Nanduri et.al. | 2409.09832 | null |
| 2024-09-15 | PersonaMark: Personalized LLM watermarking for model protection and user attribution | Yuehan Zhang et.al. | 2409.09739 | null |
| 2024-09-13 | Interactive Masked Image Modeling for Multimodal Object Detection in Remote Sensing | Minh-Duc Vu et.al. | 2409.08885 | null |
| 2024-09-13 | Direct-CP: Directed Collaborative Perception for Connected and Autonomous Vehicles via Proactive Attention | Yihang Tao et.al. | 2409.08840 | null |
| 2024-09-13 | RT-DETRv3: Real-time End-to-End Object Detection with Hierarchical Dense Positive Supervision | Shuo Wang et.al. | 2409.08475 | null |
| 2024-09-12 | X-ray Fluoroscopy Guided Localization and Steering of Medical Microrobots through Virtual Enhancement | Husnu Halid Alabay et.al. | 2409.08337 | null |
| 2024-09-12 | What is YOLOv9: An In-Depth Exploration of the Internal Features of the Next-Generation Object Detector | Muhammad Yaseen et.al. | 2409.07813 | null |
| 2024-09-11 | Object Depth and Size Estimation using Stereo-vision and Integration with SLAM | Layth Hamad et.al. | 2409.07623 | null |
| 2024-09-11 | Zero-Shot Machine-Generated Text Detection Using Mixture of Large Language Models | Matthieu Dubois et.al. | 2409.07615 | null |
| 2024-09-11 | ENACT: Entropy-based Clustering of Attention Input for Improving the Computational Performance of Object Detection Transformers | Giorgos Savathrakis et.al. | 2409.07541 | link |
| 2024-09-11 | Watchlist Challenge: 3rd Open-set Face Detection and Identification | Furkan Kasım et.al. | 2409.07220 | null |
| 2024-09-11 | SCLNet: A Scale-Robust Complementary Learning Network for Object Detection in UAV Images | Xuexue Li et.al. | 2409.07024 | null |
| 2024-09-11 | ODYSSEE: Oyster Detection Yielded by Sensor Systems on Edge Electronics | Xiaomin Lin et.al. | 2409.07003 | null |
| 2024-09-11 | Brain-Inspired Stepwise Patch Merging for Vision Transformers | Yonghao Yu et.al. | 2409.06963 | null |
| 2024-09-10 | Cross-Modal Self-Supervised Learning with Effective Contrastive Units for LiDAR Point Clouds | Mu Cai et.al. | 2409.06827 | link |
| 2024-09-10 | Technical Report of Mobile Manipulator Robot for Industrial Environments | Erfan Amoozad Khalili et.al. | 2409.06693 | null |
| 2024-09-10 | A comprehensive study on Blood Cancer detection and classification using Convolutional Neural Network | Md Taimur Ahad et.al. | 2409.06689 | null |
| 2024-09-10 | When to Extract ReID Features: A Selective Approach for Improved Multiple Object Tracking | Emirhan Bayar et.al. | 2409.06617 | link |
| 2024-09-10 | Transtreaming: Adaptive Delay-aware Transformer for Real-time Streaming Perception | Xiang Zhang et.al. | 2409.06584 | null |
| 2024-09-10 | Semi-Supervised 3D Object Detection with Chanel Augmentation using Transformation Equivariance | Minju Kang et.al. | 2409.06583 | null |
| 2024-09-10 | Knowledge Distillation via Query Selection for Detection Transformer | Yi Liu et.al. | 2409.06443 | null |
| 2024-09-10 | An Attribute-Enriched Dataset and Auto-Annotated Pipeline for Open Detection | Pengfei Qi et.al. | 2409.06300 | null |
| 2024-09-09 | Replay Consolidation with Label Propagation for Continual Object Detection | Riccardo De Monte et.al. | 2409.05650 | null |
| 2024-09-09 | Renormalized Connection for Scale-preferred Object Detection in Satellite Imagery | Fan Zhang et.al. | 2409.05624 | null |
| 2024-09-09 | LEROjD: Lidar Extended Radar-Only Object Detection | Patrick Palmer et.al. | 2409.05564 | link |
| 2024-09-09 | Proto-OOD: Enhancing OOD Object Detection with Prototype Feature Similarity | Junkun Chen et.al. | 2409.05466 | null |
| 2024-09-09 | Distribution Discrepancy and Feature Heterogeneity for Active 3D Object Detection | Huang-Yu Chen et.al. | 2409.05425 | null |
| 2024-09-08 | A Low-Computational Video Synopsis Framework with a Standard Dataset | Ramtin Malekpour et.al. | 2409.05230 | link |
| 2024-09-08 | Can OOD Object Detectors Learn from Foundation Models? | Jiahui Liu et.al. | 2409.05162 | link |
| 2024-09-08 | WaterSeeker: Efficient Detection of Watermarked Segments in Large Documents | Leyi Pan et.al. | 2409.05112 | null |
| 2024-09-08 | Visual Grounding with Multi-modal Conditional Adaptation | Ruilin Yao et.al. | 2409.04999 | link |
| 2024-09-08 | Multi-V2X: A Large Scale Multi-modal Multi-penetration-rate Dataset for Cooperative Perception | Rongsong Li et.al. | 2409.04980 | null |
| 2024-09-06 | Future Does Matter: Boosting 3D Object Detection with Temporal Motion Estimation in Point Cloud Sequences | Rui Yu et.al. | 2409.04390 | null |
| 2024-09-06 | UniDet3D: Multi-dataset Indoor 3D Object Detection | Maksim Kolodiazhnyi et.al. | 2409.04234 | link |
| 2024-09-06 | Feature Compression for Cloud-Edge Multimodal 3D Object Detection | Chongzhen Tian et.al. | 2409.04123 | null |
| 2024-09-06 | D4: Text-guided diffusion model-based domain adaptive data augmentation for vineyard shoot detection | Kentaro Hirahara et.al. | 2409.04060 | null |
| 2024-09-06 | BFA-YOLO: Balanced multiscale object detection network for multi-view building facade attachments detection | Yangguang Chen et.al. | 2409.04025 | null |
| 2024-09-05 | LowFormer: Hardware Efficient Design for Convolutional Transformer Backbones | Moritz Nottebaum et.al. | 2409.03460 | link |
| 2024-09-05 | Training-free Conversion of Pretrained ANNs to SNNs for Low-Power and High-Performance Applications | Tong Bu et.al. | 2409.03368 | null |
| 2024-09-05 | YOLO-PPA based Efficient Traffic Sign Detection for Cruise Control in Autonomous Driving | Jingyu Zhang et.al. | 2409.03320 | null |
| 2024-09-05 | Gr-IoU: Ground-Intersection over Union for Robust Multi-Object Tracking with 3D Geometric Constraints | Keisuke Toida et.al. | 2409.03252 | null |
| 2024-09-04 | Boundless: Generating Photorealistic Synthetic Data for Object Detection in Urban Streetscapes | Mehmet Kerem Turkcan et.al. | 2409.03022 | link |
| 2024-09-04 | Real-Time Dynamic Scale-Aware Fusion Detection Network: Take Road Damage Detection as an example | Weichao Pan et.al. | 2409.02546 | null |
| 2024-09-04 | TP-GMOT: Tracking Generic Multiple Object by Textual Prompt with Motion-Appearance Cost (MAC) SORT | Duy Le Dinh Anh et.al. | 2409.02490 | link |
| 2024-09-04 | Rapid Automatic Multiple Moving Objects Detection Method Based on Feature Extraction from Images with Non-sidereal Tracking | Lei Wang et.al. | 2409.02405 | null |
| 2024-09-04 | Pluralistic Salient Object Detection | Xuelu Feng et.al. | 2409.02368 | null |
| 2024-09-03 | Site Selection for the Second Flyeye Telescope: A Simulation Study for Optimizing Near-Earth Object Discovery | D. Föhring et.al. | 2409.02329 | null |
| 2024-09-03 | K-Origins: Better Colour Quantification for Neural Networks | Lewis Mason et.al. | 2409.02281 | null |
| 2024-09-03 | Evaluation and Comparison of Visual Language Models for Transportation Engineering Problems | Sanjita Prajapati et.al. | 2409.02278 | null |
| 2024-09-03 | A Modern Take on Visual Relationship Reasoning for Grasp Planning | Paolo Rabino et.al. | 2409.02035 | null |
| 2024-09-03 | Latent Distillation for Continual Object Detection at the Edge | Francesco Pasti et.al. | 2409.01872 | link |
| 2024-09-03 | Real-Time Indoor Object Detection based on hybrid CNN-Transformer Approach | Salah Eddine Laidoudi et.al. | 2409.01871 | null |
| 2024-08-30 | Structuring a Training Strategy to Robustify Perception Models with Realistic Image Augmentations | Ahmed Hammam et.al. | 2408.17311 | null |
| 2024-08-30 | Hybrid Classification-Regression Adaptive Loss for Dense Object Detection | Yanquan Huang et.al. | 2408.17182 | null |
| 2024-08-30 | UTrack: Multi-Object Tracking with Uncertain Detections | Edgardo Solano-Carrillo et.al. | 2408.17098 | link |
| 2024-08-30 | PIB: Prioritized Information Bottleneck Framework for Collaborative Edge Video Analytics | Zhengru Fang et.al. | 2408.17047 | null |
| 2024-08-30 | CP-VoteNet: Contrastive Prototypical VoteNet for Few-Shot Point Cloud Object Detection | Xuejing Li et.al. | 2408.17036 | null |
| 2024-08-30 | MakeWay: Object-Aware Costmaps for Proactive Indoor Navigation Using LiDAR | Binbin Xu et.al. | 2408.17034 | null |
| 2024-08-29 | Analyzing Errors in Controlled Turret System Given Target Location Input from Artificial Intelligence Methods in Automatic Target Recognition | Matthew Karlson et.al. | 2408.16923 | null |
| 2024-08-29 | Space3D-Bench: Spatial 3D Question Answering Benchmark | Emilia Szymanska et.al. | 2408.16662 | null |
| 2024-08-29 | SODAWideNet++: Combining Attention and Convolutions for Salient Object Detection | Rohit Venkata Sai Dulam et.al. | 2408.16645 | null |
| 2024-08-29 | UAV-Based Human Body Detector Selection and Fusion for Geolocated Saliency Map Generation | Piotr Rudol et.al. | 2408.16501 | null |
| 2024-08-29 | Weakly Supervised Object Detection for Automatic Tooth-marked Tongue Recognition | Yongcun Zhang et.al. | 2408.16451 | link |
| 2024-08-29 | Enhancing Sound Source Localization via False Negative Elimination | Zengjie Song et.al. | 2408.16448 | link |
| 2024-08-29 | High-yield large-scale suspended graphene membranes over closed cavities for sensor applications | Sebastian Lukas et.al. | 2408.16408 | null |
| 2024-08-29 | FA-YOLO: Research On Efficient Feature Selection YOLO Improved Algorithm Based On FMDS and AGMF Modules | Yukang Huo et.al. | 2408.16313 | null |
| 2024-08-29 | Anno-incomplete Multi-dataset Detection | Yiran Xu et.al. | 2408.16247 | null |
| 2024-08-29 | PolarBEVDet: Exploring Polar Representation for Multi-View 3D Object Detection in Bird’s-Eye-View | Zichen Yu et.al. | 2408.16200 | null |
| 2024-08-28 | ChartEye: A Deep Learning Framework for Chart Information Extraction | Osama Mustafa et.al. | 2408.16123 | null |
| 2024-08-28 | microYOLO: Towards Single-Shot Object Detection on Microcontrollers | Mark Deutel et.al. | 2408.15865 | null |
| 2024-08-28 | What is YOLOv8: An In-Depth Exploration of the Internal Features of the Next-Generation Object Detector | Muhammad Yaseen et.al. | 2408.15857 | null |
| 2024-08-28 | Network transferability of adversarial patches in real-time object detection | Jens Bayer et.al. | 2408.15833 | link |
| 2024-08-28 | Object Detection for Vehicle Dashcams using Transformers | Osama Mustafa et.al. | 2408.15809 | null |
| 2024-08-29 | RIDE: Boosting 3D Object Detection for LiDAR Point Clouds via Rotation-Invariant Analysis | Zhaoxuan Wang et.al. | 2408.15643 | null |
| 2024-08-28 | MMDRFuse: Distilled Mini-Model with Dynamic Refresh for Multi-Modality Image Fusion | Yanglin Deng et.al. | 2408.15641 | link |
| 2024-08-28 | Semantic and goal-oriented edge computing for satellite Earth Observation | Beatriz Soret et.al. | 2408.15639 | null |
| 2024-08-28 | Transfer Learning from Simulated to Real Scenes for Monocular 3D Object Detection | Sondos Mohamed et.al. | 2408.15637 | null |
| 2024-08-28 | Can Visual Language Models Replace OCR-Based Visual Question Answering Pipelines in Production? A Case Study in Retail | Bianca Lamm et.al. | 2408.15626 | null |
| 2024-08-28 | RoboSense: Large-scale Dataset and Benchmark for Multi-sensor Low-speed Autonomous Driving | Haisheng Su et.al. | 2408.15503 | null |
| 2024-08-27 | A Review of Transformer-Based Models for Computer Vision Tasks: Capturing Global Context and Spatial Relationships | Gracile Astlin Pereira et.al. | 2408.15178 | null |
| 2024-08-27 | Adapting Segment Anything Model to Multi-modal Salient Object Detection with Semantic Feature Fusion Guidance | Kunpeng Wang et.al. | 2408.15063 | null |
| 2024-08-27 | Hierarchical Graph Interaction Transformer with Dynamic Token Clustering for Camouflaged Object Detection | Siyuan Yao et.al. | 2408.15020 | link |
| 2024-08-27 | Knowledge Discovery in Optical Music Recognition: Enhancing Information Retrieval with Instance Segmentation | Elona Shatri et.al. | 2408.15002 | null |
| 2024-08-27 | BOX3D: Lightweight Camera-LiDAR Fusion for 3D Object Detection and Localization | Mario A. V. Saucedo et.al. | 2408.14941 | null |
| 2024-08-26 | PVAFN: Point-Voxel Attention Fusion Network with Multi-Pooling Enhancing for 3D Object Detection | Yidi Li et.al. | 2408.14600 | null |
| 2024-08-26 | A Survey of Camouflaged Object Detection and Beyond | Fengyang Xiao et.al. | 2408.14562 | null |
| 2024-08-26 | Beyond Few-shot Object Detection: A Detailed Survey | Vishal Chudasama et.al. | 2408.14249 | null |
| 2024-08-26 | TC-PDM: Temporally Consistent Patch Diffusion Models for Infrared-to-Visible Video Translation | Anh-Dzung Doan et.al. | 2408.14227 | null |
| 2024-08-26 | EMDFNet: Efficient Multi-scale and Diverse Feature Network for Traffic Sign Detection | Pengyu Li et.al. | 2408.14189 | null |
| 2024-08-26 | More Pictures Say More: Visual Intersection Network for Open Set Object Detection | Bingcheng Dong et.al. | 2408.14032 | null |
| 2024-08-25 | Bridging the Gap between Real-world and Synthetic Images for Testing Autonomous Driving Systems | Mohammad Hossein Amini et.al. | 2408.13950 | null |
| 2024-08-25 | OpenNav: Efficient Open Vocabulary 3D Object Detection for Smart Wheelchair Navigation | Muhammad Rameez ur Rahman et.al. | 2408.13936 | link |
| 2024-08-25 | Infrared Domain Adaptation with Zero-Shot Quantization | Burak Sevsay et.al. | 2408.13925 | null |
| 2024-08-25 | TraIL-Det: Transformation-Invariant Local Feature Networks for 3D LiDAR Object Detection with Unsupervised Pre-Training | Li Li et.al. | 2408.13902 | null |
| 2024-08-25 | Selectively Dilated Convolution for Accuracy-Preserving Sparse Pillar-based Embedded 3D Object Detection | Seongmin Park et.al. | 2408.13798 | null |
| 2024-08-24 | Mean Height Aided Post-Processing for Pedestrian Detection | Jing Yuan et.al. | 2408.13646 | null |
| 2024-08-23 | MCTR: Multi Camera Tracking Transformer | Alexandru Niculescu-Mizil et.al. | 2408.13243 | null |
| 2024-08-23 | DeTPP: Leveraging Object Detection for Robust Long-Horizon Event Prediction | Ivan Karpukhin et.al. | 2408.13131 | null |
| 2024-08-23 | VFM-Det: Towards High-Performance Vehicle Detection via Large Foundation Models | Wentao Wu et.al. | 2408.13031 | link |
| 2024-08-23 | Can AI Assistance Aid in the Grading of Handwritten Answer Sheets? | Pritam Sil et.al. | 2408.12870 | null |
| 2024-08-23 | Symmetric masking strategy enhances the performance of Masked Image Modeling | Khanh-Binh Nguyen et.al. | 2408.12772 | null |
| 2024-08-22 | CatFree3D: Category-agnostic 3D Object Detection with Diffusion | Wenjing Bian et.al. | 2408.12747 | null |
| 2024-08-22 | Revisiting Cross-Domain Problem for LiDAR-based 3D Object Detection | Ruixiao Zhang et.al. | 2408.12708 | null |
| 2024-08-22 | xGen-VideoSyn-1: High-fidelity Text-to-Video Synthesis with Compressed Representations | Can Qin et.al. | 2408.12590 | null |
| 2024-08-22 | Enhanced Parking Perception by Multi-Task Fisheye Cross-view Transformers | Antonyo Musabini et.al. | 2408.12575 | null |
| 2024-08-22 | Comparing YOLOv5 Variants for Vehicle Detection: A Performance Analysis | Athulya Sundaresan Geetha et.al. | 2408.12550 | null |
| 2024-08-22 | UMAD: University of Macau Anomaly Detection Benchmark Dataset | Dong Li et.al. | 2408.12527 | link |
| 2024-08-22 | Class-balanced Open-set Semi-supervised Object Detection for Medical Images | Zhanyun Lu et.al. | 2408.12355 | null |
| 2024-08-22 | OVA-DETR: Open Vocabulary Aerial Object Detection Using Image-Text Alignment and Fusion | Guoting Wei et.al. | 2408.12246 | null |
| 2024-08-22 | On the Credibility of Backdoor Attacks Against Object Detectors in the Physical World | Bao Gia Doan et.al. | 2408.12122 | null |
| 2024-08-21 | CARLA Drone: Monocular 3D Object Detection from a Different Perspective | Johannes Meier et.al. | 2408.11958 | null |
| 2024-08-21 | SBDet: A Symmetry-Breaking Object Detector via Relaxed Rotation-Equivariance | Zhiqiang Wu et.al. | 2408.11760 | null |
| 2024-08-21 | Video-to-Text Pedestrian Monitoring (VTPM): Leveraging Computer Vision and Large Language Models for Privacy-Preserve Pedestrian Activity Monitoring at Intersections | Ahmed S. Abdelrahman et.al. | 2408.11649 | null |
| 2024-08-21 | Domain-invariant Progressive Knowledge Distillation for UAV-based Object Detection | Liang Yao et.al. | 2408.11407 | null |
| 2024-08-20 | On the Potential of Open-Vocabulary Models for Object Detection in Unusual Street Scenes | Sadia Ilyas et.al. | 2408.11221 | null |
| 2024-08-20 | Quantum Inverse Contextual Vision Transformers (Q-ICVT): A New Frontier in 3D Object Detection for AVs | Sanjay Bhargav Dharavath et.al. | 2408.11207 | link |
| 2024-08-20 | A Closer Look at Data Augmentation Strategies for Finetuning-Based Low/Few-Shot Object Detection | Vladislav Li et.al. | 2408.10940 | null |
| 2024-08-20 | Aligning Object Detector Bounding Boxes with Human Preference | Ombretta Strafforello et.al. | 2408.10844 | null |
| 2024-08-20 | LightMDETR: A Lightweight Approach for Low-Cost Open-Vocabulary Object Detection Training | Binta Sow et.al. | 2408.10787 | null |
| 2024-08-20 | Just a Hint: Point-Supervised Camouflaged Object Detection | Huafeng Chen et.al. | 2408.10777 | null |
| 2024-08-21 | Generative AI in Industrial Machine Vision – A Review | Hans Aoyang Zhou et.al. | 2408.10775 | null |
| 2024-08-20 | Detection of Intracranial Hemorrhage for Trauma Patients | Antoine P. Sanner et.al. | 2408.10768 | null |
| 2024-08-20 | SAM-COD: SAM-guided Unified Framework for Weakly-Supervised Camouflaged Object Detection | Huafeng Chen et.al. | 2408.10760 | null |
| 2024-08-20 | Leveraging Temporal Contexts to Enhance Vehicle-Infrastructure Cooperative Perception | Jiaru Zhong et.al. | 2408.10531 | null |
| 2024-08-19 | Leveraging Superfluous Information in Contrastive Representation Learning | Xuechu Yu et.al. | 2408.10292 | null |
| 2024-08-19 | SHARP: Segmentation of Hands and Arms by Range using Pseudo-Depth for Enhanced Egocentric 3D Hand Pose Estimation and Action Recognition | Wiktor Mucha et.al. | 2408.10037 | null |
| 2024-08-19 | Segment-Anything Models Achieve Zero-shot Robustness in Autonomous Driving | Jun Yan et.al. | 2408.09839 | link |
| 2024-08-19 | Latent Diffusion for Guided Document Table Generation | Syed Jawwad Haider Hamdani et.al. | 2408.09800 | null |
| 2024-08-18 | Adversarial Attacked Teacher for Unsupervised Domain Adaptive Object Detection | Kaiwen Wang et.al. | 2408.09431 | null |
| 2024-08-18 | Boundary-Recovering Network for Temporal Action Detection | Jihwan Kim et.al. | 2408.09354 | null |
| 2024-08-18 | YOLOv1 to YOLOv10: The fastest and most accurate real-time object detection systems | Chien-Yao Wang et.al. | 2408.09332 | null |
| 2024-08-17 | GSLAMOT: A Tracklet and Query Graph-based Simultaneous Locating, Mapping, and Multiple Object Tracking System | Shuo Wang et.al. | 2408.09191 | null |
| 2024-08-17 | PADetBench: Towards Benchmarking Physical Attacks against Object Detection | Jiawei Lian et.al. | 2408.09181 | link |
| 2024-08-17 | MaskBEV: Towards A Unified Framework for BEV Detection and Map Segmentation | Xiao Zhao et.al. | 2408.09122 | null |
| 2024-08-17 | Locate Anything on Earth: Advancing Open-Vocabulary Object Detection for Remote Sensing Community | Jiancheng Pan et.al. | 2408.09110 | null |
| 2024-08-16 | SAM2-UNet: Segment Anything 2 Makes Strong Encoder for Natural and Medical Image Segmentation | Xinyu Xiong et.al. | 2408.08870 | link |
| 2024-08-16 | Multimodal Relational Triple Extraction with Query-based Entity Object Transformer | Lei Hei et.al. | 2408.08709 | null |
| 2024-08-16 | Tell Codec What Worth Compressing: Semantically Disentangled Image Coding for Machine with LMMs | Jinming Liu et.al. | 2408.08575 | null |
| 2024-08-15 | 5%>100%: Breaking Performance Shackles of Full Fine-Tuning on Visual Recognition Tasks | Dongshuo Yin et.al. | 2408.08345 | link |
| 2024-08-15 | Learned Multimodal Compression for Autonomous Driving | Hadi Hadizadeh et.al. | 2408.08211 | null |
| 2024-08-16 | OC3D: Weakly Supervised Outdoor 3D Object Detection with Only Coarse Click Annotation | Qiming Xia et.al. | 2408.08092 | null |
| 2024-08-15 | CamoTeacher: Dual-Rotation Consistency Learning for Semi-Supervised Camouflaged Object Detection | Xunfa Lai et.al. | 2408.08050 | null |
| 2024-08-15 | Co-Fix3D: Enhancing 3D Object Detection with Collaborative Refinement | Wenxuan Li et.al. | 2408.07999 | null |
| 2024-08-15 | GOReloc: Graph-based Object-Level Relocalization for Visual SLAM | Yutong Wang et.al. | 2408.07917 | link |
| 2024-08-14 | See It All: Contextualized Late Aggregation for 3D Dense Captioning | Minjung Kim et.al. | 2408.07648 | null |
| 2024-08-14 | Panacea+: Panoramic and Controllable Video Generation for Autonomous Driving | Yuqing Wen et.al. | 2408.07605 | null |
| 2024-08-14 | Infra-YOLO: Efficient Neural Network Structure with Model Compression for Real-Time Infrared Small Object Detection | Zhonglin Chen et.al. | 2408.07455 | null |
| 2024-08-14 | Sign language recognition based on deep learning and low-cost handcrafted descriptors | Alvaro Leandro Cavalcante Carneiro et.al. | 2408.07244 | link |
| 2024-08-13 | Vision Language Model for Interpretable and Fine-grained Detection of Safety Compliance in Diverse Workplaces | Zhiling Chen et.al. | 2408.07146 | null |
| 2024-08-13 | Divide and Conquer: Improving Multi-Camera 3D Perception with 2D Semantic-Depth Priors and Input-Dependent Queries | Qi Song et.al. | 2408.06901 | null |
| 2024-08-13 | Integrating Saliency Ranking and Reinforcement Learning for Enhanced Object Detection | Matthias Bartolo et.al. | 2408.06803 | link |
| 2024-08-13 | Exploring Domain Shift on Radar-Based 3D Object Detection Amidst Diverse Environmental Conditions | Miao Zhang et.al. | 2408.06772 | null |
| 2024-08-13 | Unified-IoU: For High-Quality Object Detection | Xiangjie Luo et.al. | 2408.06636 | link |
| 2024-08-13 | A lightweight YOLOv5-FFM model for occlusion pedestrian detection | Xiangjie Luo et.al. | 2408.06633 | null |
| 2024-08-13 | MV-DETR: Multi-modality indoor object detection by Multi-View DEtecton TRansformers | Zichao Dong et.al. | 2408.06604 | null |
| 2024-08-12 | Latent Disentanglement for Low Light Image Enhancement | Zhihao Zheng et.al. | 2408.06245 | null |
| 2024-08-12 | MR3D-Net: Dynamic Multi-Resolution 3D Sparse Voxel Grid Fusion for LiDAR-Based Collective Perception | Sven Teufel et.al. | 2408.06137 | link |
| 2024-08-12 | DPDETR: Decoupled Position Detection Transformer for Infrared-Visible Object Detection | Junjie Guo et.al. | 2408.06123 | null |
| 2024-08-12 | Optimizing Vision Transformers with Data-Free Knowledge Transfer | Gousia Habib et.al. | 2408.05952 | null |
| 2024-08-12 | MV2DFusion: Leveraging Modality-Specific Object Semantics for Multi-Modal 3D Detection | Zitian Wang et.al. | 2408.05945 | null |
| 2024-08-12 | Multi-scale Contrastive Adaptor Learning for Segmenting Anything in Underperformed Scenes | Ke Zhou et.al. | 2408.05936 | null |
| 2024-08-12 | Weakly Supervised Video Anomaly Detection and Localization with Spatio-Temporal Prompts | Peng Wu et.al. | 2408.05905 | null |
| 2024-08-12 | Toward Pedestrian Head Tracking: A Benchmark Dataset and an Information Fusion Network | Kailai Sun et.al. | 2408.05877 | null |
| 2024-08-11 | U-DECN: End-to-End Underwater Object Detection ConvNet with Improved DeNoising Training | Zhuoyan Liu et.al. | 2408.05780 | link |
| 2024-08-11 | FADE: A Dataset for Detecting Falling Objects around Buildings in Video | Zhigang Tu et.al. | 2408.05750 | null |
| 2024-08-09 | DeepInteraction++: Multi-Modality Interaction for Autonomous Driving | Zeyu Yang et.al. | 2408.05075 | link |
| 2024-08-09 | RadarPillars: Efficient Object Detection from 4D Radar Point Clouds | Alexander Musiat et.al. | 2408.05020 | null |
| 2024-08-09 | Hyper-YOLO: When Visual Object Detection Meets Hypergraph Computation | Yifan Feng et.al. | 2408.04804 | link |
| 2024-08-08 | SOD-YOLOv8 – Enhancing YOLOv8 for Small Object Detection in Traffic Scenes | Boshra Khalili et.al. | 2408.04786 | null |
| 2024-08-08 | Data-Driven Pixel Control: Challenges and Prospects | Saurabh Farkya et.al. | 2408.04767 | null |
| 2024-08-10 | SAM2-Adapter: Evaluating & Adapting Segment Anything 2 in Downstream Tasks: Camouflage, Shadow, Medical Image Segmentation, and More | Tianrun Chen et.al. | 2408.04579 | null |
| 2024-08-07 | Impact Analysis of Data Drift Towards The Development of Safety-Critical Automotive System | Md Shahi Amran Hossain et.al. | 2408.04476 | null |
| 2024-08-08 | Detecting Car Speed using Object Detection and Depth Estimation: A Deep Learning Framework | Subhasis Dasgupta et.al. | 2408.04360 | null |
| 2024-08-08 | Multi-Scale and Detail-Enhanced Segment Anything Model for Salient Object Detection | Shixuan Gao et.al. | 2408.04326 | null |
| 2024-08-08 | LLM-DetectAIve: a Tool for Fine-Grained Machine-Generated Text Detection | Mervat Abassy et.al. | 2408.04284 | null |
| 2024-08-08 | Learning to Rewrite: Generalized LLM-Generated Text Detection | Wei Hao et.al. | 2408.04237 | null |
| 2024-08-07 | PaveCap: The First Multimodal Framework for Comprehensive Pavement Condition Assessment with Dense Captioning and PCI Estimation | Blessing Agyei Kyem et.al. | 2408.04110 | link |
| 2024-08-07 | Vision-Language Guidance for LiDAR-based Unsupervised 3D Object Detection | Christian Fruhwirth-Reisinger et.al. | 2408.03790 | null |
| 2024-08-07 | Data Generation Scheme for Thermal Modality with Edge-Guided Adversarial Conditional Diffusion Model | Guoqing Zhu et.al. | 2408.03748 | link |
| 2024-08-07 | CAS-ViT: Convolutional Additive Self-attention Vision Transformers for Efficient Mobile Applications | Tianfang Zhang et.al. | 2408.03703 | link |
| 2024-08-07 | L4DR: LiDAR-4DRadar Fusion for Weather-Robust 3D Object Detection | Xun Huang et.al. | 2408.03677 | null |
| 2024-08-07 | Designing Extremely Memory-Efficient CNNs for On-device Vision Tasks | Jaewook Lee et.al. | 2408.03663 | null |
| 2024-08-07 | Leveraging LLMs for Enhanced Open-Vocabulary 3D Scene Understanding in Autonomous Driving | Amirhosein Chahe et.al. | 2408.03516 | null |
| 2024-08-07 | GUI Element Detection Using SOTA YOLO Deep Learning Models | Seyed Shayan Daneshvar et.al. | 2408.03507 | null |
| 2024-08-06 | AI Foundation Models in Remote Sensing: A Survey | Siqi Lu et.al. | 2408.03464 | null |
| 2024-08-06 | Biomedical Image Segmentation: A Systematic Literature Review of Deep Learning Based Object Detection Methods | Fazli Wahid et.al. | 2408.03393 | null |
| 2024-08-06 | Nighttime Pedestrian Detection Based on Fore-Background Contrast Learning | He Yao et.al. | 2408.03030 | null |
| 2024-08-06 | Diverse Generation while Maintaining Semantic Coordination: A Diffusion-Based Data Augmentation Method for Object Detection | Sen Nie et.al. | 2408.02891 | null |
| 2024-08-05 | HQOD: Harmonious Quantization for Object Detection | Long Huang et.al. | 2408.02561 | null |
| 2024-08-05 | Tensorial template matching for fast cross-correlation with rotations and its application for tomography | Antonio Martinez-Sanchez et.al. | 2408.02398 | null |
| 2024-08-05 | Mixture-of-Noises Enhanced Forgery-Aware Predictor for Multi-Face Manipulation Detection and Localization | Changtao Miao et.al. | 2408.02306 | null |
| 2024-08-05 | AssemAI: Interpretable Image-Based Anomaly Detection for Manufacturing Pipelines | Renjith Prasad et.al. | 2408.02181 | null |
| 2024-08-04 | KAN-RCBEVDepth: A multi-modal fusion algorithm in object detection for autonomous driving | Zhihao Lai et.al. | 2408.02088 | null |
| 2024-08-06 | A Survey and Evaluation of Adversarial Attacks for Object Detection | Khoi Nguyen Tiet Nguyen et.al. | 2408.01934 | null |
| 2024-08-04 | CAF-YOLO: A Robust Framework for Multi-Scale Lesion Detection in Biomedical Imagery | Zilin Chen et.al. | 2408.01897 | null |
| 2024-08-03 | Supervised Image Translation from Visible to Infrared Domain for Object Detection | Prahlad Anand et.al. | 2408.01843 | null |
| 2024-08-03 | Domain penalisation for improved Out-of-Distribution Generalisation | Shuvam Jena et.al. | 2408.01746 | null |
| 2024-08-03 | LAM3D: Leveraging Attention for Monocular 3D Object Detection | Diana-Alexandra Sas et.al. | 2408.01739 | null |
| 2024-08-02 | A Robotics-Inspired Scanpath Model Reveals the Importance of Uncertainty and Semantic Object Cues for Gaze Guidance in Dynamic Scenes | Vito Mengers et.al. | 2408.01322 | null |
| 2024-08-02 | Underwater Object Detection Enhancement via Channel Stabilization | Muhammad Ali et.al. | 2408.01293 | null |
| 2024-08-02 | PGNeXt: High-Resolution Salient Object Detection via Pyramid Grafting Network | Changqun Xia et.al. | 2408.01137 | null |
| 2024-08-02 | Effect of Fog Particle Size Distribution on 3D Object Detection Under Adverse Weather Conditions | Ajinkya Shinde et.al. | 2408.01085 | null |
| 2024-08-02 | Boosting Gaze Object Prediction via Pixel-level Supervision from Vision Foundation Model | Yang Jin et.al. | 2408.01044 | null |
| 2024-08-02 | MambaST: A Plug-and-Play Cross-Spectral Spatial-Temporal Fuser for Efficient Pedestrian Detection | Xiangbo Gao et.al. | 2408.01037 | null |
| 2024-08-02 | Visible-Thermal Multiple Object Tracking: Large-scale Video Dataset and Progressive Fusion Approach | Yabin Zhu et.al. | 2408.00969 | null |
| 2024-08-01 | Joint Neural Networks for One-shot Object Recognition and Detection | Camilo J. Vargas et.al. | 2408.00701 | null |
| 2024-08-01 | Harnessing Uncertainty-aware Bounding Boxes for Unsupervised 3D Object Detection | Ruiyang Zhang et.al. | 2408.00619 | null |
| 2024-08-01 | U2UData: A Large-scale Cooperative Perception Dataset for Swarm UAVs Autonomous Flight | Tongtong Feng et.al. | 2408.00606 | null |
| 2024-08-01 | MUFASA: Multi-View Fusion and Adaptation Network with Spatial Awareness for Radar Object Detection | Xiangyuan Peng et.al. | 2408.00565 | null |
| 2024-08-01 | Focus, Distinguish, and Prompt: Unleashing CLIP for Efficient and Flexible Scene Text Retrieval | Gangyan Zeng et.al. | 2408.00441 | null |
| 2024-08-01 | MonoMM: A Multi-scale Mamba-Enhanced Network for Real-time Monocular 3D Object Detection | Youjia Fu et.al. | 2408.00438 | null |
| 2024-08-01 | DNTextSpotter: Arbitrary-Shaped Scene Text Spotting via Improved Denoising Training | Yu Xie et.al. | 2408.00355 | null |
| 2024-08-01 | A Simple Background Augmentation Method for Object Detection with Diffusion Model | Yuhang Li et.al. | 2408.00350 | null |
| 2024-08-01 | Diff3DETR:Agent-based Diffusion Model for Semi-supervised 3D Object Detection | Jiacheng Deng et.al. | 2408.00286 | null |
| 2024-08-01 | RoCo:Robust Collaborative Perception By Iterative Object Matching and Pose Adjustment | Zhe Huang et.al. | 2408.00257 | null |
| 2024-07-31 | Dynamic Object Queries for Transformer-based Incremental Object Detection | Jichuan Zhang et.al. | 2407.21687 | null |
| 2024-07-31 | Spatial Transformer Network YOLO Model for Agricultural Object Detection | Yash Zambre et.al. | 2407.21652 | null |
| 2024-07-31 | Evaluating SAM2’s Role in Camouflaged Object Detection: From SAM to SAM2 | Lv Tang et.al. | 2407.21596 | null |
| 2024-07-31 | InScope: A New Real-world 3D Infrastructure-side Collaborative Perception Dataset for Open Traffic Scenarios | Xiaofei Zhang et.al. | 2407.21581 | null |
| 2024-07-31 | Voxel Scene Graph for Intracranial Hemorrhage | Antoine P. Sanner et.al. | 2407.21580 | null |
| 2024-07-31 | MarvelOVD: Marrying Object Recognition and Vision-Language Models for Robust Open-Vocabulary Object Detection | Kuo Wang et.al. | 2407.21465 | null |
| 2024-07-31 | Generalized Tampered Scene Text Detection in the era of Generative AI | Chenfan Qu et.al. | 2407.21422 | null |
| 2024-07-30 | Candidate Distant Trans-Neptunian Objects Detected by the New Horizons Subaru TNO Survey | Wesley C. Fraser et.al. | 2407.21142 | null |
| 2024-07-30 | What is YOLOv5: A deep look into the internal features of the popular object detector | Rahima Khanam et.al. | 2407.20892 | null |
| 2024-07-30 | WARM-3D: A Weakly-Supervised Sim2Real Domain Adaptation Framework for Roadside Monocular 3D Object Detection | Xingcheng Zhou et.al. | 2407.20818 | null |
| 2024-07-31 | Integer-Valued Training and Spike-Driven Inference Spiking Neural Network for High-performance and Energy-efficient Object Detection | Xinhao Luo et.al. | 2407.20708 | link |
| 2024-07-29 | Uncertainty-Rectified YOLO-SAM for Weakly Supervised ICH Segmentation | Pascal Spiegler et.al. | 2407.20461 | null |
| 2024-07-29 | MEVDT: Multi-Modal Event-Based Vehicle Detection and Tracking Dataset | Zaid A. El Shair et.al. | 2407.20446 | null |
| 2024-07-30 | AxiomVision: Accuracy-Guaranteed Adaptive Visual Model Selection for Perspective-Aware Video Analytics | Xiangxiang Dai et.al. | 2407.20124 | link |
| 2024-07-29 | Octave-YOLO: Cross frequency detection network with octave convolution | Sangjune Shin et.al. | 2407.19746 | null |
| 2024-07-29 | Cross-Layer Feature Pyramid Transformer for Small Object Detection in Aerial Images | Zewen Du et.al. | 2407.19696 | null |
| 2024-07-29 | Practical Video Object Detection via Feature Selection and Aggregation | Yuheng Shi et.al. | 2407.19650 | link |
| 2024-07-28 | Solving Short-Term Relocalization Problems In Monocular Keyframe Visual SLAM Using Spatial And Semantic Data | Azmyin Md. Kamal et.al. | 2407.19518 | link |
| 2024-07-28 | Depth-Wise Convolutions in Vision Transformers for Efficient Training on Small Datasets | Tianxiao Zhang et.al. | 2407.19394 | link |
| 2024-07-27 | Sewer Image Super-Resolution with Depth Priors and Its Lightweight Network | Gang Pan et.al. | 2407.19271 | null |
| 2024-07-27 | Enhancing Tree Type Detection in Forest Fire Risk Assessment: Multi-Stage Approach and Color Encoding with Forest Fire Risk Evaluation Framework for UAV Imagery | Jinda Zhang et.al. | 2407.19184 | null |
| 2024-07-27 | Reducing Spurious Correlation for Federated Domain Generalization | Shuran Ma et.al. | 2407.19174 | null |
| 2024-07-27 | Robust Multimodal 3D Object Detection via Modality-Agnostic Decoding and Proximity-based Modality Ensemble | Juhan Cha et.al. | 2407.19156 | link |
| 2024-07-26 | Local Binary Pattern(LBP) Optimization for Feature Extraction | Zeinab Sedaghatjoo et.al. | 2407.18665 | null |
| 2024-07-25 | LION: Linear Group RNN for 3D Object Detection in Point Clouds | Zhe Liu et.al. | 2407.18232 | link |
| 2024-07-25 | XS-VID: An Extremely Small Video Object Detection Dataset | Jiahao Guo et.al. | 2407.18137 | null |
| 2024-07-25 | SaccadeDet: A Novel Dual-Stage Architecture for Rapid and Accurate Detection in Gigapixel Images | Wenxi Li et.al. | 2407.17956 | null |
| 2024-07-25 | A Novel Perception Entropy Metric for Optimizing Vehicle Perception with LiDAR Deployment | Yongjiang He et.al. | 2407.17942 | null |
| 2024-07-25 | Hierarchical Object Detection and Recognition Framework for Practical Plant Disease Diagnosis | Kohei Iwano et.al. | 2407.17906 | null |
| 2024-07-25 | Advancing 3D Point Cloud Understanding through Deep Transfer Learning: A Comprehensive Survey | Shahab Saquib Sohail et.al. | 2407.17877 | null |
| 2024-07-25 | Enhancing Fine-grained Object Detection in Aerial Images via Orthogonal Mapping | Haoran Zhu et.al. | 2407.17738 | link |
| 2024-07-26 | Unsqueeze [CLS] Bottleneck to Learn Rich Representations | Qing Su et.al. | 2407.17671 | link |
| 2024-07-24 | SDLNet: Statistical Deep Learning Network for Co-Occurring Object Detection and Identification | Binay Kumar Singh et.al. | 2407.17664 | null |
| 2024-07-24 | PEEKABOO: Hiding parts of an image for unsupervised object localization | Hasib Zunair et.al. | 2407.17628 | link |
| 2024-07-24 | ALPI: Auto-Labeller with Proxy Injection for 3D Object Detection using 2D Labels Only | Saad Lahlali et.al. | 2407.17197 | null |
| 2024-07-24 | DVPE: Divided View Position Embedding for Multi-View 3D Object Detection | Jiasen Wang et.al. | 2407.16955 | link |
| 2024-07-23 | What Matters in Range View 3D Object Detection | Benjamin Wilson et.al. | 2407.16789 | link |
| 2024-07-23 | A Framework for Pupil Tracking with Event Cameras | Khadija Iddrisu et.al. | 2407.16665 | null |
| 2024-07-24 | Velocity Driven Vision: Asynchronous Sensor Fusion Birds Eye View Models for Autonomous Vehicles | Seamie Hayes et.al. | 2407.16636 | null |
| 2024-07-23 | COALA: A Practical and Vision-Centric Federated Learning Platform | Weiming Zhuang et.al. | 2407.16560 | link |
| 2024-07-23 | Dynamic Retraining-Updating Mean Teacher for Source-Free Object Detection | Trinh Le Ba Khanh et.al. | 2407.16497 | link |
| 2024-07-23 | MonoWAD: Weather-Adaptive Diffusion Model for Robust Monocular 3D Object Detection | Youngmin Oh et.al. | 2407.16448 | link |
| 2024-07-23 | ESOD: Efficient Small Object Detection on High-Resolution Images | Kai Liu et.al. | 2407.16424 | null |
| 2024-07-23 | Understanding Impacts of Electromagnetic Signal Injection Attacks on Object Detection | Youqian Zhang et.al. | 2407.16327 | null |
| 2024-07-23 | DeepClean: Integrated Distortion Identification and Algorithm Selection for Rectifying Image Corruptions | Aditya Kapoor et.al. | 2407.16302 | null |
| 2024-07-23 | FoRA: Low-Rank Adaptation Model beyond Multimodal Siamese Network | Weiying Xie et.al. | 2407.16129 | link |
| 2024-07-22 | PLayerTV: Advanced Player Tracking and Identification for Automatic Soccer Highlight Clips | Håkon Maric Solberg et.al. | 2407.16076 | null |
| 2024-07-22 | Disentangling spatio-temporal knowledge for weakly supervised object detection and segmentation in surgical video | Guiqiu Liao et.al. | 2407.15794 | null |
| 2024-07-22 | Towards Open-World Object-based Anomaly Detection via Self-Supervised Outlier Synthesis | Brian K. S. Isaac-Medina et.al. | 2407.15763 | null |
| 2024-07-22 | Counter Turing Test ( $CT^2$): Investigating AI-Generated Text Detection for Hindi – Ranking LLMs based on Hindi AI Detectability Index ($ADI_{hi}$ ) | Ishan Kavathekar et.al. | 2407.15694 | null |
| 2024-07-22 | YOLOv10 for Automated Fracture Detection in Pediatric Wrist Trauma X-rays | Ammar Ahmed et.al. | 2407.15689 | link |
| 2024-07-22 | SS-SFR: Synthetic Scenes Spatial Frequency Response on Virtual KITTI and Degraded Automotive Simulations for Object Detection | Daniel Jakab et.al. | 2407.15646 | null |
| 2024-07-22 | YOLO-pdd: A Novel Multi-scale PCB Defect Detection Method Using Deep Representations with Sequential Images | Bowen Liu et.al. | 2407.15427 | null |
| 2024-07-22 | Learning High-resolution Vector Representation from Multi-Camera Images for 3D Object Detection | Zhili Chen et.al. | 2407.15354 | null |
| 2024-07-22 | Explore the LiDAR-Camera Dynamic Adjustment Fusion for 3D Object Detection | Yiran Yang et.al. | 2407.15334 | null |
| 2024-07-21 | Weak-to-Strong Compositional Learning from Generative Models for Language-based Object Detection | Kwanyong Park et.al. | 2407.15296 | null |
| 2024-07-21 | Multiple Object Detection and Tracking in Panoramic Videos for Cycling Safety Analysis | Jingwei Guo et.al. | 2407.15199 | null |
| 2024-07-19 | Enhancing Layout Hotspot Detection Efficiency with YOLOv8 and PCA-Guided Augmentation | Dongyang Wu et.al. | 2407.14498 | null |
| 2024-07-19 | MLMT-CNN for Object Detection and Segmentation in Multi-layer and Multi-spectral Images | Majedaldein Almahasneh et.al. | 2407.14473 | null |
| 2024-07-19 | EmoCAM: Toward Understanding What Drives CNN-based Emotion Recognition | Youssef Doulfoukar et.al. | 2407.14314 | null |
| 2024-07-19 | Bucketed Ranking-based Losses for Efficient Training of Object Detectors | Feyza Yavuz et.al. | 2407.14204 | link |
| 2024-07-19 | Visual Text Generation in the Wild | Yuanzhi Zhu et.al. | 2407.14138 | link |
| 2024-07-18 | GroupMamba: Parameter-Efficient and Accurate Group Visual State Space Model | Abdelrahman Shaker et.al. | 2407.13772 | link |
| 2024-07-18 | General Geometry-aware Weakly Supervised 3D Object Detection | Guowen Zhang et.al. | 2407.13748 | link |
| 2024-07-18 | Enhancing Source-Free Domain Adaptive Object Detection with Low-confidence Pseudo Label Distillation | Ilhoon Yoon et.al. | 2407.13524 | link |
| 2024-07-18 | The use of the symmetric finite difference in the local binary pattern (symmetric LBP) | Zeinab Sedaghatjoo et.al. | 2407.13178 | null |
| 2024-07-18 | Learning Camouflaged Object Detection from Noisy Pseudo Label | Jin Zhang et.al. | 2407.13157 | null |
| 2024-07-18 | DFMSD: Dual Feature Masking Stage-wise Knowledge Distillation for Object Detection | Zhourui Zhang et.al. | 2407.13147 | null |
| 2024-07-18 | FocusDiffuser: Perceiving Local Disparities for Camouflaged Object Detection | Jianwei Zhao et.al. | 2407.13133 | null |
| 2024-07-17 | AdaLog: Post-Training Quantization for Vision Transformers with Adaptive Logarithm Quantizer | Zhuguanyu Wu et.al. | 2407.12951 | link |
| 2024-07-17 | Toward INT4 Fixed-Point Training via Exploring Quantization Error for Gradients | Dohyung Kim et.al. | 2407.12637 | null |
| 2024-07-17 | CerberusDet: Unified Multi-Task Object Detection | Irina Tolstykh et.al. | 2407.12632 | link |
| 2024-07-17 | Weighting Pseudo-Labels via High-Activation Feature Index Similarity and Object Detection for Semi-Supervised Segmentation | Prantik Howlader et.al. | 2407.12630 | link |
| 2024-07-17 | Enhancing Wrist Abnormality Detection with YOLO: Analysis of State-of-the-art Single-stage Detection Models | Ammar Ahmed et.al. | 2407.12597 | link |
| 2024-07-17 | Embracing Events and Frames with Hierarchical Feature Refinement Network for Object Detection | Hu Cao et.al. | 2407.12582 | null |
| 2024-07-17 | Close the Sim2real Gap via Physically-based Structured Light Synthetic Data Simulation | Kaixin Bai et.al. | 2407.12449 | null |
| 2024-07-17 | GLARE: Low Light Image Enhancement via Generative Latent Feature based Codebook Retrieval | Han Zhou et.al. | 2407.12431 | link |
| 2024-07-17 | Exploring Deeper! Segment Anything Model with Depth Perception for Camouflaged Object Detection | Zhenni Yu et.al. | 2407.12339 | null |
| 2024-07-16 | AFIDAF: Alternating Fourier and Image Domain Adaptive Filters as an Efficient Alternative to Attention in ViTs | Yunling Zheng et.al. | 2407.12217 | null |
| 2024-07-16 | The object detection method aids in image reconstruction evaluation and clinical interpretation of meniscal abnormalities | Natalia Konovalova et.al. | 2407.12184 | null |
| 2024-07-16 | A Case for Application-Aware Space Radiation Tolerance in Orbital Computing | Meiqi Wang et.al. | 2407.11853 | null |
| 2024-07-16 | Improving Unsupervised Video Object Segmentation via Fake Flow Generation | Suhwan Cho et.al. | 2407.11714 | link |
| 2024-07-16 | Relation DETR: Exploring Explicit Position Relation Prior for Object Detection | Xiuquan Hou et.al. | 2407.11699 | link |
| 2024-07-16 | Bridge Past and Future: Overcoming Information Asymmetry in Incremental Object Detection | Qijie Mo et.al. | 2407.11499 | null |
| 2024-07-16 | Crowd-SAM: SAM as a Smart Annotator for Object Detection in Crowded Scenes | Zhi Cai et.al. | 2407.11464 | link |
| 2024-07-16 | Generative AI Driven Task-Oriented Adaptive Semantic Communications | Yuzhou Fu et.al. | 2407.11354 | null |
| 2024-07-16 | LaMI-DETR: Open-Vocabulary Detection with Language Model Instruction | Penghui Du et.al. | 2407.11335 | link |
| 2024-07-16 | TCFormer: Visual Recognition via Token Clustering Transformer | Wang Zeng et.al. | 2407.11321 | link |
| 2024-07-16 | PADRe: A Unifying Polynomial Attention Drop-in Replacement for Efficient Vision Transformer | Pierre-David Letourneau et.al. | 2407.11306 | null |
| 2024-07-15 | OpenPSG: Open-set Panoptic Scene Graph Generation via Large Multimodal Models | Zijian Zhou et.al. | 2407.11213 | link |
| 2024-07-15 | Interpreting Hand gestures using Object Detection and Digits Classification | Sangeetha K et.al. | 2407.10902 | null |
| 2024-07-15 | RepVF: A Unified Vector Fields Representation for Multi-task 3D Perception | Chunliang Li et.al. | 2407.10876 | link |
| 2024-07-15 | OPEN: Object-wise Position Embedding for Multi-view 3D Object Detection | Jinghua Hou et.al. | 2407.10753 | link |
| 2024-07-15 | Anticipating Future Object Compositions without Forgetting | Youssef Zahran et.al. | 2407.10723 | null |
| 2024-07-15 | OVLW-DETR: Open-Vocabulary Light-Weighted Detection Transformer | Yu Wang et.al. | 2407.10655 | link |
| 2024-07-15 | Backdoor Attacks against Image-to-Image Networks | Wenbo Jiang et.al. | 2407.10445 | null |
| 2024-07-14 | Shape2Scene: 3D Scene Representation Learning Through Pre-training on Shape Data | Tuo Feng et.al. | 2407.10200 | link |
| 2024-07-14 | LabelDistill: Label-guided Cross-modal Knowledge Distillation for Camera-based 3D Object Detection | Sanmin Kim et.al. | 2407.10164 | link |
| 2024-07-14 | FSD-BEV: Foreground Self-Distillation for Multi-view 3D Object Detection | Zheng Jiang et.al. | 2407.10135 | null |
| 2024-07-14 | When Pedestrian Detection Meets Multi-Modal Learning: Generalist Model and Benchmark Dataset | Yi Zhang et.al. | 2407.10125 | null |
| 2024-07-12 | DART: An Automated End-to-End Object Detection Pipeline with Data Diversification, Open-Vocabulary Bounding Box Annotation, Pseudo-Label Review, and Model Training | Chen Xin et.al. | 2407.09174 | link |
| 2024-07-12 | Open Vocabulary Multi-Label Video Classification | Rohit Gupta et.al. | 2407.09073 | null |
| 2024-07-12 | DroneMOT: Drone-based Multi-Object Tracking Considering Detection Difficulties and Simultaneous Moving of Drones and Objects | Peng Wang et.al. | 2407.09051 | null |
| 2024-07-12 | Task-driven single-image super-resolution reconstruction of document scans | Maciej Zyrek et.al. | 2407.08993 | null |
| 2024-07-11 | OmniNOCS: A unified NOCS dataset and model for 3D lifting of 2D objects | Akshay Krishnan et.al. | 2407.08711 | null |
| 2024-07-11 | Approaching Outside: Scaling Unsupervised 3D Object Detection from 2D Scene | Ruiyang Zhang et.al. | 2407.08569 | link |
| 2024-07-11 | Projecting Points to Axes: Oriented Object Detection via Point-Axis Representation | Zeyang Zhao et.al. | 2407.08489 | link |
| 2024-07-11 | Semi-Supervised Object Detection: A Survey on Progress from CNN to Transformer | Tahira Shehzadi et.al. | 2407.08460 | null |
| 2024-07-11 | PowerYOLO: Mixed Precision Model for Hardware Efficient Object Detection with Event Data | Dominika Przewlocka-Rus et.al. | 2407.08272 | null |
| 2024-07-11 | Knowledge distillation to effectively attain both region-of-interest and global semantics from an image where multiple objects appear | Seonwhee Jin et.al. | 2407.08257 | link |
| 2024-07-11 | Enrich the content of the image Using Context-Aware Copy Paste | Qiushi Guo et.al. | 2407.08151 | null |
| 2024-07-11 | DMM: Disparity-guided Multispectral Mamba for Oriented Object Detection in Remote Sensing | Minghang Zhou et.al. | 2407.08132 | null |
| 2024-07-10 | MambaVision: A Hybrid Mamba-Transformer Vision Backbone | Ali Hatamizadeh et.al. | 2407.08083 | link |
| 2024-07-10 | Bayesian Detector Combination for Object Detection with Crowdsourced Annotations | Zhi Qin Tan et.al. | 2407.07958 | link |
| 2024-07-10 | Cross Domain Object Detection via Multi-Granularity Confidence Alignment based Mean Teacher | Jiangming Chen et.al. | 2407.07780 | null |
| 2024-07-10 | LSM: A Comprehensive Metric for Assessing the Safety of Lane Detection Systems in Autonomous Driving | Jörg Gamerdinger et.al. | 2407.07740 | null |
| 2024-07-10 | Few-Shot Domain Adaptive Object Detection for Microscopic Images | Sumayya Inayat et.al. | 2407.07633 | null |
| 2024-07-10 | Simplifying Source-Free Domain Adaptation for Object Detection: Effective Self-Training Strategies and Performance Insights | Yan Hao et.al. | 2407.07586 | link |
| 2024-07-09 | Exploring Camera Encoder Designs for Autonomous Driving Perception | Barath Lakshmanan et.al. | 2407.07276 | null |
| 2024-07-09 | ConvNLP: Image-based AI Text Detection | Suriya Prakash Jambunathan et.al. | 2407.07225 | null |
| 2024-07-09 | Category-level Object Detection, Pose Estimation and Reconstruction from Stereo Images | Chuanrui Zhang et.al. | 2407.06984 | null |
| 2024-07-09 | Cue Point Estimation using Object Detection | Giulia Argüello et.al. | 2407.06823 | link |
| 2024-07-09 | CoLA: Conditional Dropout and Language-driven Robust Dual-modal Salient Object Detection | Shuang Hao et.al. | 2407.06780 | link |
| 2024-07-09 | Graph-Based Captioning: Enhancing Visual Descriptions by Interconnecting Region Captions | Yu-Guan Hsieh et.al. | 2407.06723 | null |
| 2024-07-08 | Stochastic Traveling Salesperson Problem with Neighborhoods for Object Detection | Cheng Peng et.al. | 2407.06366 | null |
| 2024-07-08 | GeoWATCH for Detecting Heavy Construction in Heterogeneous Time Series of Satellite Images | Jon Crall et.al. | 2407.06337 | null |
| 2024-07-08 | Multi-clue Consistency Learning to Bridge Gaps Between General and Oriented Object in Semi-supervised Detection | Chenxu Wang et.al. | 2407.05909 | link |
| 2024-07-08 | Boosting 3D Object Detection with Semantic-Aware Multi-Branch Framework | Hao Jing et.al. | 2407.05769 | null |
| 2024-07-08 | Short-term Object Interaction Anticipation with Disentangled Object Detection @ Ego4D Short Term Object Interaction Anticipation Challenge | Hyunjin Cho et.al. | 2407.05713 | link |
| 2024-07-08 | Weakly Supervised Test-Time Domain Adaptation for Object Detection | Anh-Dzung Doan et.al. | 2407.05607 | null |
| 2024-07-08 | Towards Reflected Object Detection: A Benchmark | Zhongtian Wang et.al. | 2407.05575 | null |
| 2024-07-08 | GMC: A General Framework of Multi-stage Context Learning and Utilization for Visual Detection Tasks | Xuan Wang et.al. | 2407.05566 | null |
| 2024-07-07 | CLAMP-ViT: Contrastive Data-Free Learning for Adaptive Post-Training Quantization of ViTs | Akshat Ramachandran et.al. | 2407.05266 | link |
| 2024-07-07 | Unlocking Textual and Visual Wisdom: Open-Vocabulary 3D Object Detection Enhanced by Comprehensive Guidance from Text and Image | Pengkun Jiao et.al. | 2407.05256 | null |
| 2024-07-06 | SCSA: Exploring the Synergistic Effects Between Spatial and Channel Attention | Yunzhong Si et.al. | 2407.05128 | null |
| 2024-07-06 | Quantizing YOLOv7: A Comprehensive Study | Mohammadamin Baghbanbashi et.al. | 2407.04943 | null |
| 2024-07-05 | SH17: A Dataset for Human Safety and Personal Protective Equipment Detection in Manufacturing Industry | Hafiz Mughees Ahmad et.al. | 2407.04590 | link |
| 2024-07-05 | Optimizing the image correction pipeline for pedestrian detection in the thermal-infrared domain | Christophe Karam et.al. | 2407.04484 | null |
| 2024-07-05 | Multi-Branch Auxiliary Fusion YOLO with Re-parameterization Heterogeneous Convolutional for accurate object detection | Zhiqiang Yang et.al. | 2407.04381 | link |
| 2024-07-05 | Towards Stable 3D Object Detection | Jiabao Wang et.al. | 2407.04305 | null |
| 2024-07-05 | Research, Applications and Prospects of Event-Based Pedestrian Detection: A Survey | Han Wang et.al. | 2407.04277 | null |
| 2024-07-04 | LiDAR-based Real-Time Object Detection and Tracking in Dynamic Environments | Wenqiang Du et.al. | 2407.04115 | null |
| 2024-07-04 | FIPGNet:Pyramid grafting network with feature interaction strategies | Ziyi Ding et.al. | 2407.04085 | null |
| 2024-07-04 | Detect Closer Surfaces that can be Seen: New Modeling and Evaluation in Cross-domain 3D Object Detection | Ruixiao Zhang et.al. | 2407.04061 | null |
| 2024-07-04 | The Solution for the GAIIC2024 RGB-TIR object detection Challenge | Xiangyu Wu et.al. | 2407.03872 | null |
| 2024-07-04 | StreamLTS: Query-based Temporal-Spatial LiDAR Fusion for Cooperative Object Detection | Yunshuang Yuan et.al. | 2407.03825 | null |
| 2024-07-03 | Visual Grounding with Attention-Driven Constraint Balancing | Weitai Kang et.al. | 2407.03243 | null |
| 2024-07-03 | Category-Aware Dynamic Label Assignment with High-Quality Oriented Proposal | Mingkui Feng et.al. | 2407.03205 | null |
| 2024-07-03 | SegVG: Transferring Object Bounding Box to Segmentation for Visual Grounding | Weitai Kang et.al. | 2407.03200 | link |
| 2024-07-03 | Global Context Modeling in YOLOv8 for Pediatric Wrist Fracture Detection | Rui-Yang Ju et.al. | 2407.03163 | link |
| 2024-07-03 | YOLOv5, YOLOv8 and YOLOv10: The Go-To Detectors for Real-time Vision | Muhammad Hussain et.al. | 2407.02988 | null |
| 2024-07-03 | Mast Kalandar at SemEval-2024 Task 8: On the Trail of Textual Origins: RoBERTa-BiLSTM Approach to Detect AI-Generated Text | Jainit Sushil Bafna et.al. | 2407.02978 | null |
| 2024-07-03 | A Pairwise DomMix Attentive Adversarial Network for Unsupervised Domain Adaptive Object Detection | Jie Shao et.al. | 2407.02835 | null |
| 2024-07-03 | ADFQ-ViT: Activation-Distribution-Friendly Post-Training Quantization for Vision Transformers | Yanfeng Jiang et.al. | 2407.02763 | null |
| 2024-07-02 | SMILe: Leveraging Submodular Mutual Information For Robust Few-Shot Object Detection | Anay Majee et.al. | 2407.02665 | null |
| 2024-07-02 | Robust ADAS: Enhancing Robustness of Machine Learning-based Advanced Driver Assistance Systems for Adverse Weather | Muhammad Zaeem Shahzad et.al. | 2407.02581 | null |
| 2024-07-02 | Similarity Distance-Based Label Assignment for Tiny Object Detection | Shuohao Shi et.al. | 2407.02394 | link |
| 2024-07-02 | OpenSlot: Mixed Open-set Recognition with Object-centric Learning | Xu Yin et.al. | 2407.02386 | null |
| 2024-07-02 | DM3D: Distortion-Minimized Weight Pruning for Lossless 3D Object Detection | Kaixin Xu et.al. | 2407.02098 | null |
| 2024-07-02 | Multi-Grained Contrast for Data-Efficient Unsupervised Representation Learning | Chengchao Shen et.al. | 2407.02014 | link |
| 2024-07-02 | Adaptive Modality Balanced Online Knowledge Distillation for Brain-Eye-Computer based Dim Object Detection | Zixing Li et.al. | 2407.01894 | link |
| 2024-07-01 | Scarecrow monitoring system:employing mobilenet ssd for enhanced animal supervision | Balaji VS et.al. | 2407.01435 | null |
| 2024-07-01 | Formal Verification of Object Detection | Avraham Raviv et.al. | 2407.01295 | null |
| 2024-07-01 | Cross-Architecture Auxiliary Feature Space Translation for Efficient Few-Shot Personalized Object Detection | Francesco Barbato et.al. | 2407.01193 | null |
| 2024-07-01 | Eliminating Position Bias of Language Models: A Mechanistic Approach | Ziqi Wang et.al. | 2407.01100 | link |
| 2024-07-01 | No More Potentially Dynamic Objects: Static Point Cloud Map Generation based on 3D Object Detection and Ground Projection | Soojin Woo et.al. | 2407.01073 | null |
| 2024-06-28 | Detecting Subtle Differences between Human and Model Languages Using Spectrum of Relative Likelihood | Yang Xu et.al. | 2406.19874 | link |
| 2024-07-01 | Mobile Robot Oriented Large-Scale Indoor Dataset for Dynamic Scene Understanding | Yifan Tang et.al. | 2406.19791 | null |
| 2024-06-28 | Basketball-SORT: An Association Method for Complex Multi-object Occlusion Problems in Basketball Multi-object Tracking | Qingrui Hu et.al. | 2406.19655 | null |
| 2024-06-27 | Robustness Testing of Black-Box Models Against CT Degradation Through Test-Time Augmentation | Jack Highton et.al. | 2406.19557 | null |
| 2024-06-27 | BOrg: A Brain Organoid-Based Mitosis Dataset for Automatic Analysis of Brain Diseases | Muhammad Awais et.al. | 2406.19556 | link |
| 2024-06-27 | Weighted Circle Fusion: Ensembling Circle Representation from Different Object Detection Results | Jialin Yue et.al. | 2406.19540 | null |
| 2024-06-27 | Stereo Vision Based Robot for Remote Monitoring with VR Support | Mohamed Fazil M. S. et.al. | 2406.19498 | null |
| 2024-06-27 | HUWSOD: Holistic Self-training for Unified Weakly Supervised Object Detection | Liujuan Cao et.al. | 2406.19394 | link |
| 2024-06-27 | STAL3D: Unsupervised Domain Adaptation for 3D Object Detection via Collaborating Self-Training and Adversarial Learning | Yanan Zhang et.al. | 2406.19362 | null |
| 2024-06-27 | Towards Reducing Data Acquisition and Labeling for Defect Detection using Simulated Data | Lukas Malte Kemeter et.al. | 2406.19175 | null |
| 2024-06-27 | FDLite: A Single Stage Lightweight Face Detector Network | Yogesh Aggarwal et.al. | 2406.19107 | null |
| 2024-06-27 | Segment Anything Model for automated image data annotation: empirical studies using text prompts from Grounding DINO | Fuseini Mumuni et.al. | 2406.19057 | null |
| 2024-06-27 | BiCo-Fusion: Bidirectional Complementary LiDAR-Camera Fusion for Semantic- and Spatial-Aware 3D Object Detection | Yang Song et.al. | 2406.19048 | null |
| 2024-06-27 | A Universal Railway Obstacle Detection System based on Semi-supervised Segmentation And Optical Flow | Qiushi Guo et.al. | 2406.18908 | null |
| 2024-06-26 | SpY: A Context-Based Approach to Spacecraft Component Detection | Trupti Mahendrakar et.al. | 2406.18709 | null |
| 2024-06-26 | Unveiling the Unknown: Conditional Evidence Decoupling for Unknown Rejection | Zhaowei Wu et.al. | 2406.18443 | link |
| 2024-06-26 | Detecting Machine-Generated Texts: Not Just “AI vs Humans” and Explainability is Complicated | Jiazhou Ji et.al. | 2406.18259 | null |
| 2024-06-26 | CTS: Sim-to-Real Unsupervised Domain Adaptation on 3D Detection | Meiying Zhang et.al. | 2406.18129 | null |
| 2024-06-26 | The Surprising Effectiveness of Multimodal Large Language Models for Video Moment Retrieval | Meinardus Boris et.al. | 2406.18113 | link |
| 2024-06-25 | Unmasking the Imposters: In-Domain Detection of Human vs. Machine-Generated Tweets | Bryan E. Tuck et.al. | 2406.17967 | null |
| 2024-06-25 | ET tu, CLIP? Addressing Common Object Errors for Unseen Environments | Ye Won Byun et.al. | 2406.17876 | null |
| 2024-06-25 | MDHA: Multi-Scale Deformable Transformer with Hybrid Anchors for Multi-View 3D Object Detection | Michelle Adeline et.al. | 2406.17654 | link |
| 2024-06-25 | Embedded event based object detection with spiking neural network | Jonathan Courtois et.al. | 2406.17617 | null |
| 2024-06-27 | Towards Open-set Camera 3D Object Detection | Zhuolin He et.al. | 2406.17297 | null |
| 2024-06-25 | Exploring Test-Time Adaptation for Object Detection in Continually Changing Environments | Shilei Cao et.al. | 2406.16439 | null |
| 2024-06-24 | Artistic-style text detector and a new Movie-Poster dataset | Aoxiang Ning et.al. | 2406.16307 | null |
| 2024-06-24 | Investigating the Influence of Prompt-Specific Shortcuts in AI Generated Text Detection | Choonghyun Park et.al. | 2406.16275 | null |
| 2024-06-23 | Review of Zero-Shot and Few-Shot AI Algorithms in The Medical Domain | Maged Badawi et.al. | 2406.16143 | null |
| 2024-06-22 | Understanding Student and Academic Staff Perceptions of AI Use in Assessment and Feedback | Jasper Roe et.al. | 2406.15808 | null |
| 2024-06-22 | Smart Feature is What You Need | Zhaoxin Hu et.al. | 2406.15805 | link |
| 2024-06-22 | MR-MLLM: Mutual Reinforcement of Multimodal Comprehension and Vision Perception | Guanqun Wang et.al. | 2406.15768 | null |
| 2024-06-21 | Towards Robust Training Datasets for Machine Learning with Ontologies: A Case Study for Emergency Road Vehicle Detection | Lynn Vonderhaar et.al. | 2406.15268 | null |
| 2024-06-21 | DiPEx: Dispersing Prompt Expansion for Class-Agnostic Object Detection | Jia Syuen Lim et.al. | 2406.14924 | null |
| 2024-06-21 | MOS: Model Synergy for Test-Time Adaptation on LiDAR-Based 3D Object Detection | Zhuoxiao Chen et.al. | 2406.14878 | null |
| 2024-06-20 | Visible-Thermal Tiny Object Detection: A Benchmark Dataset and Baselines | Xinyi Ying et.al. | 2406.14482 | link |
| 2024-06-20 | Enhanced Bank Check Security: Introducing a Novel Dataset and Transformer-Based Approach for Detection and Verification | Muhammad Saif Ullah Khan et.al. | 2406.14370 | link |
| 2024-06-20 | HoTPP Benchmark: Are We Good at the Long Horizon Events Forecasting? | Ivan Karpukhin et.al. | 2406.14341 | link |
| 2024-06-20 | LeYOLO, New Scalable and Efficient CNN Architecture for Object Detection | Lilian Hollard et.al. | 2406.14239 | link |
| 2024-06-20 | SSAD: Self-supervised Auxiliary Detection Framework for Panoramic X-ray based Dental Disease Diagnosis | Zijian Cai et.al. | 2406.13963 | link |
| 2024-06-20 | Towards the in-situ Trunk Identification and Length Measurement of Sea Cucumbers via Bézier Curve Modelling | Shuaixin Liu et.al. | 2406.13951 | link |
| 2024-06-19 | DPO: Dual-Perturbation Optimization for Test-time Adaptation in 3D Object Detection | Zhuoxiao Chen et.al. | 2406.13891 | link |
| 2024-06-19 | Semantic Enhanced Few-shot Object Detection | Zheng Wang et.al. | 2406.13498 | null |
| 2024-06-19 | Snowy Scenes,Clear Detections: A Robust Model for Traffic Light Detection in Adverse Weather Conditions | Shivank Garg et.al. | 2406.13473 | link |
| 2024-06-19 | Strengthening Layer Interaction via Dynamic Layer Attention | Kaishen Wang et.al. | 2406.13392 | link |
| 2024-06-18 | Privacy Preserving Federated Learning in Medical Imaging with Uncertainty Estimation | Nikolas Koutsoubis et.al. | 2406.12815 | link |
| 2024-06-18 | Online Anchor-based Training for Image Classification Tasks | Maria Tzelepi et.al. | 2406.12662 | null |
| 2024-06-18 | Applying Ensemble Methods to Model-Agnostic Machine-Generated Text Detection | Ivan Ong et.al. | 2406.12570 | null |
| 2024-06-18 | MultiSocial: Multilingual Benchmark of Machine-Generated Text Detection of Social-Media Texts | Dominik Macko et.al. | 2406.12549 | null |
| 2024-06-18 | ViDSOD-100: A New Dataset and a Baseline Model for RGB-D Video Salient Object Detection | Junhao Lin et.al. | 2406.12536 | link |
| 2024-06-18 | SDNIA-YOLO: A Robust Object Detection Model for Extreme Weather Conditions | Yuexiong Ding et.al. | 2406.12395 | null |
| 2024-06-18 | Competitive Learning for Achieving Content-specific Filters in Video Coding for Machines | Honglei Zhang et.al. | 2406.12367 | null |
| 2024-06-18 | Certified ML Object Detection for Surveillance Missions | Mohammed Belcaid et.al. | 2406.12362 | null |
| 2024-06-18 | DASSF: Dynamic-Attention Scale-Sequence Fusion for Aerial Object Detection | Haodong Li et.al. | 2406.12285 | null |
| 2024-06-18 | The Solution for CVPR2024 Foundational Few-Shot Object Detection Challenge | Hongpeng Pan et.al. | 2406.12225 | null |
| 2024-06-17 | V3Det Challenge 2024 on Vast Vocabulary and Open Vocabulary Object Detection: Methods and Results | Jiaqi Wang et.al. | 2406.11739 | null |
| 2024-06-17 | YOLO-FEDER FusionNet: A Novel Deep Learning Architecture for Drone Detection | Tamara R. Lenhard et.al. | 2406.11641 | null |
| 2024-06-17 | Low-power Ship Detection in Satellite Images Using Neuromorphic Hardware | Gregor Lenz et.al. | 2406.11319 | null |
| 2024-06-17 | Semi-Supervised Domain Adaptation Using Target-Oriented Domain Augmentation for 3D Object Detection | Yecheol Kim et.al. | 2406.11313 | link |
| 2024-06-17 | Syn-to-Real Unsupervised Domain Adaptation for Indoor 3D Object Detection | Yunsong Wang et.al. | 2406.11311 | null |
| 2024-06-17 | Enhancing Generalizability of Representation Learning for Data-Efficient 3D Scene Understanding | Yunsong Wang et.al. | 2406.11283 | null |
| 2024-06-17 | YOLO9tr: A Lightweight Model for Pavement Damage Detection Utilizing a Generalized Efficient Layer Aggregation Network and Attention Mechanism | Sompote Youwai et.al. | 2406.11254 | link |
| 2024-06-16 | GANmut: Generating and Modifying Facial Expressions | Maria Surani et.al. | 2406.11079 | null |
| 2024-06-16 | Exploring the Limitations of Detecting Machine-Generated Text | Jad Doughman et.al. | 2406.11073 | null |
| 2024-06-16 | Open-Vocabulary X-ray Prohibited Item Detection via Fine-tuning CLIP | Shuyang Lin et.al. | 2406.10961 | null |
| 2024-06-14 | EFM3D: A Benchmark for Measuring Progress Towards 3D Egocentric Foundation Models | Julian Straub et.al. | 2406.10224 | null |
| 2024-06-14 | YOLOv1 to YOLOv10: A comprehensive review of YOLO variants and their application in the agricultural domain | Mujadded Al Rabbani Alif et.al. | 2406.10139 | null |
| 2024-06-14 | Shelf-Supervised Multi-Modal Pre-Training for 3D Object Detection | Mehar Khurana et.al. | 2406.10115 | null |
| 2024-06-14 | Automated GIS-Based Framework for Detecting Crosswalk Changes from Bi-Temporal High-Resolution Aerial Images | Richard Boadu Antwi et.al. | 2406.09731 | null |
| 2024-06-14 | An alternate approach for estimating grain-growth kinetics | Manoj Prabakar et.al. | 2406.09653 | null |
| 2024-06-13 | Scene Graph Generation in Large-Size VHR Satellite Imagery: A Large-Scale Dataset and A Context-Aware Approach | Yansheng Li et.al. | 2406.09410 | link |
| 2024-06-13 | Towards Evaluating the Robustness of Visual State Space Models | Hashmat Shadab Malik et.al. | 2406.09407 | link |
| 2024-06-13 | Visual Sketchpad: Sketching as a Visual Chain of Thought for Multimodal Language Models | Yushi Hu et.al. | 2406.09403 | null |
| 2024-06-13 | Enhanced Object Detection: A Study on Vast Vocabulary Object Detection Track for V3Det Challenge 2024 | Peixi Wu et.al. | 2406.09201 | null |
| 2024-06-13 | Navigating the Shadows: Unveiling Effective Disturbances for Modern AI Content Detectors | Ying Zhou et.al. | 2406.08922 | link |
| 2024-06-13 | Computer vision-based model for detecting turning lane features on Florida’s public roadways | Richard Boadu Antwi et.al. | 2406.08822 | null |
| 2024-06-13 | BEVSpread: Spread Voxel Pooling for Bird’s-Eye-View Representation in Vision-based Roadside 3D Object Detection | Wenjie Wang et.al. | 2406.08785 | null |
| 2024-06-12 | UnO: Unsupervised Occupancy Fields for Perception and Forecasting | Ben Agro et.al. | 2406.08691 | null |
| 2024-06-12 | Transformation-Dependent Adversarial Attacks | Yaoteng Tan et.al. | 2406.08443 | null |
| 2024-06-12 | Dataset Enhancement with Instance-Level Augmentations | Orest Kupyn et.al. | 2406.08249 | link |
| 2024-06-12 | Chemistry3D: Robotic Interaction Benchmark for Chemistry Experiments | Shoujie Li et.al. | 2406.08160 | null |
| 2024-06-12 | CT3D++: Improving 3D Object Detection with Keypoint-induced Channel-wise Transformer | Hualian Sheng et.al. | 2406.08152 | null |
| 2024-06-12 | MWIRSTD: A MWIR Small Target Detection Dataset | Nikhil Kumar et.al. | 2406.08063 | link |
| 2024-06-12 | Sense Less, Generate More: Pre-training LiDAR Perception with Masked Autoencoders for Ultra-Efficient 3D Sensing | Sina Tayebati et.al. | 2406.07833 | null |
| 2024-06-11 | A Deep Learning Approach to Detect Complete Safety Equipment For Construction Workers Based On YOLOv7 | Md. Shariful Islam et.al. | 2406.07707 | null |
| 2024-06-11 | Transforming a rare event search into a not-so-rare event search in real-time with deep learning-based object detection | J. Schueler et.al. | 2406.07538 | null |
| 2024-06-11 | Understanding Visual Concepts Across Models | Brandon Trabucco et.al. | 2406.07506 | link |
| 2024-06-11 | Minimizing Energy Costs in Deep Learning Model Training: The Gaussian Sampling Approach | Challapalli Phanindra Revanth et.al. | 2406.07332 | null |
| 2024-06-11 | Unsupervised Object Detection with Theoretical Guarantees | Marian Longa et.al. | 2406.07284 | null |
| 2024-06-11 | Advancing Grounded Multimodal Named Entity Recognition via LLM-Based Reformulation and Box-Based Segmentation | Jinyuan Li et.al. | 2406.07268 | null |
| 2024-06-11 | EFFOcc: A Minimal Baseline for EFficient Fusion-based 3D Occupancy Network | Yining Shi et.al. | 2406.07042 | link |
| 2024-06-11 | RS-DFM: A Remote Sensing Distributed Foundation Model for Diverse Downstream Tasks | Zhechao Wang et.al. | 2406.07032 | null |
| 2024-06-12 | LiSD: An Efficient Multi-Task Learning Framework for LiDAR Segmentation and Detection | Jiahua Xu et.al. | 2406.07023 | null |
| 2024-06-11 | Teaching with Uncertainty: Unleashing the Potential of Knowledge Distillation in Object Detection | Junfei Yi et.al. | 2406.06999 | null |
| 2024-06-10 | UnSupDLA: Towards Unsupervised Document Layout Analysis | Talha Uddin Sheikh et.al. | 2406.06236 | null |
| 2024-06-10 | UEMM-Air: A Synthetic Multi-modal Dataset for Unmanned Aerial Vehicle Object Detection | Fan Liu et.al. | 2406.06230 | link |
| 2024-06-10 | ReCon1M:A Large-scale Benchmark Dataset for Relation Comprehension in Remote Sensing Imagery | Xian Sun et.al. | 2406.06028 | null |
| 2024-06-10 | Solution for SMART-101 Challenge of CVPR Multi-modal Algorithmic Reasoning Task 2024 | Jinwoo Ahn et.al. | 2406.05963 | null |
| 2024-06-10 | Open-Vocabulary Part-Based Grasping | Tjeard van Oort et.al. | 2406.05951 | null |
| 2024-06-09 | Stealthy Targeted Backdoor Attacks against Image Captioning | Wenshu Fan et.al. | 2406.05874 | null |
| 2024-06-09 | Scaling Graph Convolutions for Mobile Vision | William Avery et.al. | 2406.05850 | link |
| 2024-06-09 | Mamba YOLO: SSMs-Based YOLO For Object Detection | Zeyu Wang et.al. | 2406.05835 | link |
| 2024-06-09 | ControlLoc: Physical-World Hijacking Attack on Visual Perception in Autonomous Driving | Chen Ma et.al. | 2406.05810 | null |
| 2024-06-09 | SAM-PM: Enhancing Video Camouflaged Object Detection using Spatio-Temporal Attention | Muhammad Nawfal Meeran et.al. | 2406.05802 | link |
| 2024-06-07 | Nacala-Roof-Material: Drone Imagery for Roof Detection, Classification, and Segmentation to Support Mosquito-borne Disease Risk Assessment | Venkanna Babu Guthula et.al. | 2406.04949 | null |
| 2024-06-07 | EGOR: Efficient Generated Objects Replay for incremental object detection | Zijia An et.al. | 2406.04829 | null |
| 2024-06-07 | UCDNet: Multi-UAV Collaborative 3D Object Detection Network by Reliable Feature Mapping | Pengju Tian et.al. | 2406.04648 | null |
| 2024-06-07 | UVCPNet: A UAV-Vehicle Collaborative Perception Network for 3D Object Detection | Yuchao Wang et.al. | 2406.04647 | null |
| 2024-06-06 | CORU: Comprehensive Post-OCR Parsing and Receipt Understanding Dataset | Abdelrahman Abdallah et.al. | 2406.04493 | link |
| 2024-06-06 | DeTra: A Unified Model for Object Detection and Trajectory Forecasting | Sergio Casas et.al. | 2406.04426 | null |
| 2024-06-06 | Parameter-Inverted Image Pyramid Networks | Xizhou Zhu et.al. | 2406.04330 | link |
| 2024-06-06 | LenslessFace: An End-to-End Optimized Lensless System for Privacy-Preserving Face Verification | Xin Cai et.al. | 2406.04129 | null |
| 2024-06-06 | Semmeldetector: Application of Machine Learning in Commercial Bakeries | Thomas H. Schmitt et.al. | 2406.04050 | null |
| 2024-06-06 | Frequency-based Matcher for Long-tailed Semantic Segmentation | Shan Li et.al. | 2406.03917 | link |
| 2024-06-06 | Instance Segmentation and Teeth Classification in Panoramic X-rays | Devichand Budagam et.al. | 2406.03747 | link |
| 2024-06-05 | FedPylot: Navigating Federated Learning for Real-Time Object Detection in Internet of Vehicles | Cyprien Quéméneur et.al. | 2406.03611 | link |
| 2024-06-05 | LW-DETR: A Transformer Replacement to YOLO for Real-Time Detection | Qiang Chen et.al. | 2406.03459 | link |
| 2024-06-05 | Global Clipper: Enhancing Safety and Reliability of Transformer-based Object Detection Models | Qutub Syed Sha et.al. | 2406.03229 | null |
| 2024-06-05 | Situation Monitor: Diversity-Driven Zero-Shot Out-of-Distribution Detection using Budding Ensemble Architecture for Object Detection | Qutub Syed et.al. | 2406.03188 | null |
| 2024-06-05 | Enhanced Automotive Object Detection via RGB-D Fusion in a DiffusionDet Framework | Eliraz Orfaig et.al. | 2406.03129 | null |
| 2024-06-04 | Open-YOLO 3D: Towards Fast and Accurate Open-Vocabulary 3D Instance Segmentation | Mohamed El Amine Boudjoghra et.al. | 2406.02548 | link |
| 2024-06-04 | SatSplatYOLO: 3D Gaussian Splatting-based Virtual Object Detection Ensembles for Satellite Feature Recognition | Van Minh Nguyen et.al. | 2406.02533 | null |
| 2024-06-04 | GrootVL: Tree Topology is All You Need in State Space Model | Yicheng Xiao et.al. | 2406.02395 | link |
| 2024-06-04 | Low-Rank Adaption on Transformer-based Oriented Object Detector for Satellite Onboard Processing of Remote Sensing Images | Xinyang Pu et.al. | 2406.02385 | link |
| 2024-06-04 | Radar Spectra-Language Model for Automotive Scene Parsing | Mariia Pushkareva et.al. | 2406.02158 | null |
| 2024-06-04 | Detecting Endangered Marine Species in Autonomous Underwater Vehicle Imagery Using Point Annotations and Few-Shot Learning | Heather Doig et.al. | 2406.01932 | null |
| 2024-06-04 | GeminiFusion: Efficient Pixel-wise Multimodal Fusion for Vision Transformer | Ding Jia et.al. | 2406.01210 | link |
| 2024-06-03 | Learning Adaptive Fusion Bank for Multi-modal Salient Object Detection | Kunpeng Wang et.al. | 2406.01127 | link |
| 2024-06-03 | Visual Car Brand Classification by Implementing a Synthetic Image Dataset Creation Pipeline | Jan Lippemeier et.al. | 2406.01071 | null |
| 2024-06-03 | Multi-Object Tracking based on Imaging Radar 3D Object Detection | Patrick Palmer et.al. | 2406.01011 | null |
| 2024-05-31 | Power of Cooperative Supervision: Multiple Teachers Framework for Enhanced 3D Semi-Supervised Object Detection | Jin-Hee Lee et.al. | 2405.20720 | link |
| 2024-05-30 | On Calibration of Object Detectors: Pitfalls, Evaluation and Baselines | Selim Kuzucu et.al. | 2405.20459 | link |
| 2024-05-30 | RTGen: Generating Region-Text Pairs for Open-Vocabulary Object Detection | Fangyi Chen et.al. | 2405.19854 | null |
| 2024-05-30 | Improving Object Detector Training on Synthetic Data by Starting With a Strong Baseline Methodology | Frank A. Ruis et.al. | 2405.19822 | null |
| 2024-05-30 | Towards Unified Multi-granularity Text Detection with Interactive Attention | Xingyu Wan et.al. | 2405.19765 | null |
| 2024-05-30 | Fully Test-Time Adaptation for Monocular 3D Object Detection | Hongbin Lin et.al. | 2405.19682 | link |
| 2024-05-30 | YotoR-You Only Transform One Representation | José Ignacio Díaz Villa et.al. | 2405.19629 | null |
| 2024-05-29 | Enabling Visual Recognition at Radio Frequency | Haowen Lai et.al. | 2405.19516 | null |
| 2024-05-29 | Model Agnostic Defense against Adversarial Patch Attacks on Object Detection in Unmanned Aerial Vehicles | Saurabh Pathak et.al. | 2405.19179 | null |
| 2024-05-29 | RGB-T Object Detection via Group Shuffled Multi-receptive Attention and Multi-modal Supervision | Jinzhong Wang et.al. | 2405.18955 | null |
| 2024-05-29 | SSGA-Net: Stepwise Spatial Global-local Aggregation Networks for for Autonomous Driving | Yiming Cui et.al. | 2405.18857 | null |
| 2024-05-29 | PillarHist: A Quantization-aware Pillar Feature Encoder based on Height-aware Histogram | Sifan Zhou et.al. | 2405.18734 | null |
| 2024-05-28 | A Review and Implementation of Object Detection Models and Optimizations for Real-time Medical Mask Detection during the COVID-19 Pandemic | Ioanna Gogou et.al. | 2405.18387 | link |
| 2024-05-28 | Is a 3D-Tokenized LLM the Key to Reliable Autonomous Driving? | Yifan Bai et.al. | 2405.18361 | null |
| 2024-05-28 | Intent3D: 3D Object Detection in RGB-D Scans Based on Human Intention | Weitai Kang et.al. | 2405.18295 | link |
| 2024-05-28 | DMT-JEPA: Discriminative Masked Targets for Joint-Embedding Predictive Architecture | Shentong Mo et.al. | 2405.17995 | link |
| 2024-05-28 | Transformer and Hybrid Deep Learning Based Models for Machine-Generated Text Detection | Teodor-George Marchitan et.al. | 2405.17964 | null |
| 2024-05-28 | Self-supervised Pre-training for Transferable Multi-modal Perception | Xiaohao Xu et.al. | 2405.17942 | null |
| 2024-05-28 | Boosting General Trimap-free Matting in the Real-World Image | Leo Shan Wenzhang Zhou Grace Zhao et.al. | 2405.17916 | null |
| 2024-05-28 | The Binary Quantized Neural Network for Dense Prediction via Specially Designed Upsampling and Attention | Xingyu Ding et.al. | 2405.17776 | null |
| 2024-05-27 | Understanding differences in applying DETR to natural and medical images | Yanqi Xu et.al. | 2405.17677 | null |
| 2024-05-27 | Hardness-Aware Scene Synthesis for Semi-Supervised 3D Object Detection | Shuai Zeng et.al. | 2405.17422 | link |
| 2024-05-27 | Tracking Small Birds by Detection Candidate Region Filtering and Detection History-aware Association | Tingwei Liu et.al. | 2405.17323 | null |
| 2024-05-27 | Enhanced Automotive Radar Collaborative Sensing By Exploiting Constructive Interference | Lifan Xu et.al. | 2405.17297 | null |
| 2024-05-27 | SCaRL- A Synthetic Multi-Modal Dataset for Autonomous Driving | Avinash Nittur Ramesh et.al. | 2405.17030 | null |
| 2024-05-27 | Collective Perception Datasets for Autonomous Driving: A Comprehensive Review | Sven Teufel et.al. | 2405.16973 | null |
| 2024-05-27 | OED: Towards One-stage End-to-End Dynamic Scene Graph Generation | Guan Wang et.al. | 2405.16925 | link |
| 2024-05-27 | ContrastAlign: Toward Robust BEV Feature Alignment via Contrastive Learning for Multi-Modal 3D Object Detection | Ziying Song et.al. | 2405.16873 | null |
| 2024-05-27 | A re-calibration method for object detection with multi-modal alignment bias in autonomous driving | Zhihang Song et.al. | 2405.16848 | null |
| 2024-05-26 | A Study on Unsupervised Anomaly Detection and Defect Localization using Generative Model in Ultrasonic Non-Destructive Testing | Yusaku Ando et.al. | 2405.16580 | null |
| 2024-05-26 | AI-Generated Text Detection and Classification Based on BERT Deep Learning Algorithm | Hao Wang et.al. | 2405.16422 | null |
| 2024-05-24 | UNION: Unsupervised 3D Object Detection using Object Appearance-based Pseudo-Classes | Ted Lentsch et.al. | 2405.15688 | link |
| 2024-05-24 | Multimodal Object Detection via Probabilistic a priori Information Integration | Hafsa El Hafyani et.al. | 2405.15596 | null |
| 2024-05-24 | Scale-Invariant Feature Disentanglement via Adversarial Learning for UAV-based Object Detection | Fan Liu et.al. | 2405.15465 | null |
| 2024-05-24 | Leveraging knowledge distillation for partial multi-task learning from multiple remote sensing datasets | Hoàng-Ân Lê et.al. | 2405.15394 | null |
| 2024-05-24 | Towards Global Optimal Visual In-Context Learning Prompt Selection | Chengming Xu et.al. | 2405.15279 | null |
| 2024-05-24 | Unbiased Faster R-CNN for Single-source Domain Generalized Object Detection | Yajing Liu et.al. | 2405.15225 | null |
| 2024-05-24 | ODGEN: Domain-specific Object Detection Data Generation with Diffusion Models | Jingyuan Zhu et.al. | 2405.15199 | null |
| 2024-05-24 | MonoDETRNext: Next-generation Accurate and Efficient Monocular 3D Object Detection Method | Pan Liao et.al. | 2405.15176 | null |
| 2024-05-23 | Learning to Detect and Segment Mobile Objects from Unlabeled Videos | Yihong Sun et.al. | 2405.14841 | null |
| 2024-05-23 | Designing A Sustainable Marine Debris Clean-up Framework without Human Labels | Raymond Wang et.al. | 2405.14815 | null |
| 2024-05-23 | Drones Help Drones: A Collaborative Framework for Multi-Drone Object Trajectory Prediction and Beyond | Zhechao Wang et.al. | 2405.14674 | null |
| 2024-05-23 | Improving Single Domain-Generalized Object Detection: A Focus on Diversification and Alignment | Muhammad Sohail Danish et.al. | 2405.14497 | null |
| 2024-05-23 | YOLOv10: Real-Time End-to-End Object Detection | Ao Wang et.al. | 2405.14458 | link |
| 2024-05-23 | Harmony: A Joint Self-Supervised and Weakly-Supervised Framework for Learning General Purpose Visual Representations | Mohammed Baharoon et.al. | 2405.14239 | null |
| 2024-05-22 | Two Heads are Better Than One: Neural Networks Quantization with 2D Hilbert Curve-based Output Representation | Mykhailo Uss et.al. | 2405.14024 | null |
| 2024-05-22 | TS40K: a 3D Point Cloud Dataset of Rural Terrain and Electrical Transmission System | Diogo Lavado et.al. | 2405.13989 | null |
| 2024-05-22 | Class-Conditional self-reward mechanism for improved Text-to-Image models | Safouane El Ghazouali et.al. | 2405.13473 | link |
| 2024-05-22 | Adaptive Wireless Image Semantic Transmission and Over-The-Air Testing | Jiarun Ding et.al. | 2405.13403 | null |
| 2024-05-21 | BiomedParse: a biomedical foundation model for image parsing of everything everywhere all at once | Theodore Zhao et.al. | 2405.12971 | null |
| 2024-05-21 | AMFD: Distillation via Adaptive Multimodal Fusion for Multispectral Pedestrian Detection | Zizhao Chen et.al. | 2405.12944 | link |
| 2024-05-21 | Predicting the Influence of Adverse Weather on Pedestrian Detection with Automotive Radar and Lidar Sensors | Daniel Weihmayr et.al. | 2405.12736 | null |
| 2024-05-21 | Spotting AI’s Touch: Identifying LLM-Paraphrased Spans in Text | Yafu Li et.al. | 2405.12689 | null |
| 2024-05-21 | Automating Attendance Management in Human Resources: A Design Science Approach Using Computer Vision and Facial Recognition | Bao-Thien Nguyen-Tat et.al. | 2405.12633 | null |
| 2024-05-21 | FFAM: Feature Factorization Activation Map for Explanation of 3D Detectors | Shuai Liu et.al. | 2405.12601 | link |
| 2024-05-21 | Dataset and Benchmark for Urdu Natural Scenes Text Detection, Recognition and Visual Question Answering | Hiba Maryam et.al. | 2405.12533 | null |
| 2024-05-21 | Active Object Detection with Knowledge Aggregation and Distillation from Large Models | Dejie Yang et.al. | 2405.12509 | null |
| 2024-05-21 | Mutual Information Analysis in Multimodal Learning Systems | Hadi Hadizadeh et.al. | 2405.12456 | null |
| 2024-05-20 | Multi-View Attentive Contextualization for Multi-View 3D Object Detection | Xianpeng Liu et.al. | 2405.12200 | null |
| 2024-05-20 | Bangladeshi Native Vehicle Detection in Wild | Bipin Saha et.al. | 2405.12150 | link |
| 2024-05-20 | Salience-guided Ground Factor for Robust Localization of Delivery Robots in Complex Urban Environments | Jooyong Park et.al. | 2405.11855 | null |
| 2024-05-20 | DATR: Unsupervised Domain Adaptive Detection Transformer with Dataset-Level Adaptation and Prototypical Alignment | Jianhong Han et.al. | 2405.11765 | link |
| 2024-05-20 | Versatile Teacher: A Class-aware Teacher-student Framework for Cross-domain Adaptation | Runou Yang et.al. | 2405.11754 | link |
| 2024-05-19 | FADet: A Multi-sensor 3D Object Detection Network based on Local Featured Attention | Ziang Guo et.al. | 2405.11682 | link |
| 2024-05-19 | SLAB: Efficient Transformers with Simplified Linear Attention and Progressive Re-parameterized Batch Normalization | Jialong Guo et.al. | 2405.11582 | link |
| 2024-05-19 | The First Swahili Language Scene Text Detection and Recognition Dataset | Fadila Wendigoundi Douamba et.al. | 2405.11437 | link |
| 2024-05-18 | InfRS: Incremental Few-Shot Object Detection in Remote Sensing Images | Wuzhou Li et.al. | 2405.11293 | null |
| 2024-05-18 | Visible and Clear: Finding Tiny Objects in Difference Map | Bing Cao et.al. | 2405.11276 | null |
| 2024-05-17 | A Versatile Framework for Analyzing Galaxy Image Data by Implanting Human-in-the-loop on a Large Vision Model | Mingxiang Fu et.al. | 2405.10890 | null |
| 2024-05-17 | DeepPavlov at SemEval-2024 Task 8: Leveraging Transfer Learning for Detecting Boundaries of Machine-Generated Texts | Anastasia Voznyuk et.al. | 2405.10629 | link |
| 2024-05-17 | DuoSpaceNet: Leveraging Both Bird’s-Eye-View and Perspective View Representations for 3D Object Detection | Zhe Huang et.al. | 2405.10577 | null |
| 2024-05-16 | Drone-type-Set: Drone types detection benchmark for drone detection and tracking | Kholoud AlDosari et.al. | 2405.10398 | null |
| 2024-05-16 | Grounded 3D-LLM with Referent Tokens | Yilun Chen et.al. | 2405.10370 | link |
| 2024-05-16 | Grounding DINO 1.5: Advance the “Edge” of Open-Set Object Detection | Tianhe Ren et.al. | 2405.10300 | link |
| 2024-05-16 | Towards Task-Compatible Compressible Representations | Anderson de Andrade et.al. | 2405.10244 | link |
| 2024-05-16 | SpecDETR: A Transformer-based Hyperspectral Point Object Detection Network | Zhaoxu Li et.al. | 2405.10148 | link |
| 2024-05-16 | SHiNe: Semantic Hierarchy Nexus for Open-vocabulary Object Detection | Mingxuan Liu et.al. | 2405.10053 | link |
| 2024-05-16 | FPDIoU Loss: A Loss Function for Efficient Bounding Box Regression of Rotated Object Detection | Siliang Ma et.al. | 2405.09942 | null |
| 2024-05-16 | Infrared Adversarial Car Stickers | Xiaopei Zhu et.al. | 2405.09924 | null |
| 2024-05-16 | PillarNeXt: Improving the 3D detector by introducing Voxel2Pillar feature encoding and extracting multi-scale features | Xusheng Li et.al. | 2405.09828 | null |
| 2024-05-16 | Size-invariance Matters: Rethinking Metrics and Losses for Imbalanced Multi-object Salient Object Detection | Feiran Li et.al. | 2405.09782 | link |
| 2024-05-15 | Synth-to-Real Unsupervised Domain Adaptation for Instance Segmentation | Guo Yachan et.al. | 2405.09682 | null |
| 2024-05-15 | Dynamic Loss Decay based Robust Oriented Object Detection on Remote Sensing Images with Noisy Labels | Guozhang Liu et.al. | 2405.09024 | null |
| 2024-05-14 | CLIP with Quality Captions: A Strong Pretraining for Vision Tasks | Pavan Kumar Anasosalu Vasu et.al. | 2405.08911 | null |
| 2024-05-14 | Open-Vocabulary Object Detection via Neighboring Region Attention Alignment | Sunyuan Qiang et.al. | 2405.08593 | null |
| 2024-05-14 | Semantic Contextualization of Face Forgery: A New Definition, Dataset, and Detection Method | Mian Zou et.al. | 2405.08487 | link |
| 2024-05-14 | RDPN6D: Residual-based Dense Point-wise Network for 6Dof Object Pose Estimation Based on RGB-D Images | Zong-Wei Hong et.al. | 2405.08483 | link |
| 2024-05-14 | Multimodal Collaboration Networks for Geospatial Vehicle Detection in Dense, Occluded, and Large-Scale Events | Xin Wu et.al. | 2405.08251 | link |
| 2024-05-13 | RAID: A Shared Benchmark for Robust Evaluation of Machine-Generated Text Detectors | Liam Dugan et.al. | 2405.07940 | null |
| 2024-05-13 | oTTC: Object Time-to-Contact for Motion Estimation in Autonomous Driving | Abdul Hannan Khan et.al. | 2405.07698 | null |
| 2024-05-13 | MonoMAE: Enhancing Monocular 3D Detection through Depth-Aware Masked Autoencoders | Xueying Jiang et.al. | 2405.07696 | null |
| 2024-05-13 | Quality-aware Selective Fusion Network for V-D-T Salient Object Detection | Liuxin Bao et.al. | 2405.07655 | link |
| 2024-05-13 | Fast Training Data Acquisition for Object Detection and Segmentation using Black Screen Luminance Keying | Thomas Pöllabauer et.al. | 2405.07653 | null |
| 2024-05-13 | Integrity Monitoring of 3D Object Detection in Automated Driving Systems using Raw Activation Patterns and Spatial Filtering | Hakan Yekta Yatbaz et.al. | 2405.07600 | null |
| 2024-05-13 | Environmental Matching Attack Against Unmanned Aerial Vehicles Object Detection | Dehong Kong et.al. | 2405.07595 | null |
| 2024-05-13 | Text Grouping Adapter: Adapting Pre-trained Text Detector for Layout Analysis | Tianci Bi et.al. | 2405.07481 | null |
| 2024-05-13 | Enhancing 3D Object Detection by Using Neural Network with Self-adaptive Thresholding | Houze Liu et.al. | 2405.07479 | null |
| 2024-05-12 | MAML MOT: Multiple Object Tracking based on Meta-Learning | Jiayi Chen et.al. | 2405.07272 | null |
| 2024-05-10 | How to Augment for Atmospheric Turbulence Effects on Thermal Adapted Object Detection Models? | Engin Uzun et.al. | 2405.06383 | null |
| 2024-05-10 | Precise Apple Detection and Localization in Orchards using YOLOv5 for Robotic Harvesting Systems | Jiang Ziyue et.al. | 2405.06260 | null |
| 2024-05-09 | CSA-Net: Channel-wise Spatially Autocorrelated Attention Networks | Nick et.al. | 2405.05755 | null |
| 2024-05-09 | Depth Awakens: A Depth-perceptual Attention Fusion Network for RGB-D Camouflaged Object Detection | Xinran Liua et.al. | 2405.05614 | null |
| 2024-05-09 | The object detection model uses combined extraction with KNN and RF classification | Florentina Tatrin Kurniati et.al. | 2405.05551 | null |
| 2024-05-08 | Reviewing Intelligent Cinematography: AI research for camera-based video production | Adrian Azzarelli et.al. | 2405.05039 | null |
| 2024-05-07 | A Novel Wide-Area Multiobject Detection System with High-Probability Region Searching | Xianlei Long et.al. | 2405.04589 | null |
| 2024-05-07 | DriveWorld: 4D Pre-trained Scene Understanding via World Models for Autonomous Driving | Chen Min et.al. | 2405.04390 | null |
| 2024-05-07 | A New Dataset and Comparative Study for Aphid Cluster Detection and Segmentation in Sorghum Fields | Raiyan Rahman et.al. | 2405.04305 | null |
| 2024-05-07 | ViewFormer: Exploring Spatiotemporal Modeling for Multi-View 3D Occupancy Perception via View-Guided Transformers | Jinke Li et.al. | 2405.04299 | link |
| 2024-05-07 | Who Wrote This? The Key to Zero-Shot LLM-Generated Text Detection Is GECScore | Junchao Wu et.al. | 2405.04286 | link |
| 2024-05-07 | Deep Event-based Object Detection in Autonomous Driving: A Survey | Bingquan Zhou et.al. | 2405.03995 | null |
| 2024-05-06 | BadFusion: 2D-Oriented Backdoor Attacks against 3D Object Detection | Saket S. Chaturvedi et.al. | 2405.03884 | null |
| 2024-05-06 | RepVGG-GELAN: Enhanced GELAN with VGG-STYLE ConvNets for Brain Tumour Detection | Thennarasi Balakrishnan et.al. | 2405.03541 | link |
| 2024-05-06 | Low-light Object Detection | Pengpeng Li et.al. | 2405.03519 | null |
| 2024-05-06 | Salient Object Detection From Arbitrary Modalities | Nianchang Huang et.al. | 2405.03352 | null |
| 2024-05-06 | Modality Prompts for Arbitrary Modality Salient Object Detection | Nianchang Huang et.al. | 2405.03351 | null |
| 2024-05-06 | Vietnamese AI Generated Text Detection | Quang-Dan Tran et.al. | 2405.03206 | null |
| 2024-05-06 | PTQ4SAM: Post-Training Quantization for Segment Anything | Chengtao Lv et.al. | 2405.03144 | link |
| 2024-05-05 | Performance Evaluation of Real-Time Object Detection for Electric Scooters | Dong Chen et.al. | 2405.03039 | link |
| 2024-05-05 | SalFAU-Net: Saliency Fusion Attention U-Net for Salient Object Detection | Kassaw Abraham Mulat et.al. | 2405.02906 | null |
| 2024-05-07 | Adaptive Guidance Learning for Camouflaged Object Detection | Zhennan Chen et.al. | 2405.02824 | null |
| 2024-05-05 | PVTransformer: Point-to-Voxel Transformer for Scalable 3D Object Detection | Zhaoqi Leng et.al. | 2405.02811 | null |
| 2024-05-02 | Segmentation-Free Outcome Prediction in Head and Neck Cancer: Deep Learning-based Feature Extraction from Multi-Angle Maximum Intensity Projections (MA-MIPs) of PET Images | Amirhosein Toosi et.al. | 2405.01756 | null |
| 2024-05-02 | PointCompress3D – A Point Cloud Compression Framework for Roadside LiDARs in Intelligent Transportation Systems | Walter Zimmer et.al. | 2405.01750 | null |
| 2024-05-02 | Development of Skip Connection in Deep Neural Networks for Computer Vision and Medical Image Analysis: A Survey | Guoping Xu et.al. | 2405.01725 | link |
| 2024-05-02 | SOAR: Advancements in Small Body Object Detection for Aerial Imagery Using State Space Models and Programmable Gradients | Tushar Verma et.al. | 2405.01699 | null |
| 2024-05-02 | Imagine the Unseen: Occluded Pedestrian Detection via Adversarial Feature Completion | Shanshan Zhang et.al. | 2405.01311 | null |
| 2024-05-02 | Overcoming LLM Challenges using RAG-Driven Precision in Coffee Leaf Disease Remediation | Dr. Selva Kumar S et.al. | 2405.01310 | null |
| 2024-05-02 | Towards Consistent Object Detection via LiDAR-Camera Synergy | Kai Luo et.al. | 2405.01258 | link |
| 2024-05-02 | Federated Learning with Heterogeneous Data Handling for Robust Vehicular Object Detection | Ahmad Khalil et.al. | 2405.01108 | null |
| 2024-05-01 | Grains of Saliency: Optimizing Saliency-based Training of Biometric Attack Detection Models | Colton R. Crum et.al. | 2405.00650 | null |
| 2024-05-01 | Object detection under the linear subspace model with application to cryo-EM images | Amitay Eldar et.al. | 2405.00364 | null |
| 2024-04-30 | Visual Fact Checker: Enabling High-Fidelity Detailed Caption Generation | Yunhao Ge et.al. | 2404.19752 | null |
| 2024-04-30 | Quantifying Nematodes through Images: Datasets, Models, and Baselines of Deep Learning | Zhipeng Yuan et.al. | 2404.19748 | null |
| 2024-04-30 | Masked Multi-Query Slot Attention for Unsupervised Object Discovery | Rishav Pramanik et.al. | 2404.19654 | link |
| 2024-04-30 | Physical Backdoor: Towards Temperature-based Backdoor Attacks in the Physical World | Wen Yin et.al. | 2404.19417 | null |
| 2024-04-30 | UniFS: Universal Few-shot Instance Perception with Point Representations | Sheng Jin et.al. | 2404.19401 | null |
| 2024-04-30 | Pseudo Label Refinery for Unsupervised Domain Adaptation on Cross-dataset 3D Object Detection | Zhanwei Zhang et.al. | 2404.19384 | null |
| 2024-04-30 | Robust Pedestrian Detection via Constructing Versatile Pedestrian Knowledge Bank | Sungjune Park et.al. | 2404.19299 | null |
| 2024-04-29 | MiPa: Mixed Patch Infrared-Visible Modality Agnostic Object Detection | Heitor R. Medeiros et.al. | 2404.18849 | null |
| 2024-04-29 | Leveraging PointNet and PointNet++ for Lyft Point Cloud Classification Challenge | Rajat K. Doshi et.al. | 2404.18665 | null |
| 2024-04-29 | CoSense3D: an Agent-based Efficient Learning Framework for Collective Perception | Yunshuang Yuan et.al. | 2404.18617 | null |
| 2024-04-29 | Assessing Quality Metrics for Neural Reality Gap Input Mitigation in Autonomous Driving Testing | Stefano Carlo Lambertenghi et.al. | 2404.18577 | null |
| 2024-04-29 | Efficient Meta-Learning Enabled Lightweight Multiscale Few-Shot Object Detection in Remote Sensing Images | Wenbin Guan et.al. | 2404.18426 | null |
| 2024-04-29 | Multi-modal Perception Dataset of In-water Objects for Autonomous Surface Vehicles | Mingi Jeong et.al. | 2404.18411 | null |
| 2024-04-28 | FAD-SAR: A Novel Fishing Activity Detection System via Synthetic Aperture Radar Images Based on Deep Learning Method | Yanbing Bai et.al. | 2404.18245 | null |
| 2024-04-28 | RadSimReal: Bridging the Gap Between Synthetic and Real Data in Radar Object Detection With Simulation | Oded Bialer et.al. | 2404.18150 | null |
| 2024-04-27 | Reliable Student: Addressing Noise in Semi-Supervised 3D Object Detection | Farzad Nozarian et.al. | 2404.17910 | link |
| 2024-04-27 | A Hybrid Approach for Document Layout Analysis in Document images | Tahira Shehzadi et.al. | 2404.17888 | null |
| 2024-04-26 | Inhomogeneous illuminated image enhancement under extremely low visibility condition | Libang Chen et.al. | 2404.17503 | null |
| 2024-04-26 | Cost-Sensitive Uncertainty-Based Failure Recognition for Object Detection | Moussa Kassem Sbeyti et.al. | 2404.17427 | null |
| 2024-04-26 | Enhancing mmWave Radar Point Cloud via Visual-inertial Supervision | Cong Fan et.al. | 2404.17229 | null |
| 2024-04-26 | MorphText: Deep Morphology Regularized Arbitrary-shape Scene Text Detection | Chengpei Xu et.al. | 2404.17151 | null |
| 2024-04-25 | Generating Minimalist Adversarial Perturbations to Test Object-Detection Models: An Adaptive Multi-Metric Evolutionary Search Approach | Cristopher McIntyre-Garcia et.al. | 2404.17020 | link |
| 2024-04-25 | Constellation Dataset: Benchmarking High-Altitude Object Detection for an Urban Intersection | Mehmet Kerem Turkcan et.al. | 2404.16944 | link |
| 2024-04-25 | Self-Balanced R-CNN for Instance Segmentation | Leonardo Rossi et.al. | 2404.16633 | link |
| 2024-04-25 | Cross-Domain Spatial Matching for Camera and Radar Sensor Data Fusion in Autonomous Vehicle Perception System | Daniel Dworak et.al. | 2404.16548 | null |
| 2024-04-25 | Commonsense Prototype for Outdoor Unsupervised 3D Object Detection | Hai Wu et.al. | 2404.16493 | link |
| 2024-04-25 | IMWA: Iterative Model Weight Averaging Benefits Class-Imbalanced Learning Tasks | Zitong Huang et.al. | 2404.16331 | null |
| 2024-04-25 | CFMW: Cross-modality Fusion Mamba for Multispectral Object Detection under Adverse Weather Conditions | Haoyuan Li et.al. | 2404.16302 | link |
| 2024-04-24 | AutoGluon-Multimodal (AutoMM): Supercharging Multimodal AutoML with Foundation Models | Zhiqiang Tang et.al. | 2404.16233 | null |
| 2024-04-24 | Observational parameters of Blue Large-Amplitude Pulsators | P. Pietrukowicz et.al. | 2404.16089 | null |
| 2024-04-24 | A Survey on Visual Mamba | Hanwei Zhang et.al. | 2404.15956 | null |
| 2024-04-24 | Steal Now and Attack Later: Evaluating Robustness of Object Detection against Black-box Adversarial Attacks | Erh-Chung Chen et.al. | 2404.15881 | null |
| 2024-04-24 | Revisiting Out-of-Distribution Detection in LiDAR-based 3D Object Detection | Michael Kösel et.al. | 2404.15879 | link |
| 2024-04-23 | CFPFormer: Feature-pyramid like Transformer Decoder for Segmentation and Detection | Hongyi Cai et.al. | 2404.15451 | null |
| 2024-04-23 | ID-Aligner: Enhancing Identity-Preserving Text-to-Image Generation with Reward Feedback Learning | Weifeng Chen et.al. | 2404.15449 | null |
| 2024-04-23 | Source-free Domain Adaptation for Video Object Detection Under Adverse Image Conditions | Xingguang Zhang et.al. | 2404.15252 | null |
| 2024-04-23 | Efficient Transformer Encoders for Mask2Former-style models | Manyi Yao et.al. | 2404.15244 | null |
| 2024-04-23 | Gallbladder Cancer Detection in Ultrasound Images based on YOLO and Faster R-CNN | Sara Dadjouy et.al. | 2404.15129 | null |
| 2024-04-23 | External Prompt Features Enhanced Parameter-efficient Fine-tuning for Salient Object Detection | Wen Liang et.al. | 2404.15008 | null |
| 2024-04-23 | ContextualFusion: Context-Based Multi-Sensor Fusion for 3D Object Detection in Adverse Operating Conditions | Shounak Sural et.al. | 2404.14780 | null |
| 2024-04-23 | Unified Unsupervised Salient Object Detection via Knowledge Transfer | Yao Yuan et.al. | 2404.14759 | link |
| 2024-04-22 | SemEval-2024 Task 8: Multidomain, Multimodel and Multilingual Machine-Generated Text Detection | Yuxia Wang et.al. | 2404.14183 | null |
| 2024-04-22 | Text in the Dark: Extremely Low-Light Text Image Enhancement | Che-Tsung Lin et.al. | 2404.14135 | null |
| 2024-04-22 | CKD: Contrastive Knowledge Distillation from A Sample-wise Perspective | Wencheng Zhu et.al. | 2404.14109 | null |
| 2024-04-22 | Benchmarking Multi-Modal LLMs for Testing Visual Deep Learning Systems Through the Lens of Image Mutation | Liwen Wang et.al. | 2404.13945 | null |
| 2024-04-22 | NeRF-DetS: Enhancing Multi-View 3D Object Detection with Sampling-adaptive Network of Continuous NeRF-based Representation | Chi Huang et.al. | 2404.13921 | null |
| 2024-04-22 | TeamTrack: A Dataset for Multi-Sport Multi-Object Tracking in Full-pitch Videos | Atom Scott et.al. | 2404.13868 | null |
| 2024-04-22 | Toward Robust LiDAR based 3D Object Detection via Density-Aware Adaptive Thresholding | Eunho Lee et.al. | 2404.13852 | null |
| 2024-04-21 | A Nasal Cytology Dataset for Object Detection and Deep Learning | Mauro Camporeale et.al. | 2404.13745 | null |
| 2024-04-23 | Clio: Real-time Task-Driven Open-Set 3D Scene Graphs | Dominic Maggio et.al. | 2404.13696 | null |
| 2024-04-20 | FisheyeDetNet: Object Detection on Fisheye Surround View Camera Systems for Automated Driving | Ganesh Sistu et.al. | 2404.13443 | null |
| 2024-04-19 | A comparison between single-stage and two-stage 3D tracking algorithms for greenhouse robotics | David Rapado-Rincon et.al. | 2404.12963 | null |
| 2024-04-19 | Language-Driven Active Learning for Diverse Open-Set 3D Object Detection | Ross Greer et.al. | 2404.12856 | null |
| 2024-04-19 | ECOR: Explainable CLIP for Object Recognition | Ali Rasekh et.al. | 2404.12839 | null |
| 2024-04-19 | A Point-Based Approach to Efficient LiDAR Multi-Task Perception | Christopher Lang et.al. | 2404.12798 | null |
| 2024-04-19 | ELEV-VISION-SAM: Integrated Vision Language and Foundation Model for Automated Estimation of Building Lowest Floor Elevation | Yu-Hsuan Ho et.al. | 2404.12606 | null |
| 2024-04-18 | The devil is in the object boundary: towards annotation-free instance segmentation using Foundation Models | Cheng Shi et.al. | 2404.11957 | link |
| 2024-04-18 | Simultaneous Detection and Interaction Reasoning for Object-Centric Action Recognition | Xunsong Li et.al. | 2404.11903 | null |
| 2024-04-17 | TempBEV: Improving Learned BEV Encoders with Combined Image and BEV Space Temporal Aggregation | Thomas Monninger et.al. | 2404.11803 | null |
| 2024-04-17 | Multimodal 3D Object Detection on Unseen Domains | Deepti Hegde et.al. | 2404.11764 | null |
| 2024-04-17 | Equivariant Spatio-Temporal Self-Supervision for LiDAR Object Detection | Deepti Hegde et.al. | 2404.11737 | null |
| 2024-04-17 | Multi-resolution Rescored ByteTrack for Video Object Detection on Ultra-low-power Embedded Systems | Luca Bompani et.al. | 2404.11488 | link |
| 2024-04-17 | EcoMLS: A Self-Adaptation Approach for Architecting Green ML-Enabled Systems | Meghana Tedla et.al. | 2404.11411 | null |
| 2024-04-17 | Detector Collapse: Backdooring Object Detection to Catastrophic Overload or Blindness | Hangtao Zhang et.al. | 2404.11357 | null |
| 2024-04-17 | Simple In-place Data Augmentation for Surveillance Object Detection | Munkh-Erdene Otgonbold et.al. | 2404.11226 | null |
| 2024-04-17 | Feature Corrective Transfer Learning: End-to-End Solutions to Object Detection in Non-Ideal Visual Conditions | Chuheng Wei et.al. | 2404.11214 | null |
| 2024-04-17 | GhostNetV3: Exploring the Training Strategies for Compact Models | Zhenhua Liu et.al. | 2404.11202 | link |
| 2024-04-17 | How to deal with glare for improved perception of Autonomous Vehicles | Muhammad Z. Alam et.al. | 2404.10992 | null |
| 2024-04-17 | Leveraging 3D LiDAR Sensors to Enable Enhanced Urban Safety and Public Health: Pedestrian Monitoring and Abnormal Activity Detection | Nawfal Guefrachi et.al. | 2404.10978 | null |
| 2024-04-16 | OSR-ViT: A Simple and Modular Framework for Open-Set Object Detection and Discovery | Matthew Inkawhich et.al. | 2404.10865 | null |
| 2024-04-16 | Learning Feature Inversion for Multi-class Anomaly Detection under General-purpose COCO-AD Benchmark | Jiangning Zhang et.al. | 2404.10760 | null |
| 2024-04-16 | Watch Your Step: Optimal Retrieval for Continual Learning at Scale | Truman Hickok et.al. | 2404.10758 | null |
| 2024-04-16 | Efficient optimal dispersed Haar-like filters for face detection | Zeinab Sedaghatjoo et.al. | 2404.10476 | null |
| 2024-04-16 | Camera clustering for scalable stream-based active distillation | Dani Manjah et.al. | 2404.10411 | null |
| 2024-04-15 | Low-Light Image Enhancement Framework for Improved Object Detection in Fisheye Lens Datasets | Dai Quoc Tran et.al. | 2404.10078 | link |
| 2024-04-15 | Explainable Light-Weight Deep Learning Pipeline for Improved Drought Stres | Aswini Kumar Patra et.al. | 2404.10073 | null |
| 2024-04-15 | VFMM3D: Releasing the Potential of Image by Vision Foundation Model for Monocular 3D Object Detection | Bonan Ding et.al. | 2404.09431 | null |
| 2024-04-14 | TEXT2TASTE: A Versatile Egocentric Vision System for Intelligent Reading Assistance Using Large Language Model | Wiktor Mucha et.al. | 2404.09254 | null |
| 2024-04-14 | DetCLIPv3: Towards Versatile Generative Open-vocabulary Object Detection | Lewei Yao et.al. | 2404.09216 | null |
| 2024-04-14 | Coreset Selection for Object Detection | Hojun Lee et.al. | 2404.09161 | null |
| 2024-04-14 | Fusion-Mamba for Cross-modality Object Detection | Wenhao Dong et.al. | 2404.09146 | null |
| 2024-04-13 | The Snake’s Beating Heart? A Millisecond Pulsar Binary in the Galactic Center Radio Filament G359.1 $-$ 0.2 | Marcus E. Lower et.al. | 2404.09098 | null |
| 2024-04-13 | BG-YOLO: A Bidirectional-Guided Method for Underwater Object Detection | Jian Zhang et.al. | 2404.08979 | null |
| 2024-04-13 | Shifting Spotlight for Co-supervision: A Simple yet Efficient Single-branch Network to See Through Camouflage | Yang Hu et.al. | 2404.08936 | null |
| 2024-04-12 | Training-free Boost for Open-Vocabulary Object Detection with Confidence Aggregation | Yanhao Zheng et.al. | 2404.08603 | link |
| 2024-04-12 | FashionFail: Addressing Failure Cases in Fashion Object Detection and Segmentation | Riza Velioglu et.al. | 2404.08582 | link |
| 2024-04-12 | Analyzing Decades-Long Environmental Changes in Namibia Using Archival Aerial Photography and Deep Learning | Girmaw Abebe Tadesse et.al. | 2404.08544 | null |
| 2024-04-12 | MambaDFuse: A Mamba-based Dual-phase Model for Multi-modality Image Fusion | Zhe Li et.al. | 2404.08406 | null |
| 2024-04-12 | Overcoming Scene Context Constraints for Object Detection in wild using Defilters | Vamshi Krishna Kancharla et.al. | 2404.08293 | null |
| 2024-04-11 | ConsistencyDet: Robust Object Detector with Denoising Paradigm of Consistency Model | Lifan Jiang et.al. | 2404.07773 | link |
| 2024-04-11 | Exploiting Object-based and Segmentation-based Semantic Features for Deep Learning-based Indoor Scene Classification | Ricardo Pereira et.al. | 2404.07739 | null |
| 2024-04-11 | Run-time Monitoring of 3D Object Detection in Automated Driving Systems Using Early Layer Neural Activation Patterns | Hakan Yekta Yatbaz et.al. | 2404.07685 | null |
| 2024-04-11 | Finding Dino: A plug-and-play framework for unsupervised detection of out-of-distribution objects using prototypes | Poulami Sinhamahapatra et.al. | 2404.07664 | null |
| 2024-04-11 | Separated Attention: An Improved Cycle GAN Based Under Water Image Enhancement Method | Tashmoy Ghosh et.al. | 2404.07649 | null |
| 2024-04-11 | GLID: Pre-training a Generalist Encoder-Decoder Vision Model | Jihao Liu et.al. | 2404.07603 | null |
| 2024-04-11 | SFSORT: Scene Features-based Simple Online Real-Time Tracker | M. M. Morsali et.al. | 2404.07553 | link |
| 2024-04-11 | The Sydney Radio Star Catalogue: properties of radio stars at megahertz to gigahertz frequencies | Laura N. Driessen et.al. | 2404.07418 | null |
| 2024-04-11 | Simplifying Two-Stage Detectors for On-Device Inference in Remote Sensing | Jaemin Kang et.al. | 2404.07405 | null |
| 2024-04-11 | A fine-tuning workflow for automatic first-break picking with deep learning | Amir Mardan et.al. | 2404.07400 | link |
| 2024-04-10 | Identification of Fine-grained Systematic Errors via Controlled Scene Generation | Valentyn Boreiko et.al. | 2404.07045 | null |
| 2024-04-10 | Accurate Tennis Court Line Detection on Amateur Recorded Matches | Sameer Agrawal et.al. | 2404.06977 | null |
| 2024-04-10 | SARA: Smart AI Reading Assistant for Reading Comprehension | Enkeleda Thaqi et.al. | 2404.06906 | null |
| 2024-04-10 | Sparse Points to Dense Clouds: Enhancing 3D Detection with Limited LiDAR Data | Aakash Kumar et.al. | 2404.06715 | null |
| 2024-04-10 | Scaling Multi-Camera 3D Object Detection through Weak-to-Strong Eliciting | Hao Lu et.al. | 2404.06700 | link |
| 2024-04-09 | Learning Embeddings with Centroid Triplet Loss for Object Identification in Robotic Grasping | Anas Gouda et.al. | 2404.06277 | link |
| 2024-04-09 | Label-Efficient 3D Object Detection For Road-Side Units | Minh-Quan Dao et.al. | 2404.06256 | null |
| 2024-04-09 | Automatic Defect Detection in Sewer Network Using Deep Learning Based Object Detector | Bach Ha et.al. | 2404.06219 | null |
| 2024-04-09 | YOLC: You Only Look Clusters for Tiny Object Detection in Aerial Images | Chenguang Liu et.al. | 2404.06180 | null |
| 2024-04-09 | Enhanced Radar Perception via Multi-Task Learning: Towards Refined Data for Sensor Fusion Applications | Huawei Sun et.al. | 2404.06165 | null |
| 2024-04-09 | Improving Facial Landmark Detection Accuracy and Efficiency with Knowledge Distillation | Zong-Wei Hong et.al. | 2404.06029 | null |
| 2024-04-08 | Retrieval-Augmented Open-Vocabulary Object Detection | Jooyeon Kim et.al. | 2404.05687 | link |
| 2024-04-08 | 3D-COCO: extension of MS-COCO dataset for image detection and 3D reconstruction modules | Maxence Bideaux et.al. | 2404.05641 | null |
| 2024-04-08 | PetKaz at SemEval-2024 Task 8: Can Linguistics Capture the Specifics of LLM-generated Text? | Kseniia Petukhova et.al. | 2404.05483 | null |
| 2024-04-08 | Detecting Every Object from Events | Haitian Zhang et.al. | 2404.05285 | link |
| 2024-04-08 | MOSE: Boosting Vision-based Roadside 3D Object Detection with Scene Cues | Xiahan Chen et.al. | 2404.05280 | null |
| 2024-04-08 | Rendering-Enhanced Automatic Image-to-Point Cloud Registration for Roadside Scenes | Yu Sheng et.al. | 2404.05164 | null |
| 2024-04-08 | Better Monocular 3D Detectors with LiDAR from the Past | Yurong You et.al. | 2404.05139 | link |
| 2024-04-07 | AirShot: Efficient Few-Shot Detection for Autonomous Exploration | Zihan Wang et.al. | 2404.05069 | link |
| 2024-04-07 | PlateSegFL: A Privacy-Preserving License Plate Detection Using Federated Segmentation Learning | Md. Shahriar Rahman Anuvab et.al. | 2404.05049 | null |
| 2024-04-07 | PathFinder: Attention-Driven Dynamic Non-Line-of-Sight Tracking with a Mobile Robot | Shenbagaraj Kannapiran et.al. | 2404.05024 | null |
| 2024-04-05 | SCAResNet: A ResNet Variant Optimized for Tiny Object Detection in Transmission and Distribution Towers | Weile Li et.al. | 2404.04179 | link |
| 2024-04-05 | Designing Robots to Help Women | Martin Cooney et.al. | 2404.04123 | null |
| 2024-04-04 | Is CLIP the main roadblock for fine-grained open-world perception? | Lorenzo Bianchi et.al. | 2404.03539 | link |
| 2024-04-04 | DQ-DETR: DETR with Dynamic Query for Tiny Object Detection | Yi-Xin Huang et.al. | 2404.03507 | link |
| 2024-04-05 | A Methodology to Study the Impact of Spiking Neural Network Parameters considering Event-Based Automotive Data | Iqra Bano et.al. | 2404.03493 | null |
| 2024-04-04 | MonoCD: Monocular 3D Object Detection with Complementary Depths | Longfei Yan et.al. | 2404.03181 | link |
| 2024-04-03 | DPFT: Dual Perspective Fusion Transformer for Camera-Radar-based Object Detection | Felix Fent et.al. | 2404.03015 | null |
| 2024-04-03 | ALOHa: A New Measure for Hallucination in Captioning Models | Suzanne Petryk et.al. | 2404.02904 | null |
| 2024-04-03 | FlightScope: A Deep Comprehensive Assessment of Aircraft Detection Algorithms in Satellite Imagery | Safouane El Ghazouali et.al. | 2404.02877 | link |
| 2024-04-03 | HENet: Hybrid Encoding for End-to-end Multi-task 3D Perception from Multi-view Cameras | Zhongyu Xia et.al. | 2404.02517 | link |
| 2024-04-04 | TE-TAD: Towards Full End-to-End Temporal Action Detection via Time-Aligned Coordinate Expression | Ho-Joong Kim et.al. | 2404.02405 | null |
| 2024-04-04 | EGTR: Extracting Graph from Transformer for Scene Graph Generation | Jinbae Im et.al. | 2404.02072 | link |
| 2024-04-03 | Cooperative Students: Navigating Unsupervised Domain Adaptation in Nighttime Object Detection | Jicheng Yuan et.al. | 2404.01988 | link |
| 2024-04-02 | Towards Enhanced Analysis of Lung Cancer Lesions in EBUS-TBNA – A Semi-Supervised Video Object Detection Method | Jyun-An Lin et.al. | 2404.01929 | null |
| 2024-04-02 | Humanizing Machine-Generated Content: Evading AI-Text Detection through Adversarial Attack | Ying Zhou et.al. | 2404.01907 | link |
| 2024-04-02 | Scene Adaptive Sparse Transformer for Event-based Object Detection | Yansong Peng et.al. | 2404.01882 | link |
| 2024-04-02 | Semi-Supervised Domain Adaptation for Wildfire Detection | JooYoung Jang et.al. | 2404.01842 | null |
| 2024-04-02 | Sparse Semi-DETR: Sparse Learnable Queries for Semi-Supervised Object Detection | Tahira Shehzadi et.al. | 2404.01819 | null |
| 2024-04-02 | Analyzing the Single Event Upset Vulnerability of Binarized Neural Networks on SRAM FPGAs | Ioanna Souvatzoglou et.al. | 2404.01757 | null |
| 2024-04-02 | Disentangled Pre-training for Human-Object Interaction Detection | Zhuolong Li et.al. | 2404.01725 | null |
| 2024-04-02 | Task Integration Distillation for Object Detectors | Hai Su et.al. | 2404.01699 | null |
| 2024-03-29 | PLoc: A New Evaluation Criterion Based on Physical Location for Autonomous Driving Datasets | Ruining Yang et.al. | 2403.19893 | null |
| 2024-03-29 | MambaMixer: Efficient Selective State Space Models with Dual Token and Channel Selection | Ali Behrouz et.al. | 2403.19888 | link |
| 2024-03-28 | DenseNets Reloaded: Paradigm Shift Beyond ResNets and ViTs | Donghyun Kim et.al. | 2403.19588 | link |
| 2024-03-28 | OV-Uni3DETR: Towards Unified Open-Vocabulary 3D Object Detection via Cycle-Modality Propagation | Zhenyu Wang et.al. | 2403.19580 | null |
| 2024-03-28 | AIpom at SemEval-2024 Task 8: Detecting AI-produced Outputs in M4 | Alexander Shirnin et.al. | 2403.19354 | null |
| 2024-03-28 | Sparse Generation: Making Pseudo Labels Sparse for weakly supervision with points | Tian Ma et.al. | 2403.19306 | null |
| 2024-03-28 | CAT: Exploiting Inter-Class Dynamics for Domain Adaptive Object Detection | Mikhail Kennerley et.al. | 2403.19278 | link |
| 2024-03-28 | Algorithmic Ways of Seeing: Using Object Detection to Facilitate Art Exploration | Louie Søs Meyer et.al. | 2403.19174 | null |
| 2024-03-28 | CRKD: Enhanced Camera-Radar Object Detection with Cross-modality Knowledge Distillation | Lingjun Zhao et.al. | 2403.19104 | null |
| 2024-03-28 | A Real-Time Framework for Domain-Adaptive Underwater Object Detection with Image Enhancement | Junjie Wen et.al. | 2403.19079 | null |
| 2024-03-27 | Illicit object detection in X-ray images using Vision Transformers | Jorgen Cani et.al. | 2403.19043 | null |
| 2024-03-27 | Benchmarking Object Detectors with COCO: A New Path Forward | Shweta Singh et.al. | 2403.18819 | link |
| 2024-03-27 | PhysicsAssistant: An LLM-Powered Interactive Learning Robot for Physics Lab Investigations | Ehsan Latif et.al. | 2403.18721 | null |
| 2024-03-27 | CosalPure: Learning Concept from Group Images for Robust Co-Saliency Detection | Jiayi Zhu et.al. | 2403.18554 | null |
| 2024-03-27 | BAM: Box Abstraction Monitors for Real-time OoD Detection in Object Detection | Changshun Wu et.al. | 2403.18373 | null |
| 2024-03-27 | Ship in Sight: Diffusion Models for Ship-Image Super Resolution | Luigi Sigillo et.al. | 2403.18370 | link |
| 2024-03-27 | DODA: Diffusion for Object-detection Domain Adaptation in Agriculture | Shuai Xiang et.al. | 2403.18334 | null |
| 2024-03-27 | Tracking-Assisted Object Detection with Event Cameras | Ting-Kang Yen et.al. | 2403.18330 | null |
| 2024-03-27 | SGDM: Static-Guided Dynamic Module Make Stronger Visual Models | Wenjie Xing et.al. | 2403.18282 | null |
| 2024-03-27 | Road Obstacle Detection based on Unknown Objectness Scores | Chihiro Noguchi et.al. | 2403.18207 | null |
| 2024-03-26 | State of the art applications of deep learning within tracking and detecting marine debris: A survey | Zoe Moorton et.al. | 2403.18067 | null |
| 2024-03-26 | The Solution for the CVPR 2023 1st foundation model challenge-Track2 | Haonan Xu et.al. | 2403.17702 | null |
| 2024-03-26 | PlainMamba: Improving Non-Hierarchical Mamba in Visual Recognition | Chenhongyi Yang et.al. | 2403.17695 | link |
| 2024-03-26 | UADA3D: Unsupervised Adversarial Domain Adaptation for 3D Object Detection with Sparse LiDAR and Large Domain Gaps | Maciej K Wozniak et.al. | 2403.17633 | null |
| 2024-03-26 | SSF3D: Strict Semi-Supervised 3D Object Detection with Switching Filter | Songbur Wong et.al. | 2403.17390 | null |
| 2024-03-26 | Decoupled Pseudo-labeling for Semi-Supervised Monocular 3D Object Detection | Jiacheng Zhang et.al. | 2403.17387 | null |
| 2024-03-26 | AIDE: An Automatic Data Engine for Object Detection in Autonomous Driving | Mingfu Liang et.al. | 2403.17373 | null |
| 2024-03-26 | Staircase Localization for Autonomous Exploration in Urban Environments | Jinrae Kim et.al. | 2403.17330 | null |
| 2024-03-25 | Co-Occurring of Object Detection and Identification towards unlabeled object discovery | Binay Kumar Singh et.al. | 2403.17223 | null |
| 2024-03-25 | Optimizing LiDAR Placements for Robust Driving Perception in Adverse Conditions | Ye Li et.al. | 2403.17009 | link |
| 2024-03-25 | Isolated Diffusion: Optimizing Multi-Concept Text-to-Image Generation Training-Freely with Isolated Diffusion Guidance | Jingyuan Zhu et.al. | 2403.16954 | null |
| 2024-03-25 | TrustAI at SemEval-2024 Task 8: A Comprehensive Analysis of Multi-domain Machine Generated Text Detection Techniques | Ashok Urlana et.al. | 2403.16592 | null |
| 2024-03-25 | RCBEVDet: Radar-camera Fusion in Bird’s Eye View for 3D Object Detection | Zhiwei Lin et.al. | 2403.16440 | link |
| 2024-03-25 | ASDF: Assembly State Detection Utilizing Late Fusion by Integrating 6D Pose Estimation | Hannah Schieber et.al. | 2403.16400 | link |
| 2024-03-25 | Impact of Video Compression Artifacts on Fisheye Camera Visual Perception Tasks | Madhumitha Sakthi et.al. | 2403.16338 | null |
| 2024-03-24 | Cross-domain Multi-modal Few-shot Object Detection via Rich Text | Zeyu Shangguan et.al. | 2403.16188 | null |
| 2024-03-24 | Semantic Is Enough: Only Semantic Information For NeRF Reconstruction | Ruibo Wang et.al. | 2403.16043 | null |
| 2024-03-23 | Adversarial Defense Teacher for Cross-Domain Object Detection under Poor Visibility Conditions | Kaiwen Wang et.al. | 2403.15786 | null |
| 2024-03-23 | EAGLE: A Domain Generalization Framework for AI-generated Text Detection | Amrita Bhattacharjee et.al. | 2403.15690 | null |
| 2024-03-25 | Point-DETR3D: Leveraging Imagery Data with Spatial Point Prior for Weakly Semi-supervised 3D Object Detection | Hongzhi Gao et.al. | 2403.15317 | null |
| 2024-03-22 | CR3DT: Camera-RADAR Fusion for 3D Detection and Tracking | Nicolas Baumann et.al. | 2403.15313 | link |
| 2024-03-22 | IS-Fusion: Instance-Scene Collaborative Fusion for Multimodal 3D Object Detection | Junbo Yin et.al. | 2403.15241 | null |
| 2024-03-22 | MSCoTDet: Language-driven Multi-modal Fusion for Improved Multispectral Pedestrian Detection | Taeheon Kim et.al. | 2403.15209 | null |
| 2024-03-22 | SFOD: Spiking Fusion Object Detector | Yimeng Fan et.al. | 2403.15192 | link |
| 2024-03-22 | CRPlace: Camera-Radar Fusion with BEV Representation for Place Recognition | Shaowei Fu et.al. | 2403.15183 | null |
| 2024-03-22 | An In-Depth Analysis of Data Reduction Methods for Sustainable Deep Learning | Víctor Toscano-Durán et.al. | 2403.15150 | null |
| 2024-03-22 | Gradient-based Sampling for Class Imbalanced Semi-supervised Object Detection | Jiaming Li et.al. | 2403.15127 | link |
| 2024-03-22 | VRSO: Visual-Centric Reconstruction for Static Object Annotation | Chenyao Yu et.al. | 2403.15026 | null |
| 2024-03-22 | Vehicle Detection Performance in Nordic Region | Hamam Mokayed et.al. | 2403.15017 | null |
| 2024-03-21 | T-Rex2: Towards Generic Object Detection via Text-Visual Prompt Synergy | Qing Jiang et.al. | 2403.14610 | link |
| 2024-03-21 | UAV-Assisted Maritime Search and Rescue: A Holistic Approach | Martin Messmer et.al. | 2403.14281 | null |
| 2024-03-21 | Scene-Graph ViT: End-to-End Open-Vocabulary Visual Relationship Detection | Tim Salzmann et.al. | 2403.14270 | null |
| 2024-03-21 | 3D Object Detection from Point Cloud via Voting Step Diffusion | Haoran Hou et.al. | 2403.14133 | null |
| 2024-03-20 | EcoSense: Energy-Efficient Intelligent Sensing for In-Shore Ship Detection through Edge-Cloud Collaboration | Wenjun Huang et.al. | 2403.14027 | null |
| 2024-03-20 | RAR: Retrieving And Ranking Augmented MLLMs for Visual Recognition | Ziyu Liu et.al. | 2403.13805 | link |
| 2024-03-20 | Bounding Box Stability against Feature Dropout Reflects Detector Generalization across Environments | Yang Yang et.al. | 2403.13803 | link |
| 2024-03-20 | Fostc3net:A Lightweight YOLOv5 Based On the Network Structure Optimization | Danqing Ma et.al. | 2403.13703 | null |
| 2024-03-20 | Find n’ Propagate: Open-Vocabulary 3D Object Detection in Urban Environments | Djamahl Etchegaray et.al. | 2403.13556 | link |
| 2024-03-20 | MTP: Advancing Remote Sensing Foundation Model via Multi-Task Pretraining | Di Wang et.al. | 2403.13430 | link |
| 2024-03-20 | Few-shot Oriented Object Detection with Memorable Contrastive Learning in Remote Sensing Images | Jiawei Zhou et.al. | 2403.13375 | null |
| 2024-03-20 | Adaptive Ensembles of Fine-Tuned Transformers for LLM-Generated Text Detection | Zhixin Lai et.al. | 2403.13335 | null |
| 2024-03-20 | DetDiffusion: Synergizing Generative and Perceptive Models for Enhanced Data Generation and Perception | Yibo Wang et.al. | 2403.13304 | null |
| 2024-03-20 | Facilitating Pornographic Text Detection for Open-Domain Dialogue Systems via Knowledge Distillation of Large Language Models | Huachuan Qiu et.al. | 2403.13250 | null |
| 2024-03-19 | SceneScript: Reconstructing Scenes With An Autoregressive Structured Language Model | Armen Avetisyan et.al. | 2403.13064 | null |
| 2024-03-19 | Wildfire danger prediction optimization with transfer learning | Spiros Maggioros et.al. | 2403.12871 | link |
| 2024-03-19 | As Firm As Their Foundations: Can open-sourced foundation models be used to create adversarial examples for downstream tasks? | Anjun Hu et.al. | 2403.12693 | null |
| 2024-03-19 | EAS-SNN: End-to-End Adaptive Sampling and Representation for Event-based Detection with Recurrent Spiking Neural Networks | Ziming Wang et.al. | 2403.12574 | null |
| 2024-03-19 | DetToolChain: A New Prompting Paradigm to Unleash Detection Ability of MLLM | Yixuan Wu et.al. | 2403.12488 | null |
| 2024-03-19 | TransformMix: Learning Transformation and Mixing Strategies from Data | Tsz-Him Cheung et.al. | 2403.12429 | null |
| 2024-03-19 | VisionGPT: LLM-Assisted Real-Time Anomaly Detection for Safe Visual Navigation | Hao Wang et.al. | 2403.12415 | null |
| 2024-03-19 | Entity6K: A Large Open-Domain Evaluation Dataset for Real-World Entity Recognition | Jielin Qiu et.al. | 2403.12339 | null |
| 2024-03-18 | EffiPerception: an Efficient Framework for Various Perception Tasks | Xinhao Xiang et.al. | 2403.12317 | null |
| 2024-03-18 | Prototipo de un Contador Bidireccional Automático de Personas basado en sensores de visión 3D | Benjamín Ojeda-Magaña et.al. | 2403.12310 | null |
| 2024-03-18 | Align and Distill: Unifying and Improving Domain Adaptive Object Detection | Justin Kay et.al. | 2403.12029 | link |
| 2024-03-18 | TrajectoryNAS: A Neural Architecture Search for Trajectory Prediction | Ali Asghar Sharifi et.al. | 2403.11695 | null |
| 2024-03-18 | Just Add $100 More: Augmenting NeRF-based Pseudo-LiDAR Point Cloud for Resolving Class-imbalance Problem | Mincheol Chang et.al. | 2403.11573 | null |
| 2024-03-18 | R2SNet: Scalable Domain Adaptation for Object Detection in Cloud-Based Robots Ecosystems via Proposal Refinement | Michele Antonazzi et.al. | 2403.11567 | null |
| 2024-03-18 | Continual Forgetting for Pre-trained Vision Models | Hongbo Zhao et.al. | 2403.11530 | link |
| 2024-03-17 | V2X-DGW: Domain Generalization for Multi-agent Perception under Adverse Weather Conditions | Baolu Li et.al. | 2403.11371 | link |
| 2024-03-17 | Advanced Knowledge Extraction of Physical Design Drawings, Translation and conversion to CAD formats using Deep Learning | Jesher Joshua M et.al. | 2403.11291 | null |
| 2024-03-17 | ManipVQA: Injecting Robotic Affordance and Physically Grounded Information into Multi-Modal Large Language Models | Siyuan Huang et.al. | 2403.11289 | link |
| 2024-03-17 | CPA-Enhancer: Chain-of-Thought Prompted Adaptive Enhancer for Object Detection under Unknown Degradations | Yuwei Zhang et.al. | 2403.11220 | link |
| 2024-03-17 | GRA: Detecting Oriented Objects through Group-wise Rotating and Attention | Jiangshan Wang et.al. | 2403.11127 | null |
| 2024-03-17 | Self-supervised co-salient object detection via feature correspondence at multiple scales | Souradeep Chakraborty et.al. | 2403.11107 | link |
| 2024-03-14 | Open-Vocabulary Object Detection with Meta Prompt Representation and Instance Contrastive Optimization | Zhao Wang et.al. | 2403.09433 | null |
| 2024-03-14 | D3T: Distinctive Dual-Domain Teacher Zigzagging Across RGB-Thermal Gap for Domain-Adaptive Object Detection | Dinh Phat Do et.al. | 2403.09359 | link |
| 2024-03-14 | Griffon v2: Advancing Multimodal Perception with High-Resolution Scaling and Visual-Language Co-Referring | Yufei Zhan et.al. | 2403.09333 | link |
| 2024-03-14 | EfficientMFD: Towards More Efficient Multimodal Synchronous Fusion Detection | Jiaqing Zhang et.al. | 2403.09323 | link |
| 2024-03-14 | Knowledge Distillation in YOLOX-ViT for Side-Scan Sonar Object Detection | Martin Aubard et.al. | 2403.09313 | link |
| 2024-03-14 | MOTPose: Multi-object 6D Pose Estimation for Dynamic Video Sequences using Attention-based Temporal Fusion | Arul Selvam Periyasamy et.al. | 2403.09309 | null |
| 2024-03-14 | CLIP-EBC: CLIP Can Count Accurately through Enhanced Blockwise Classification | Yiming Ma et.al. | 2403.09281 | link |
| 2024-03-14 | D-YOLO a robust framework for object detection in adverse weather conditions | Zihan Chu et.al. | 2403.09233 | null |
| 2024-03-14 | Improving Distant 3D Object Detection Using 2D Box Supervision | Zetong Yang et.al. | 2403.09230 | null |
| 2024-03-14 | PoIFusion: Multi-Modal 3D Object Detection via Fusion at Points of Interest | Jiajun Deng et.al. | 2403.09212 | null |
| 2024-03-13 | VLOGGER: Multimodal Diffusion for Embodied Avatar Synthesis | Enric Corona et.al. | 2403.08764 | null |
| 2024-03-13 | MIM4D: Masked Modeling with Multi-View Video for Autonomous Driving Representation Learning | Jialv Zou et.al. | 2403.08760 | link |
| 2024-03-13 | Data Augmentation in Human-Centric Vision | Wentao Jiang et.al. | 2403.08650 | null |
| 2024-03-13 | PRAGO: Differentiable Multi-View Pose Optimization From Objectness Detections | Matteo Taiana et.al. | 2403.08586 | null |
| 2024-03-13 | A Multimodal Fusion Network For Student Emotion Recognition Based on Transformer and Tensor Product | Ao Xiang et.al. | 2403.08511 | null |
| 2024-03-13 | Improved YOLOv5 Based on Attention Mechanism and FasterNet for Foreign Object Detection on Railway and Airway tracks | Zongqing Qi et.al. | 2403.08499 | null |
| 2024-03-13 | IAMCV Multi-Scenario Vehicle Interaction Dataset | Novel Certad et.al. | 2403.08455 | null |
| 2024-03-13 | Advancing Security in AI Systems: A Novel Approach to Detecting Backdoors in Deep Neural Networks | Khondoker Murad Hossain et.al. | 2403.08208 | null |
| 2024-03-12 | TaskCLIP: Extend Large Vision-Language Model for Task Oriented Object Detection | Hanning Chen et.al. | 2403.08108 | null |
| 2024-03-12 | Aedes aegypti Egg Counting with Neural Networks for Object Detection | Micheli Nayara de Oliveira Vicente et.al. | 2403.08016 | null |
| 2024-03-12 | Mondrian: On-Device High-Performance Video Analytics with Compressive Packed Inference | Changmin Jeon et.al. | 2403.07598 | null |
| 2024-03-12 | PeLK: Parameter-efficient Large Kernel ConvNets with Peripheral Convolution | Honghao Chen et.al. | 2403.07589 | null |
| 2024-03-12 | A Survey of Vision Transformers in Autonomous Driving: Current Trends and Future Directions | Quoc-Vinh Lai-Dang et.al. | 2403.07542 | null |
| 2024-03-12 | JSTR: Joint Spatio-Temporal Reasoning for Event-based Moving Object Detection | Hanyu Zhou et.al. | 2403.07436 | null |
| 2024-03-12 | Eliminating Cross-modal Conflicts in BEV Space for LiDAR-Camera 3D Object Detection | Jiahui Fu et.al. | 2403.07372 | null |
| 2024-03-12 | GPT-generated Text Detection: Benchmark Dataset and Tensor-based Detection Method | Zubair Qazi et.al. | 2403.07321 | link |
| 2024-03-12 | MENTOR: Multilingual tExt detectioN TOward leaRning by analogy | Hsin-Ju Lin et.al. | 2403.07286 | null |
| 2024-03-12 | SparseLIF: High-Performance Sparse LiDAR-Camera Fusion for 3D Object Detection | Hongcheng Zhang et.al. | 2403.07284 | null |
| 2024-03-12 | Adaptive Bounding Box Uncertainties via Two-Step Conformal Prediction | Alexander Timans et.al. | 2403.07263 | null |
| 2024-03-11 | Class Imbalance in Object Detection: An Experimental Diagnosis and Study of Mitigation Strategies | Nieves Crasto et.al. | 2403.07113 | link |
| 2024-03-11 | Real-time Transformer-based Open-Vocabulary Detection with Efficient Fusion Head | Tiancheng Zhao et.al. | 2403.06892 | link |
| 2024-03-11 | LeOCLR: Leveraging Original Images for Contrastive Learning of Visual Representations | Mohammad Alkhalefi et.al. | 2403.06813 | null |
| 2024-03-11 | Genetic Learning for Designing Sim-to-Real Data Augmentations | Bram Vanherle et.al. | 2403.06786 | null |
| 2024-03-11 | Evaluating the Energy Efficiency of Few-Shot Learning for Object Detection in Industrial Settings | Georgios Tsoumplekas et.al. | 2403.06631 | null |
| 2024-03-11 | Cross-domain and Cross-dimension Learning for Image-to-Graph Transformers | Alexander H. Berger et.al. | 2403.06601 | null |
| 2024-03-11 | SARDet-100K: Towards Open-Source Benchmark and ToolKit for Large-Scale SAR Object Detection | Yuxuan Li et.al. | 2403.06534 | link |
| 2024-03-11 | 3D Semantic Segmentation-Driven Representations for 3D Object Detection | Hayeon O et.al. | 2403.06501 | null |
| 2024-03-11 | Fine-Grained Pillar Feature Encoding Via Spatio-Temporal Virtual Grid for 3D Object Detection | Konyul Park et.al. | 2403.06433 | null |
| 2024-03-10 | Transformer based Multitask Learning for Image Captioning and Object Detection | Debolena Basak et.al. | 2403.06292 | null |
| 2024-03-10 | Poly Kernel Inception Network for Remote Sensing Detection | Xinhao Cai et.al. | 2403.06258 | link |
| 2024-03-08 | EVD4UAV: An Altitude-Sensitive Benchmark to Evade Vehicle Detection in UAV | Huiming Sun et.al. | 2403.05422 | null |
| 2024-03-08 | SIRST-5K: Exploring Massive Negatives Synthesis with Self-supervised Learning for Robust Infrared Small Target Detection | Yahao Lu et.al. | 2403.05416 | link |
| 2024-03-08 | Exploring Robust Features for Few-Shot Object Detection in Satellite Imagery | Xavier Bou et.al. | 2403.05381 | null |
| 2024-03-08 | Frequency-Adaptive Dilated Convolution for Semantic Segmentation | Linwei Chen et.al. | 2403.05369 | link |
| 2024-03-08 | VLM-PL: Advanced Pseudo Labeling approach Class Incremental Object Detection with Vision-Language Model | Junsu Kim et.al. | 2403.05346 | null |
| 2024-03-08 | Improving the Successful Robotic Grasp Detection Using Convolutional Neural Networks | Hamed Hosseini et.al. | 2403.05211 | null |
| 2024-03-08 | LanePtrNet: Revisiting Lane Detection as Point Voting and Grouping on Curves | Jiayan Cao et.al. | 2403.05155 | null |
| 2024-03-08 | RadarDistill: Boosting Radar-based Object Detection Performance via Knowledge Distillation from LiDAR Features | Geonho Bang et.al. | 2403.05061 | null |
| 2024-03-08 | ActFormer: Scalable Collaborative Perception via Active Queries | Suozhi Huang et.al. | 2403.04968 | null |
| 2024-03-07 | FriendNet: Detection-Friendly Dehazing Network | Yihua Fan et.al. | 2403.04443 | null |
| 2024-03-07 | Effectiveness Assessment of Recent Large Vision-Language Models | Yao Jiang et.al. | 2403.04306 | null |
| 2024-03-07 | ACC-ViT : Atrous Convolution’s Comeback in Vision Transformers | Nabil Ibtehaz et.al. | 2403.04200 | null |
| 2024-03-07 | CN-RMA: Combined Network with Ray Marching Aggregation for 3D Indoors Object Detection from Multi-view Images | Guanlin Shen et.al. | 2403.04198 | null |
| 2024-03-07 | Scalable and Robust Transformer Decoders for Interpretable Image Classification with Foundation Models | Evelyn Mannix et.al. | 2403.04125 | null |
| 2024-03-07 | CMDA: Cross-Modal and Domain Adversarial Adaptation for LiDAR-Based 3D Object Detection | Gyusam Chang et.al. | 2403.03721 | null |
| 2024-03-06 | Adversarial Infrared Geometry: Using Geometry to Perform Adversarial Attack against Infrared Pedestrian Detectors | Kalibinuer Tiliwalidi et.al. | 2403.03674 | null |
| 2024-03-06 | Towards Detecting AI-Generated Text within Human-AI Collaborative Hybrid Texts | Zijie Zeng et.al. | 2403.03506 | link |
| 2024-03-06 | Multi-task Learning for Real-time Autonomous Driving Leveraging Task-adaptive Attention Generator | Wonhyeok Choi et.al. | 2403.03468 | null |
| 2024-03-06 | FLAME Diffuser: Grounded Wildfire Image Synthesis using Mask Guided Diffusion | Hao Wang et.al. | 2403.03463 | null |
| 2024-03-06 | Performance Evaluation of Semi-supervised Learning Frameworks for Multi-Class Weed Detection | Jiajia Li et.al. | 2403.03390 | link |
| 2024-03-05 | Detecting Concrete Visual Tokens for Multimodal Machine Translation | Braeden Bowen et.al. | 2403.03075 | null |
| 2024-03-05 | Loss Design for Single-carrier Joint Communication and Neural Network-based Sensing | Charlotte Muth et.al. | 2403.02929 | null |
| 2024-03-05 | Are Dense Labels Always Necessary for 3D Object Detection from Point Cloud? | Chenqiang Gao et.al. | 2403.02818 | null |
| 2024-03-05 | Bootstrapping Rare Object Detection in High-Resolution Satellite Imagery | Akram Zaytar et.al. | 2403.02736 | null |
| 2024-03-05 | FastOcc: Accelerating 3D Occupancy Prediction by Fusing the 2D Bird’s-Eye View and Perspective View | Jiawei Hou et.al. | 2403.02710 | null |
| 2024-03-05 | False Positive Sampling-based Data Augmentation for Enhanced 3D Object Detection Accuracy | Jiyong Oh et.al. | 2403.02639 | null |
| 2024-03-05 | BSDP: Brain-inspired Streaming Dual-level Perturbations for Online Open World Object Detection | Yu Chen et.al. | 2403.02637 | null |
| 2024-03-04 | NiNformer: A Network in Network Transformer with Token Mixing Generated Gating Function | Abdullah Nazhat Abdullah et.al. | 2403.02411 | link |
| 2024-03-04 | COMMIT: Certifying Robustness of Multi-Sensor Fusion Systems against Semantic Attacks | Zijian Huang et.al. | 2403.02329 | null |
| 2024-03-04 | Scalable Vision-Based 3D Object Detection and Monocular Depth Estimation for Autonomous Driving | Yuxuan Liu et.al. | 2403.02037 | link |
| 2024-03-02 | TUMTraf V2X Cooperative Perception Dataset | Walter Zimmer et.al. | 2403.01316 | null |
| 2024-03-02 | Causal Mode Multiplexer: A Novel Framework for Unbiased Multispectral Pedestrian Detection | Taeheon Kim et.al. | 2403.01300 | null |
| 2024-03-02 | Run-time Introspection of 2D Object Detection in Automated Driving Systems Using Learning Representations | Hakan Yekta Yatbaz et.al. | 2403.01172 | null |
| 2024-03-02 | ELA: Efficient Local Attention for Deep Convolutional Neural Networks | Wei Xu et.al. | 2403.01123 | null |
| 2024-03-02 | Face Swap via Diffusion Model | Feifei Wang et.al. | 2403.01108 | link |
| 2024-03-02 | Beyond Night Visibility: Adaptive Multi-Scale Fusion of Infrared and Visible Images | Shufan Pei et.al. | 2403.01083 | null |
| 2024-03-01 | Learning Causal Features for Incremental Object Detection | Zhenwei He et.al. | 2403.00591 | null |
| 2024-03-01 | Abductive Ego-View Accident Video Understanding for Safe Driving Perception | Jianwu Fang et.al. | 2403.00436 | null |
| 2024-03-04 | DAMS-DETR: Dynamic Adaptive Multispectral Detection Transformer with Competitive Query Selection and Adaptive Feature Fusion | Junjie Guo et.al. | 2403.00326 | null |
| 2024-03-01 | ODM: A Text-Image Further Alignment Pre-training Approach for Scene Text Detection and Spotting | Chen Duan et.al. | 2403.00303 | link |
| 2024-02-29 | SeMoLi: What Moves Together Belongs Together | Jenny Seidenschwarz et.al. | 2402.19463 | null |
| 2024-02-29 | Genie: Smart ROS-based Caching for Connected Autonomous Robots | Zexin Li et.al. | 2402.19410 | null |
| 2024-02-29 | ProtoP-OD: Explainable Object Detection with Prototypical Parts | Pavlos Rath-Manakidis et.al. | 2402.19142 | null |
| 2024-02-29 | Theoretically Achieving Continuous Representation of Oriented Bounding Boxes | Zikai Xiao et.al. | 2402.18975 | link |
| 2024-02-29 | Boosting Semi-Supervised Object Detection in Remote Sensing Images With Active Teaching | Boxuan Zhang et.al. | 2402.18958 | null |
| 2024-02-29 | Edge Computing Enabled Real-Time Video Analysis via Adaptive Spatial-Temporal Semantic Filtering | Xiang Chen et.al. | 2402.18927 | null |
| 2024-02-29 | A Simple yet Effective Network based on Vision Transformer for Camouflaged Object and Salient Object Detection | Chao Hao et.al. | 2402.18922 | null |
| 2024-02-29 | Privacy-Preserving Autoencoder for Collaborative Object Detection | Bardia Azizian et.al. | 2402.18864 | null |
| 2024-02-29 | Debiased Novel Category Discovering and Localization | Juexiao Feng et.al. | 2402.18821 | null |
| 2024-02-28 | Spatial Coherence Loss for Salient and Camouflaged Object Detection and Beyond | Ziyun Yang et.al. | 2402.18698 | null |
| 2024-02-28 | UniMODE: Unified Monocular 3D Object Detection | Zhuoling Li et.al. | 2402.18573 | null |
| 2024-02-28 | Detection of Micromobility Vehicles in Urban Traffic Videos | Khalil Sabri et.al. | 2402.18503 | link |
| 2024-02-28 | Sunshine to Rainstorm: Cross-Weather Knowledge Distillation for Robust 3D Object Detection | Xun Huang et.al. | 2402.18493 | null |
| 2024-02-28 | Prompt-Driven Dynamic Object-Centric Learning for Single Domain Generalization | Deng Li et.al. | 2402.18447 | null |
| 2024-02-28 | Unveiling novel insights into Kirchhoff migration for effective object detection using experimental Fresnel dataset | Won-Kwang Park et.al. | 2402.18322 | null |
| 2024-02-28 | Zero-Shot Aerial Object Detection with Visual Description Regularization | Zhengqing Zang et.al. | 2402.18233 | null |
| 2024-02-28 | VulMCI : Code Splicing-based Pixel-row Oversampling for More Continuous Vulnerability Image Generation | Tao Peng et.al. | 2402.18189 | null |
| 2024-02-27 | SDDGR: Stable Diffusion-based Deep Generative Replay for Class Incremental Object Detection | Junsu Kim et.al. | 2402.17323 | null |
| 2024-02-27 | A Vanilla Multi-Task Framework for Dense Visual Prediction Solution to 1st VCL Challenge – Multi-Task Robustness Track | Zehui Chen et.al. | 2402.17319 | null |
| 2024-02-27 | Probing Multimodal Large Language Models for Global and Local Semantic Representation | Mingxu Tao et.al. | 2402.17304 | null |
(<a href=../README.md>back to main</a>)