Object Detection - 2024-11 | Paper Arxiv Daily

Object Detection - 2024-11

Publish Date	Title	Authors	PDF	Translate	Read	Code
2024-11-29	SpaRC: Sparse Radar-Camera Fusion for 3D Object Detection	Philipp Wolters et.al.	2411.19860	translate	read	null
2024-11-29	Feedback-driven object detection and iterative model improvement	Sönke Tenckhoff et.al.	2411.19835	translate	read	link
2024-11-29	Real-Time Anomaly Detection in Video Streams	Fabien Poirier et.al.	2411.19731	translate	read	null
2024-11-29	LDA-AQU: Adaptive Query-guided Upsampling via Local Deformable Attention	Zewen Du et.al.	2411.19585	translate	read	link
2024-11-29	Bootstraping Clustering of Gaussians for View-consistent 3D Scene Understanding	Wenbo Zhang et.al.	2411.19551	translate	read	null
2024-11-28	Automatic Prompt Generation and Grounding Object Detection for Zero-Shot Image Anomaly Detection	Tsun-Hin Cheung et.al.	2411.19220	translate	read	null
2024-11-28	Co-Learning: Towards Semi-Supervised Object Detection with Road-side Cameras	Jicheng Yuan et.al.	2411.19143	translate	read	null
2024-11-28	On Moving Object Segmentation from Monocular Video with Transformers	Christian Homeyer et.al.	2411.19141	translate	read	null
2024-11-28	Dynamic Attention and Bi-directional Fusion for Safety Helmet Wearing Detection	Junwei Feng et.al.	2411.19071	translate	read	null
2024-11-28	MVFormer: Diversifying Feature Normalization and Token Mixing for Efficient Vision Transformers	Jongseong Bae et.al.	2411.18995	translate	read	null
2024-11-27	Exploring Depth Information for Detecting Manipulated Face Videos	Haoyue Wang et.al.	2411.18572	translate	read	null
2024-11-27	Efficient Dynamic LiDAR Odometry for Mobile Robots with Structured Point Clouds	Jonathan Lichtenfeld et.al.	2411.18443	translate	read	link
2024-11-27	Deep Fourier-embedded Network for Bi-modal Salient Object Detection	Pengfei Lyu et.al.	2411.18409	translate	read	link
2024-11-27	Optimizing Multispectral Object Detection: A Bag of Tricks and Comprehensive Benchmarks	Chen Zhou et.al.	2411.18288	translate	read	link
2024-11-27	From Open Vocabulary to Open World: Teaching Vision Language Models to Detect Novel Objects	Zizhao Li et.al.	2411.18207	translate	read	link
2024-11-27	RPEE-HEADS: A Novel Benchmark for Pedestrian Head Detection in Crowd Videos	Mohamad Abubaker et.al.	2411.18164	translate	read	null
2024-11-27	Revisiting Misalignment in Multispectral Pedestrian Detection: A Language-Driven Approach for Cross-modal Alignment Fusion	Taeheon Kim et.al.	2411.17995	translate	read	null
2024-11-27	ROICtrl: Boosting Instance Control for Visual Generation	Yuchao Gu et.al.	2411.17949	translate	read	null
2024-11-26	Box for Mask and Mask for Box: weak losses for multi-task partially supervised learning	Hoàng-Ân Lê et.al.	2411.17536	translate	read	link
2024-11-26	TinyViM: Frequency Decoupling for Tiny Hybrid Vision Mamba	Xiaowen Ma et.al.	2411.17473	translate	read	link
2024-11-26	Communication-Efficient Cooperative SLAMMOT via Determining the Number of Collaboration Vehicles	Susu Fang et.al.	2411.17432	translate	read	null
2024-11-26	DGNN-YOLO: Dynamic Graph Neural Networks with YOLO11 for Small Object Detection and Tracking in Traffic Surveillance	Shahriar Soudeep et.al.	2411.17251	translate	read	null
2024-11-26	Event-based Spiking Neural Networks for Object Detection: A Review of Datasets, Architectures, Learning Rules, and Implementation	Craig Iaboni et.al.	2411.17006	translate	read	link
2024-11-25	Online Episodic Memory Visual Query Localization with Egocentric Streaming Object Memory	Zaira Manigrasso et.al.	2411.16934	translate	read	null
2024-11-25	Open Vocabulary Monocular 3D Object Detection	Jin Yao et.al.	2411.16833	translate	read	link
2024-11-25	Imperceptible Adversarial Examples in the Physical World	Weilin Xu et.al.	2411.16622	translate	read	null
2024-11-25	STDWeb: Simple Transient Detection pipeline for the Web	Sergey Karpov et.al.	2411.16470	translate	read	null
2024-11-25	Machine Learning for the Digital Typhoon Dataset: Extensions to Multiple Basins and New Developments in Representations and Tasks	Asanobu Kitamoto et.al.	2411.16421	translate	read	link
2024-11-25	CutS3D: Cutting Semantics in 3D for 2D Unsupervised Instance Segmentation	Leon Sick et.al.	2411.16319	translate	read	null
2024-11-25	Diagnosis of diabetic retinopathy using machine learning & deep learning technique	Eric Shah et.al.	2411.16250	translate	read	null
2024-11-25	Interpreting Object-level Foundation Models via Visual Precision Search	Ruoyu Chen et.al.	2411.16198	translate	read	null
2024-11-25	Learn from Foundation Model: Fruit Detection Model without Manual Annotation	Yanan Wang et.al.	2411.16196	translate	read	null
2024-11-25	CIA: Controllable Image Augmentation Framework Based on Stable Diffusion	Mohamed Benkedadra et.al.	2411.16128	translate	read	null
2024-11-25	You only thermoelastically deform once: Point Absorber Detection in LIGO Test Masses with YOLO	Simon R. Goode et.al.	2411.16104	translate	read	null
2024-11-25	Leverage Task Context for Object Affordance Ranking	Haojie Huang et.al.	2411.16082	translate	read	null
2024-11-22	A Real-Time DETR Approach to Bangladesh Road Object Detection for Autonomous Vehicles	Irfan Nafiz Shahan et.al.	2411.15110	translate	read	null
2024-11-22	MSSF: A 4D Radar and Camera Fusion Framework With Multi-Stage Sampling for 3D Object Detection in Autonomous Driving	Hongsi Liu et.al.	2411.15016	translate	read	null
2024-11-22	VisionPAD: A Vision-Centric Pre-training Paradigm for Autonomous Driving	Haiming Zhang et.al.	2411.14716	translate	read	null
2024-11-21	Unveiling the Hidden: A Comprehensive Evaluation of Underwater Image Enhancement and Its Impact on Object Detection	Ali Awad et.al.	2411.14626	translate	read	null
2024-11-21	DINO-X: A Unified Vision Model for Open-World Object Detection and Understanding	Tianhe Ren et.al.	2411.14347	translate	read	link
2024-11-21	AnywhereDoor: Multi-Target Backdoor Attacks on Object Detection	Jialin Lu et.al.	2411.14243	translate	read	null
2024-11-21	Transforming Static Images Using Generative Models for Video Salient Object Detection	Suhwan Cho et.al.	2411.13975	translate	read	link
2024-11-21	Multitask Learning for SAR Ship Detection with Gaussian-Mask Joint Segmentation	Ming Zhao et.al.	2411.13847	translate	read	null
2024-11-20	MambaDETR: Query-based Temporal Modeling using State Space Model for Multi-View 3D Object Detection	Tong Ning et.al.	2411.13628	translate	read	null
2024-11-20	DIS-Mine: Instance Segmentation for Disaster-Awareness in Poor-Light Condition in Underground Mines	Mizanur Rahman Jewel et.al.	2411.13544	translate	read	null
2024-11-20	A Resource Efficient Fusion Network for Object Detection in Bird’s-Eye View using Camera and Raw Radar Data	Kavin Chandrasekaran et.al.	2411.13311	translate	read	link
2024-11-20	VADet: Multi-frame LiDAR 3D Object Detection using Variable Aggregation	Chengjie Huang et.al.	2411.13186	translate	read	null
2024-11-20	RAW-Diffusion: RGB-Guided Diffusion Models for High-Fidelity RAW Image Generation	Christoph Reinders et.al.	2411.13150	translate	read	link
2024-11-20	YCB-LUMA: YCB Object Dataset with Luminance Keying for Object Localization	Thomas Pöllabauer et.al.	2411.13149	translate	read	link
2024-11-20	Video-RAG: Visually-aligned Retrieval-Augmented Long Video Comprehension	Yongdong Luo et.al.	2411.13093	translate	read	link
2024-11-20	Bounding-box Watermarking: Defense against Model Extraction Attacks on Object Detectors	Satoru Koda et.al.	2411.13047	translate	read	null
2024-11-20	Collaborative Feature-Logits Contrastive Learning for Open-Set Semi-Supervised Object Detection	Xinhao Zhong et.al.	2411.13001	translate	read	null
2024-11-19	Maps from Motion (MfM): Generating 2D Semantic Maps from Sparse Multi-view Images	Matteo Toso et.al.	2411.12620	translate	read	null
2024-11-19	GaussianPretrain: A Simple Unified 3D Gaussian Representation for Visual Pre-training in Autonomous Driving	Shaoqing Xu et.al.	2411.12452	translate	read	null
2024-11-19	Physics-Guided Detector for SAR Airplanes	Zhongling Huang et.al.	2411.12301	translate	read	link
2024-11-18	Scaling Deep Learning Research with Kubernetes on the NRP Nautilus HyperCluster	J. Alex Hurt et.al.	2411.12038	translate	read	null
2024-11-18	LightFFDNets: Lightweight Convolutional Neural Networks for Rapid Facial Forgery Detection	Günel Jabbarlı et.al.	2411.11826	translate	read	null
2024-11-18	WoodYOLO: A Novel Object Detector for Wood Species Detection in Microscopic Images	Lars Nieradzik et.al.	2411.11738	translate	read	null
2024-11-18	Exploring Emerging Trends and Research Opportunities in Visual Place Recognition	Antonios Gasteratos et.al.	2411.11481	translate	read	null
2024-11-18	SL-YOLO: A Stronger and Lighter Drone Target Detection Model	Defan Chen et.al.	2411.11477	translate	read	null
2024-11-19	EVT: Efficient View Transformation for Multi-Modal 3D Object Detection	Yongjin Lee et.al.	2411.10715	translate	read	null
2024-11-15	Vision Eagle Attention: A New Lens for Advancing Image Classification	Mahmudul Hasan et.al.	2411.10564	translate	read	link
2024-11-15	Interactive Image-Based Aphid Counting in Yellow Water Traps under Stirring Actions	Xumin Gao et.al.	2411.10357	translate	read	null
2024-11-15	RETR: Multi-View Radar Detection Transformer for Indoor Perception	Ryoma Yataka et.al.	2411.10293	translate	read	null
2024-11-15	Visual-Linguistic Agent: Towards Collaborative Contextual Object Reasoning	Jingru Yang et.al.	2411.10252	translate	read	null
2024-11-15	Real-Time AI-Driven People Tracking and Counting Using Overhead Cameras	Ishrath Ahamed et.al.	2411.10072	translate	read	null
2024-11-15	Diachronic Document Dataset for Semantic Layout Analysis	Thibault Clérice et.al.	2411.10068	translate	read	null
2024-11-14	Adversarial Attacks Using Differentiable Rendering: A Survey	Matthew Hull et.al.	2411.09749	translate	read	null
2024-11-14	Local-Global Attention: An Adaptive Mechanism for Multi-Scale Feature Integration	Yifan Shao et.al.	2411.09604	translate	read	link
2024-11-14	Long-Tailed Object Detection Pre-training: Dynamic Rebalancing Contrastive Learning with Dual Reconstruction	Chen-Long Duan et.al.	2411.09453	translate	read	null
2024-11-14	Instruction-Driven Fusion of Infrared-Visible Images: Tailoring for Diverse Downstream Tasks	Zengyi Yang et.al.	2411.09387	translate	read	null
2024-11-14	DT-JRD: Deep Transformer based Just Recognizable Difference Prediction Model for Video Coding for Machines	Junqi Liu et.al.	2411.09308	translate	read	null
2024-11-14	Cross-Modal Consistency in Multimodal Large Language Models	Xiang Zhang et.al.	2411.09273	translate	read	null
2024-11-14	LEAP:D – A Novel Prompt-based Approach for Domain-Generalized Aerial Object Detection	Chanyeong Park et.al.	2411.09180	translate	read	null
2024-11-13	Multimodal Object Detection using Depth and Image Data for Manufacturing Parts	Nazanin Mahjourian et.al.	2411.09062	translate	read	null
2024-11-13	DART-LLM: Dependency-Aware Multi-Robot Task Decomposition and Execution using Large Language Models	Yongdong Wang et.al.	2411.09022	translate	read	null
2024-11-13	UIFormer: A Unified Transformer-based Framework for Incremental Few-Shot Object Detection and Instance Segmentation	Chengyuan Zhang et.al.	2411.08569	translate	read	null
2024-11-13	Methodology for a Statistical Analysis of Influencing Factors on 3D Object Detection Performance	Anton Kuznietsov et.al.	2411.08482	translate	read	null
2024-11-13	V2X-R: Cooperative LiDAR-4D Radar Fusion for 3D Object Detection with Denoising Diffusion	Xun Huang et.al.	2411.08402	translate	read	link
2024-11-12	Large-scale Remote Sensing Image Target Recognition and Automatic Annotation	Wuzheng Dong et.al.	2411.07802	translate	read	link
2024-11-12	Efficient 3D Perception on Multi-Sweep Point Cloud with Gumbel Spatial Pruning	Jianhao Li et.al.	2411.07742	translate	read	null
2024-11-12	Depthwise Separable Convolutions with Deep Residual Convolutions	Md Arid Hasan et.al.	2411.07544	translate	read	null
2024-11-11	Transformers for Charged Particle Track Reconstruction in High Energy Physics	Samuel Van Stroud et.al.	2411.07149	translate	read	null
2024-11-11	Multi-scale Frequency Enhancement Network for Blind Image Deblurring	Yawen Xiang et.al.	2411.06893	translate	read	null
2024-11-11	Fast and Efficient Transformer-based Method for Bird’s Eye View Instance Prediction	Miguel Antunes-García et.al.	2411.06851	translate	read	link
2024-11-11	AV-PedAware: Self-Supervised Audio-Visual Fusion for Dynamic Pedestrian Awareness	Yizhuo Yang et.al.	2411.06789	translate	read	null
2024-11-11	United Domain Cognition Network for Salient Object Detection in Optical Remote Sensing Images	Yanguang Sun et.al.	2411.06703	translate	read	link
2024-11-11	Track Any Peppers: Weakly Supervised Sweet Pepper Tracking Using VLMs	Jia Syuen Lim et.al.	2411.06702	translate	read	null
2024-11-11	LFSamba: Marry SAM with Mamba for Light Field Salient Object Detection	Zhengyi Liu et.al.	2411.06652	translate	read	null
2024-11-09	Robust Detection of LLM-Generated Text: A Comparative Analysis	Yongye Su et.al.	2411.06248	translate	read	null
2024-11-09	LSSInst: Improving Geometric Modeling in LSS-Based BEV Perception with Instance Representation	Weijie Ma et.al.	2411.06173	translate	read	link
2024-11-09	AI-Compass: A Comprehensive and Effective Multi-module Testing Tool for AI Systems	Zhiyu Zhu et.al.	2411.06146	translate	read	null
2024-11-08	Open-set object detection: towards unified problem formulation and benchmarking	Hejer Ammar et.al.	2411.05564	translate	read	null
2024-11-08	ZOPP: A Framework of Zero-shot Offboard Panoptic Perception for Autonomous Driving	Tao Ma et.al.	2411.05311	translate	read	null
2024-11-08	SimpleBEV: Improved LiDAR-Camera Fusion Architecture for 3D Object Detection	Yun Zhao et.al.	2411.05292	translate	read	null
2024-11-07	On the Inherent Robustness of One-Stage Object Detection against Out-of-Distribution Data	Aitor Martinez-Seras et.al.	2411.04586	translate	read	null
2024-11-07	l0-Regularized Sparse Coding-based Interpretable Network for Multi-Modal Image Fusion	Gargi Panda et.al.	2411.04519	translate	read	null
2024-11-07	Pose2Trajectory: Using Transformers on Body Pose to Predict Tennis Player’s Trajectory	Ali K. AlShami et.al.	2411.04501	translate	read	null
2024-11-07	SuperQ-GRASP: Superquadrics-based Grasp Pose Estimation on Larger Objects for Mobile-Manipulation	Xun Tu et.al.	2411.04386	translate	read	null
2024-11-07	UEVAVD: A Dataset for Developing UAV’s Eye View Active Object Detection	Xinhua Jiang et.al.	2411.04348	translate	read	null
2024-11-07	GazeGen: Gaze-Driven User Interaction for Visual Content Generation	He-Yen Hsieh et.al.	2411.04335	translate	read	null
2024-11-06	An Enhancement of Haar Cascade Algorithm Applied to Face Recognition for Gate Pass Security	Clarence A. Antipona et.al.	2411.03831	translate	read	null
2024-11-06	Understanding the Effects of Human-written Paraphrases in LLM-generated Text Detection	Hiu Ting Lau et.al.	2411.03806	translate	read	link
2024-11-06	Efficient Fourier Filtering Network with Contrastive Learning for UAV-based Unaligned Bi-modal Salient Object Detection	Pengfei Lyu et.al.	2411.03728	translate	read	link
2024-11-06	Estimation of Psychosocial Work Environment Exposures Through Video Object Detection. Proof of Concept Using CCTV Footage	Claus D. Hansen et.al.	2411.03724	translate	read	null
2024-11-06	Hybrid Attention for Robust RGB-T Pedestrian Detection in Real-World Conditions	Arunkumar Rathinam et.al.	2411.03576	translate	read	null
2024-11-05	An Application-Agnostic Automatic Target Recognition System Using Vision Language Models	Anthony Palladino et.al.	2411.03491	translate	read	null
2024-11-05	Self-supervised cross-modality learning for uncertainty-aware object detection and recognition in applications which lack pre-labelled training data	Irum Mehboob et.al.	2411.03082	translate	read	null
2024-11-05	CRT-Fusion: Camera, Radar, Temporal Fusion Using Motion Information for 3D Object Detection	Jisong Kim et.al.	2411.03013	translate	read	null
2024-11-05	Centerness-based Instance-aware Knowledge Distillation with Task-wise Mutual Lifting for Object Detection on Drone Imagery	Bowei Du et.al.	2411.02861	translate	read	null
2024-11-05	Correlation of Object Detection Performance with Visual Saliency and Depth Estimation	Matthias Bartolo et.al.	2411.02844	translate	read	link
2024-11-05	ERUP-YOLO: Enhancing Object Detection Robustness for Adverse Weather Condition by Unified Image-Adaptive Processing	Yuka Ogino et.al.	2411.02799	translate	read	null
2024-11-05	Real-Time Text Detection with Similar Mask in Traffic, Industrial, and Natural Scenes	Xu Han et.al.	2411.02794	translate	read	link
2024-11-05	Efficient Feature Aggregation and Scale-Aware Regression for Monocular 3D Object Detection	Yifan Wang et.al.	2411.02747	translate	read	null
2024-11-05	Analysis of Multi-epoch JWST Images of $\sim 300$ Little Red Dots: Tentative Detection of Variability in a Minority of Sources	Zijian Zhang et.al.	2411.02729	translate	read	null
2024-11-04	Intelligent Video Recording Optimization using Activity Detection for Surveillance Systems	Youssef Elmir et.al.	2411.02632	translate	read	null
2024-11-04	SIRA: Scalable Inter-frame Relation and Association for Radar Perception	Ryoma Yataka et.al.	2411.02220	translate	read	null
2024-11-04	Advanced computer vision for extracting georeferenced vehicle trajectories from drone imagery	Robert Fonod et.al.	2411.02136	translate	read	null
2024-11-04	Exploiting Unlabeled Data with Multiple Expert Teachers for Open Vocabulary Aerial Object Detection and Its Orientation Adaptation	Yan Li et.al.	2411.02057	translate	read	link
2024-11-04	V-CAS: A Realtime Vehicle Anti Collision System Using Vision Transformer on Multi-Camera Streams	Muhammad Waqas Ashraf et.al.	2411.01963	translate	read	null
2024-11-04	Exploiting Contextual Uncertainty of Visual Data for Efficient Training of Deep Models	Sharat Agarwal et.al.	2411.01925	translate	read	null
2024-11-04	LiDAttack: Robust Black-box Attack on LiDAR-based Object Detection	Jinyin Chen et.al.	2411.01889	translate	read	link
2024-11-03	ROAD-Waymo: Action Awareness at Scale for Autonomous Driving	Salman Khan et.al.	2411.01683	translate	read	null
2024-11-03	OSAD: Open-Set Aircraft Detection in SAR Images	Xiayang Xiao et.al.	2411.01597	translate	read	null
2024-11-03	One for All: Multi-Domain Joint Training for Point Cloud Based 3D Object Detection	Zhenyu Wang et.al.	2411.01584	translate	read	null
2024-11-03	A Visual Question Answering Method for SAR Ship: Breaking the Requirement for Multimodal Dataset Construction and Model Fine-Tuning	Fei Wang et.al.	2411.01445	translate	read	null

(<a href=../Object_Detection.md>back to Object Detection</a>)