Object Detection - 2024-03
Object Detection - 2024-03
| Publish Date | Title | Authors | Translate | Read | Code | |
|---|---|---|---|---|---|---|
| 2024-03-29 | PLoc: A New Evaluation Criterion Based on Physical Location for Autonomous Driving Datasets | Ruining Yang et.al. | 2403.19893 | translate | read | null |
| 2024-03-29 | MambaMixer: Efficient Selective State Space Models with Dual Token and Channel Selection | Ali Behrouz et.al. | 2403.19888 | translate | read | link |
| 2024-03-28 | DenseNets Reloaded: Paradigm Shift Beyond ResNets and ViTs | Donghyun Kim et.al. | 2403.19588 | translate | read | link |
| 2024-03-28 | OV-Uni3DETR: Towards Unified Open-Vocabulary 3D Object Detection via Cycle-Modality Propagation | Zhenyu Wang et.al. | 2403.19580 | translate | read | null |
| 2024-03-28 | AIpom at SemEval-2024 Task 8: Detecting AI-produced Outputs in M4 | Alexander Shirnin et.al. | 2403.19354 | translate | read | null |
| 2024-03-28 | Sparse Generation: Making Pseudo Labels Sparse for weakly supervision with points | Tian Ma et.al. | 2403.19306 | translate | read | null |
| 2024-03-28 | CAT: Exploiting Inter-Class Dynamics for Domain Adaptive Object Detection | Mikhail Kennerley et.al. | 2403.19278 | translate | read | link |
| 2024-03-28 | Algorithmic Ways of Seeing: Using Object Detection to Facilitate Art Exploration | Louie Søs Meyer et.al. | 2403.19174 | translate | read | null |
| 2024-03-28 | CRKD: Enhanced Camera-Radar Object Detection with Cross-modality Knowledge Distillation | Lingjun Zhao et.al. | 2403.19104 | translate | read | null |
| 2024-03-28 | A Real-Time Framework for Domain-Adaptive Underwater Object Detection with Image Enhancement | Junjie Wen et.al. | 2403.19079 | translate | read | null |
| 2024-03-27 | Illicit object detection in X-ray images using Vision Transformers | Jorgen Cani et.al. | 2403.19043 | translate | read | null |
| 2024-03-27 | Benchmarking Object Detectors with COCO: A New Path Forward | Shweta Singh et.al. | 2403.18819 | translate | read | link |
| 2024-03-27 | PhysicsAssistant: An LLM-Powered Interactive Learning Robot for Physics Lab Investigations | Ehsan Latif et.al. | 2403.18721 | translate | read | null |
| 2024-03-27 | CosalPure: Learning Concept from Group Images for Robust Co-Saliency Detection | Jiayi Zhu et.al. | 2403.18554 | translate | read | null |
| 2024-03-27 | BAM: Box Abstraction Monitors for Real-time OoD Detection in Object Detection | Changshun Wu et.al. | 2403.18373 | translate | read | null |
| 2024-03-27 | Ship in Sight: Diffusion Models for Ship-Image Super Resolution | Luigi Sigillo et.al. | 2403.18370 | translate | read | link |
| 2024-03-27 | DODA: Diffusion for Object-detection Domain Adaptation in Agriculture | Shuai Xiang et.al. | 2403.18334 | translate | read | null |
| 2024-03-27 | Tracking-Assisted Object Detection with Event Cameras | Ting-Kang Yen et.al. | 2403.18330 | translate | read | null |
| 2024-03-27 | SGDM: Static-Guided Dynamic Module Make Stronger Visual Models | Wenjie Xing et.al. | 2403.18282 | translate | read | null |
| 2024-03-27 | Road Obstacle Detection based on Unknown Objectness Scores | Chihiro Noguchi et.al. | 2403.18207 | translate | read | null |
| 2024-03-26 | State of the art applications of deep learning within tracking and detecting marine debris: A survey | Zoe Moorton et.al. | 2403.18067 | translate | read | null |
| 2024-03-26 | The Solution for the CVPR 2023 1st foundation model challenge-Track2 | Haonan Xu et.al. | 2403.17702 | translate | read | null |
| 2024-03-26 | PlainMamba: Improving Non-Hierarchical Mamba in Visual Recognition | Chenhongyi Yang et.al. | 2403.17695 | translate | read | link |
| 2024-03-26 | UADA3D: Unsupervised Adversarial Domain Adaptation for 3D Object Detection with Sparse LiDAR and Large Domain Gaps | Maciej K Wozniak et.al. | 2403.17633 | translate | read | null |
| 2024-03-26 | SSF3D: Strict Semi-Supervised 3D Object Detection with Switching Filter | Songbur Wong et.al. | 2403.17390 | translate | read | null |
| 2024-03-26 | Decoupled Pseudo-labeling for Semi-Supervised Monocular 3D Object Detection | Jiacheng Zhang et.al. | 2403.17387 | translate | read | null |
| 2024-03-26 | AIDE: An Automatic Data Engine for Object Detection in Autonomous Driving | Mingfu Liang et.al. | 2403.17373 | translate | read | null |
| 2024-03-26 | Staircase Localization for Autonomous Exploration in Urban Environments | Jinrae Kim et.al. | 2403.17330 | translate | read | null |
| 2024-03-25 | Co-Occurring of Object Detection and Identification towards unlabeled object discovery | Binay Kumar Singh et.al. | 2403.17223 | translate | read | null |
| 2024-03-25 | Optimizing LiDAR Placements for Robust Driving Perception in Adverse Conditions | Ye Li et.al. | 2403.17009 | translate | read | link |
| 2024-03-25 | Isolated Diffusion: Optimizing Multi-Concept Text-to-Image Generation Training-Freely with Isolated Diffusion Guidance | Jingyuan Zhu et.al. | 2403.16954 | translate | read | null |
| 2024-03-25 | TrustAI at SemEval-2024 Task 8: A Comprehensive Analysis of Multi-domain Machine Generated Text Detection Techniques | Ashok Urlana et.al. | 2403.16592 | translate | read | null |
| 2024-03-25 | RCBEVDet: Radar-camera Fusion in Bird’s Eye View for 3D Object Detection | Zhiwei Lin et.al. | 2403.16440 | translate | read | link |
| 2024-03-25 | ASDF: Assembly State Detection Utilizing Late Fusion by Integrating 6D Pose Estimation | Hannah Schieber et.al. | 2403.16400 | translate | read | link |
| 2024-03-25 | Impact of Video Compression Artifacts on Fisheye Camera Visual Perception Tasks | Madhumitha Sakthi et.al. | 2403.16338 | translate | read | null |
| 2024-03-24 | Cross-domain Multi-modal Few-shot Object Detection via Rich Text | Zeyu Shangguan et.al. | 2403.16188 | translate | read | null |
| 2024-03-24 | Semantic Is Enough: Only Semantic Information For NeRF Reconstruction | Ruibo Wang et.al. | 2403.16043 | translate | read | null |
| 2024-03-23 | Adversarial Defense Teacher for Cross-Domain Object Detection under Poor Visibility Conditions | Kaiwen Wang et.al. | 2403.15786 | translate | read | null |
| 2024-03-23 | EAGLE: A Domain Generalization Framework for AI-generated Text Detection | Amrita Bhattacharjee et.al. | 2403.15690 | translate | read | null |
| 2024-03-25 | Point-DETR3D: Leveraging Imagery Data with Spatial Point Prior for Weakly Semi-supervised 3D Object Detection | Hongzhi Gao et.al. | 2403.15317 | translate | read | null |
| 2024-03-22 | CR3DT: Camera-RADAR Fusion for 3D Detection and Tracking | Nicolas Baumann et.al. | 2403.15313 | translate | read | link |
| 2024-03-22 | IS-Fusion: Instance-Scene Collaborative Fusion for Multimodal 3D Object Detection | Junbo Yin et.al. | 2403.15241 | translate | read | null |
| 2024-03-22 | MSCoTDet: Language-driven Multi-modal Fusion for Improved Multispectral Pedestrian Detection | Taeheon Kim et.al. | 2403.15209 | translate | read | null |
| 2024-03-22 | SFOD: Spiking Fusion Object Detector | Yimeng Fan et.al. | 2403.15192 | translate | read | link |
| 2024-03-22 | CRPlace: Camera-Radar Fusion with BEV Representation for Place Recognition | Shaowei Fu et.al. | 2403.15183 | translate | read | null |
| 2024-03-22 | An In-Depth Analysis of Data Reduction Methods for Sustainable Deep Learning | Víctor Toscano-Durán et.al. | 2403.15150 | translate | read | null |
| 2024-03-22 | Gradient-based Sampling for Class Imbalanced Semi-supervised Object Detection | Jiaming Li et.al. | 2403.15127 | translate | read | link |
| 2024-03-22 | VRSO: Visual-Centric Reconstruction for Static Object Annotation | Chenyao Yu et.al. | 2403.15026 | translate | read | null |
| 2024-03-22 | Vehicle Detection Performance in Nordic Region | Hamam Mokayed et.al. | 2403.15017 | translate | read | null |
| 2024-03-21 | T-Rex2: Towards Generic Object Detection via Text-Visual Prompt Synergy | Qing Jiang et.al. | 2403.14610 | translate | read | link |
| 2024-03-21 | UAV-Assisted Maritime Search and Rescue: A Holistic Approach | Martin Messmer et.al. | 2403.14281 | translate | read | null |
| 2024-03-21 | Scene-Graph ViT: End-to-End Open-Vocabulary Visual Relationship Detection | Tim Salzmann et.al. | 2403.14270 | translate | read | null |
| 2024-03-21 | 3D Object Detection from Point Cloud via Voting Step Diffusion | Haoran Hou et.al. | 2403.14133 | translate | read | null |
| 2024-03-20 | EcoSense: Energy-Efficient Intelligent Sensing for In-Shore Ship Detection through Edge-Cloud Collaboration | Wenjun Huang et.al. | 2403.14027 | translate | read | null |
| 2024-03-20 | RAR: Retrieving And Ranking Augmented MLLMs for Visual Recognition | Ziyu Liu et.al. | 2403.13805 | translate | read | link |
| 2024-03-20 | Bounding Box Stability against Feature Dropout Reflects Detector Generalization across Environments | Yang Yang et.al. | 2403.13803 | translate | read | link |
| 2024-03-20 | Fostc3net:A Lightweight YOLOv5 Based On the Network Structure Optimization | Danqing Ma et.al. | 2403.13703 | translate | read | null |
| 2024-03-20 | Find n’ Propagate: Open-Vocabulary 3D Object Detection in Urban Environments | Djamahl Etchegaray et.al. | 2403.13556 | translate | read | link |
| 2024-03-20 | MTP: Advancing Remote Sensing Foundation Model via Multi-Task Pretraining | Di Wang et.al. | 2403.13430 | translate | read | link |
| 2024-03-20 | Few-shot Oriented Object Detection with Memorable Contrastive Learning in Remote Sensing Images | Jiawei Zhou et.al. | 2403.13375 | translate | read | null |
| 2024-03-20 | Adaptive Ensembles of Fine-Tuned Transformers for LLM-Generated Text Detection | Zhixin Lai et.al. | 2403.13335 | translate | read | null |
| 2024-03-20 | DetDiffusion: Synergizing Generative and Perceptive Models for Enhanced Data Generation and Perception | Yibo Wang et.al. | 2403.13304 | translate | read | null |
| 2024-03-20 | Facilitating Pornographic Text Detection for Open-Domain Dialogue Systems via Knowledge Distillation of Large Language Models | Huachuan Qiu et.al. | 2403.13250 | translate | read | null |
| 2024-03-19 | SceneScript: Reconstructing Scenes With An Autoregressive Structured Language Model | Armen Avetisyan et.al. | 2403.13064 | translate | read | null |
| 2024-03-19 | Wildfire danger prediction optimization with transfer learning | Spiros Maggioros et.al. | 2403.12871 | translate | read | link |
| 2024-03-19 | As Firm As Their Foundations: Can open-sourced foundation models be used to create adversarial examples for downstream tasks? | Anjun Hu et.al. | 2403.12693 | translate | read | null |
| 2024-03-19 | EAS-SNN: End-to-End Adaptive Sampling and Representation for Event-based Detection with Recurrent Spiking Neural Networks | Ziming Wang et.al. | 2403.12574 | translate | read | null |
| 2024-03-19 | DetToolChain: A New Prompting Paradigm to Unleash Detection Ability of MLLM | Yixuan Wu et.al. | 2403.12488 | translate | read | null |
| 2024-03-19 | TransformMix: Learning Transformation and Mixing Strategies from Data | Tsz-Him Cheung et.al. | 2403.12429 | translate | read | null |
| 2024-03-19 | VisionGPT: LLM-Assisted Real-Time Anomaly Detection for Safe Visual Navigation | Hao Wang et.al. | 2403.12415 | translate | read | null |
| 2024-03-19 | Entity6K: A Large Open-Domain Evaluation Dataset for Real-World Entity Recognition | Jielin Qiu et.al. | 2403.12339 | translate | read | null |
| 2024-03-18 | EffiPerception: an Efficient Framework for Various Perception Tasks | Xinhao Xiang et.al. | 2403.12317 | translate | read | null |
| 2024-03-18 | Prototipo de un Contador Bidireccional Automático de Personas basado en sensores de visión 3D | Benjamín Ojeda-Magaña et.al. | 2403.12310 | translate | read | null |
| 2024-03-18 | Align and Distill: Unifying and Improving Domain Adaptive Object Detection | Justin Kay et.al. | 2403.12029 | translate | read | link |
| 2024-03-18 | TrajectoryNAS: A Neural Architecture Search for Trajectory Prediction | Ali Asghar Sharifi et.al. | 2403.11695 | translate | read | null |
| 2024-03-18 | Just Add $100 More: Augmenting NeRF-based Pseudo-LiDAR Point Cloud for Resolving Class-imbalance Problem | Mincheol Chang et.al. | 2403.11573 | translate | read | null |
| 2024-03-18 | R2SNet: Scalable Domain Adaptation for Object Detection in Cloud-Based Robots Ecosystems via Proposal Refinement | Michele Antonazzi et.al. | 2403.11567 | translate | read | null |
| 2024-03-18 | Continual Forgetting for Pre-trained Vision Models | Hongbo Zhao et.al. | 2403.11530 | translate | read | link |
| 2024-03-17 | V2X-DGW: Domain Generalization for Multi-agent Perception under Adverse Weather Conditions | Baolu Li et.al. | 2403.11371 | translate | read | link |
| 2024-03-17 | Advanced Knowledge Extraction of Physical Design Drawings, Translation and conversion to CAD formats using Deep Learning | Jesher Joshua M et.al. | 2403.11291 | translate | read | null |
| 2024-03-17 | ManipVQA: Injecting Robotic Affordance and Physically Grounded Information into Multi-Modal Large Language Models | Siyuan Huang et.al. | 2403.11289 | translate | read | link |
| 2024-03-17 | CPA-Enhancer: Chain-of-Thought Prompted Adaptive Enhancer for Object Detection under Unknown Degradations | Yuwei Zhang et.al. | 2403.11220 | translate | read | link |
| 2024-03-17 | GRA: Detecting Oriented Objects through Group-wise Rotating and Attention | Jiangshan Wang et.al. | 2403.11127 | translate | read | null |
| 2024-03-17 | Self-supervised co-salient object detection via feature correspondence at multiple scales | Souradeep Chakraborty et.al. | 2403.11107 | translate | read | link |
| 2024-03-14 | Open-Vocabulary Object Detection with Meta Prompt Representation and Instance Contrastive Optimization | Zhao Wang et.al. | 2403.09433 | translate | read | null |
| 2024-03-14 | D3T: Distinctive Dual-Domain Teacher Zigzagging Across RGB-Thermal Gap for Domain-Adaptive Object Detection | Dinh Phat Do et.al. | 2403.09359 | translate | read | link |
| 2024-03-14 | Griffon v2: Advancing Multimodal Perception with High-Resolution Scaling and Visual-Language Co-Referring | Yufei Zhan et.al. | 2403.09333 | translate | read | link |
| 2024-03-14 | EfficientMFD: Towards More Efficient Multimodal Synchronous Fusion Detection | Jiaqing Zhang et.al. | 2403.09323 | translate | read | link |
| 2024-03-14 | Knowledge Distillation in YOLOX-ViT for Side-Scan Sonar Object Detection | Martin Aubard et.al. | 2403.09313 | translate | read | link |
| 2024-03-14 | MOTPose: Multi-object 6D Pose Estimation for Dynamic Video Sequences using Attention-based Temporal Fusion | Arul Selvam Periyasamy et.al. | 2403.09309 | translate | read | null |
| 2024-03-14 | CLIP-EBC: CLIP Can Count Accurately through Enhanced Blockwise Classification | Yiming Ma et.al. | 2403.09281 | translate | read | link |
| 2024-03-14 | D-YOLO a robust framework for object detection in adverse weather conditions | Zihan Chu et.al. | 2403.09233 | translate | read | null |
| 2024-03-14 | Improving Distant 3D Object Detection Using 2D Box Supervision | Zetong Yang et.al. | 2403.09230 | translate | read | null |
| 2024-03-14 | PoIFusion: Multi-Modal 3D Object Detection via Fusion at Points of Interest | Jiajun Deng et.al. | 2403.09212 | translate | read | null |
| 2024-03-13 | VLOGGER: Multimodal Diffusion for Embodied Avatar Synthesis | Enric Corona et.al. | 2403.08764 | translate | read | null |
| 2024-03-13 | MIM4D: Masked Modeling with Multi-View Video for Autonomous Driving Representation Learning | Jialv Zou et.al. | 2403.08760 | translate | read | link |
| 2024-03-13 | Data Augmentation in Human-Centric Vision | Wentao Jiang et.al. | 2403.08650 | translate | read | null |
| 2024-03-13 | PRAGO: Differentiable Multi-View Pose Optimization From Objectness Detections | Matteo Taiana et.al. | 2403.08586 | translate | read | null |
| 2024-03-13 | A Multimodal Fusion Network For Student Emotion Recognition Based on Transformer and Tensor Product | Ao Xiang et.al. | 2403.08511 | translate | read | null |
| 2024-03-13 | Improved YOLOv5 Based on Attention Mechanism and FasterNet for Foreign Object Detection on Railway and Airway tracks | Zongqing Qi et.al. | 2403.08499 | translate | read | null |
| 2024-03-13 | IAMCV Multi-Scenario Vehicle Interaction Dataset | Novel Certad et.al. | 2403.08455 | translate | read | null |
| 2024-03-13 | Advancing Security in AI Systems: A Novel Approach to Detecting Backdoors in Deep Neural Networks | Khondoker Murad Hossain et.al. | 2403.08208 | translate | read | null |
| 2024-03-12 | TaskCLIP: Extend Large Vision-Language Model for Task Oriented Object Detection | Hanning Chen et.al. | 2403.08108 | translate | read | null |
| 2024-03-12 | Aedes aegypti Egg Counting with Neural Networks for Object Detection | Micheli Nayara de Oliveira Vicente et.al. | 2403.08016 | translate | read | null |
| 2024-03-12 | Mondrian: On-Device High-Performance Video Analytics with Compressive Packed Inference | Changmin Jeon et.al. | 2403.07598 | translate | read | null |
| 2024-03-12 | PeLK: Parameter-efficient Large Kernel ConvNets with Peripheral Convolution | Honghao Chen et.al. | 2403.07589 | translate | read | null |
| 2024-03-12 | A Survey of Vision Transformers in Autonomous Driving: Current Trends and Future Directions | Quoc-Vinh Lai-Dang et.al. | 2403.07542 | translate | read | null |
| 2024-03-12 | JSTR: Joint Spatio-Temporal Reasoning for Event-based Moving Object Detection | Hanyu Zhou et.al. | 2403.07436 | translate | read | null |
| 2024-03-12 | Eliminating Cross-modal Conflicts in BEV Space for LiDAR-Camera 3D Object Detection | Jiahui Fu et.al. | 2403.07372 | translate | read | null |
| 2024-03-12 | GPT-generated Text Detection: Benchmark Dataset and Tensor-based Detection Method | Zubair Qazi et.al. | 2403.07321 | translate | read | link |
| 2024-03-12 | MENTOR: Multilingual tExt detectioN TOward leaRning by analogy | Hsin-Ju Lin et.al. | 2403.07286 | translate | read | null |
| 2024-03-12 | SparseLIF: High-Performance Sparse LiDAR-Camera Fusion for 3D Object Detection | Hongcheng Zhang et.al. | 2403.07284 | translate | read | null |
| 2024-03-12 | Adaptive Bounding Box Uncertainties via Two-Step Conformal Prediction | Alexander Timans et.al. | 2403.07263 | translate | read | null |
| 2024-03-11 | Class Imbalance in Object Detection: An Experimental Diagnosis and Study of Mitigation Strategies | Nieves Crasto et.al. | 2403.07113 | translate | read | link |
| 2024-03-11 | Real-time Transformer-based Open-Vocabulary Detection with Efficient Fusion Head | Tiancheng Zhao et.al. | 2403.06892 | translate | read | link |
| 2024-03-11 | LeOCLR: Leveraging Original Images for Contrastive Learning of Visual Representations | Mohammad Alkhalefi et.al. | 2403.06813 | translate | read | null |
| 2024-03-11 | Genetic Learning for Designing Sim-to-Real Data Augmentations | Bram Vanherle et.al. | 2403.06786 | translate | read | null |
| 2024-03-11 | Evaluating the Energy Efficiency of Few-Shot Learning for Object Detection in Industrial Settings | Georgios Tsoumplekas et.al. | 2403.06631 | translate | read | null |
| 2024-03-11 | Cross-domain and Cross-dimension Learning for Image-to-Graph Transformers | Alexander H. Berger et.al. | 2403.06601 | translate | read | null |
| 2024-03-11 | SARDet-100K: Towards Open-Source Benchmark and ToolKit for Large-Scale SAR Object Detection | Yuxuan Li et.al. | 2403.06534 | translate | read | link |
| 2024-03-11 | 3D Semantic Segmentation-Driven Representations for 3D Object Detection | Hayeon O et.al. | 2403.06501 | translate | read | null |
| 2024-03-11 | Fine-Grained Pillar Feature Encoding Via Spatio-Temporal Virtual Grid for 3D Object Detection | Konyul Park et.al. | 2403.06433 | translate | read | null |
| 2024-03-10 | Transformer based Multitask Learning for Image Captioning and Object Detection | Debolena Basak et.al. | 2403.06292 | translate | read | null |
| 2024-03-10 | Poly Kernel Inception Network for Remote Sensing Detection | Xinhao Cai et.al. | 2403.06258 | translate | read | link |
| 2024-03-08 | EVD4UAV: An Altitude-Sensitive Benchmark to Evade Vehicle Detection in UAV | Huiming Sun et.al. | 2403.05422 | translate | read | null |
| 2024-03-08 | SIRST-5K: Exploring Massive Negatives Synthesis with Self-supervised Learning for Robust Infrared Small Target Detection | Yahao Lu et.al. | 2403.05416 | translate | read | link |
| 2024-03-08 | Exploring Robust Features for Few-Shot Object Detection in Satellite Imagery | Xavier Bou et.al. | 2403.05381 | translate | read | null |
| 2024-03-08 | Frequency-Adaptive Dilated Convolution for Semantic Segmentation | Linwei Chen et.al. | 2403.05369 | translate | read | link |
| 2024-03-08 | VLM-PL: Advanced Pseudo Labeling approach Class Incremental Object Detection with Vision-Language Model | Junsu Kim et.al. | 2403.05346 | translate | read | null |
| 2024-03-08 | Improving the Successful Robotic Grasp Detection Using Convolutional Neural Networks | Hamed Hosseini et.al. | 2403.05211 | translate | read | null |
| 2024-03-08 | LanePtrNet: Revisiting Lane Detection as Point Voting and Grouping on Curves | Jiayan Cao et.al. | 2403.05155 | translate | read | null |
| 2024-03-08 | RadarDistill: Boosting Radar-based Object Detection Performance via Knowledge Distillation from LiDAR Features | Geonho Bang et.al. | 2403.05061 | translate | read | null |
| 2024-03-08 | ActFormer: Scalable Collaborative Perception via Active Queries | Suozhi Huang et.al. | 2403.04968 | translate | read | null |
| 2024-03-07 | FriendNet: Detection-Friendly Dehazing Network | Yihua Fan et.al. | 2403.04443 | translate | read | null |
| 2024-03-07 | Effectiveness Assessment of Recent Large Vision-Language Models | Yao Jiang et.al. | 2403.04306 | translate | read | null |
| 2024-03-07 | ACC-ViT : Atrous Convolution’s Comeback in Vision Transformers | Nabil Ibtehaz et.al. | 2403.04200 | translate | read | null |
| 2024-03-07 | CN-RMA: Combined Network with Ray Marching Aggregation for 3D Indoors Object Detection from Multi-view Images | Guanlin Shen et.al. | 2403.04198 | translate | read | null |
| 2024-03-07 | Scalable and Robust Transformer Decoders for Interpretable Image Classification with Foundation Models | Evelyn Mannix et.al. | 2403.04125 | translate | read | null |
| 2024-03-07 | CMDA: Cross-Modal and Domain Adversarial Adaptation for LiDAR-Based 3D Object Detection | Gyusam Chang et.al. | 2403.03721 | translate | read | null |
| 2024-03-06 | Adversarial Infrared Geometry: Using Geometry to Perform Adversarial Attack against Infrared Pedestrian Detectors | Kalibinuer Tiliwalidi et.al. | 2403.03674 | translate | read | null |
| 2024-03-06 | Towards Detecting AI-Generated Text within Human-AI Collaborative Hybrid Texts | Zijie Zeng et.al. | 2403.03506 | translate | read | link |
| 2024-03-06 | Multi-task Learning for Real-time Autonomous Driving Leveraging Task-adaptive Attention Generator | Wonhyeok Choi et.al. | 2403.03468 | translate | read | null |
| 2024-03-06 | FLAME Diffuser: Grounded Wildfire Image Synthesis using Mask Guided Diffusion | Hao Wang et.al. | 2403.03463 | translate | read | null |
| 2024-03-06 | Performance Evaluation of Semi-supervised Learning Frameworks for Multi-Class Weed Detection | Jiajia Li et.al. | 2403.03390 | translate | read | link |
| 2024-03-05 | Detecting Concrete Visual Tokens for Multimodal Machine Translation | Braeden Bowen et.al. | 2403.03075 | translate | read | null |
| 2024-03-05 | Loss Design for Single-carrier Joint Communication and Neural Network-based Sensing | Charlotte Muth et.al. | 2403.02929 | translate | read | null |
| 2024-03-05 | Are Dense Labels Always Necessary for 3D Object Detection from Point Cloud? | Chenqiang Gao et.al. | 2403.02818 | translate | read | null |
| 2024-03-05 | Bootstrapping Rare Object Detection in High-Resolution Satellite Imagery | Akram Zaytar et.al. | 2403.02736 | translate | read | null |
| 2024-03-05 | FastOcc: Accelerating 3D Occupancy Prediction by Fusing the 2D Bird’s-Eye View and Perspective View | Jiawei Hou et.al. | 2403.02710 | translate | read | null |
| 2024-03-05 | False Positive Sampling-based Data Augmentation for Enhanced 3D Object Detection Accuracy | Jiyong Oh et.al. | 2403.02639 | translate | read | null |
| 2024-03-05 | BSDP: Brain-inspired Streaming Dual-level Perturbations for Online Open World Object Detection | Yu Chen et.al. | 2403.02637 | translate | read | null |
| 2024-03-04 | NiNformer: A Network in Network Transformer with Token Mixing Generated Gating Function | Abdullah Nazhat Abdullah et.al. | 2403.02411 | translate | read | link |
| 2024-03-04 | COMMIT: Certifying Robustness of Multi-Sensor Fusion Systems against Semantic Attacks | Zijian Huang et.al. | 2403.02329 | translate | read | null |
| 2024-03-04 | Scalable Vision-Based 3D Object Detection and Monocular Depth Estimation for Autonomous Driving | Yuxuan Liu et.al. | 2403.02037 | translate | read | link |
| 2024-03-02 | TUMTraf V2X Cooperative Perception Dataset | Walter Zimmer et.al. | 2403.01316 | translate | read | null |
| 2024-03-02 | Causal Mode Multiplexer: A Novel Framework for Unbiased Multispectral Pedestrian Detection | Taeheon Kim et.al. | 2403.01300 | translate | read | null |
| 2024-03-02 | Run-time Introspection of 2D Object Detection in Automated Driving Systems Using Learning Representations | Hakan Yekta Yatbaz et.al. | 2403.01172 | translate | read | null |
| 2024-03-02 | ELA: Efficient Local Attention for Deep Convolutional Neural Networks | Wei Xu et.al. | 2403.01123 | translate | read | null |
| 2024-03-02 | Face Swap via Diffusion Model | Feifei Wang et.al. | 2403.01108 | translate | read | link |
| 2024-03-02 | Beyond Night Visibility: Adaptive Multi-Scale Fusion of Infrared and Visible Images | Shufan Pei et.al. | 2403.01083 | translate | read | null |
| 2024-03-01 | Learning Causal Features for Incremental Object Detection | Zhenwei He et.al. | 2403.00591 | translate | read | null |
| 2024-03-01 | Abductive Ego-View Accident Video Understanding for Safe Driving Perception | Jianwu Fang et.al. | 2403.00436 | translate | read | null |
| 2024-03-04 | DAMS-DETR: Dynamic Adaptive Multispectral Detection Transformer with Competitive Query Selection and Adaptive Feature Fusion | Junjie Guo et.al. | 2403.00326 | translate | read | null |
| 2024-03-01 | ODM: A Text-Image Further Alignment Pre-training Approach for Scene Text Detection and Spotting | Chen Duan et.al. | 2403.00303 | translate | read | link |
(<a href=../Object_Detection.md>back to Object Detection</a>)