Scene Understanding - 2024-02

Publish Date Title Authors PDF Translate Read Code
2024-02-29 FusionVision: A comprehensive approach of 3D object reconstruction and segmentation from RGB-D cameras using YOLO and fast segment anything Safouane El Ghazouali et.al. 2403.00175 translate read link
2024-02-29 One model to use them all: Training a segmentation model with complementary datasets Alexander C. Jenke et.al. 2402.19340 translate read link
2024-02-29 Feature boosting with efficient attention for scene parsing Vivek Singh et.al. 2402.19250 translate read null
2024-02-29 PCDepth: Pattern-based Complementary Learning for Monocular Depth Estimation by Best of Both Worlds Haotian Liu et.al. 2402.18925 translate read null
2024-02-28 Windowed-FourierMixer: Enhancing Clutter-Free Room Modeling with Fourier Transform Bruno Henriques et.al. 2402.18287 translate read null
2024-02-27 LiveHPS: LiDAR-based Scene-level Human Pose and Shape Estimation in Free Environment Yiming Ren et.al. 2402.17171 translate read null
2024-02-27 Efficiently Leveraging Linguistic Priors for Scene Text Spotting Nguyen Nguyen et.al. 2402.17134 translate read null
2024-02-26 DreamUp3D: Object-Centric Generative Models for Single-View 3D Scene Understanding and Real-to-Sim Transfer Yizhe Wu et.al. 2402.16308 translate read null
2024-02-24 Sequential Visual and Semantic Consistency for Semi-supervised Text Recognition Mingkun Yang et.al. 2402.15806 translate read null
2024-02-23 OpenSUN3D: 1st Workshop Challenge on Open-Vocabulary 3D Scene Understanding Francis Engelmann et.al. 2402.15321 translate read null
2024-02-22 S^2Former-OR: Single-Stage Bimodal Transformer for Scene Graph Generation in OR Jialun Pei et.al. 2402.14461 translate read null
2024-02-22 Swin3D++: Effective Multi-Source Pretraining for 3D Indoor Scene Understanding Yu-Qi Yang et.al. 2402.14215 translate read link
2024-02-21 Class-Aware Mask-Guided Feature Refinement for Scene Text Recognition Mingkun Yang et.al. 2402.13643 translate read link
2024-02-25 DriveVLM: The Convergence of Autonomous Driving and Large Vision-Language Models Xiaoyu Tian et.al. 2402.12289 translate read null

(<a href=../Scene_Understanding.md>back to Scene Understanding</a>)