Scene Understanding - 2024-02 | Paper Arxiv Daily

Scene Understanding - 2024-02

Publish Date	Title	Authors	PDF	Translate	Read	Code
2024-02-29	FusionVision: A comprehensive approach of 3D object reconstruction and segmentation from RGB-D cameras using YOLO and fast segment anything	Safouane El Ghazouali et.al.	2403.00175	translate	read	link
2024-02-29	One model to use them all: Training a segmentation model with complementary datasets	Alexander C. Jenke et.al.	2402.19340	translate	read	link
2024-02-29	Feature boosting with efficient attention for scene parsing	Vivek Singh et.al.	2402.19250	translate	read	null
2024-02-29	PCDepth: Pattern-based Complementary Learning for Monocular Depth Estimation by Best of Both Worlds	Haotian Liu et.al.	2402.18925	translate	read	null
2024-02-28	Windowed-FourierMixer: Enhancing Clutter-Free Room Modeling with Fourier Transform	Bruno Henriques et.al.	2402.18287	translate	read	null
2024-02-27	LiveHPS: LiDAR-based Scene-level Human Pose and Shape Estimation in Free Environment	Yiming Ren et.al.	2402.17171	translate	read	null
2024-02-27	Efficiently Leveraging Linguistic Priors for Scene Text Spotting	Nguyen Nguyen et.al.	2402.17134	translate	read	null
2024-02-26	DreamUp3D: Object-Centric Generative Models for Single-View 3D Scene Understanding and Real-to-Sim Transfer	Yizhe Wu et.al.	2402.16308	translate	read	null
2024-02-24	Sequential Visual and Semantic Consistency for Semi-supervised Text Recognition	Mingkun Yang et.al.	2402.15806	translate	read	null
2024-02-23	OpenSUN3D: 1st Workshop Challenge on Open-Vocabulary 3D Scene Understanding	Francis Engelmann et.al.	2402.15321	translate	read	null
2024-02-22	S^2Former-OR: Single-Stage Bimodal Transformer for Scene Graph Generation in OR	Jialun Pei et.al.	2402.14461	translate	read	null
2024-02-22	Swin3D++: Effective Multi-Source Pretraining for 3D Indoor Scene Understanding	Yu-Qi Yang et.al.	2402.14215	translate	read	link
2024-02-21	Class-Aware Mask-Guided Feature Refinement for Scene Text Recognition	Mingkun Yang et.al.	2402.13643	translate	read	link
2024-02-25	DriveVLM: The Convergence of Autonomous Driving and Large Vision-Language Models	Xiaoyu Tian et.al.	2402.12289	translate	read	null

(<a href=../Scene_Understanding.md>back to Scene Understanding</a>)