Scene Understanding - 2024-10 | Paper Arxiv Daily

Scene Understanding - 2024-10

Publish Date	Title	Authors	PDF	Translate	Read	Code
2024-10-30	UniRiT: Towards Few-Shot Non-Rigid Point Cloud Registration	Geng Li et.al.	2410.22909	translate	read	null
2024-10-30	Situational Scene Graph for Structured Human-centric Situation Understanding	Chinthani Sugandhika et.al.	2410.22829	translate	read	null
2024-10-30	Symbolic Graph Inference for Compound Scene Understanding	FNU Aryan et.al.	2410.22626	translate	read	null
2024-10-29	Senna: Bridging Large Vision-Language Models and End-to-End Autonomous Driving	Bo Jiang et.al.	2410.22313	translate	read	link
2024-10-26	Towards Robust Algorithms for Surgical Phase Recognition via Digital Twin-based Scene Representation	Hao Ding et.al.	2410.20026	translate	read	null
2024-10-23	Surgical Scene Segmentation by Transformer With Asymmetric Feature Enhancement	Cheng Yuan et.al.	2410.17642	translate	read	link
2024-10-22	PerspectiveNet: Multi-View Perception for Dynamic Scene Understanding	Vinh Nguyen et.al.	2410.16824	translate	read	null
2024-10-20	Scene Graph Generation with Role-Playing Large Language Models	Guikun Chen et.al.	2410.15364	translate	read	null
2024-10-20	Large Language Models for Autonomous Driving (LLM4AD): Concept, Benchmark, Simulation, and Real-Vehicle Experiment	Can Cui et.al.	2410.15281	translate	read	null
2024-10-19	Semantically Safe Robot Manipulation: From Semantic Scene Understanding to Motion Safeguards	Lukas Brunke et.al.	2410.15185	translate	read	null
2024-10-19	Part-Whole Relational Fusion Towards Multi-Modal Scene Understanding	Yi Liu et.al.	2410.14944	translate	read	link
2024-10-17	ARKit LabelMaker: A New Scale for Indoor 3D Scene Understanding	Guangda Ji et.al.	2410.13924	translate	read	link
2024-10-17	VLM-Grounder: A VLM Agent for Zero-Shot 3D Visual Grounding	Runsen Xu et.al.	2410.13860	translate	read	link
2024-10-16	3D Gaussian Splatting in Robotics: A Survey	Siting Zhu et.al.	2410.12262	translate	read	null
2024-10-17	SAM-Guided Masked Token Prediction for 3D Scene Understanding	Zhimin Chen et.al.	2410.12158	translate	read	null
2024-10-16	Leveraging Large Vision Language Model For Better Automatic Web GUI Testing	Siyi Wang et.al.	2410.12157	translate	read	null
2024-10-15	MCTBench: Multimodal Cognition towards Text-Rich Visual Scenes Benchmark	Bin Shan et.al.	2410.11538	translate	read	link
2024-10-14	3DArticCyclists: Generating Simulated Dynamic 3D Cyclists for Human-Object Interaction (HOI) and Autonomous Driving Applications	Eduardo R. Corral-Soto et.al.	2410.10782	translate	read	null
2024-10-17	Stratified Domain Adaptation: A Progressive Self-Training Approach for Scene Text Recognition	Kha Nhat Le et.al.	2410.09913	translate	read	null
2024-10-13	LoLI-Street: Benchmarking Low-Light Image Enhancement and Beyond	Md Tanvir Islam et.al.	2410.09831	translate	read	link
2024-10-12	Enhancing Single Image to 3D Generation using Gaussian Splatting and Hybrid Diffusion Priors	Hritam Basak et.al.	2410.09467	translate	read	null
2024-10-11	Dual-AEB: Synergizing Rule-Based and Multimodal Large Language Models for Effective Emergency Braking	Wei Zhang et.al.	2410.08616	translate	read	null
2024-10-10	A transition towards virtual representations of visual scenes	Américo Pereira et.al.	2410.07987	translate	read	null
2024-10-10	RDT-1B: a Diffusion Foundation Model for Bimanual Manipulation	Songming Liu et.al.	2410.07864	translate	read	null
2024-10-11	Test-Time Intensity Consistency Adaptation for Shadow Detection	Leyi Zhu et.al.	2410.07695	translate	read	null
2024-10-10	3D Vision-Language Gaussian Splatting	Qucheng Peng et.al.	2410.07577	translate	read	null
2024-10-09	Evaluating the Impact of Point Cloud Colorization on Semantic Segmentation Accuracy	Qinfeng Zhu et.al.	2410.06725	translate	read	null
2024-10-09	Open-RGBT: Open-vocabulary RGB-T Zero-shot Semantic Segmentation in Open-world Environments	Meng Yu et.al.	2410.06626	translate	read	null
2024-10-08	BoxMap: Efficient Structural Mapping and Navigation	Zili Wang et.al.	2410.06263	translate	read	null
2024-10-08	OrionNav: Online Planning for Robot Autonomy with Context-Aware LLM and Open-Vocabulary Semantic Scene Graphs	Venkata Naren Devarakonda et.al.	2410.06239	translate	read	null
2024-10-07	Resource-Efficient Multiview Perception: Integrating Semantic Masking with Masked Autoencoders	Kosta Dakic et.al.	2410.04817	translate	read	null
2024-10-07	Diffusion Models in 3D Vision: A Survey	Zhen Wang et.al.	2410.04738	translate	read	null
2024-10-06	In-Place Panoptic Radiance Field Segmentation with Perceptual Prior for 3D Scene Understanding	Shenghao Li et.al.	2410.04529	translate	read	null
2024-10-05	ETHcavation: A Dataset and Pipeline for Panoptic Scene Understanding and Object Tracking in Dynamic Construction Environments	Lorenzo Terenzi et.al.	2410.04250	translate	read	null
2024-10-05	Fast Object Detection with a Machine Learning Edge Device	Richard C. Rodriguez et.al.	2410.04173	translate	read	null
2024-10-04	SPARTUN3D: Situated Spatial Understanding of 3D World in Large Language Models	Yue Zhang et.al.	2410.03878	translate	read	null
2024-10-03	RESSCAL3D++: Joint Acquisition and Semantic Segmentation of 3D Point Clouds	Remco Royen et.al.	2410.02323	translate	read	link
2024-10-01	A Critical Assessment of Visual Sound Source Localization Models Including Negative Audio	Xavier Juanola et.al.	2410.01020	translate	read	link
2024-10-02	BehAV: Behavioral Rule Guided Autonomy Using VLMs for Robot Navigation in Outdoor Scenes	Kasun Weerakoon et.al.	2409.16484	translate	read	null

(<a href=../Scene_Understanding.md>back to Scene Understanding</a>)