Scene Understanding - 2024-03
Scene Understanding - 2024-03
| Publish Date | Title | Authors | Translate | Read | Code | |
|---|---|---|---|---|---|---|
| 2024-03-31 | Adapting to Length Shift: FlexiLength Network for Trajectory Prediction | Yi Xu et.al. | 2404.00742 | translate | read | null |
| 2024-03-31 | Neural Radiance Field-based Visual Rendering: A Comprehensive Review | Mingyuan Yao et.al. | 2404.00714 | translate | read | null |
| 2024-03-29 | VSRD: Instance-Aware Volumetric Silhouette Rendering for Weakly Supervised 3D Object Detection | Zihua Liu et.al. | 2404.00149 | translate | read | null |
| 2024-03-29 | HGS-Mapping: Online Dense Mapping Using Hybrid Gaussian Representation in Urban Scenes | Ke Wu et.al. | 2403.20159 | translate | read | null |
| 2024-03-27 | Object Pose Estimation via the Aggregation of Diffusion Features | Tianfu Wang et.al. | 2403.18791 | translate | read | link |
| 2024-03-25 | Calib3D: Calibrating Model Preferences for Reliable 3D Scene Understanding | Lingdong Kong et.al. | 2403.17010 | translate | read | link |
| 2024-03-25 | Towards Trustworthy Automated Driving through Qualitative Scene Understanding and Explanations | Nassim Belmecheri et.al. | 2403.16908 | translate | read | null |
| 2024-03-25 | DOCTR: Disentangled Object-Centric Transformer for Point Scene Understanding | Xiaoxuan Yu et.al. | 2403.16431 | translate | read | link |
| 2024-03-24 | AutoInst: Automatic Instance-Based Segmentation of LiDAR 3D Scans | Cedric Perauer et.al. | 2403.16318 | translate | read | null |
| 2024-03-24 | Improving Scene Graph Generation with Relation Words’ Debiasing in Vision-Language Models | Yuxuan Wang et.al. | 2403.16184 | translate | read | null |
| 2024-03-24 | Multi-Task Learning with Multi-Task Optimization | Lu Bai et.al. | 2403.16162 | translate | read | null |
| 2024-03-24 | Semantic Is Enough: Only Semantic Information For NeRF Reconstruction | Ruibo Wang et.al. | 2403.16043 | translate | read | null |
| 2024-03-22 | Semantic Gaussians: Open-Vocabulary Scene Understanding with 3D Gaussian Splatting | Jun Guo et.al. | 2403.15624 | translate | read | null |
| 2024-03-22 | DiffusionMTL: Learning Multi-Task Denoising Diffusion Model from Partially Annotated Data | Hanrong Ye et.al. | 2403.15389 | translate | read | null |
| 2024-03-21 | DSGG: Dense Relation Transformer for an End-to-end Scene Graph Generation | Zeeshan Hayder et.al. | 2403.14886 | translate | read | null |
| 2024-03-21 | Evaluating Panoramic 3D Estimation in Indoor Lighting Analysis | Zining Cheng et.al. | 2403.14836 | translate | read | null |
| 2024-03-21 | SurroundSDF: Implicit 3D Scene Understanding Based on Signed Distance Field | Lizhe Liu et.al. | 2403.14366 | translate | read | null |
| 2024-03-21 | Exosense: A Vision-Centric Scene Understanding System For Safe Exoskeleton Navigation | Jianeng Wang et.al. | 2403.14320 | translate | read | null |
| 2024-03-21 | Volumetric Environment Representation for Vision-Language Navigation | Rui Liu et.al. | 2403.14158 | translate | read | null |
| 2024-03-21 | 3D Object Detection from Point Cloud via Voting Step Diffusion | Haoran Hou et.al. | 2403.14133 | translate | read | null |
| 2024-03-20 | Efficient scene text image super-resolution with semantic guidance | LeoWu TomyEnrique et.al. | 2403.13330 | translate | read | link |
| 2024-03-19 | SceneScript: Reconstructing Scenes With An Autoregressive Structured Language Model | Armen Avetisyan et.al. | 2403.13064 | translate | read | null |
| 2024-03-19 | HUGS: Holistic Urban 3D Scene Understanding via Gaussian Splatting | Hongyu Zhou et.al. | 2403.12722 | translate | read | null |
| 2024-03-19 | M2DA: Multi-Modal Fusion Transformer Incorporating Driver Attention for Autonomous Driving | Dongyang Xu et.al. | 2403.12552 | translate | read | null |
| 2024-03-19 | Multi-Object RANSAC: Efficient Plane Clustering Method in a Clutter | Seunghyeon Lim et.al. | 2403.12449 | translate | read | null |
| 2024-03-19 | Geometric Constraints in Deep Learning Frameworks: A Survey | Vibhas K Vats et.al. | 2403.12431 | translate | read | null |
| 2024-03-18 | R3DS: Reality-linked 3D Scenes for Panoramic Scene Understanding | Qirui Wu et.al. | 2403.12301 | translate | read | null |
| 2024-03-18 | HiKER-SGG: Hierarchical Knowledge Enhanced Robust Scene Graph Generation | Ce Zhang et.al. | 2403.12033 | translate | read | link |
| 2024-03-18 | Agent3D-Zero: An Agent for Zero-shot 3D Understanding | Sha Zhang et.al. | 2403.11835 | translate | read | null |
| 2024-03-18 | OpenOcc: Open Vocabulary 3D Scene Reconstruction via Occupancy Representation | Haochen Jiang et.al. | 2403.11796 | translate | read | null |
| 2024-03-19 | Urban Scene Diffusion through Semantic Occupancy Map | Junge Zhang et.al. | 2403.11697 | translate | read | null |
| 2024-03-18 | Hierarchical Spatial Proximity Reasoning for Vision-and-Language Navigation | Ming Xu et.al. | 2403.11541 | translate | read | link |
| 2024-03-18 | Beyond Uncertainty: Risk-Aware Active View Acquisition for Safe Robot Navigation and 3D Scene Understanding with FisherRF | Guangyi Liu et.al. | 2403.11396 | translate | read | null |
| 2024-03-17 | Omni-Recon: Towards General-Purpose Neural Radiance Fields for Versatile 3D Applications | Yonggan Fu et.al. | 2403.11131 | translate | read | link |
| 2024-03-16 | N2F2: Hierarchical Scene Understanding with Nested Neural Feature Fields | Yash Bhalgat et.al. | 2403.10997 | translate | read | null |
| 2024-03-16 | Segment Any Object Model (SAOM): Real-to-Simulation Fine-Tuning Strategy for Multi-Class Multi-Instance Segmentation | Mariia Khan et.al. | 2403.10780 | translate | read | null |
| 2024-03-15 | Robust Shape Fitting for 3D Scene Abstraction | Florian Kluger et.al. | 2403.10452 | translate | read | link |
| 2024-03-15 | Do Visual-Language Maps Capture Latent Semantics? | Matti Pekkanen et.al. | 2403.10117 | translate | read | null |
| 2024-03-15 | Enhancing Human-Centered Dynamic Scene Understanding via Multiple LLMs Collaborated Reasoning | Hang Zhang et.al. | 2403.10107 | translate | read | null |
| 2024-03-14 | GroupContrast: Semantic-aware Self-supervised Representation Learning for 3D Understanding | Chengyao Wang et.al. | 2403.09639 | translate | read | link |
| 2024-03-12 | IndicSTR12: A Dataset for Indic Scene Text Recognition | Harsh Lunia et.al. | 2403.08007 | translate | read | null |
| 2024-03-12 | Efficient Global Navigational Planning in 3D Structures based on Point Cloud Tomography | Bowen Yang et.al. | 2403.07631 | translate | read | link |
| 2024-03-12 | Open-Vocabulary Scene Text Recognition via Pseudo-Image Labeling and Margin Loss | Xuhua Ren et.al. | 2403.07518 | translate | read | null |
| 2024-03-12 | MoAI: Mixture of All Intelligence for Large Language and Vision Models | Byung-Kwan Lee et.al. | 2403.07508 | translate | read | link |
| 2024-03-11 | Mapping High-level Semantic Regions in Indoor Environments without Object Recognition | Roberto Bigazzi et.al. | 2403.07076 | translate | read | null |
| 2024-03-11 | Optimizing Latent Graph Representations of Surgical Scenes for Zero-Shot Domain Transfer | Siddhant Satyanaik et.al. | 2403.06953 | translate | read | null |
| 2024-03-08 | Stealing Stable Diffusion Prior for Robust Monocular Depth Estimation | Yifan Mao et.al. | 2403.05056 | translate | read | link |
| 2024-03-07 | Towards Scene Graph Anticipation | Rohith Peddi et.al. | 2403.04899 | translate | read | null |
| 2024-03-07 | Embodied Understanding of Driving Scenarios | Yunsong Zhou et.al. | 2403.04593 | translate | read | link |
| 2024-03-07 | Out of the Room: Generalizing Event-Based Dynamic Motion Segmentation for Complex Scenes | Stamatios Georgoulis et.al. | 2403.04562 | translate | read | null |
| 2024-03-06 | GSNeRF: Generalizable Semantic Neural Radiance Fields with Enhanced 3D Scene Understanding | Zi-Ting Chou et.al. | 2403.03608 | translate | read | null |
| 2024-03-05 | OORD: The Oxford Offroad Radar Dataset | Matthew Gadd et.al. | 2403.02845 | translate | read | link |
| 2024-03-05 | HUNTER: Unsupervised Human-centric 3D Detection via Transferring Knowledge from Synthetic Instances to Real Scenes | Yichen Yao et.al. | 2403.02769 | translate | read | null |
(<a href=../Scene_Understanding.md>back to Scene Understanding</a>)