Depth Estimation - 2024-12
Depth Estimation - 2024-12
| Publish Date | Title | Authors | Translate | Read | Code | |
|---|---|---|---|---|---|---|
| 2024-12-31 | Tech Report: Divide and Conquer 3D Real-Time Reconstruction for Improved IGS | Yicheng Zhu et.al. | 2501.01465 | translate | read | null |
| 2024-12-30 | FPGA-based Acceleration of Neural Network for Image Classification using Vitis AI | Zhengdong Li et.al. | 2412.20974 | translate | read | null |
| 2024-12-29 | MetricDepth: Enhancing Monocular Depth Estimation with Deep Metric Learning | Chunpu Liu et.al. | 2412.20390 | translate | read | null |
| 2024-12-28 | Multi-Modality Driven LoRA for Adverse Condition Depth Estimation | Guanglei Yang et.al. | 2412.20162 | translate | read | null |
| 2024-12-28 | DepthMamba with Adaptive Fusion | Zelin Meng et.al. | 2412.19964 | translate | read | null |
| 2024-12-26 | An End-to-End Depth-Based Pipeline for Selfie Image Rectification | Ahmed Alhawwary et.al. | 2412.19189 | translate | read | null |
| 2024-12-26 | Revisiting Monocular 3D Object Detection from Scene-Level Depth Retargeting to Instance-Level Spatial Refinement | Qiude Zhang et.al. | 2412.19165 | translate | read | null |
| 2024-12-26 | MVS-GS: High-Quality 3D Gaussian Splatting Mapping via Online Multi-View Stereo | Byeonggwon Lee et.al. | 2412.19130 | translate | read | null |
| 2024-12-26 | Learning Monocular Depth from Events via Egomotion Compensation | Haitao Meng et.al. | 2412.19067 | translate | read | null |
| 2024-12-24 | RSGaussian:3D Gaussian Splatting with LiDAR for Aerial Remote Sensing Novel View Synthesis | Yiling Yao et.al. | 2412.18380 | translate | read | null |
| 2024-12-27 | LiRCDepth: Lightweight Radar-Camera Depth Estimation via Knowledge Distillation and Uncertainty Guidance | Huawei Sun et.al. | 2412.16380 | translate | read | link |
| 2024-12-19 | Flowing from Words to Pixels: A Framework for Cross-Modality Evolution | Qihao Liu et.al. | 2412.15213 | translate | read | null |
| 2024-12-19 | Scaling 4D Representations | João Carreira et.al. | 2412.15212 | translate | read | null |
| 2024-12-18 | Foundation Models Meet Low-Cost Sensors: Test-Time Adaptation for Rescaling Disparity for Zero-Shot Metric Depth Estimation | Rémi Marsal et.al. | 2412.14103 | translate | read | null |
| 2024-12-18 | Prompting Depth Anything for 4K Resolution Accurate Metric Depth Estimation | Haotong Lin et.al. | 2412.14015 | translate | read | null |
| 2024-12-18 | Marigold-DC: Zero-Shot Monocular Depth Completion with Guided Diffusion | Massimiliano Viola et.al. | 2412.13389 | translate | read | null |
| 2024-12-18 | Dyn-HaMR: Recovering 4D Interacting Hand Motion from a Dynamic Camera | Zhengdi Yu et.al. | 2412.12861 | translate | read | null |
| 2024-12-17 | PromptDet: A Lightweight 3D Object Detection Framework with LiDAR Prompts | Kun Guo et.al. | 2412.12460 | translate | read | null |
| 2024-12-16 | V-MIND: Building Versatile Monocular Indoor 3D Detector with Diverse 2D Annotations | Jin-Cheng Jhang et.al. | 2412.11412 | translate | read | null |
| 2024-12-16 | Depth-Centric Dehazing and Depth-Estimation from Real-World Hazy Driving Video | Junkai Fan et.al. | 2412.11395 | translate | read | null |
| 2024-12-15 | ViPOcc: Leveraging Visual Priors from Vision Foundation Models for Single-View 3D Occupancy Prediction | Yi Feng et.al. | 2412.11210 | translate | read | link |
| 2024-12-14 | MAL: Cluster-Masked and Multi-Task Pretraining for Enhanced xLSTM Vision Performance | Wenjun Huang et.al. | 2412.10730 | translate | read | null |
| 2024-12-12 | Stereo4D: Learning How Things Move in 3D from Internet Stereo Videos | Linyi Jin et.al. | 2412.09621 | translate | read | null |
| 2024-12-12 | T-SVG: Text-Driven Stereoscopic Video Generation | Qiao Jin et.al. | 2412.09323 | translate | read | null |
| 2024-12-12 | Cross-View Completion Models are Zero-shot Correspondence Estimators | Honggyu An et.al. | 2412.09072 | translate | read | null |
| 2024-12-11 | BLADE: Single-view Body Mesh Learning through Accurate Depth Estimation | Shengze Wang et.al. | 2412.08640 | translate | read | null |
| 2024-12-13 | Utilizing Multi-step Loss for Single Image Reflection Removal | Abdelrahman Elnenaey et.al. | 2412.08582 | translate | read | link |
| 2024-12-11 | Dense Depth from Event Focal Stack | Kenta Horikawa et.al. | 2412.08120 | translate | read | null |
| 2024-12-10 | Diffusion-Based Attention Warping for Consistent 3D Scene Editing | Eyal Gomel et.al. | 2412.07984 | translate | read | null |
| 2024-12-10 | Balancing Shared and Task-Specific Representations: A Hybrid Approach to Depth-Aware Video Panoptic Segmentation | Kurt H. W. Stolle et.al. | 2412.07966 | translate | read | null |
| 2024-12-09 | SphereUFormer: A U-Shaped Transformer for Spherical 360 Perception | Yaniv Benny et.al. | 2412.06968 | translate | read | null |
| 2024-12-09 | Driv3R: Learning Dense 4D Reconstruction for Autonomous Driving | Xin Fei et.al. | 2412.06777 | translate | read | link |
| 2024-12-09 | MAtCha Gaussians: Atlas of Charts for High-Quality Geometry and Photorealism From Sparse Views | Antoine Guédon et.al. | 2412.06767 | translate | read | null |
| 2024-12-09 | On-Device Self-Supervised Learning of Low-Latency Monocular Depth from Only Events | Jesse Hagenaars et.al. | 2412.06359 | translate | read | null |
| 2024-12-09 | Omni-Scene: Omni-Gaussian Representation for Ego-Centric Sparse-View Scene Reconstruction | Dongxu Wei et.al. | 2412.06273 | translate | read | null |
| 2024-12-09 | Event fields: Capturing light fields at high speed, resolution, and dynamic range | Ziyuan Qu et.al. | 2412.06191 | translate | read | null |
| 2024-12-08 | GVDepth: Zero-Shot Monocular Depth Estimation for Ground Vehicles based on Probabilistic Cue Fusion | Karlo Koledic et.al. | 2412.06080 | translate | read | null |
| 2024-12-08 | Prism: Semi-Supervised Multi-View Stereo with Monocular Structure Priors | Alex Rich et.al. | 2412.05771 | translate | read | null |
| 2024-12-10 | TACO: Learning Multi-modal Action Models with Synthetic Chains-of-Thought-and-Action | Zixian Ma et.al. | 2412.05479 | translate | read | null |
| 2024-12-06 | SimC3D: A Simple Contrastive 3D Pretraining Framework Using RGB Images | Jiahua Dong et.al. | 2412.05274 | translate | read | null |
| 2024-12-06 | Penetrative rotating magnetoconvection subject to lateral variations in temperature gradients | Tirtharaj Barman et.al. | 2412.05235 | translate | read | null |
| 2024-12-06 | PanoDreamer: 3D Panorama Synthesis from a Single Image | Avinash Paliwal et.al. | 2412.04827 | translate | read | link |
| 2024-12-05 | LAA-Net: A Physical-prior-knowledge Based Network for Robust Nighttime Depth Estimation | Kebin Peng et.al. | 2412.04666 | translate | read | null |
| 2024-12-05 | MegaSaM: Accurate, Fast, and Robust Structure and Motion from Casual Dynamic Videos | Zhengqi Li et.al. | 2412.04463 | translate | read | null |
| 2024-12-05 | MT3DNet: Multi-Task learning Network for 3D Surgical Scene Reconstruction | Mithun Parab et.al. | 2412.03928 | translate | read | null |
| 2024-12-04 | Perception Tokens Enhance Visual Reasoning in Multimodal Language Models | Mahtab Bigverdi et.al. | 2412.03548 | translate | read | null |
| 2024-12-04 | Dense Scene Reconstruction from Light-Field Images Affected by Rolling Shutter | Hermes McGriff et.al. | 2412.03518 | translate | read | null |
| 2024-12-04 | MultiGO: Towards Multi-level Geometry Learning for Monocular 3D Textured Human Reconstruction | Gangjian Zhang et.al. | 2412.03103 | translate | read | null |
| 2024-12-05 | Align3R: Aligned Monocular Depth Estimation for Dynamic Videos | Jiahao Lu et.al. | 2412.03079 | translate | read | null |
| 2024-12-03 | Single-Shot Metric Depth from Focused Plenoptic Cameras | Blanca Lasheras-Hernandez et.al. | 2412.02386 | translate | read | null |
| 2024-12-03 | Dual Exposure Stereo for Extended Dynamic Range 3D Imaging | Juhyung Choi et.al. | 2412.02351 | translate | read | null |
| 2024-12-03 | Amodal Depth Anything: Amodal Depth Estimation in the Wild | Zhenyu Li et.al. | 2412.02336 | translate | read | null |
| 2024-12-03 | GSGTrack: Gaussian Splatting-Guided Object Pose Tracking from RGB Videos | Zhiyuan Chen et.al. | 2412.02267 | translate | read | null |
| 2024-12-03 | FoveaSPAD: Exploiting Depth Priors for Adaptive and Efficient Single-Photon 3D Imaging | Justin Folden et.al. | 2412.02052 | translate | read | null |
| 2024-12-02 | Mutli-View 3D Reconstruction using Knowledge Distillation | Aditya Dutt et.al. | 2412.02039 | translate | read | link |
| 2024-12-02 | AVS-Net: Audio-Visual Scale Net for Self-supervised Monocular Metric Depth Estimation | Xiaohu Liu et.al. | 2412.01637 | translate | read | null |
| 2024-12-02 | STATIC : Surface Temporal Affine for TIme Consistency in Video Monocular Depth Estimation | Sunghun Yang et.al. | 2412.01090 | translate | read | null |
| 2024-12-01 | FiffDepth: Feed-forward Transformation of Diffusion-Based Generators for Detailed Depth Estimation | Yunpeng Bai et.al. | 2412.00671 | translate | read | null |
(<a href=../Depth_Estimation.md>back to Depth Estimation</a>)