Depth Estimation
Depth Estimation
| Publish Date | Title | Authors | Code | |
|---|---|---|---|---|
| 2025-12-18 | Depth Any Panoramas: A Foundation Model for Panoramic Depth Estimation | Xin Lin et.al. | 2512.16913 | null |
| 2025-12-18 | N3D-VLM: Native 3D Grounding Enables Accurate Spatial Reasoning in Vision-Language Models | Yuxin Wang et.al. | 2512.16561 | null |
| 2025-12-17 | In Pursuit of Pixel Supervision for Visual Pre-training | Lihe Yang et.al. | 2512.15715 | null |
| 2025-12-16 | DASP: Self-supervised Nighttime Monocular Depth Estimation with Domain Adaptation of Spatiotemporal Priors | Yiheng Huang et.al. | 2512.14536 | null |
| 2025-12-16 | Elastic3D: Controllable Stereo Video Conversion with Guided Latent Decoding | Nando Metzger et.al. | 2512.14236 | null |
| 2025-12-16 | Robust Single-shot Structured Light 3D Imaging via Neural Feature Decoding | Jiaheng Li et.al. | 2512.14028 | null |
| 2025-12-16 | Deep Learning Perspective of Scene Understanding in Autonomous Robots | Afia Maham et.al. | 2512.14020 | null |
| 2025-12-15 | StarryGazer: Leveraging Monocular Depth Estimation Models for Domain-Agnostic Single Depth Image Completion | Sangmin Hong et.al. | 2512.13147 | null |
| 2025-12-13 | BokehDepth: Enhancing Monocular Depth Estimation through Bokeh Generation | Hangwei Zhang et.al. | 2512.12425 | null |
| 2025-12-12 | ProbeMDE: Uncertainty-Guided Active Proprioception for Monocular Depth Estimation in Surgical Robotics | Britton Jordan et.al. | 2512.11773 | null |
| 2025-12-11 | Empowering Dynamic Urban Navigation with Stereo and Mid-Level Vision | Wentao Zhou et.al. | 2512.10956 | null |
| 2025-12-11 | Video Depth Propagation | Luigi Piccinelli et.al. | 2512.10725 | null |
| 2025-12-11 | SpaceDrive: Infusing Spatial Awareness into VLM-based Autonomous Driving | Peizheng Li et.al. | 2512.10719 | null |
| 2025-12-11 | Robust Shape from Focus via Multiscale Directional Dilated Laplacian and Recurrent Network | Khurram Ashfaq et.al. | 2512.10498 | null |
| 2025-12-09 | Scale-invariant and View-relational Representation Learning for Full Surround Monocular Depth | Kyumin Hwang et.al. | 2512.08700 | null |
| 2025-12-09 | Development & first Performance evaluation of multi-element monolithic HPGe detector for X-ray spectroscopy | N. Goyal et.al. | 2512.08389 | null |
| 2025-12-09 | Accuracy Does Not Guarantee Human-Likeness in Monocular Depth Estimators | Yuki Kubota et.al. | 2512.08163 | null |
| 2025-12-08 | More than Segmentation: Benchmarking SAM 3 for Segmentation, 3D Perception, and Reconstruction in Robotic Surgery | Wenzhen Dong et.al. | 2512.07596 | null |
| 2025-12-07 | CoT4Det: A Chain-of-Thought Framework for Perception-Oriented Vision-Language Tasks | Yu Qi et.al. | 2512.06663 | null |
| 2025-12-06 | HuPrior3R: Incorporating Human Priors for Better 3D Dynamic Reconstruction from Monocular Videos | Weitao Xiong et.al. | 2512.06368 | null |
| 2025-12-05 | See in Depth: Training-Free Surgical Scene Segmentation with Monocular Depth Priors | Kunyi Yang et.al. | 2512.05529 | null |
| 2025-12-05 | YOLO and SGBM Integration for Autonomous Tree Branch Detection and Depth Estimation in Radiata Pine Pruning Applications | Yida Lin et.al. | 2512.05412 | null |
| 2025-12-03 | Gamma-from-Mono: Road-Relative, Metric, Self-Supervised Monocular Geometry for Vehicular Applications | Gasser Elazab et.al. | 2512.04303 | null |
| 2025-12-03 | Unique Lives, Shared World: Learning from Single-Life Videos | Tengda Han et.al. | 2512.04085 | null |
| 2025-12-03 | SpaceTools: Tool-Augmented Spatial Reasoning via Double Interactive RL | Siyi Chen et.al. | 2512.04069 | null |
| 2025-12-03 | MDE-AgriVLN: Agricultural Vision-and-Language Navigation with Monocular Depth Estimation | Xiaobei Zhao et.al. | 2512.03958 | null |
| 2025-12-03 | Generalization Evaluation of Deep Stereo Matching Methods for UAV-Based Forestry Applications | Yida Lin et.al. | 2512.03427 | null |
| 2025-12-02 | DynamicVerse: A Physically-Aware Multimodal Framework for 4D World Modeling | Kairun Wen et.al. | 2512.03000 | null |
| 2025-12-02 | BEVDilation: LiDAR-Centric Multi-Modal Fusion for 3D Object Detection | Guowen Zhang et.al. | 2512.02972 | null |
| 2025-12-01 | DepthScape: Authoring 2.5D Designs via Depth Estimation, Semantic Understanding, and Geometry Extraction | Xia Su et.al. | 2512.02263 | null |
| 2025-12-01 | BlinkBud: Detecting Hazards from Behind via Sampled Monocular 3D Detection on a Single Earbud | Yunzhe Li et.al. | 2512.01366 | null |
| 2025-11-30 | Lotus-2: Advancing Geometric Dense Prediction with Powerful Image Generative Model | Jing He et.al. | 2512.01030 | null |
| 2025-11-30 | EAG3R: Event-Augmented 3D Geometry Estimation for Dynamic and Extreme-Lighting Scenes | Xiaoshan Wu et.al. | 2512.00771 | null |
| 2025-11-26 | Multi-modal On-Device Learning for Monocular Depth Estimation on Ultra-low-power MCUs | Davide Nadalini et.al. | 2512.00086 | null |
| 2025-11-28 | Geometry-Consistent 4D Gaussian Splatting for Sparse-Input Dynamic View Synthesis | Yiwei Li et.al. | 2511.23044 | null |
| 2025-11-27 | Advances in electromagnetic techniques for subsurface infrastructure detection: A comprehensive review of methods, challenges, and innovations | Arasti Afrasiabi et.al. | 2511.22673 | null |
| 2025-11-27 | IE-SRGS: An Internal-External Knowledge Fusion Framework for High-Fidelity 3D Gaussian Splatting Super-Resolution | Xiang Feng et.al. | 2511.22233 | null |
| 2025-11-25 | MODEST: Multi-Optics Depth-of-Field Stereo Dataset | Nisarg K. Trivedi et.al. | 2511.20853 | null |
| 2025-11-25 | 3D-Aware Multi-Task Learning with Cross-View Correlations for Dense Scene Understanding | Xiaoye Wang et.al. | 2511.20646 | null |
| 2025-11-25 | DeLightMono: Enhancing Self-Supervised Monocular Depth Estimation in Endoscopy by Decoupling Uneven Illumination | Mingyang Ou et.al. | 2511.20058 | null |
| 2025-11-24 | Real-Time Object Tracking with On-Device Deep Learning for Adaptive Beamforming in Dynamic Acoustic Environments | Jorge Ortigoso-Narro et.al. | 2511.19396 | null |
| 2025-11-24 | DensifyBeforehand: LiDAR-assisted Content-aware Densification for Efficient and Quality 3D Gaussian Splatting | Phurtivilai Patt et.al. | 2511.19294 | null |
| 2025-11-24 | Understanding Task Transfer in Vision-Language Models | Bhuvan Sachdeva et.al. | 2511.18787 | null |
| 2025-11-22 | AdaPerceiver: Transformers with Adaptive Width, Depth, and Tokens | Purvish Jajal et.al. | 2511.18105 | null |
| 2025-11-21 | Vision-Guided Optic Flow Navigation for Small Lunar Missions | Sean Cowan et.al. | 2511.17720 | null |
| 2025-11-21 | DepthFocus: Controllable Depth Estimation for See-Through Scenes | Junhong Min et.al. | 2511.16993 | null |
| 2025-11-20 | Dexterity from Smart Lenses: Multi-Fingered Robot Manipulation with In-the-Wild Human Demonstrations | Irmak Guzey et.al. | 2511.16661 | null |
| 2025-11-20 | Lite Any Stereo: Efficient Zero-Shot Stereo Matching | Junpeng Jing et.al. | 2511.16555 | null |
| 2025-11-20 | CylinderDepth: Cylindrical Spatial Attention for Multi-View Consistent Self-Supervised Surround Depth Estimation | Samer Abualhanud et.al. | 2511.16428 | null |
| 2025-11-20 | Upsample Anything: A Simple and Hard to Beat Baseline for Feature Upsampling | Minseok Seo et.al. | 2511.16301 | null |
| 2025-11-19 | Learning Depth from Past Selves: Self-Evolution Contrast for Robust Depth Estimation | Jing Cao et.al. | 2511.15167 | null |
| 2025-11-18 | EGSA-PT:Edge-Guided Spatial Attention with Progressive Training for Monocular Depth Estimation and Segmentation of Transparent Objects | Gbenga Omotara et.al. | 2511.14970 | null |
| 2025-11-18 | Cheating Stereo Matching in Full-scale: Physical Adversarial Attack against Binocular Depth Estimation in Autonomous Driving | Kangqiao Zhao et.al. | 2511.14386 | null |
| 2025-11-18 | Enhancing Generalization of Depth Estimation Foundation Model via Weakly-Supervised Adaptation with Regularization | Yan Huang et.al. | 2511.14238 | null |
| 2025-11-18 | RTS-Mono: A Real-Time Self-Supervised Monocular Depth Estimation Method for Real-World Deployment | Zeyu Cheng et.al. | 2511.14107 | null |
| 2025-11-17 | Towards Metric-Aware Multi-Person Mesh Recovery by Jointly Optimizing Human Crowd in Camera Space | Kaiwen Wang et.al. | 2511.13282 | null |
| 2025-11-17 | Difficulty-Aware Label-Guided Denoising for Monocular 3D Object Detection | Soyul Lee et.al. | 2511.13195 | null |
| 2025-11-13 | Depth Anything 3: Recovering the Visual Space from Any Views | Haotong Lin et.al. | 2511.10647 | null |
| 2025-11-13 | OmniVGGT: Omni-Modality Driven Visual Geometry Grounded Transformer | Haosong Peng et.al. | 2511.10560 | null |
| 2025-11-13 | Depth-Consistent 3D Gaussian Splatting via Physical Defocus Modeling and Multi-View Geometric Supervision | Yu Deng et.al. | 2511.10316 | null |
| 2025-11-13 | RobIA: Robust Instance-aware Continual Test-time Adaptation for Deep Stereo | Jueun Ko et.al. | 2511.10107 | null |
| 2025-11-12 | PALMS+: Modular Image-Based Floor Plan Localization Leveraging Depth Foundation Model | Yunqian Cheng et.al. | 2511.09724 | null |
| 2025-11-12 | PIFF: A Physics-Informed Generative Flow Model for Real-Time Flood Depth Mapping | ChunLiang Wu et.al. | 2511.09130 | null |
| 2025-11-11 | WEDepth: Efficient Adaptation of World Knowledge for Monocular Depth Estimation | Gongshu Wang et.al. | 2511.08036 | null |
| 2025-11-11 | Visual Bridge: Universal Visual Perception Representations Generating | Yilin Gao et.al. | 2511.07877 | null |
| 2025-11-10 | FlowFeat: Pixel-Dense Embedding of Motion Profiles | Nikita Araslanov et.al. | 2511.07696 | null |
| 2025-11-09 | How Wide and How Deep? Mitigating Over-Squashing of GNNs via Channel Capacity Constrained Estimation | Zinuo You et.al. | 2511.06443 | null |
| 2025-11-09 | Temporal-Guided Visual Foundation Models for Event-Based Vision | Ruihao Xia et.al. | 2511.06238 | null |
| 2025-11-08 | Light-Field Dataset for Disparity Based Depth Estimation | Suresh Nehra et.al. | 2511.05866 | null |
| 2025-11-06 | FiCABU: A Fisher-Based, Context-Adaptive Machine Unlearning Processor for Edge AI | Eun-Su Cho et.al. | 2511.05605 | null |
| 2025-11-07 | No Pose Estimation? No Problem: Pose-Agnostic and Instance-Aware Test-Time Adaptation for Monocular Depth Estimation | Mingyu Sung et.al. | 2511.05055 | null |
| 2025-11-06 | Machine Learning-Driven Analysis of kSZ Maps to Predict CMB Optical Depth $τ$ | Farshid Farhadi Khouzani et.al. | 2511.04770 | null |
| 2025-11-06 | Asymptotics of constrained $M$ -estimation under convexity | Victor-Emmanuel Brunel et.al. | 2511.04612 | null |
| 2025-11-06 | Annual net community production and carbon exports in the central Sargasso Sea from autonomous underwater glider observations | Ruth G. Curry et.al. | 2511.04544 | null |
| 2025-11-06 | BoRe-Depth: Self-supervised Monocular Depth Estimation with Boundary Refinement for Embedded Systems | Chang Liu et.al. | 2511.04388 | null |
| 2025-11-06 | Simple 3D Pose Features Support Human and Machine Social Scene Understanding | Wenshuo Qin et.al. | 2511.03988 | null |
| 2025-11-05 | Thermodynamic Probes of Multipartite Entanglement in Strongly Interacting Quantum Systems | Harsh Sharma et.al. | 2511.03266 | null |
| 2025-11-05 | Quantum Sensing of Copper-Phthalocyanine Electron Spins via NV Relaxometry | Boning Li et.al. | 2511.03200 | null |
| 2025-11-05 | Exploring the spectral characteristics of the periodic burster 4U 1323-62: Type-I X-ray burst and persistent emission | Mahasweta Bhattacharya et.al. | 2511.03172 | null |
| 2025-11-04 | EvtSlowTV – A Large and Diverse Dataset for Event-Based Depth Estimation | Sadiq Layi Macaulay et.al. | 2511.02953 | null |
| 2025-11-04 | Classical shadows for sample-efficient measurements of gauge-invariant observables | Jacob Bringewatt et.al. | 2511.02904 | null |
| 2025-11-04 | Hydrogen site-dependent physical properties of hydrous magnesium silicates: implications for water storage and transport in the mantle transition zone | Zifan Wang et.al. | 2511.02416 | null |
| 2025-11-04 | Monocular absolute depth estimation from endoscopy via domain-invariant feature learning and latent consistency | Hao Li et.al. | 2511.02247 | null |
| 2025-11-04 | Bayesian spatio-temporal weighted regression for integrating missing and misaligned environmental data | Yovna Junglee et.al. | 2511.02149 | null |
| 2025-11-03 | Opto-Electronic Convolutional Neural Network Design Via Direct Kernel Optimization | Ali Almuallem et.al. | 2511.02065 | null |
| 2025-11-03 | Dynamic Reconstruction of Ultrasound-Derived Flow Fields With Physics-Informed Neural Fields | Viraj Patel et.al. | 2511.01804 | null |
| 2025-11-03 | HGFreNet: Hop-hybrid GraphFomer for 3D Human Pose Estimation with Trajectory Consistency in Frequency Domain | Kai Zhai et.al. | 2511.01756 | null |
| 2025-11-03 | Discriminately Treating Motion Components Evolves Joint Depth and Ego-Motion Learning | Mengtan Zhang et.al. | 2511.01502 | null |
| 2025-11-03 | Floor Plan-Guided Visual Navigation Incorporating Depth and Directional Cues | Wei Huang et.al. | 2511.01493 | null |
| 2025-11-03 | Fast End-to-End Framework for Cosmological Parameter Inference from CMB Data Using Machine Learning | Larissa Santos et.al. | 2511.01291 | null |
| 2025-11-03 | Contextual Relevance and Adaptive Sampling for LLM-Based Document Reranking | Jerry Huang et.al. | 2511.01208 | null |
| 2025-10-31 | VLM6D: VLM based 6Dof Pose Estimation based on RGB-D Images | Md Selim Sarowar et.al. | 2511.00120 | null |
| 2025-10-31 | MoRE: 3D Visual Geometry Reconstruction Meets Mixture-of-Experts | Jingnan Gao et.al. | 2510.27234 | null |
| 2025-10-30 | FlowQ-Net: A Generative Framework for Automated Quantum Circuit Design | Jun Dai et.al. | 2510.26688 | null |
| 2025-10-30 | Interstellar Comet 3I/ATLAS: Evidence for Galactic Cosmic Ray Processing | R. Maggiolo et.al. | 2510.26308 | null |
| 2025-10-29 | Quantum simulation of actinide chemistry: towards scalable algorithms on trapped ion quantum computers | Kesha Sorathia et.al. | 2510.25675 | null |
| 2025-10-29 | Continuous subsurface property retrieval from sparse radar observations using physics informed neural networks | Ishfaq Aziz et.al. | 2510.25648 | null |
| 2025-10-29 | SPADE: Sparsity Adaptive Depth Estimator for Zero-Shot, Real-Time, Monocular Depth Estimation in Underwater Environments | Hongjie Zhang et.al. | 2510.25463 | null |
| 2025-10-29 | Seeing Clearly and Deeply: An RGBD Imaging Approach with a Bio-inspired Monocentric Design | Zongxi Yu et.al. | 2510.25314 | null |
| 2025-10-28 | GeVI-SLAM: Gravity-Enhanced Stereo Visua Inertial SLAM for Underwater Robots | Yuan Shen et.al. | 2510.24533 | null |
| 2025-10-27 | The case for an Astrometric Mission Extension of Euclid. Extending Gaia by 6 magnitudes with Euclid covering one-third of the sky | Luigi “Rolly’’ BEDIN et.al. | 2510.23694 | null |
| 2025-10-27 | More Than Generation: Unifying Generation and Depth Estimation via Text-to-Image Diffusion Models | Hongkai Lin et.al. | 2510.23574 | null |
| 2025-10-27 | Group-Level and Personalized Optimization for the Insula and Hippocampus Focal Electric Field in Transcranial Temporal Interferential Stimulation: A Computational Study | Taiga Inoue et.al. | 2510.23290 | null |
| 2025-10-27 | Precise Time Delay Measurement and Compensation for Tightly Coupled Underwater SINS/piUSBL Navigation | Jin Huang et.al. | 2510.23286 | null |
| 2025-10-27 | Development of the Reconstruction Procedure of the Fluorescence detector Array of Single-pixel Telescopes for measuring Ultra-High Energy Cosmic Rays | Fraser Bradfield et.al. | 2510.23219 | null |
| 2025-10-27 | Resource analysis of Shor’s elliptic curve algorithm with an improved quantum adder on a two-dimensional lattice | Quan Gu et.al. | 2510.23212 | null |
| 2025-10-27 | Seq-DeepIPC: Sequential Sensing for End-to-End Control in Legged Robot Navigation | Oskar Natan et.al. | 2510.23057 | null |
| 2025-10-26 | LVD-GS: Gaussian Splatting SLAM for Dynamic Scenes via Hierarchical Explicit-Implicit Representation Collaboration Rendering | Wenkai Zhu et.al. | 2510.22669 | null |
| 2025-10-26 | qc-kmeans: A Quantum Compressive K-Means Algorithm for NISQ Devices | Pedro Chumpitaz-Flores et.al. | 2510.22540 | null |
| 2025-10-25 | EndoSfM3D: Learning to 3D Reconstruct Any Endoscopic Surgery Scene using Self-supervised Foundation Model | Changhao Zhang et.al. | 2510.22359 | null |
| 2025-10-25 | I2-NeRF: Learning Neural Radiance Fields Under Physically-Grounded Media Interactions | Shuhong Liu et.al. | 2510.22161 | null |
| 2025-10-25 | CogStereo: Neural Stereo Matching with Implicit Spatial Cognition Embedding | Lihuang Fang et.al. | 2510.22119 | null |
| 2025-10-25 | Impact of Charge Transfer Inefficiency on transit light-curves: A correction strategy for PLATO | Shaunak Mishra et.al. | 2510.22092 | null |
| 2025-10-24 | An Hα Transit of HD 189733b to Assess Stellar Activity Across the Transit Chord Close to JWST Observations | Kingsley E. Ehrich et.al. | 2510.21703 | null |
| 2025-10-24 | MedAlign: A Synergistic Framework of Multimodal Preference Optimization and Federated Meta-Cognitive Reasoning | Siyong Chen et.al. | 2510.21093 | null |
| 2025-10-23 | PPMStereo: Pick-and-Play Memory Construction for Consistent Dynamic Stereo Matching | Yun Wang et.al. | 2510.20178 | null |
| 2025-10-22 | Projecting Hurricane Risk in Atlantic Canada under Climate Change | Saeed Saviz Naeini et.al. | 2510.20074 | null |
| 2025-10-22 | Toward A Better Understanding of Monocular Depth Evaluation | Siyang Wu et.al. | 2510.19814 | null |
| 2025-10-22 | FAUST. XXVIII. High-Resolution ALMA Observations of Class 0/I Disks: Structure, Optical Depths, and Temperatures | M. J. Maureira et.al. | 2510.19635 | null |
| 2025-10-22 | Insights into the Unknown: Federated Data Diversity Analysis on Molecular Data | Markus Bujotzek et.al. | 2510.19535 | null |
| 2025-10-22 | PRGCN: A Graph Memory Network for Cross-Sequence Pattern Reuse in 3D Human Pose Estimation | Zhuoyang Xie et.al. | 2510.19475 | null |
| 2025-10-22 | Seabed-Net: A multi-task network for joint bathymetry estimation and seabed classification from remote sensing imagery in shallow waters | Panagiotis Agrafiotis et.al. | 2510.19329 | null |
| 2025-10-22 | SFGFusion: Surface Fitting Guided 3D Object Detection with 4D Radar and Camera Fusion | Xiaozhi Li et.al. | 2510.19215 | null |
| 2025-10-21 | Kinematic Analysis and Integration of Vision Algorithms for a Mobile Manipulator Employed Inside a Self-Driving Laboratory | Shifa Sulaiman et.al. | 2510.19081 | null |
| 2025-10-21 | Adaptive hyperviscosity stabilisation for the RBF-FD method in solving advection-dominated transport equations | Miha Rot et.al. | 2510.18772 | null |
| 2025-10-21 | PLANA3R: Zero-shot Metric Planar 3D Reconstruction via Feed-Forward Planar Splatting | Changkun Liu et.al. | 2510.18714 | null |
| 2025-10-21 | GeoDiff: Geometry-Guided Diffusion for Metric Depth Estimation | Tuan Pham et.al. | 2510.18291 | null |
| 2025-10-20 | Believe It or Not: How Deeply do LLMs Believe Implanted Facts? | Stewart Slocum et.al. | 2510.17941 | null |
| 2025-10-20 | PAGE-4D: Disentangled Pose and Geometry Estimation for VGGT-4D Perception | Kaichen Zhou et.al. | 2510.17568 | null |
| 2025-10-20 | M2H: Multi-Task Learning with Efficient Window-Based Cross-Task Attention for Monocular Spatial Perception | U. V. B. L Udugama et.al. | 2510.17363 | null |
| 2025-10-20 | Capturing Head Avatar with Hand Contacts from a Monocular Video | Haonan He et.al. | 2510.17181 | null |
| 2025-10-19 | How Universal Are SAM2 Features? | Masoud Khairi Atani et.al. | 2510.17051 | null |
| 2025-10-19 | A Low-Complexity View Synthesis Distortion Estimation Method for 3D Video with Large Baseline Considerations | Chongyuan Bi et.al. | 2510.17037 | null |
| 2025-10-19 | Prediction-Augmented Trees for Reliable Statistical Inference | Vikram Kher et.al. | 2510.16937 | null |
| 2025-10-18 | Self-Supervised Learning to Fly using Efficient Semantic Segmentation and Metric Depth Estimation for Low-Cost Autonomous UAVs | Sebastian Mocanu et.al. | 2510.16624 | null |
| 2025-10-18 | OOS-DSD: Improving Out-of-stock Detection in Retail Images using Auxiliary Tasks | Franko Šikić et.al. | 2510.16508 | null |
| 2025-10-15 | Decision-focused Sensing and Forecasting for Adaptive and Rapid Flood Response: An Implicit Learning Approach | Qian Sun et.al. | 2510.16015 | null |
| 2025-10-17 | FIDDLE: Reinforcement Learning for Quantum Fidelity Enhancement | Hoang M. Ngo et.al. | 2510.15833 | null |
| 2025-10-17 | Adaptive time Compressed QITE (ACQ) and its geometrical interpretation | Alberto Acevedo Meléndez et.al. | 2510.15781 | null |
| 2025-10-16 | SaLon3R: Structure-aware Long-term Generalizable 3D Reconstruction from Unposed Images | Jiaxin Guo et.al. | 2510.15072 | null |
| 2025-10-16 | C4D: 4D Made from 3D through Dual Correspondences | Shizun Wang et.al. | 2510.14960 | null |
| 2025-10-16 | Multi-modal video data-pipelines for machine learning with minimal human supervision | Mihai-Cristian Pîrvu et.al. | 2510.14862 | null |
| 2025-10-15 | XD-RCDepth: Lightweight Radar-Camera Depth Estimation with Explainability-Aligned and Distribution-Aware Distillation | Huawei Sun et.al. | 2510.13565 | null |
| 2025-10-15 | FlyAwareV2: A Multimodal Cross-Domain UAV Dataset for Urban Scene Understanding | Francesco Barbato et.al. | 2510.13243 | null |
| 2025-10-14 | E-MoFlow: Learning Egomotion and Optical Flow from Event Data via Implicit Regularization | Wenpu Li et.al. | 2510.12753 | null |
| 2025-10-14 | Spatial Forcing: Implicit Spatial Representation Alignment for Vision-language-action Model | Fuhao Li et.al. | 2510.12276 | null |
| 2025-10-13 | Evaluating the effects of preprocessing, method selection, and hyperparameter tuning on SAR-based flood mapping and water depth estimation | Jean-Paul Travert et.al. | 2510.11305 | null |
| 2025-10-11 | Gesplat: Robust Pose-Free 3D Reconstruction via Geometry-Guided Gaussian Splatting | Jiahui Lu et.al. | 2510.10097 | null |
| 2025-10-10 | Fast Self-Supervised depth and mask aware Association for Multi-Object Tracking | Milad Khanchi et.al. | 2510.09878 | null |
| 2025-10-10 | Hybrid-grained Feature Aggregation with Coarse-to-fine Language Guidance for Self-supervised Monocular Depth Estimation | Wenyao Zhang et.al. | 2510.09320 | null |
| 2025-10-10 | Online Video Depth Anything: Temporally-Consistent Depth Prediction with Low Memory Consumption | Johann-Friedrich Feiden et.al. | 2510.09182 | null |
| 2025-10-08 | Into the Rabbit Hull: From Task-Relevant Concepts in DINO to Minkowski Geometry | Thomas Fel et.al. | 2510.08638 | null |
| 2025-10-09 | RayFusion: Ray Fusion Enhanced Collaborative Visual Perception | Shaohong Wang et.al. | 2510.08017 | null |
| 2025-10-09 | CVD-STORM: Cross-View Video Diffusion with Spatial-Temporal Reconstruction Model for Autonomous Driving | Tianrui Zhang et.al. | 2510.07944 | null |
| 2025-10-09 | An End-to-End Room Geometry Constrained Depth Estimation Framework for Indoor Panorama Images | Kanglin Ning et.al. | 2510.07817 | null |
| 2025-10-08 | Pixel-Perfect Depth with Semantics-Prompted Diffusion Transformers | Gangwei Xu et.al. | 2510.07316 | null |
| 2025-10-08 | MV-Performer: Taming Video Diffusion Model for Faithful and Synchronized Multi-view Performer Synthesis | Yihao Zhi et.al. | 2510.07190 | null |
| 2025-10-07 | Human3R: Everyone Everywhere All at Once | Yue Chen et.al. | 2510.06219 | null |
| 2025-10-07 | EgoNight: Towards Egocentric Vision Understanding at Night with a Challenging Benchmark | Deheng Zhang et.al. | 2510.06218 | null |
| 2025-10-07 | Dropping the D: RGB-D SLAM Without the Depth Sensor | Mert Kiray et.al. | 2510.06216 | null |
| 2025-10-07 | DeLTa: Demonstration and Language-Guided Novel Transparent Object Manipulation | Taeyeop Lee et.al. | 2510.05662 | null |
| 2025-10-07 | Human Action Recognition from Point Clouds over Time | James Dickens et.al. | 2510.05506 | null |
| 2025-10-06 | HybridFlow: Quantification of Aleatoric and Epistemic Uncertainty with a Single Hybrid Model | Peter Van Katwyk et.al. | 2510.05054 | null |
| 2025-10-06 | Benchmark on Monocular Metric Depth Estimation in Wildlife Setting | Niccolò Niccoli et.al. | 2510.04723 | null |
| 2025-10-04 | Evaluating High-Resolution Piano Sustain Pedal Depth Estimation with Musically Informed Metrics | Hanwen Zhang et.al. | 2510.03750 | null |
| 2025-10-03 | Whisker-based Tactile Flight for Tiny Drones | Chaoxiang Ye et.al. | 2510.03119 | null |
| 2025-10-02 | Non-Rigid Structure-from-Motion via Differential Geometry with Recoverable Conformal Scale | Yongbo Chen et.al. | 2510.01665 | null |
| 2025-10-01 | Temporal Score Rescaling for Temperature Sampling in Diffusion and Flow Models | Yanbo Xu et.al. | 2510.01184 | null |
| 2025-09-30 | DA $^{2}$ : Depth Anything in Any Direction | Haodong Li et.al. | 2509.26618 | link |
| 2025-09-30 | DEPTHOR++: Robust Depth Enhancement from a Real-World Lightweight dToF and RGB Guidance | Jijun Xiang et.al. | 2509.26498 | null |
| 2025-09-30 | EasyOcc: 3D Pseudo-Label Supervision for Fully Self-Supervised Semantic Occupancy Prediction Models | Seamie Hayes et.al. | 2509.26087 | null |
| 2025-09-30 | PFDepth: Heterogeneous Pinhole-Fisheye Joint Depth Estimation via Distortion-aware Gaussian-Splatted Volumetric Fusion | Zhiwei Zhang et.al. | 2509.26008 | null |
| 2025-09-29 | DepthLM: Metric Depth From Vision Language Models | Zhipeng Cai et.al. | 2509.25413 | link |
| 2025-09-29 | Fast Feature Field ( $\text{F}^3$ ): A Predictive Representation of Events | Richeek Das et.al. | 2509.25146 | null |
| 2025-09-29 | BRIDGE – Building Reinforcement-Learning Depth-to-Image Data Generation Engine for Monocular Depth Estimation | Dingning Liu et.al. | 2509.25077 | link |
| 2025-09-29 | HBSplat: Robust Sparse-View Gaussian Reconstruction with Hybrid-Loss Guided Depth and Bidirectional Warping | Yu Ma et.al. | 2509.24893 | null |
| 2025-09-28 | RPG360: Robust 360 Depth Estimation with Perspective Foundation Models and Graph Optimization | Dongki Jung et.al. | 2509.23991 | null |
| 2025-09-28 | FastViDAR: Real-Time Omnidirectional Depth Estimation via Alternative Hierarchical Attention | Hangtian Zhao et.al. | 2509.23733 | null |
| 2025-09-28 | Efficient Domain-Adaptive Multi-Task Dense Prediction with Vision Foundation Models | Beomseok Kang et.al. | 2509.23626 | null |
| 2025-09-26 | CCNeXt: An Effective Self-Supervised Stereo Depth Estimation Approach | Alexandre Lopes et.al. | 2509.22627 | link |
| 2025-09-26 | EfficientDepth: A Fast and Detail-Preserving Monocular Depth Estimation Model | Andrii Litvynchuk et.al. | 2509.22527 | null |
| 2025-09-26 | DualFocus: Depth from Focus with Spatio-Focal Dual Variational Constraints | Sungmin Woo et.al. | 2509.21992 | null |
| 2025-09-25 | Finding 3D Positions of Distant Objects from Noisy Camera Movement and Semantic Segmentation Sequences | Julius Pesonen et.al. | 2509.20906 | null |
| 2025-09-24 | Shared Neural Space: Unified Precomputed Feature Encoding for Multi-Task and Cross Domain Vision | Jing Li et.al. | 2509.20481 | null |
| 2025-09-24 | BiTAA: A Bi-Task Adversarial Attack for Object Detection and Depth Estimation via 3D Gaussian Splatting | Yixun Zhang et.al. | 2509.19793 | null |
| 2025-09-24 | VIMD: Monocular Visual-Inertial Motion and Depth Estimation | Saimouli Katragadda et.al. | 2509.19713 | null |
| 2025-09-24 | Enhancing Transformer-Based Vision Models: Addressing Feature Map Anomalies Through Novel Optimization Strategies | Sumit Mamtani et.al. | 2509.19687 | null |
| 2025-09-23 | An on-chip Pixel Processing Approach with 2.4μs latency for Asynchronous Read-out of SPAD-based dToF Flash LiDARs | Yiyang Liu et.al. | 2509.19192 | null |
| 2025-09-23 | RS3DBench: A Comprehensive Benchmark for 3D Spatial Perception in Remote Sensing | Jiayu Wang et.al. | 2509.18897 | null |
| 2025-09-23 | Zero-shot Monocular Metric Depth for Endoscopic Images | Nicolas Toussaint et.al. | 2509.18642 | null |
| 2025-09-18 | URNet: Uncertainty-aware Refinement Network for Event-based Stereo Depth Estimation | Yifeng Cheng et.al. | 2509.18184 | null |
| 2025-09-22 | RadarSFD: Single-Frame Diffusion with Pretrained Priors for Radar Point Clouds | Bin Zhao et.al. | 2509.18068 | null |
| 2025-09-22 | Predicting Depth Maps from Single RGB Images and Addressing Missing Information in Depth Estimation | Mohamad Mofeed Chaar et.al. | 2509.17686 | null |
| 2025-09-22 | Evict3R: Training-Free Token Eviction for Memory-Bounded Streaming Visual Geometry Transformers | Soroush Mahdi et.al. | 2509.17650 | null |
| 2025-09-22 | GPS Denied IBVS-Based Navigation and Collision Avoidance of UAV Using a Low-Cost RGB Camera | Xiaoyu Wang et.al. | 2509.17435 | null |
| 2025-09-21 | ConfidentSplat: Confidence-Weighted Depth Fusion for Accurate 3D Gaussian Splatting SLAM | Amanuel T. Dufera et.al. | 2509.16863 | null |
| 2025-09-19 | 3D Gaussian Flats: Hybrid 2D/3D Photometric Scene Reconstruction | Maria Taktasheva et.al. | 2509.16423 | null |
| 2025-09-19 | StereoAdapter: Adapting Stereo Depth Estimation to Underwater Scenes | Zhengri Wu et.al. | 2509.16415 | link |
| 2025-09-19 | Towards Sharper Object Boundaries in Self-Supervised Depth Estimation | Aurélien Cecille et.al. | 2509.15987 | null |
| 2025-09-19 | Shedding Light on Depth: Explainability Assessment in Monocular Depth Estimation | Lorenzo Cirillo et.al. | 2509.15980 | null |
| 2025-09-19 | MS-GS: Multi-Appearance Sparse-View 3D Gaussian Splatting in the Wild | Deming Li et.al. | 2509.15548 | null |
| 2025-09-18 | Depth AnyEvent: A Cross-Modal Distillation Paradigm for Event-Based Monocular Depth Estimation | Luca Bartolomei et.al. | 2509.15224 | null |
| 2025-09-18 | Lightweight and Accurate Multi-View Stereo with Confidence-Aware Diffusion Model | Fangjinhua Wang et.al. | 2509.15220 | null |
| 2025-09-18 | UCorr: Wire Detection and Depth Estimation for Autonomous Drones | Benedikt Kolbeinsson et.al. | 2509.14989 | null |
| 2025-09-18 | MapAnything: Mapping Urban Assets using Single Street-View Images | Miriam Louise Carnot et.al. | 2509.14839 | null |
| 2025-09-16 | \textsc{Gen2Real}: Towards Demo-Free Dexterous Manipulation by Harnessing Generated Video | Kai Ye et.al. | 2509.14178 | null |
| 2025-09-17 | UM-Depth : Uncertainty Masked Self-Supervised Monocular Depth Estimation with Visual Odometry | Tae-Wook Um et.al. | 2509.13713 | null |
| 2025-09-17 | Gaussian Alignment for Relative Camera Pose Estimation via Single-View Reconstruction | Yumin Li et.al. | 2509.13652 | null |
| 2025-09-16 | ColonCrafter: A Depth Estimation Model for Colonoscopy Videos Using Diffusion Priors | Romain Hardy et.al. | 2509.13525 | null |
| 2025-09-16 | MINGLE: VLMs for Semantically Complex Region Detection in Urban Scenes | Liu Liu et.al. | 2509.13484 | null |
| 2025-09-16 | MapAnything: Universal Feed-Forward Metric 3D Reconstruction | Nikhil Keetha et.al. | 2509.13414 | null |
| 2025-09-16 | ROOM: A Physics-Based Continuum Robot Simulator for Photorealistic Medical Datasets Generation | Salvatore Esposito et.al. | 2509.13177 | link |
| 2025-09-15 | BREA-Depth: Bronchoscopy Realistic Airway-geometric Depth Estimation | Francis Xiatian Zhang et.al. | 2509.11885 | null |
| 2025-09-14 | In-Vivo Skin 3-D Surface Reconstruction and Wrinkle Depth Estimation using Handheld High Resolution Tactile Sensing | Akhil Padmanabha et.al. | 2509.11385 | null |
| 2025-09-14 | The System Description of CPS Team for Track on Driving with Language of CVPR 2024 Autonomous Grand Challenge | Jinghan Peng et.al. | 2509.11071 | null |
| 2025-09-12 | Self-supervised Learning Of Visual Pose Estimation Without Pose Labels By Classifying LED States | Nicholas Carlotti et.al. | 2509.10405 | null |
| 2025-09-10 | Computational Imaging for Enhanced Computer Vision | Humera Shaikh et.al. | 2509.08712 | null |
| 2025-09-10 | Deep Visual Odometry for Stereo Event Cameras | Sheng Zhong et.al. | 2509.08235 | null |
| 2025-09-09 | Zero-Shot Metric Depth Estimation via Monocular Visual-Inertial Rescaling for Autonomous Aerial Navigation | Steven Yang et.al. | 2509.08159 | null |
| 2025-09-09 | MCTED: A Machine-Learning-Ready Dataset for Digital Elevation Model Generation From Mars Imagery | Rafał Osadnik et.al. | 2509.08027 | null |
| 2025-09-08 | Event Spectroscopy: Event-based Multispectral and Depth Sensing using Structured Light | Christian Geckeler et.al. | 2509.06741 | null |
| 2025-09-08 | VIM-GS: Visual-Inertial Monocular Gaussian Splatting via Object-level Guidance in Large Scenes | Shengkai Zhang et.al. | 2509.06685 | null |
| 2025-09-07 | S-LAM3D: Segmentation-Guided Monocular 3D Object Detection via Feature Space Fusion | Diana-Alexandra Sas et.al. | 2509.05999 | null |
| 2025-09-06 | MonoGlass3D: Monocular 3D Glass Detection with Plane Regression and Adaptive Feature Fusion | Kai Zhang et.al. | 2509.05599 | null |
| 2025-09-05 | FloodVision: Urban Flood Depth Estimation Using Foundation Vision-Language Models and Domain Knowledge Graph | Zhangding Liu et.al. | 2509.04772 | null |
| 2025-09-03 | Uncertainty-aware Test-Time Training (UT $^3$ ) for Efficient On-the-fly Domain Adaptive Dense Regression | Uddeshya Upadhyay et.al. | 2509.03012 | null |
| 2025-09-03 | DUViN: Diffusion-Based Underwater Visual Navigation via Knowledge-Transferred Depth Features | Jinghe Yang et.al. | 2509.02983 | null |
| 2025-09-02 | Physics-Informed Machine Learning with Adaptive Grids for Optical Microrobot Depth Estimation | Lan Wei et.al. | 2509.02343 | null |
| 2025-09-02 | Doctoral Thesis: Geometric Deep Learning For Camera Pose Prediction, Registration, Depth Estimation, and 3D Reconstruction | Xueyang Kang et.al. | 2509.01873 | null |
| 2025-09-01 | EndoGMDE: Generalizable Monocular Depth Estimation with Mixture of Low-Rank Experts for Diverse Endoscopic Scenes | Liangjing Shao et.al. | 2509.01206 | null |
| 2025-08-31 | ER-LoRA: Effective-Rank Guided Adaptation for Weather-Generalized Depth Estimation | Weilong Yan et.al. | 2509.00665 | null |
| 2025-08-23 | ARTPS: Depth-Enhanced Hybrid Anomaly Detection and Learnable Curiosity Score for Autonomous Rover Target Prioritization | Poyraz Baydemir et.al. | 2509.00042 | null |
| 2025-08-28 | Enhancing Pseudo-Boxes via Data-Level LiDAR-Camera Fusion for Unsupervised 3D Object Detection | Mingqian Ji et.al. | 2508.20530 | null |
| 2025-08-27 | OpenM3D: Open Vocabulary Multi-view Indoor 3D Object Detection without Human Annotations | Peng-Hao Hsu et.al. | 2508.20063 | null |
| 2025-08-26 | SoccerNet 2025 Challenges Results | Silvio Giancola et.al. | 2508.19182 | null |
| 2025-08-25 | Impact of Target and Tool Visualization on Depth Perception and Usability in Optical See-Through AR | Yue Yang et.al. | 2508.18481 | null |
| 2025-08-25 | EndoUFM: Utilizing Foundation Models for Monocular depth estimation of endoscopic images | Xinning Yao et.al. | 2508.17916 | null |
| 2025-08-23 | Balanced Sharpness-Aware Minimization for Imbalanced Regression | Yahao Liu et.al. | 2508.16973 | null |
| 2025-08-20 | FOCUS: Frequency-Optimized Conditioning of DiffUSion Models for mitigating catastrophic forgetting during Test-Time Adaptation | Gabriel Tjio et.al. | 2508.14437 | null |
| 2025-08-19 | ROVR-Open-Dataset: A Large-Scale Depth Dataset for Autonomous Driving | Xianda Guo et.al. | 2508.13977 | null |
| 2025-08-18 | Batching-Aware Joint Model Onloading and Offloading for Hierarchical Multi-Task Inference | Seohyeon Cha et.al. | 2508.13380 | null |
| 2025-08-18 | DMS:Diffusion-Based Multi-Baseline Stereo Generation for Improving Self-Supervised Depth Estimation | Zihua Liu et.al. | 2508.13091 | null |
| 2025-08-15 | DashCam Video: A complementary low-cost data stream for on-demand forest-infrastructure system monitoring | Durga Joshi et.al. | 2508.11591 | null |
| 2025-08-15 | Unifying Scale-Aware Depth Prediction and Perceptual Priors for Monocular Endoscope Pose Estimation and Tissue Reconstruction | Muzammil Khan et.al. | 2508.11282 | null |
| 2025-08-15 | CHARM3R: Towards Unseen Camera Height Robust Monocular 3D Detector | Abhinav Kumar et.al. | 2508.11185 | null |
| 2025-08-12 | Vision-Only Gaussian Splatting for Collaborative Semantic Occupancy Prediction | Cheng Chen et.al. | 2508.10936 | null |
| 2025-08-14 | SC-Lane: Slope-aware and Consistent Road Height Estimation Framework for 3D Lane Detection | Chaesong Park et.al. | 2508.10411 | null |
| 2025-08-12 | A new dataset and comparison for multi-camera frame synthesis | Conall Daly et.al. | 2508.09068 | null |
| 2025-08-12 | Deep Spectral Epipolar Representations for Dense Light Field Reconstruction | Noor Islam S. Mohammad et.al. | 2508.08900 | null |
| 2025-08-11 | GRASPTrack: Geometry-Reasoned Association via Segmentation and Projection for Multi-Object Tracking | Xudong Han et.al. | 2508.08117 | null |
| 2025-08-11 | TRIDE: A Text-assisted Radar-Image weather-aware fusion network for Depth Estimation | Huawei Sun et.al. | 2508.08038 | null |
| 2025-08-11 | Autonomous Navigation of Cloud-Controlled Quadcopters in Confined Spaces Using Multi-Modal Perception and LLM-Driven High Semantic Reasoning | Shoaib Ahmmad et.al. | 2508.07885 | null |
| 2025-08-10 | MonoMPC: Monocular Vision Based Navigation with Learned Collision Model and Risk-Aware Model Predictive Control | Basant Sharma et.al. | 2508.07387 | null |
| 2025-08-10 | DIP-GS: Deep Image Prior For Gaussian Splatting Sparse View Recovery | Rajaei Khatib et.al. | 2508.07372 | null |
| 2025-08-10 | Similarity Matters: A Novel Depth-guided Network for Image Restoration and A New Dataset | Junyi He et.al. | 2508.07211 | null |
| 2025-08-10 | Acoustic source depth estimation method based on a single hydrophone in Arctic underwater | Jinbao Weng et.al. | 2508.07157 | null |
| 2025-08-09 | AugLift: Boosting Generalization in Lifting-based 3D Human Pose Estimation | Nikolai Warner et.al. | 2508.07112 | null |
| 2025-08-08 | Neural Field Representations of Mobile Computational Photography | Ilya Chugunov et.al. | 2508.05907 | null |
| 2025-08-07 | Propagating Sparse Depth via Depth Foundation Model for Out-of-Distribution Depth Completion | Shenglun Chen et.al. | 2508.04984 | null |
| 2025-08-06 | Extending Foundational Monocular Depth Estimators to Fisheye Cameras with Calibration Tokens | Suchisrit Gangopadhyay et.al. | 2508.04928 | null |
| 2025-08-06 | BridgeDepth: Bridging Monocular and Stereo Reasoning with Latent Alignment | Tongfan Guan et.al. | 2508.04611 | null |
| 2025-08-06 | Pseudo Depth Meets Gaussian: A Feed-forward RGB SLAM Baseline | Linqing Zhao et.al. | 2508.04597 | null |
| 2025-08-06 | MuGS: Multi-Baseline Generalizable Gaussian Splatting Reconstruction | Yaopeng Lou et.al. | 2508.04297 | null |
| 2025-08-06 | DET-GS: Depth- and Edge-Aware Regularization for High-Fidelity 3D Gaussian Splatting | Zexu Huang et.al. | 2508.04099 | null |
| 2025-08-05 | Monocular Depth Estimation with Global-Aware Discretization and Local Context Modeling | Heng Wu et.al. | 2508.03186 | null |
| 2025-08-04 | VRSight: An AI-Driven Scene Description System to Improve Virtual Reality Accessibility for Blind People | Daniel Killough et.al. | 2508.02958 | null |
| 2025-08-04 | Elucidating the Role of Feature Normalization in IJEPA | Adam Colton et.al. | 2508.02829 | null |
| 2025-08-04 | Rethinking Transparent Object Grasping: Depth Completion with Monocular Depth Estimation and Instance Mask | Yaofeng Cheng et.al. | 2508.02507 | null |
| 2025-08-02 | 3DRot: 3D Rotation Augmentation for RGB-Based 3D Tasks | Shitian Yang et.al. | 2508.01423 | null |
| 2025-08-02 | A Coarse-to-Fine Approach to Multi-Modality 3D Occupancy Grounding | Zhan Shi et.al. | 2508.01197 | link |
| 2025-07-29 | A Survey on Deep Multi-Task Learning in Connected Autonomous Vehicles | Jiayuan Wang et.al. | 2508.00917 | null |
| 2025-07-29 | TESPEC: Temporally-Enhanced Self-Supervised Pretraining for Event Cameras | Mohammad Mohammadi et.al. | 2508.00913 | null |
| 2025-07-28 | Sparse 3D Perception for Rose Harvesting Robots: A Two-Stage Approach Bridging Simulation and Real-World Applications | Taha Samavati et.al. | 2508.00900 | null |
| 2025-08-01 | Can Large Pretrained Depth Estimation Models Help With Image Dehazing? | Hongfei Zhang et.al. | 2508.00698 | null |
| 2025-07-31 | Stereo 3D Gaussian Splatting SLAM for Outdoor Urban Scenes | Xiaohan Li et.al. | 2507.23677 | null |
| 2025-07-30 | A Dual-Feature Extractor Framework for Accurate Back Depth and Spine Morphology Estimation from Monocular RGB Images | Yuxin Wei et.al. | 2507.22691 | null |
| 2025-07-30 | UAVScenes: A Multi-Modal Dataset for UAVs | Sijie Wang et.al. | 2507.22412 | link |
| 2025-07-29 | PanoSplatt3R: Leveraging Perspective Pretraining for Generalized Unposed Wide-Baseline Panorama Reconstruction | Jiahui Ren et.al. | 2507.21960 | null |
| 2025-07-25 | Event-Based De-Snowing for Autonomous Driving | Manasi Muglikar et.al. | 2507.20901 | null |
| 2025-07-28 | Endoscopic Depth Estimation Based on Deep Learning: A Survey | Ke Niu et.al. | 2507.20881 | null |
| 2025-07-26 | UniCT Depth: Event-Image Fusion Based Monocular Depth Estimation with Convolution-Compensated ViT Dual SA Block | Luoxi Jing et.al. | 2507.19948 | null |
| 2025-07-24 | Towards Scalable Spatial Intelligence via 2D-to-3D Data Lifting | Xingyu Miao et.al. | 2507.18678 | null |
| 2025-07-24 | DepthDark: Robust Monocular Depth Estimation for Low-Light Environments | Longjian Zeng et.al. | 2507.18243 | null |
| 2025-07-24 | BokehDiff: Neural Lens Blur with One-Step Diffusion | Chengxuan Zhu et.al. | 2507.18060 | null |
| 2025-07-23 | Monocular Semantic Scene Completion via Masked Recurrent Networks | Xuzhi Wang et.al. | 2507.17661 | null |
| 2025-07-22 | SDGOCC: Semantic and Depth-Guided Bird’s-Eye View Transformation for 3D Multimodal Occupancy Prediction | Zaipeng Duan et.al. | 2507.17083 | null |
| 2025-07-21 | DAViD: Data-efficient and Accurate Vision Models from Synthetic Data | Fatemeh Saleh et.al. | 2507.15365 | null |
| 2025-07-21 | BenchDepth: Are We on the Right Way to Evaluate Depth Foundation Models? | Zhenyu Li et.al. | 2507.15321 | null |
| 2025-07-20 | Region-aware Depth Scale Adaptation with Sparse Measurements | Rizhao Fan et.al. | 2507.14879 | null |
| 2025-07-20 | Training Self-Supervised Depth Completion Using Sparse Measurements and a Single Image | Rizhao Fan et.al. | 2507.14845 | null |
| 2025-07-19 | DCHM: Depth-Consistent Human Modeling for Multiview Detection | Jiahao Ma et.al. | 2507.14505 | null |
| 2025-07-19 | Motion Segmentation and Egomotion Estimation from Event-Based Normal Flow | Zhiyuan Hua et.al. | 2507.14500 | null |
| 2025-07-18 | Depth3DLane: Fusing Monocular 3D Lane Detection with Self-Supervised Monocular Depth Estimation | Max van den Hoven et.al. | 2507.13857 | null |
| 2025-07-18 | Augmented Reality in Cultural Heritage: A Dual-Model Pipeline for 3D Artwork Reconstruction | Daniele Pannone et.al. | 2507.13719 | null |
| 2025-07-17 | $π^3$ : Scalable Permutation-Equivariant Visual Geometry Learning | Yifan Wang et.al. | 2507.13347 | null |
| 2025-07-17 | $S^2M^2$ : Scalable Stereo Matching Model for Reliable Depth Estimation | Junhong Min et.al. | 2507.13229 | null |
| 2025-07-16 | Vision-based Perception for Autonomous Vehicles in Obstacle Avoidance Scenarios | Van-Hoang-Anh Phan et.al. | 2507.12449 | null |
| 2025-07-16 | Efficient Calisthenics Skills Classification through Foreground Instance Selection and Depth Estimation | Antonio Finocchiaro et.al. | 2507.12292 | null |
| 2025-07-15 | Towards Depth Foundation Model: Recent Trends in Vision-Based Depth Estimation | Zhen Xu et.al. | 2507.11540 | null |
| 2025-07-15 | MonoMVSNet: Monocular Priors Guided Multi-View Stereo Network | Jianfei Jiang et.al. | 2507.11333 | null |
| 2025-07-15 | Uncertainty Aware Mapping for Vision-Based Underwater Robots | Abhimanyu Bhowmik et.al. | 2507.10991 | null |
| 2025-07-14 | Static or Temporal? Semantic Scene Simplification to Aid Wayfinding in Immersive Simulations of Bionic Vision | Justin M. Kasowski et.al. | 2507.10813 | null |
| 2025-07-14 | Cameras as Relative Positional Encoding | Ruilong Li et.al. | 2507.10496 | null |
| 2025-07-14 | Spatial Lifting for Dense Prediction | Mingzhi Xu et.al. | 2507.10222 | null |
| 2025-07-13 | Prompt2DEM: High-Resolution DEMs for Urban and Open Environments from Global Prompts Using a Monocular Foundation Model | Osher Rafaeli et.al. | 2507.09681 | null |
| 2025-07-11 | ByDeWay: Boost Your multimodal LLM with DEpth prompting in a Training-Free Way | Rajarshi Roy et.al. | 2507.08679 | null |
| 2025-07-10 | An Embedded Real-time Object Alert System for Visually Impaired: A Monocular Depth Estimation based Approach through Computer Vision | Jareen Anjom et.al. | 2507.08165 | null |
| 2025-07-10 | Tree-Mamba: A Tree-Aware Mamba for Underwater Monocular Depth Estimation | Peixian Zhuang et.al. | 2507.07687 | null |
| 2025-07-10 | HOTA: Hierarchical Overlap-Tiling Aggregation for Large-Area 3D Flood Mapping | Wenfeng Jia et.al. | 2507.07585 | null |
| 2025-07-08 | LighthouseGS: Indoor Structure-aware 3D Gaussian Splatting for Panorama-Style Mobile Captures | Seungoh Han et.al. | 2507.06109 | null |
| 2025-07-14 | Beyond Appearance: Geometric Cues for Robust Video Instance Segmentation | Quanzhu Niu et.al. | 2507.05948 | null |
| 2025-07-07 | The Generalization Ridge: Information Flow in Natural Language Generation | Ruidi Chang et.al. | 2507.05387 | null |
| 2025-07-10 | VOTE: Vision-Language-Action Optimization with Trajectory Ensemble Voting | Juyi Lin et.al. | 2507.05116 | null |
| 2025-07-07 | Estimating Object Physical Properties from RGB-D Vision and Depth Robot Sensors Using Deep Learning | Ricardo Cardoso et.al. | 2507.05029 | null |
| 2025-07-06 | A View-consistent Sampling Method for Regularized Training of Neural Radiance Fields | Aoxiang Fan et.al. | 2507.04408 | null |
| 2025-07-06 | High-Resolution Sustain Pedal Depth Estimation from Piano Audio Across Room Acoustics | Kun Fang et.al. | 2507.04230 | null |
| 2025-07-03 | From Pixels to Damage Severity: Estimating Earthquake Impacts Using Semantic Segmentation of Social Media Images | Danrong Zhang et.al. | 2507.02781 | null |
| 2025-07-02 | Underwater Monocular Metric Depth Estimation: Real-World Benchmarks and Synthetic Fine-Tuning | Zijie Cai et.al. | 2507.02148 | null |
| 2025-07-02 | RobuSTereo: Robust Zero-Shot Stereo Matching under Adverse Weather | Yuran Wang et.al. | 2507.01653 | null |
| 2025-07-02 | Depth Anything at Any Condition | Boyuan Sun et.al. | 2507.01634 | null |
| 2025-07-02 | DepthSync: Diffusion Guidance-Based Depth Synchronization for Scale- and Geometry-Consistent Video Depth Estimation | Yue-Jiang Dong et.al. | 2507.01603 | null |
| 2025-07-02 | Evaluating Robustness of Monocular Depth Estimation with Procedural Scene Perturbations | Jack Nugent et.al. | 2507.00981 | null |
| 2025-06-30 | SurgiSR4K: A High-Resolution Endoscopic Video Dataset for Robotic-Assisted Minimally Invasive Procedures | Fengyi Jiang et.al. | 2507.00209 | null |
| 2025-06-30 | OcRFDet: Object-Centric Radiance Fields for Multi-View 3D Object Detection in Autonomous Driving | Mingqian Ji et.al. | 2506.23565 | null |
| 2025-06-26 | ThermalDiffusion: Visual-to-Thermal Image-to-Image Translation for Autonomous Navigation | Shruti Bansal et.al. | 2506.20969 | null |
| 2025-06-25 | THIRDEYE: Cue-Aware Monocular Depth Estimation via Brain-Inspired Multi-Stage Fusion | Calin Teodor Ioan et.al. | 2506.20877 | null |
| 2025-06-30 | StereoDiff: Stereo-Diffusion Synergy for Video Depth Estimation | Haodong Li et.al. | 2506.20756 | null |
| 2025-06-24 | Look to Locate: Vision-Based Multisensory Navigation with 3-D Digital Maps for GNSS-Challenged Environments | Ola Elmaghraby et.al. | 2506.19827 | null |
| 2025-06-23 | SOF: Sorted Opacity Fields for Fast Unbounded Surface Reconstruction | Lukas Radl et.al. | 2506.19139 | null |
| 2025-06-23 | BulletGen: Improving 4D Reconstruction with Bullet-Time Generation | Denys Rozumnyi et.al. | 2506.18601 | null |
| 2025-06-21 | Optimization-Free Patch Attack on Stereo Depth Estimation | Hangcheng Liu et.al. | 2506.17632 | null |
| 2025-06-20 | DreamCube: 3D Panorama Generation via Multi-plane Synchronization | Yukun Huang et.al. | 2506.17206 | link |
| 2025-06-20 | RGBTrack: Fast, Robust Depth-Free 6D Pose Estimation and Tracking | Teng Guo et.al. | 2506.17119 | link |
| 2025-06-20 | Monocular One-Shot Metric-Depth Alignment for RGB-Based Robot Grasping | Teng Guo et.al. | 2506.17110 | null |
| 2025-06-20 | DepthVanish: Optimizing Adversarial Interval Structures for Stereo-Depth-Invisible Patches | Yun Xing et.al. | 2506.16690 | null |
| 2025-06-19 | EndoMUST: Monocular Depth Estimation for Robotic Endoscopy via End-to-end Multi-step Self-supervised Training | Liangjing Shao et.al. | 2506.16017 | link |
| 2025-06-18 | RaCalNet: Radar Calibration Network for Sparse-Supervised Metric Depth Estimation | Xingrui Qin et.al. | 2506.15560 | null |
| 2025-06-17 | Time-Optimized Safe Navigation in Unstructured Environments through Learning Based Depth Completion | Jeffrey Mao et.al. | 2506.14975 | null |
| 2025-06-17 | DiFuse-Net: RGB and Dual-Pixel Depth Estimation using Window Bi-directional Parallax Attention and Cross-modal Transfer Learning | Kunal Swami et.al. | 2506.14709 | null |
| 2025-06-16 | Test3R: Learning to Reconstruct 3D at Test Time | Yuheng Yuan et.al. | 2506.13750 | link |
| 2025-06-16 | Multiview Geometric Regularization of Gaussian Splatting for Accurate Radiance Fields | Jungeon Kim et.al. | 2506.13508 | null |
| 2025-06-17 | Self-Supervised Enhancement for Depth from a Lightweight ToF Sensor with Monocular Images | Laiyan Ding et.al. | 2506.13444 | null |
| 2025-06-16 | TR2M: Transferring Monocular Relative Depth to Metric Depth with Language Descriptions and Scale-Oriented Contrast | Beilei Cui et.al. | 2506.13387 | link |
| 2025-06-17 | 3D Hand Mesh-Guided AI-Generated Malformed Hand Refinement with Hand Pose Transformation via Diffusion Model | Chen-Bin Feng et.al. | 2506.12680 | null |
| 2025-06-12 | Leveraging 6DoF Pose Foundation Models For Mapping Marine Sediment Burial | Jerry Yan et.al. | 2506.10386 | link |
| 2025-06-11 | DCIRNet: Depth Completion with Iterative Refinement for Dexterous Grasping of Transparent and Reflective Objects | Guanghu Xie et.al. | 2506.09491 | null |
| 2025-06-11 | MSSDF: Modality-Shared Self-supervised Distillation for High-Resolution Multi-modal Remote Sensing Image Learning | Tong Wang et.al. | 2506.09327 | null |
| 2025-06-10 | AVA-Bench: Atomic Visual Ability Benchmark for Vision Foundation Models | Zheda Mai et.al. | 2506.09082 | null |
| 2025-06-10 | One Patch to Rule Them All: Transforming Static Patches into Dynamic Attacks in the Physical World | Xingshuo Han et.al. | 2506.08482 | null |
| 2025-06-09 | Jamais Vu: Exposing the Generalization Gap in Supervised Semantic Correspondence | Octave Mariotti et.al. | 2506.08220 | null |
| 2025-06-09 | Hidden in plain sight: VLMs overlook their visual representations | Stephanie Fu et.al. | 2506.08008 | null |
| 2025-06-09 | EgoM2P: Egocentric Multimodal Multitask Pretraining | Gen Li et.al. | 2506.07886 | link |
| 2025-06-09 | Flow-Anything: Learning Real-World Optical Flow Estimation from Large-Scale Single-view Images | Yingping Liang et.al. | 2506.07740 | null |
| 2025-06-07 | Dark Channel-Assisted Depth-from-Defocus from a Single Image | Moushumi Medhi et.al. | 2506.06643 | null |
| 2025-06-06 | NTIRE 2025 Challenge on HR Depth from Images of Specular and Transparent Surfaces | Pierluigi Zama Ramirez et.al. | 2506.05815 | null |
| 2025-06-06 | Advancement and Field Evaluation of a Dual-arm Apple Harvesting Robot | Keyi Zhu et.al. | 2506.05714 | null |
| 2025-06-06 | Token Transforming: A Unified and Training-Free Token Compression Framework for Vision Transformer Acceleration | Fanhu Zeng et.al. | 2506.05709 | null |
| 2025-06-06 | Aerial Multi-View Stereo via Adaptive Depth Range Inference and Normal Cues | Yimei Liu et.al. | 2506.05655 | null |
| 2025-06-03 | Attacking Attention of Foundation Models Disrupts Downstream Tasks | Hondamunige Prasanna Silva et.al. | 2506.05394 | null |
| 2025-06-09 | Structure-Aware Radar-Camera Depth Estimation | Fuyi Zhang et.al. | 2506.05008 | null |
| 2025-06-05 | Generating Synthetic Stereo Datasets using 3D Gaussian Splatting and Expert Knowledge Transfer | Filip Slezak et.al. | 2506.04908 | null |
| 2025-06-05 | Toward Better SSIM Loss for Unsupervised Monocular Depth Estimation | Yijun Cao et.al. | 2506.04758 | null |
| 2025-06-04 | JointSplat: Probabilistic Joint Flow-Depth Optimization for Sparse-View Gaussian Splatting | Yang Xiao et.al. | 2506.03872 | null |
| 2025-06-03 | ViT-Split: Unleashing the Power of Vision Foundation Models via Efficient Splitting Heads | Yifan Li et.al. | 2506.03433 | null |
| 2025-06-02 | E3D-Bench: A Benchmark for End-to-End 3D Geometric Foundation Models | Wenyan Cong et.al. | 2506.01933 | null |
| 2025-06-01 | Perceptual Inductive Bias Is What You Need Before Contrastive Learning | Tianqin Li et.al. | 2506.01201 | null |
| 2025-05-31 | XYZ-IBD: High-precision Bin-picking Dataset for Object 6D Pose Estimation Capturing Real-world Industrial Complexity | Junwen Huang et.al. | 2506.00599 | null |
| 2025-05-31 | Improving Optical Flow and Stereo Depth Estimation by Leveraging Uncertainty-Based Learning Difficulties | Jisoo Jeong et.al. | 2506.00324 | null |
| 2025-05-30 | Harnessing Foundation Models for Robust and Generalizable 6-DOF Bronchoscopy Localization | Qingyao Tian et.al. | 2505.24249 | null |
| 2025-05-29 | Ultrafast High-Flux Single-Photon LiDAR Simulator via Neural Mapping | Weijian Zhang et.al. | 2505.23992 | null |
| 2025-05-29 | Bridging Geometric and Semantic Foundation Models for Generalized Monocular Depth Estimation | Sanggyun Ma et.al. | 2505.23400 | null |
| 2025-05-29 | GeoMan: Temporally Consistent Human Geometry Estimation using Image-to-Video Diffusion | Gwanghyun Kim et.al. | 2505.23085 | null |
| 2025-05-28 | Depth to magnetic source estimation using TDX contour | Hammed Oyekan et.al. | 2505.22780 | null |
| 2025-05-27 | Object Concepts Emerge from Motion | Haoqian Liang et.al. | 2505.21635 | null |
| 2025-05-23 | EvidenceMoE: A Physics-Guided Mixture-of-Experts with Evidential Critics for Advancing Fluorescence Light Detection and Ranging in Scattering Media | Ismail Erbas et.al. | 2505.21532 | null |
| 2025-05-27 | Occlusion Boundary and Depth: Mutual Enhancement via Multi-Task Learning | Lintao Xu et.al. | 2505.21231 | null |
| 2025-05-27 | Robust Video-Based Pothole Detection and Area Estimation for Intelligent Vehicles with Depth Map and Kalman Smoothing | Dehao Wang et.al. | 2505.21049 | null |
| 2025-05-27 | Spatial RoboGrasp: Generalized Robotic Grasping Control Policy | Yiqi Huang et.al. | 2505.20814 | null |
| 2025-05-26 | SpikeStereoNet: A Brain-Inspired Framework for Stereo Depth Estimation from Spike Streams | Zhuoheng Gao et.al. | 2505.19487 | null |
| 2025-05-25 | From Single Images to Motion Policies via Video-Generation Environment Representations | Weiming Zhi et.al. | 2505.19306 | null |
| 2025-05-23 | Repurposing Marigold for Zero-Shot Metric Depth Estimation via Defocus Blur Cues | Chinmay Talegaonkar et.al. | 2505.17358 | null |
| 2025-05-22 | MEgoHand: Multimodal Egocentric Hand-Object Interaction Motion Generation | Bohan Zhou et.al. | 2505.16602 | null |
| 2025-05-22 | BadDepth: Backdoor Attacks Against Monocular Depth Estimation in the Physical World | Ji Guo et.al. | 2505.16154 | null |
| 2025-05-21 | RadarRGBD A Multi-Sensor Fusion Dataset for Perception with RGB-D and mmWave Radar | Tieshuai Song et.al. | 2505.15860 | null |
| 2025-05-20 | M3Depth: Wavelet-Enhanced Depth Estimation on Mars via Mutual Boosting of Dual-Modal Data | Junjie Li et.al. | 2505.14159 | null |
| 2025-05-20 | Multi-Label Stereo Matching for Transparent Scene Depth Estimation | Zhidan Liu et.al. | 2505.14008 | link |
| 2025-05-20 | Event-Driven Dynamic Scene Depth Completion | Zhiqiang Yan et.al. | 2505.13279 | null |
| 2025-05-19 | DB3D-L: Depth-aware BEV Feature Transformation for Accurate 3D Lane Detection | Yehao Liu et.al. | 2505.13266 | null |
| 2025-05-24 | 3D Visual Illusion Depth Estimation | Chengtang Yao et.al. | 2505.13061 | link |
| 2025-05-19 | IA-MVS: Instance-Focused Adaptive Depth Sampling for Multi-View Stereo | Yinzhe Wang et.al. | 2505.12714 | null |
| 2025-05-18 | Depth Transfer: Learning to See Like a Simulator for Real-World Drone Navigation | Hang Yu et.al. | 2505.12428 | null |
| 2025-05-18 | Always Clear Depth: Robust Monocular Depth Estimation under Adverse Weather | Kui Jiang et.al. | 2505.12199 | link |
| 2025-05-17 | MonoMobility: Zero-Shot 3D Mobility Analysis from Monocular Videos | Hongyi Zhou et.al. | 2505.11868 | null |
| 2025-05-16 | SurgPose: Generalisable Surgical Instrument Pose Estimation using Zero-Shot Learning and Stereo Vision | Utsav Rai et.al. | 2505.11439 | null |
| 2025-05-16 | Attention on the Sphere | Boris Bonev et.al. | 2505.11157 | null |
| 2025-05-15 | Depth Anything with Any Prior | Zehan Wang et.al. | 2505.10565 | link |
| 2025-05-15 | JointDistill: Adaptive Multi-Task Distillation for Joint Depth Estimation and Scene Segmentation | Tiancong Cheng et.al. | 2505.10057 | null |
| 2025-05-14 | Marigold: Affordable Adaptation of Diffusion-Based Image Generators for Image Analysis | Bingxin Ke et.al. | 2505.09358 | link |
| 2025-05-13 | Boosting Zero-shot Stereo Matching using Large-scale Mixed Images Sources in the Real World | Yuran Wang et.al. | 2505.08607 | null |
| 2025-05-12 | Some insights into depth estimators for location and scatter in the multivariate setting | Jorge G. Adrover et.al. | 2505.07383 | null |
| 2025-05-11 | Reinforcement Learning-Based Monocular Vision Approach for Autonomous UAV Landing | Tarik Houichime et.al. | 2505.06963 | null |
| 2025-05-10 | ElectricSight: 3D Hazard Monitoring for Power Lines Using Low-Cost Sensors | Xingchen Li et.al. | 2505.06573 | null |
| 2025-05-09 | Camera-Only Bird’s Eye View Perception: A Neural Approach to LiDAR-Free Environmental Mapping for Autonomous Vehicles | Anupkumar Bochare et.al. | 2505.06113 | null |
| 2025-05-09 | MonoCoP: Chain-of-Prediction for Monocular 3D Object Detection | Zhihao Zhang et.al. | 2505.04594 | null |
| 2025-05-13 | Self-Supervised Learning for Robotic Leaf Manipulation: A Hybrid Geometric-Neural Approach | Srecharan Selvam et.al. | 2505.03702 | null |
| 2025-05-06 | LiftFeat: 3D Geometry-Aware Local Feature Matching | Yepeng Liu et.al. | 2505.03422 | link |
| 2025-05-06 | VGLD: Visually-Guided Linguistic Disambiguation for Monocular Depth Scale Recovery | Bojin Wu et.al. | 2505.02704 | link |
| 2025-05-05 | DELTA: Dense Depth from Events and LiDAR using Transformer’s Attention | Vincent Brebion et.al. | 2505.02593 | null |
| 2025-05-03 | PosePilot: Steering Camera Pose for Generative World Models with Self-supervised Depth | Bu Jin et.al. | 2505.01729 | null |
| 2025-05-02 | LMDepth: Lightweight Mamba-based Monocular Depth Estimation for Real-World Deployment | Jiahuan Long et.al. | 2505.00980 | null |
| 2025-05-01 | JointDiT: Enhancing RGB-Depth Joint Modeling with Diffusion Transformers | Kwon Byung-Ki et.al. | 2505.00482 | link |
| 2025-04-30 | HoloTime: Taming Video Diffusion Models for Panoramic 4D Scene Generation | Haiyang Zhou et.al. | 2504.21650 | link |
| 2025-04-30 | eNCApsulate: NCA for Precision Diagnosis on Capsule Endoscopes | Henry John Krumb et.al. | 2504.21562 | null |
| 2025-04-29 | Real-Time Wayfinding Assistant for Blind and Low-Vision Users | Dabbrata Das et.al. | 2504.20976 | null |
| 2025-04-29 | Large-scale visual SLAM for in-the-wild videos | Shuo Sun et.al. | 2504.20496 | null |
| 2025-04-28 | Joint Optimization of Neural Radiance Fields and Continuous Camera Motion from a Monocular Video | Hoang Chuong Nguyen et.al. | 2504.19819 | null |
| 2025-04-27 | Leveraging Multi-Modal Saliency and Fusion for Gaze Target Detection | Athul M. Mathew et.al. | 2504.19271 | null |
| 2025-04-26 | Depth as Points: Center Point-based Depth Estimation | Zhiheng Tu et.al. | 2504.18773 | null |
| 2025-04-25 | LaRI: Layered Ray Intersections for Single-view 3D Geometric Reasoning | Rui Li et.al. | 2504.18424 | null |
| 2025-04-25 | Dense Geometry Supervision for Underwater Depth Estimation | Wenxiang Gua et.al. | 2504.18233 | null |
| 2025-04-25 | LiDAR-Guided Monocular 3D Object Detection for Long-Range Railway Monitoring | Raul David Dominguez Sanchez et.al. | 2504.18203 | null |
| 2025-04-24 | The Fourth Monocular Depth Estimation Challenge | Anton Obukhov et.al. | 2504.17787 | null |
| 2025-04-24 | Occlusion-Aware Self-Supervised Monocular Depth Estimation for Weak-Texture Endoscopic Images | Zebo Huang et.al. | 2504.17582 | null |
| 2025-04-24 | Invasion depth estimation of gastric cancer in early stage using circularly polarized light scattering: Phantom studies | Mike R. Maskey et.al. | 2504.17161 | null |
| 2025-04-23 | PPS-Ctrl: Controllable Sim-to-Real Translation for Colonoscopy Depth Estimation | Xinqi Xiong et.al. | 2504.17067 | null |
| 2025-04-23 | Helping Blind People Grasp: Enhancing a Tactile Bracelet with an Automated Hand Navigation System | Marcin Furtak et.al. | 2504.16502 | null |
| 2025-04-21 | MonoTher-Depth: Enhancing Thermal Depth Estimation via Confidence-Aware Distillation | Xingxing Zuo et.al. | 2504.16127 | null |
| 2025-04-22 | DERD-Net: Learning Depth from Event-based Ray Densities | Diego de Oliveira Hitzges et.al. | 2504.15863 | null |
| 2025-04-22 | VistaDepth: Frequency Modulation With Bias Reweighting For Enhanced Long-Range Depth Estimation | Mingxia Zhan et.al. | 2504.15095 | null |
| 2025-04-20 | Seurat: From Moving Points to Depth | Seokju Cho et.al. | 2504.14687 | link |
| 2025-04-18 | Occlusion-Ordered Semantic Instance Segmentation | Soroosh Baselizadeh et.al. | 2504.14054 | null |
| 2025-04-18 | Enhancing Pothole Detection and Characterization: Integrated Segmentation and Depth Estimation in Road Anomaly Systems | Uthman Baroudi et.al. | 2504.13648 | null |
| 2025-04-17 | Perception Encoder: The best visual embeddings are not at the output of the network | Daniel Bolya et.al. | 2504.13181 | link |
| 2025-04-17 | TSGS: Improving Gaussian Splatting for Transparent Surface Reconstruction via Normal and De-lighting Priors | Mingwei Li et.al. | 2504.12799 | null |
| 2025-04-17 | Privacy-Preserving Operating Room Workflow Analysis using Digital Twins | Alejandra Perez et.al. | 2504.12552 | null |
| 2025-04-16 | Metric-Solver: Sliding Anchored Metric Depth Estimation from a Single Image | Tao Wen et.al. | 2504.12103 | null |
| 2025-04-16 | TacoDepth: Towards Efficient Radar-Camera Depth Estimation with One-stage Fusion | Yiran Wang et.al. | 2504.11773 | null |
| 2025-04-16 | An Online Adaptation Method for Robust Depth Estimation and Visual Odometry in the Open World | Xingwu Ji et.al. | 2504.11698 | link |
| 2025-04-15 | Aligning Generative Denoising with Discriminative Objectives Unleashes Diffusion for Visual Perception | Ziqi Pang et.al. | 2504.11457 | link |
| 2025-04-16 | DeepWheel: Generating a 3D Synthetic Wheel Dataset for Design and Performance Evaluation | Soyoung Yoo et.al. | 2504.11347 | null |
| 2025-04-13 | TextSplat: Text-Guided Semantic Fusion for Generalizable Gaussian Splatting | Zhicong Wu et.al. | 2504.09588 | null |
| 2025-04-12 | Text To 3D Object Generation For Scalable Room Assembly | Sonia Laguna et.al. | 2504.09328 | null |
| 2025-04-11 | Cut-and-Splat: Leveraging Gaussian Splatting for Synthetic Data Generation | Bram Vanherle et.al. | 2504.08473 | link |
| 2025-04-10 | Geo4D: Leveraging Video Generators for Geometric 4D Scene Reconstruction | Zeren Jiang et.al. | 2504.07961 | link |
| 2025-04-09 | FlashDepth: Real-time Streaming Video Depth Estimation at 2K Resolution | Gene Chou et.al. | 2504.07093 | null |
| 2025-04-08 | POMATO: Marrying Pointmap Matching with Temporal Motion for Dynamic 3D Reconstruction | Songyan Zhang et.al. | 2504.05692 | link |
| 2025-04-07 | Stereo-LiDAR Fusion by Semi-Global Matching With Discrete Disparity-Matching Cost and Semidensification | Yasuhiro Yao et.al. | 2504.05148 | link |
| 2025-04-04 | 3D Scene Understanding Through Local Random Access Sequence Modeling | Wanhee Lee et.al. | 2504.03875 | link |
| 2025-04-04 | RingMoE: Mixture-of-Modality-Experts Multi-Modal Foundation Models for Universal Remote Sensing Image Interpretation | Hanbo Bi et.al. | 2504.03166 | null |
| 2025-04-02 | FreSca: Unveiling the Scaling Space in Diffusion Models | Chao Huang et.al. | 2504.02154 | link |
| 2025-04-03 | Toward Real-world BEV Perception: Depth Uncertainty Estimation via Gaussian Splatting | Shu-Wei Lu et.al. | 2504.01957 | null |
| 2025-04-02 | A novel gesture interaction control method for rehabilitation lower extremity exoskeleton | Shuang Qiu et.al. | 2504.01888 | null |
| 2025-04-02 | DEPTHOR: Depth Enhancement from a Practical Light-Weight dToF Sensor and RGB Image | Jijun Xiang et.al. | 2504.01596 | null |
| 2025-04-01 | GeometryCrafter: Consistent Geometry Estimation for Open-world Videos with Diffusion Priors | Tian-Xing Xu et.al. | 2504.01016 | link |
| 2025-04-01 | Monocular and Generalizable Gaussian Talking Head Animation | Shengjie Gong et.al. | 2504.00665 | null |
| 2025-03-31 | ExScene: Free-View 3D Scene Reconstruction with Gaussian Splatting from a Single Image | Tianyi Gong et.al. | 2503.23881 | null |
| 2025-03-31 | Detail-aware multi-view stereo network for depth estimation | Haitao Tian et.al. | 2503.23684 | null |
| 2025-03-30 | Blurry-Edges: Photon-Limited Depth Estimation from Defocused Boundaries | Wei Xu et.al. | 2503.23606 | null |
| 2025-03-30 | Boosting Omnidirectional Stereo Matching with a Pre-trained Depth Foundation Model | Jannik Endres et.al. | 2503.23502 | link |
| 2025-03-28 | SemAlign3D: Semantic Correspondence between RGB-Images through Aligning 3D Object-Class Representations | Krispin Wandel et.al. | 2503.22462 | null |
| 2025-03-28 | EndoLRMGS: Complete Endoscopic Scene Reconstruction combining Large Reconstruction Modelling and Gaussian Splatting | Xu Wang et.al. | 2503.22437 | link |
| 2025-03-28 | MVSAnywhere: Zero-Shot Multi-View Stereo | Sergio Izquierdo et.al. | 2503.22430 | null |
| 2025-03-28 | One Look is Enough: A Novel Seamless Patchwise Refinement for Zero-Shot Monocular Depth Estimation Models on High-Resolution Images | Byeongjun Kwon et.al. | 2503.22351 | null |
| 2025-03-28 | Intrinsic Image Decomposition for Robust Self-supervised Monocular Depth Estimation on Reflective Surfaces | Wonhyeok Choi et.al. | 2503.22209 | null |
| 2025-03-28 | Deep Depth Estimation from Thermal Image: Dataset, Benchmark, and Challenges | Ukcheol Shin et.al. | 2503.22060 | link |
| 2025-03-27 | A Unified Image-Dense Annotation Generation Model for Underwater Scenes | Hongkai Lin et.al. | 2503.21771 | link |
| 2025-03-27 | ICG-MVSNet: Learning Intra-view and Cross-view Relationships for Guidance in Multi-View Stereo | Yuxi Hu et.al. | 2503.21525 | null |
| 2025-03-26 | Synthetic-to-Real Self-supervised Robust Depth Estimation via Learning with Motion and Structure Priors | Weilong Yan et.al. | 2503.20211 | link |
| 2025-03-26 | FUSE: Label-Free Image-Event Joint Monocular Depth Estimation via Frequency-Decoupled Alignment and Degradation-Robust Fusion | Pihai Sun et.al. | 2503.19739 | link |
| 2025-03-25 | Semi-SD: Semi-Supervised Metric Depth Estimation via Surrounding Cameras for Autonomous Driving | Yusen Xie et.al. | 2503.19713 | link |
| 2025-03-25 | StableGS: A Floater-Free Framework for 3D Gaussian Splatting | Luchao Wang et.al. | 2503.18458 | null |
| 2025-03-24 | PDDM: Pseudo Depth Diffusion Model for RGB-PD Semantic Segmentation Based in Complex Indoor Scenes | Xinhua Xu et.al. | 2503.18393 | null |
| 2025-03-23 | Co-SemDepth: Fast Joint Semantic Segmentation and Depth Estimation on Aerial Images | Yara AlaaEldin et.al. | 2503.17982 | link |
| 2025-03-21 | Radar-Guided Polynomial Fitting for Metric Depth Estimation | Patrick Rim et.al. | 2503.17182 | null |
| 2025-03-21 | AnimatePainter: A Self-Supervised Rendering Framework for Reconstructing Painting Process | Junjie Hu et.al. | 2503.17029 | null |
| 2025-03-21 | Distilling Monocular Foundation Model for Fine-grained Depth Completion | Yingping Liang et.al. | 2503.16970 | null |
| 2025-03-20 | QuartDepth: Post-Training Quantization for Real-Time Depth Estimation on the Edge | Xuan Shen et.al. | 2503.16709 | link |
| 2025-03-20 | A Recipe for Generating 3D Worlds From a Single Image | Katja Schwarz et.al. | 2503.16611 | null |
| 2025-03-20 | Learning to Efficiently Adapt Foundation Models for Self-Supervised Endoscopic 3D Scene Reconstruction from Any Cameras | Beilei Cui et.al. | 2503.15917 | null |
| 2025-03-20 | Jasmine: Harnessing Diffusion Prior for Self-supervised Depth Estimation | Jiyuan Wang et.al. | 2503.15905 | null |
| 2025-03-19 | TULIP: Towards Unified Language-Image Pretraining | Zineng Tang et.al. | 2503.15485 | null |
| 2025-03-19 | EgoDTM: Towards 3D-Aware Egocentric Video-Language Pretraining | Boshen Xu et.al. | 2503.15470 | null |
| 2025-03-19 | USAM-Net: A U-Net-based Network for Improved Stereo Correspondence and Scene Depth Estimation using Features from a Pre-trained Image Segmentation network | Joseph Emmanuel DL Dayo et.al. | 2503.14950 | null |
| 2025-03-18 | Multi-view Reconstruction via SfM-guided Monocular Depth Estimation | Haoyu Guo et.al. | 2503.14483 | null |
| 2025-03-18 | DUNE: Distilling a Universal Encoder from Heterogeneous 2D and 3D Teachers | Mert Bulent Sariyildiz et.al. | 2503.14405 | null |
| 2025-03-18 | 3D Densification for Multi-Map Monocular VSLAM in Endoscopy | X. Anadón et.al. | 2503.14346 | null |
| 2025-03-17 | MonoCT: Overcoming Monocular 3D Detection Domain Shift with Consistent Teacher Models | Johannes Meier et.al. | 2503.13743 | null |
| 2025-03-17 | Improving Geometric Consistency for 360-Degree Neural Radiance Fields in Indoor Scenarios | Iryna Repinetska et.al. | 2503.13710 | null |
| 2025-03-19 | FlexWorld: Progressively Expanding 3D Scenes for Flexiable-View Synthesis | Luxi Chen et.al. | 2503.13265 | null |
| 2025-03-17 | MM-Spatial: Exploring 3D Spatial Understanding in Multimodal LLMs | Erik Daxberger et.al. | 2503.13111 | null |
| 2025-03-17 | TransDiff: Diffusion-Based Method for Manipulating Transparent Objects Using a Single RGB-D Image | Haoxiao Wang et.al. | 2503.12779 | null |
| 2025-03-16 | UniVG: A Generalist Diffusion Model for Unified Image Generation and Editing | Tsu-Jui Fu et.al. | 2503.12652 | null |
| 2025-03-16 | Deblur Gaussian Splatting SLAM | Francesco Girlanda et.al. | 2503.12572 | null |
| 2025-03-14 | VGGT: Visual Geometry Grounded Transformer | Jianyuan Wang et.al. | 2503.11651 | null |
| 2025-03-14 | Seeing and Seeing Through the Glass: Real and Synthetic Data for Multi-Layer Depth Estimation | Hongyu Wen et.al. | 2503.11633 | null |
| 2025-03-14 | Simulating Dual-Pixel Images From Ray Tracing For Depth Estimation | Fengchen He et.al. | 2503.11213 | null |
| 2025-03-13 | Flow-NeRF: Joint Learning of Geometry, Poses, and Dense Flow within Unified Neural Representations | Xunzhi Zheng et.al. | 2503.10464 | null |
| 2025-03-15 | WonderVerse: Extendable 3D Scene Generation with Video Generative Models | Hao Feng et.al. | 2503.09160 | null |
| 2025-03-11 | Language-Depth Navigated Thermal and Visible Image Fusion | Jinchang Zhang et.al. | 2503.08676 | null |
| 2025-03-11 | CL-MVSNet: Unsupervised Multi-view Stereo with Dual-level Contrastive Learning | Kaiqiang Xiong et.al. | 2503.08219 | null |
| 2025-03-10 | SIRE: SE(3) Intrinsic Rigidity Embeddings | Cameron Smith et.al. | 2503.07739 | null |
| 2025-03-10 | LBM: Latent Bridge Matching for Fast Image-to-Image Translation | Clément Chadebec et.al. | 2503.07535 | link |
| 2025-03-12 | Endo-FASt3r: Endoscopic Foundation model Adaptation for Structure from motion | Mona Sheikh Zeinoddin et.al. | 2503.07204 | null |
| 2025-03-11 | LightMotion: A Light and Tuning-free Method for Simulating Camera Motion in Video Generation | Quanjian Song et.al. | 2503.06508 | null |
| 2025-03-08 | Towards Ambiguity-Free Spatial Foundation Model: Rethinking and Decoupling Depth Ambiguity | Xiaohao Xu et.al. | 2503.06014 | link |
| 2025-03-07 | TomatoScanner: phenotyping tomato fruit based on only RGB image | Xiaobei Zhao et.al. | 2503.05568 | null |
| 2025-03-07 | Persistent Object Gaussian Splat (POGS) for Tracking Human and Robot Manipulation of Irregularly Shaped Objects | Justin Yu et.al. | 2503.05189 | null |
| 2025-03-05 | RTFusion: A depth estimation network based on multimodal fusion in challenging scenarios | Zelin Meng et.al. | 2503.04821 | null |
| 2025-03-06 | A Novel Solution for Drone Photogrammetry with Low-overlap Aerial Images using Monocular Depth Estimation | Jiageng Zhong et.al. | 2503.04513 | null |
| 2025-03-08 | EvidMTL: Evidential Multi-Task Learning for Uncertainty-Aware Semantic Surface Mapping from Monocular RGB Images | Rohit Menon et.al. | 2503.04441 | null |
| 2025-03-06 | H3O: Hyper-Efficient 3D Occupancy Prediction with Heterogeneous Supervision | Yunxiao Shi et.al. | 2503.04059 | null |
| 2025-03-05 | Task-Agnostic Attacks Against Vision Foundation Models | Brian Pulfer et.al. | 2503.03842 | null |
| 2025-03-05 | Multi-View Depth Consistent Image Generation Using Generative AI Models: Application on Architectural Design of University Buildings | Xusheng Du et.al. | 2503.03068 | null |
| 2025-03-04 | RGBSQGrasp: Inferring Local Superquadric Primitives from Single RGB Image for Graspability-Aware Bin Picking | Yifeng Xu et.al. | 2503.02387 | null |
| 2025-03-03 | MUSt3R: Multi-view Network for Stereo 3D Reconstruction | Yohann Cabon et.al. | 2503.01661 | null |
| 2025-03-02 | Bridging Spectral-wise and Multi-spectral Depth Estimation via Geometry-guided Contrastive Learning | Ukcheol Shin et.al. | 2503.00793 | null |
| 2025-02-28 | EndoPBR: Material and Lighting Estimation for Photorealistic Surgical Simulations via Physically-based Rendering | John J. Han et.al. | 2502.20669 | null |
| 2025-02-27 | UniDepthV2: Universal Monocular Metric Depth Estimation Made Simpler | Luigi Piccinelli et.al. | 2502.20110 | link |
| 2025-02-26 | Stellar Models Also Limit Exoplanet Atmosphere Studies in Emission | Thomas J. Fauchez et.al. | 2502.19585 | null |
| 2025-02-26 | Distill Any Depth: Distillation Creates a Stronger Monocular Depth Estimator | Xiankang He et.al. | 2502.19204 | link |
| 2025-02-26 | SLAM in the Dark: Self-Supervised Learning of Pose, Depth and Loop-Closure from Thermal Images | Yangfan Xu et.al. | 2502.18932 | null |
| 2025-02-21 | RGB-Only Gaussian Splatting SLAM for Unbounded Outdoor Scenes | Sicheng Yu et.al. | 2502.15633 | null |
| 2025-02-20 | CDGS: Confidence-Aware Depth Regularization for 3D Gaussian Splatting | Qilin Zhang et.al. | 2502.14684 | link |
| 2025-03-03 | Monocular Depth Estimation and Segmentation for Transparent Object with Iterative Semantic and Geometric Fusion | Jiangyuan Liu et.al. | 2502.14616 | link |
| 2025-02-20 | Self-supervised Monocular Depth Estimation Robust to Reflective Surface Leveraged by Triplet Mining | Wonhyeok Choi et.al. | 2502.14573 | null |
| 2025-02-20 | OrchardDepth: Precise Metric Depth Estimation of Orchard Scene from Monocular Camera Images | Zhichao Zheng et.al. | 2502.14279 | null |
| 2025-02-18 | Pre-training Auto-regressive Robotic Models with 4D Representations | Dantong Niu et.al. | 2502.13142 | null |
| 2025-02-18 | SHADeS: Self-supervised Monocular Depth Estimation Through Non-Lambertian Image Decomposition | Rema Daher et.al. | 2502.12994 | null |
| 2025-02-17 | Deep Neural Networks for Accurate Depth Estimation with Latent Space Features | Siddiqui Muhammad Yasir et.al. | 2502.11777 | null |
| 2025-02-16 | Adjust Your Focus: Defocus Deblurring From Dual-Pixel Images Using Explicit Multi-Scale Cross-Correlation | Kunal Swami et.al. | 2502.11002 | null |
| 2025-02-14 | RealCam-I2V: Real-World Image-to-Video Generation with Interactive Complex Camera Control | Teng Li et.al. | 2502.10059 | null |
| 2025-02-13 | SteROI-D: System Design and Mapping for Stereo Depth Inference on Regions of Interest | Jack Erhardt et.al. | 2502.09528 | null |
| 2025-02-17 | S $^2$ -Diffusion: Generalizing from Instance-level to Category-level Skills in Robot Manipulation | Quantao Yang et.al. | 2502.09389 | null |
| 2025-02-13 | CoL3D: Collaborative Learning of Single-view Depth and Camera Intrinsics for Metric 3D Shape Recovery | Chenghao Zhang et.al. | 2502.08902 | null |
| 2025-02-13 | Visual-based spatial audio generation system for multi-speaker environments | Xiaojing Liu et.al. | 2502.07538 | null |
| 2025-02-11 | Learning Inverse Laplacian Pyramid for Progressive Depth Completion | Kun Wang et.al. | 2502.07289 | null |
| 2025-02-10 | From Image to Video: An Empirical Study of Diffusion Representations | Pedro Vélez et.al. | 2502.07001 | null |
| 2025-02-09 | Revisiting Gradient-based Uncertainty for Monocular Depth Estimation | Julia Hornauer et.al. | 2502.05964 | null |
| 2025-02-09 | SphereFusion: Efficient Panorama Depth Estimation via Gated Fusion | Qingsong Yan et.al. | 2502.05859 | null |
| 2025-02-05 | MetaFE-DE: Learning Meta Feature Embedding for Depth Estimation from Monocular Endoscopic Images | Dawei Lu et.al. | 2502.03493 | null |
| 2025-02-04 | DOC-Depth: A novel approach for dense depth ground truth generation | Simon de Moreau et.al. | 2502.02144 | null |
| 2025-02-01 | Leveraging Stable Diffusion for Monocular Depth Estimation via Image Semantic Encoding | Jingming Xia et.al. | 2502.01666 | null |
| 2025-02-01 | Exploring Representation-Aligned Latent Space for Better Generation | Wanghan Xu et.al. | 2502.00359 | null |
| 2025-02-01 | MonoDINO-DETR: Depth-Enhanced Monocular 3D Object Detection Using a Vision Foundation Model | Jihyeok Kim et.al. | 2502.00315 | null |
| 2025-01-30 | Zero-Shot Novel View and Depth Synthesis with Multi-View Geometric Diffusion | Vitor Guizilini et.al. | 2501.18804 | null |
| 2025-01-25 | Snapshot Compressed Imaging Based Single-Measurement Computer Vision for Videos | Fengpu Pan et.al. | 2501.15122 | null |
| 2025-01-24 | Rethinking Encoder-Decoder Flow Through Shared Structures | Frederik Laboyrie et.al. | 2501.14535 | null |
| 2025-01-23 | IMAGINE-E: Image Generation Intelligence Evaluation of State-of-the-art Text-to-Image Models | Jiayi Lei et.al. | 2501.13920 | null |
| 2025-01-23 | PromptMono: Cross Prompting Attention for Self-Supervised Monocular Depth Estimation in Challenging Environments | Changhao Wang et.al. | 2501.13796 | null |
| 2025-01-22 | Enhancing Monocular Depth Estimation with Multi-Source Auxiliary Tasks | Alessio Quercia et.al. | 2501.12824 | null |
| 2025-01-22 | Video Depth Anything: Consistent Depth Estimation for Super-Long Videos | Sili Chen et.al. | 2501.12375 | null |
| 2025-01-21 | Fast Underwater Scene Reconstruction using Multi-View Stereo and Physical Imaging | Shuyi Hu et.al. | 2501.11884 | null |
| 2025-01-21 | Survey on Monocular Metric Depth Estimation | Jiuling Zhang et.al. | 2501.11841 | null |
| 2025-01-19 | RDG-GS: Relative Depth Guidance with Gaussian Splatting for Real-time Sparse-View 3D Rendering | Chenlu Zhan et.al. | 2501.11102 | null |
| 2025-01-15 | BloomScene: Lightweight Structured 3D Gaussian Splatting for Crossmodal Scene Generation | Xiaolu Hou et.al. | 2501.10462 | null |
| 2025-01-20 | Zero-Shot Monocular Scene Flow Estimation in the Wild | Yiqing Liang et.al. | 2501.10357 | null |
| 2025-01-17 | One-D-Piece: Image Tokenizer Meets Quality-Controllable Compression | Keita Miwa et.al. | 2501.10064 | null |
| 2025-01-17 | Multi-Modal Attention Networks for Enhanced Segmentation and Depth Estimation of Subsurface Defects in Pulse Thermography | Mohammed Salah et.al. | 2501.09994 | link |
| 2025-01-21 | FoundationStereo: Zero-Shot Stereo Matching | Bowen Wen et.al. | 2501.09898 | null |
| 2025-01-16 | DEFOM-Stereo: Depth Foundation Model Based Stereo Matching | Hualie Jiang et.al. | 2501.09466 | link |
| 2025-01-15 | StereoGen: High-quality Stereo Image Generation from a Single Image | Xianqi Wang et.al. | 2501.08654 | null |
| 2025-01-15 | MonSter: Marry Monodepth to Stereo Unleashes Power | Junda Cheng et.al. | 2501.08643 | link |
| 2025-01-14 | A Critical Synthesis of Uncertainty Quantification and Foundation Models in Monocular Depth Estimation | Steven Landgraf et.al. | 2501.08188 | null |
| 2025-01-14 | Revisiting Birds Eye View Perception Models with Frozen Foundation Models: DINOv2 and Metric3Dv2 | Seamie Hayes et.al. | 2501.08118 | null |
| 2025-01-13 | Matching Free Depth Recovery from Structured Light | Zhuohang Yu et.al. | 2501.07113 | null |
| 2025-01-09 | Relative Pose Estimation through Affine Corrections of Monocular Depth Priors | Yifan Yu et.al. | 2501.05446 | link |
| 2025-01-09 | $DPF^*$ : improved Depth Potential Function for scale-invariant sulcal depth estimation | Maxime Dieudonné et.al. | 2501.05436 | link |
| 2025-01-09 | A Systematic Literature Review on Deep Learning-based Depth Estimation in Computer Vision | Ali Rohan et.al. | 2501.05147 | null |
| 2025-01-07 | AuxDepthNet: Real-Time Monocular 3D Object Detection with Depth-Sensitive Features | Ruochen Zhang et.al. | 2501.03700 | null |
| 2025-01-05 | DepthMaster: Taming Diffusion Models for Monocular Depth Estimation | Ziyang Song et.al. | 2501.02576 | link |
| 2025-01-05 | Depth Any Camera: Zero-Shot Metric Depth Estimation from Any Camera | Yuliang Guo et.al. | 2501.02464 | null |
| 2025-01-03 | SafeAug: Safety-Critical Driving Data Augmentation from Naturalistic Datasets | Zhaobin Mo et.al. | 2501.02143 | null |
| 2025-01-03 | Laparoscopic Scene Analysis for Intraoperative Visualisation of Gamma Probe Signals in Minimally Invasive Cancer Surgery | Baoru Huang et.al. | 2501.01752 | null |
| 2025-01-03 | IGAF: Incremental Guided Attention Fusion for Depth Super-Resolution | Athanasios Tragakis et.al. | 2501.01723 | null |
| 2024-12-31 | Tech Report: Divide and Conquer 3D Real-Time Reconstruction for Improved IGS | Yicheng Zhu et.al. | 2501.01465 | null |
| 2025-01-02 | TexAVi: Generating Stereoscopic VR Video Clips from Text Descriptions | Vriksha Srihari et.al. | 2501.01156 | null |
| 2025-01-02 | PatchRefiner V2: Fast and Lightweight Real-Domain High-Resolution Metric Depth Estimation | Zhenyu Li et.al. | 2501.01121 | null |
| 2024-12-30 | FPGA-based Acceleration of Neural Network for Image Classification using Vitis AI | Zhengdong Li et.al. | 2412.20974 | null |
| 2024-12-29 | MetricDepth: Enhancing Monocular Depth Estimation with Deep Metric Learning | Chunpu Liu et.al. | 2412.20390 | null |
| 2024-12-28 | Multi-Modality Driven LoRA for Adverse Condition Depth Estimation | Guanglei Yang et.al. | 2412.20162 | null |
| 2024-12-28 | DepthMamba with Adaptive Fusion | Zelin Meng et.al. | 2412.19964 | null |
| 2024-12-26 | An End-to-End Depth-Based Pipeline for Selfie Image Rectification | Ahmed Alhawwary et.al. | 2412.19189 | null |
| 2024-12-26 | Revisiting Monocular 3D Object Detection from Scene-Level Depth Retargeting to Instance-Level Spatial Refinement | Qiude Zhang et.al. | 2412.19165 | null |
| 2024-12-26 | MVS-GS: High-Quality 3D Gaussian Splatting Mapping via Online Multi-View Stereo | Byeonggwon Lee et.al. | 2412.19130 | null |
| 2024-12-26 | Learning Monocular Depth from Events via Egomotion Compensation | Haitao Meng et.al. | 2412.19067 | null |
| 2024-12-24 | RSGaussian:3D Gaussian Splatting with LiDAR for Aerial Remote Sensing Novel View Synthesis | Yiling Yao et.al. | 2412.18380 | null |
| 2024-12-27 | LiRCDepth: Lightweight Radar-Camera Depth Estimation via Knowledge Distillation and Uncertainty Guidance | Huawei Sun et.al. | 2412.16380 | link |
| 2024-12-19 | Flowing from Words to Pixels: A Framework for Cross-Modality Evolution | Qihao Liu et.al. | 2412.15213 | null |
| 2024-12-19 | Scaling 4D Representations | João Carreira et.al. | 2412.15212 | null |
| 2024-12-18 | Foundation Models Meet Low-Cost Sensors: Test-Time Adaptation for Rescaling Disparity for Zero-Shot Metric Depth Estimation | Rémi Marsal et.al. | 2412.14103 | null |
| 2024-12-18 | Prompting Depth Anything for 4K Resolution Accurate Metric Depth Estimation | Haotong Lin et.al. | 2412.14015 | null |
| 2024-12-18 | Marigold-DC: Zero-Shot Monocular Depth Completion with Guided Diffusion | Massimiliano Viola et.al. | 2412.13389 | null |
| 2024-12-18 | Dyn-HaMR: Recovering 4D Interacting Hand Motion from a Dynamic Camera | Zhengdi Yu et.al. | 2412.12861 | null |
| 2024-12-17 | PromptDet: A Lightweight 3D Object Detection Framework with LiDAR Prompts | Kun Guo et.al. | 2412.12460 | null |
| 2024-12-16 | V-MIND: Building Versatile Monocular Indoor 3D Detector with Diverse 2D Annotations | Jin-Cheng Jhang et.al. | 2412.11412 | null |
| 2024-12-16 | Depth-Centric Dehazing and Depth-Estimation from Real-World Hazy Driving Video | Junkai Fan et.al. | 2412.11395 | null |
| 2024-12-15 | ViPOcc: Leveraging Visual Priors from Vision Foundation Models for Single-View 3D Occupancy Prediction | Yi Feng et.al. | 2412.11210 | link |
| 2024-12-14 | MAL: Cluster-Masked and Multi-Task Pretraining for Enhanced xLSTM Vision Performance | Wenjun Huang et.al. | 2412.10730 | null |
| 2024-12-12 | Stereo4D: Learning How Things Move in 3D from Internet Stereo Videos | Linyi Jin et.al. | 2412.09621 | null |
| 2024-12-12 | T-SVG: Text-Driven Stereoscopic Video Generation | Qiao Jin et.al. | 2412.09323 | null |
| 2024-12-12 | Cross-View Completion Models are Zero-shot Correspondence Estimators | Honggyu An et.al. | 2412.09072 | null |
| 2024-12-11 | BLADE: Single-view Body Mesh Learning through Accurate Depth Estimation | Shengze Wang et.al. | 2412.08640 | null |
| 2024-12-13 | Utilizing Multi-step Loss for Single Image Reflection Removal | Abdelrahman Elnenaey et.al. | 2412.08582 | link |
| 2024-12-11 | Dense Depth from Event Focal Stack | Kenta Horikawa et.al. | 2412.08120 | null |
| 2024-12-10 | Diffusion-Based Attention Warping for Consistent 3D Scene Editing | Eyal Gomel et.al. | 2412.07984 | null |
| 2024-12-10 | Balancing Shared and Task-Specific Representations: A Hybrid Approach to Depth-Aware Video Panoptic Segmentation | Kurt H. W. Stolle et.al. | 2412.07966 | null |
| 2024-12-09 | SphereUFormer: A U-Shaped Transformer for Spherical 360 Perception | Yaniv Benny et.al. | 2412.06968 | null |
| 2024-12-09 | Driv3R: Learning Dense 4D Reconstruction for Autonomous Driving | Xin Fei et.al. | 2412.06777 | link |
| 2024-12-09 | MAtCha Gaussians: Atlas of Charts for High-Quality Geometry and Photorealism From Sparse Views | Antoine Guédon et.al. | 2412.06767 | null |
| 2024-12-09 | On-Device Self-Supervised Learning of Low-Latency Monocular Depth from Only Events | Jesse Hagenaars et.al. | 2412.06359 | null |
| 2024-12-09 | Omni-Scene: Omni-Gaussian Representation for Ego-Centric Sparse-View Scene Reconstruction | Dongxu Wei et.al. | 2412.06273 | null |
| 2024-12-09 | Event fields: Capturing light fields at high speed, resolution, and dynamic range | Ziyuan Qu et.al. | 2412.06191 | null |
| 2024-12-08 | GVDepth: Zero-Shot Monocular Depth Estimation for Ground Vehicles based on Probabilistic Cue Fusion | Karlo Koledic et.al. | 2412.06080 | null |
| 2024-12-08 | Prism: Semi-Supervised Multi-View Stereo with Monocular Structure Priors | Alex Rich et.al. | 2412.05771 | null |
| 2024-12-10 | TACO: Learning Multi-modal Action Models with Synthetic Chains-of-Thought-and-Action | Zixian Ma et.al. | 2412.05479 | null |
| 2024-12-06 | SimC3D: A Simple Contrastive 3D Pretraining Framework Using RGB Images | Jiahua Dong et.al. | 2412.05274 | null |
| 2024-12-06 | Penetrative rotating magnetoconvection subject to lateral variations in temperature gradients | Tirtharaj Barman et.al. | 2412.05235 | null |
| 2024-12-06 | PanoDreamer: 3D Panorama Synthesis from a Single Image | Avinash Paliwal et.al. | 2412.04827 | link |
| 2024-12-05 | LAA-Net: A Physical-prior-knowledge Based Network for Robust Nighttime Depth Estimation | Kebin Peng et.al. | 2412.04666 | null |
| 2024-12-05 | MegaSaM: Accurate, Fast, and Robust Structure and Motion from Casual Dynamic Videos | Zhengqi Li et.al. | 2412.04463 | null |
| 2024-12-05 | MT3DNet: Multi-Task learning Network for 3D Surgical Scene Reconstruction | Mithun Parab et.al. | 2412.03928 | null |
| 2024-12-04 | Perception Tokens Enhance Visual Reasoning in Multimodal Language Models | Mahtab Bigverdi et.al. | 2412.03548 | null |
| 2024-12-04 | Dense Scene Reconstruction from Light-Field Images Affected by Rolling Shutter | Hermes McGriff et.al. | 2412.03518 | null |
| 2024-12-04 | MultiGO: Towards Multi-level Geometry Learning for Monocular 3D Textured Human Reconstruction | Gangjian Zhang et.al. | 2412.03103 | null |
| 2024-12-05 | Align3R: Aligned Monocular Depth Estimation for Dynamic Videos | Jiahao Lu et.al. | 2412.03079 | null |
| 2024-12-03 | Single-Shot Metric Depth from Focused Plenoptic Cameras | Blanca Lasheras-Hernandez et.al. | 2412.02386 | null |
| 2024-12-03 | Dual Exposure Stereo for Extended Dynamic Range 3D Imaging | Juhyung Choi et.al. | 2412.02351 | null |
| 2024-12-03 | Amodal Depth Anything: Amodal Depth Estimation in the Wild | Zhenyu Li et.al. | 2412.02336 | null |
| 2024-12-03 | GSGTrack: Gaussian Splatting-Guided Object Pose Tracking from RGB Videos | Zhiyuan Chen et.al. | 2412.02267 | null |
| 2024-12-03 | FoveaSPAD: Exploiting Depth Priors for Adaptive and Efficient Single-Photon 3D Imaging | Justin Folden et.al. | 2412.02052 | null |
| 2024-12-02 | Mutli-View 3D Reconstruction using Knowledge Distillation | Aditya Dutt et.al. | 2412.02039 | link |
| 2024-12-02 | AVS-Net: Audio-Visual Scale Net for Self-supervised Monocular Metric Depth Estimation | Xiaohu Liu et.al. | 2412.01637 | null |
| 2024-12-02 | STATIC : Surface Temporal Affine for TIme Consistency in Video Monocular Depth Estimation | Sunghun Yang et.al. | 2412.01090 | null |
| 2024-12-01 | FiffDepth: Feed-forward Transformation of Diffusion-Based Generators for Detailed Depth Estimation | Yunpeng Bai et.al. | 2412.00671 | null |
| 2024-11-29 | SpaRC: Sparse Radar-Camera Fusion for 3D Object Detection | Philipp Wolters et.al. | 2411.19860 | null |
| 2024-11-29 | MonoPP: Metric-Scaled Self-Supervised Monocular Depth Estimation by Planar-Parallax Geometry in Automotive Applications | Gasser Elazab et.al. | 2411.19717 | null |
| 2024-11-29 | Gaussian Splashing: Direct Volumetric Rendering Underwater | Nir Mualem et.al. | 2411.19588 | null |
| 2024-11-28 | Learning Surrogate Rainfall-driven Inundation Models with Few Data | Marzieh Alireza Mirhoseini et.al. | 2411.19323 | null |
| 2024-11-28 | AGS-Mesh: Adaptive Gaussian Splatting and Meshing with Geometric Priors for Indoor Room Reconstruction Using Smartphones | Xuqian Ren et.al. | 2411.19271 | null |
| 2024-11-28 | Video Depth without Video Models | Bingxin Ke et.al. | 2411.19189 | null |
| 2024-11-28 | 360Recon: An Accurate Reconstruction Method Based on Depth Fusion from 360 Images | Zhongmiao Yan et.al. | 2411.19102 | null |
| 2024-11-27 | Helvipad: A Real-World Dataset for Omnidirectional Stereo Depth Estimation | Mehdi Zayene et.al. | 2411.18335 | link |
| 2024-11-27 | GAPartManip: A Large-scale Part-centric Dataset for Material-Agnostic Articulated Object Manipulation | Wenbo Cui et.al. | 2411.18276 | null |
| 2024-11-27 | SharpDepth: Sharpening Metric Depth Predictions Using Diffusion Distillation | Duc-Hai Pham et.al. | 2411.18229 | null |
| 2024-11-26 | Low-rank Adaptation-based All-Weather Removal for Autonomous Navigation | Sudarshan Rajagopalan et.al. | 2411.17814 | null |
| 2024-11-26 | Spatially Visual Perception for End-to-End Robotic Learning | Travis Davies et.al. | 2411.17458 | null |
| 2024-11-26 | DepthCues: Evaluating Monocular Depth Perception in Large Vision Models | Duolikun Danier et.al. | 2411.17385 | null |
| 2024-11-26 | Boost 3D Reconstruction using Diffusion-based Monocular Camera Calibration | Junyuan Deng et.al. | 2411.17240 | link |
| 2024-11-25 | G2SDF: Surface Reconstruction from Explicit Gaussians with Implicit SDFs | Kunyi Li et.al. | 2411.16898 | null |
| 2024-11-24 | PriorDiffusion: Leverage Language Prior in Diffusion Models for Monocular Depth Estimation | Ziyao Zeng et.al. | 2411.16750 | null |
| 2024-11-25 | Generative Omnimatte: Learning to Decompose Video into Layers | Yao-Chih Lee et.al. | 2411.16683 | null |
| 2024-11-25 | One Diffusion to Generate Them All | Duong H. Le et.al. | 2411.16318 | link |
| 2024-11-24 | Gaussian Scenes: Pose-Free Sparse-View Scene Reconstruction using Depth-Enhanced Diffusion Priors | Soumava Paul et.al. | 2411.15966 | null |
| 2024-11-21 | StereoCrafter-Zero: Zero-Shot Stereo Video Generation with Noisy Restart | Jian Shi et.al. | 2411.14295 | null |
| 2024-11-20 | DATAP-SfM: Dynamic-Aware Tracking Any Point for Robust Structure from Motion in the Wild | Weicai Ye et.al. | 2411.13291 | null |
| 2024-11-20 | OceanLens: An Adaptive Backscatter and Edge Correction using Deep Learning Model for Enhanced Underwater Imaging | Rajini Makam et.al. | 2411.13230 | null |
| 2024-11-15 | SPARS3R: Semantic Prior Alignment and Regularization for Sparse 3D Reconstruction | Yutao Tang et.al. | 2411.12592 | link |
| 2024-11-18 | Towards Degradation-Robust Reconstruction in Generalizable NeRF | Chan Ho Park et.al. | 2411.11691 | null |
| 2024-11-18 | MGNiceNet: Unified Monocular Geometric Scene Understanding | Markus Schön et.al. | 2411.11466 | null |
| 2024-11-18 | The ADUULM-360 Dataset – A Multi-Modal Dataset for Depth Estimation in Adverse Weather | Markus Schön et.al. | 2411.11455 | null |
| 2024-11-18 | GPS-Gaussian+: Generalizable Pixel-wise 3D Gaussian Splatting for Real-Time Human-Scene Rendering from Sparse Views | Boyao Zhou et.al. | 2411.11363 | null |
| 2024-11-18 | Scalable Autoregressive Monocular Depth Estimation | Jinhong Wang et.al. | 2411.11361 | null |
| 2024-11-16 | MetricGold: Leveraging Text-To-Image Latent Diffusion Models for Metric Depth Estimation | Ansh Shah et.al. | 2411.10886 | link |
| 2024-11-19 | EVT: Efficient View Transformation for Multi-Modal 3D Object Detection | Yongjin Lee et.al. | 2411.10715 | null |
| 2024-11-15 | Efficient Depth Estimation for Unstable Stereo Camera Systems on AR Glasses | Yongfan Liu et.al. | 2411.10013 | null |
| 2024-11-14 | Architect: Generating Vivid and Interactive 3D Scenes with Hierarchical 2D Inpainting | Yian Wang et.al. | 2411.09823 | null |
| 2024-11-14 | Adversarial Attacks Using Differentiable Rendering: A Survey | Matthew Hull et.al. | 2411.09749 | null |
| 2024-11-14 | Mono2Stereo: Monocular Knowledge Transfer for Enhanced Stereo Matching | Yuran Wang et.al. | 2411.09151 | null |
| 2024-11-13 | OSMLoc: Single Image-Based Visual Localization in OpenStreetMap with Geometric and Semantic Guidances | Youqi Liao et.al. | 2411.08665 | null |
| 2024-11-13 | Scaling Properties of Diffusion Models for Perceptual Tasks | Rahul Ravishankar et.al. | 2411.08034 | null |
| 2024-11-11 | $SE(3)$ Equivariant Ray Embeddings for Implicit Multi-View Depth Estimation | Yinshuang Xu et.al. | 2411.07326 | null |
| 2024-11-08 | Enhancing Depth Image Estimation for Underwater Robots by Combining Image Processing and Machine Learning | Quang Truong Nguyen et.al. | 2411.05344 | null |
| 2024-11-08 | SimpleBEV: Improved LiDAR-Camera Fusion Architecture for 3D Object Detection | Yun Zhao et.al. | 2411.05292 | null |
| 2024-11-07 | D $^3$ epth: Self-Supervised Depth Estimation with Dynamic Mask in Dynamic Scenes | Siyu Chen et.al. | 2411.04826 | null |
| 2024-11-06 | Revisiting Disparity from Dual-Pixel Images: Physics-Informed Lightweight Depth Estimation | Teppei Kurita et.al. | 2411.04714 | null |
| 2024-11-07 | Enhancing Bronchoscopy Depth Estimation through Synthetic-to-Real Domain Adaptation | Qingyao Tian et.al. | 2411.04404 | null |
| 2024-11-04 | PMPNet: Pixel Movement Prediction Network for Monocular Depth Estimation in Dynamic Scenes | Kebin Peng et.al. | 2411.04227 | null |
| 2024-11-06 | Adaptive Stereo Depth Estimation with Multi-Spectral Images Across All Lighting Conditions | Zihan Qin et.al. | 2411.03638 | null |
| 2024-11-05 | Monocular Event-Based Vision for Obstacle Avoidance with a Quadrotor | Anish Bhattacharya et.al. | 2411.03303 | null |
| 2024-11-05 | Correlation of Object Detection Performance with Visual Saliency and Depth Estimation | Matthias Bartolo et.al. | 2411.02844 | link |
| 2024-11-05 | FewViewGS: Gaussian Splatting with Few View Matching and Multi-stage Training | Ruihong Yin et.al. | 2411.02229 | null |
| 2024-11-05 | Improving Domain Generalization in Self-supervised Monocular Depth Estimation via Stabilized Adversarial Training | Yuanqi Yao et.al. | 2411.02149 | null |
| 2024-11-01 | MultiDepth: Multi-Sample Priors for Refining Monocular Metric Depth Estimations in Indoor Scenes | Sanghyun Byun et.al. | 2411.01048 | null |
| 2024-11-01 | On Deep Learning for Geometric and Semantic Scene Understanding Using On-Vehicle 3D LiDAR | Li Li et.al. | 2411.00600 | link |
| 2024-10-31 | Optical Lens Attack on Monocular Depth Estimation for Autonomous Driving | Ce Zhou et.al. | 2411.00192 | null |
| 2024-10-31 | ImOV3D: Learning Open-Vocabulary Point Clouds 3D Object Detection from Only 2D Images | Timing Yang et.al. | 2410.24001 | link |
| 2024-10-30 | Nested ResNet: A Vision-Based Method for Detecting the Sensing Area of a Drop-in Gamma Probe | Songyu Xu et.al. | 2410.23154 | null |
| 2024-10-29 | Active Event Alignment for Monocular Distance Estimation | Nan Cai et.al. | 2410.22280 | null |
| 2024-10-29 | PF3plat: Pose-Free Feed-Forward 3D Gaussian Splatting | Sunghwan Hong et.al. | 2410.22128 | link |
| 2024-10-27 | Unlocking Comics: The AI4VA Dataset for Visual Understanding | Peter Grönquist et.al. | 2410.20459 | link |
| 2024-10-27 | Depth Attention for Robust RGB Tracking | Yu Liu et.al. | 2410.20395 | link |
| 2024-10-21 | YOLO11 and Vision Transformers based 3D Pose Estimation of Immature Green Fruits in Commercial Apple Orchards for Robotic Thinning | Ranjan Sapkota et.al. | 2410.19846 | null |
| 2024-10-25 | MonoDGP: Monocular 3D Object Detection with Decoupled-Query and Geometry-Error Priors | Fanqi Pu et.al. | 2410.19590 | null |
| 2024-10-24 | Segmentation-aware Prior Assisted Joint Global Information Aggregated 3D Building Reconstruction | Hongxin Peng et.al. | 2410.18433 | null |
| 2024-10-24 | Thermal Chameleon: Task-Adaptive Tone-mapping for Radiometric Thermal-Infrared images | Dong-Guw Lee et.al. | 2410.18340 | link |
| 2024-10-25 | UnCLe: Unsupervised Continual Learning of Depth Completion | Suchisrit Gangopadhyay et.al. | 2410.18074 | null |
| 2024-10-21 | TIPS: Text-Image Pretraining with Spatial Awareness | Kevis-Kokitsi Maninis et.al. | 2410.16512 | null |
| 2024-10-22 | DCDepth: Progressive Monocular Depth Estimation in Discrete Cosine Domain | Kun Wang et.al. | 2410.14980 | link |
| 2024-10-17 | DepthSplat: Connecting Gaussian Splatting and Depth | Haofei Xu et.al. | 2410.13862 | link |
| 2024-10-16 | DH-VTON: Deep Text-Driven Virtual Try-On via Hybrid Attention Learning | Jiabao Wei et.al. | 2410.12501 | null |
| 2024-10-16 | Depth Estimation From Monocular Images With Enhanced Encoder-Decoder Architecture | Dabbrata Das et.al. | 2410.11610 | null |
| 2024-10-16 | CVCP-Fusion: On Implicit Depth Estimation for 3D Bounding Box Prediction | Pranav Gupta et.al. | 2410.11211 | link |
| 2024-10-14 | When Does Perceptual Alignment Benefit Vision Representations? | Shobhita Sundaram et.al. | 2410.10817 | null |
| 2024-10-14 | Depth Any Video with Scalable Synthetic Data | Honghui Yang et.al. | 2410.10815 | link |
| 2024-10-15 | Improved Depth Estimation of Bayesian Neural Networks | Bart van Erp et.al. | 2410.10395 | link |
| 2024-10-10 | Color-Guided Flying Pixel Correction in Depth Images | Ekamresh Vasudevan et.al. | 2410.08084 | null |
| 2024-10-09 | Surgical Depth Anything: Depth Estimation for Surgical Scenes using Foundation Models | Ange Lou et.al. | 2410.07434 | null |
| 2024-10-09 | Structure-Centric Robust Monocular Depth Estimation via Knowledge Distillation | Runze Chen et.al. | 2410.06982 | null |
| 2024-10-09 | Analysis of different disparity estimation techniques on aerial stereo image datasets | Ishan Narayan et.al. | 2410.06711 | null |
| 2024-10-08 | Vision Transformer based Random Walk for Group Re-Identification | Guoqing Zhang et.al. | 2410.05808 | null |
| 2024-10-08 | CUBE360: Learning Cubic Field Representation for Monocular 360 Depth Estimation for Virtual Reality | Wenjie Chang et.al. | 2410.05735 | null |
| 2024-10-07 | PhotoReg: Photometrically Registering 3D Gaussian Splatting Models | Ziwen Yuan et.al. | 2410.05044 | null |
| 2024-10-10 | Hybrid NeRF-Stereo Vision: Pioneering Depth Estimation and 3D Reconstruction in Endoscopy | Pengcheng Chen et.al. | 2410.04041 | null |
| 2024-10-04 | Refinement of Monocular Depth Maps via Multi-View Differentiable Rendering | Laura Fink et.al. | 2410.03861 | null |
| 2024-10-03 | RSA: Resolving Scale Ambiguities in Monocular Depth Estimators through Language Descriptions | Ziyao Zeng et.al. | 2410.02924 | null |
| 2024-10-02 | Depth Pro: Sharp Monocular Metric Depth in Less Than a Second | Aleksei Bochkovskii et.al. | 2410.02073 | link |
| 2024-10-01 | Towards Full-parameter and Parameter-efficient Self-learning For Endoscopic Camera Depth Estimation | Shuting Zhao et.al. | 2410.00979 | null |
| 2024-10-01 | Radar Meets Vision: Robustifying Monocular Metric Depth Prediction for Mobile Robotics | Marco Job et.al. | 2410.00736 | null |
| 2024-10-06 | Drone Stereo Vision for Radiata Pine Branch Detection and Distance Measurement: Utilizing Deep Learning and YOLO Integration | Yida Lin et.al. | 2410.00503 | null |
| 2024-10-01 | Seamless Augmented Reality Integration in Arthroscopy: A Pipeline for Articular Reconstruction and Guidance | Hongchao Shu et.al. | 2410.00386 | null |
| 2024-09-30 | CCDepth: A Lightweight Self-supervised Depth Estimation Network with Enhanced Interpretability | Xi Zhang et.al. | 2409.19933 | null |
| 2024-09-30 | EndoDepth: A Benchmark for Assessing Robustness in Endoscopic Depth Prediction | Ivan Reyes-Amezcua et.al. | 2409.19930 | link |
| 2024-09-29 | fCOP: Focal Length Estimation from Category-level Object Priors | Xinyue Zhang et.al. | 2409.19641 | null |
| 2024-09-29 | KineDepth: Utilizing Robot Kinematics for Online Metric Depth Estimation | Soofiyan Atar et.al. | 2409.19490 | null |
| 2024-09-27 | Speckle-illumination spatial frequency domain imaging with a stereo laparoscope for profile-corrected optical property mapping | Anthony A. Song et.al. | 2409.19153 | null |
| 2024-09-26 | Self-supervised Monocular Depth Estimation with Large Kernel Attention | Xuezhi Xiang et.al. | 2409.17895 | null |
| 2024-09-26 | Self-Distilled Depth Refinement with Noisy Poisson Fusion | Jiaqi Li et.al. | 2409.17880 | null |
| 2024-09-27 | A New Dataset for Monocular Depth Estimation Under Viewpoint Shifts | Aurel Pjetri et.al. | 2409.17851 | null |
| 2024-09-26 | Event-based Stereo Depth Estimation: A Survey | Suman Ghosh et.al. | 2409.17680 | null |
| 2024-09-26 | CAMOT: Camera Angle-aware Multi-Object Tracking | Felix Limanta et.al. | 2409.17533 | null |
| 2024-09-25 | Optical Lens Attack on Deep Learning Based Monocular Depth Estimation | Ce Zhou et.al. | 2409.17376 | null |
| 2024-09-25 | Parameter-efficient Bayesian Neural Networks for Uncertainty-aware Depth Estimation | Richard D. Paul et.al. | 2409.17085 | null |
| 2024-09-25 | EventHDR: from Event to High-Speed HDR Videos and Beyond | Yunhao Zou et.al. | 2409.17029 | null |
| 2024-09-25 | 3DDX: Bone Surface Reconstruction from a Single Standard-Geometry Radiograph via Dual-Face Depth Estimation | Yi Gu et.al. | 2409.16702 | null |
| 2024-09-24 | MIMO: Controllable Character Video Synthesis with Spatial Decomposed Modeling | Yifang Men et.al. | 2409.16160 | null |
| 2024-09-24 | Benchmarking Robustness of Endoscopic Depth Estimation with Synthetically Corrupted Data | An Wang et.al. | 2409.16063 | link |
| 2024-09-23 | FisheyeDepth: A Real Scale Self-Supervised Depth Estimation Model for Fisheye Camera | Guoyang Zhao et.al. | 2409.15054 | link |
| 2024-09-23 | DepthART: Monocular Depth Estimation as Autoregressive Refinement Task | Bulat Gabdullin et.al. | 2409.15010 | null |
| 2024-09-23 | Generalizing monocular colonoscopy image depth estimation by uncertainty-based global and local fusion network | Sijia Du et.al. | 2409.15006 | null |
| 2024-09-23 | GroCo: Ground Constraint for Metric Self-Supervised Monocular Depth | Aurélien Cecille et.al. | 2409.14850 | null |
| 2024-09-23 | Robust and Flexible Omnidirectional Depth Estimation with Multiple 360° Cameras | Ming Li et.al. | 2409.14766 | null |
| 2024-09-25 | D3RoMa: Disparity Diffusion-based Depth Sensing for Material-Agnostic Robotic Manipulation | Songlin Wei et.al. | 2409.14365 | null |
| 2024-09-21 | @Bench: Benchmarking Vision-Language Models for Human-centered Assistive Technology | Xin Jiang et.al. | 2409.14215 | null |
| 2024-09-20 | High-Resolution Flood Probability Mapping Using Generative Machine Learning with Large-Scale Synthetic Precipitation and Inundation Data | Lipai Huang et.al. | 2409.13936 | null |
| 2024-09-18 | Panoptic-Depth Forecasting | Juana Valeria Hurtado et.al. | 2409.12008 | null |
| 2024-09-17 | Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think | Gonzalo Martin Garcia et.al. | 2409.11355 | link |
| 2024-09-15 | GRIN: Zero-Shot Metric Depth with Pixel-Level Diffusion | Vitor Guizilini et.al. | 2409.09896 | null |
| 2024-09-15 | Towards Single-Lens Controllable Depth-of-Field Imaging via All-in-Focus Aberration Correction and Monocular Depth Estimation | Xiaolong Qian et.al. | 2409.09754 | link |
| 2024-09-13 | PrimeDepth: Efficient Monocular Depth Estimation with a Stable Diffusion Preimage | Denis Zavadski et.al. | 2409.09144 | link |
| 2024-09-23 | Precision Aquaculture: An Integrated Computer Vision and IoT Approach for Optimized Tilapia Feeding | Rania Hossam et.al. | 2409.08695 | link |
| 2024-09-12 | Depth on Demand: Streaming Dense Depth from a Low Frame Rate Active Sensor | Andrea Conti et.al. | 2409.08277 | null |
| 2024-09-12 | LED: Light Enhanced Depth Estimation at Night | Simon de Moreau et.al. | 2409.08031 | link |
| 2024-09-12 | Real-time Multi-view Omnidirectional Depth Estimation System for Robots and Autonomous Driving on Real Scenes | Ming Li et.al. | 2409.07843 | null |
| 2024-09-12 | Advancing Depth Anything Model for Unsupervised Monocular Depth Estimation in Endoscopy | Bojian Li et.al. | 2409.07723 | null |
| 2024-09-12 | FIReStereo: Forest InfraRed Stereo Dataset for UAS Depth Perception in Visually Degraded Environments | Devansh Dhrafani et.al. | 2409.07715 | null |
| 2024-09-10 | Deep Neural Networks: Multi-Classification and Universal Approximation | Martín Hernández et.al. | 2409.06555 | null |
| 2024-09-10 | EDADepth: Enhanced Data Augmentation for Monocular Depth Estimation | Nischal Khanal et.al. | 2409.06183 | link |
| 2024-09-11 | EndoOmni: Zero-Shot Cross-Dataset Depth Estimation in Endoscopy by Robust Self-Learning from Noisy Labels | Qingyao Tian et.al. | 2409.05442 | null |
| 2024-09-09 | Spontaneous magnetic field and disorder effects in BaPtAs_1-x_Sb_x_ with honeycomb network | T. Adachi et.al. | 2409.05266 | null |
| 2024-09-08 | TanDepth: Leveraging Global DEMs for Metric Monocular Depth Estimation in UAVs | Horatiu Florea et.al. | 2409.05142 | null |
| 2024-09-12 | Introducing a Class-Aware Metric for Monocular Depth Estimation: An Automotive Perspective | Tim Bader et.al. | 2409.04086 | link |
| 2024-09-08 | Estimating Indoor Scene Depth Maps from Ultrasonic Echoes | Junpei Honma et.al. | 2409.03336 | null |
| 2024-09-04 | iConFormer: Dynamic Parameter-Efficient Tuning with Input-Conditioned Adaptation | Hayeon Jo et.al. | 2409.02838 | null |
| 2024-09-02 | GET-UP: GEomeTric-aware Depth Estimation with Radar Points UPsampling | Huawei Sun et.al. | 2409.02720 | null |
| 2024-09-04 | Skip-and-Play: Depth-Driven Pose-Preserved Image Generation for Any Objects | Kyungmin Jo et.al. | 2409.02653 | null |
| 2024-09-04 | UniTT-Stereo: Unified Training of Transformer for Enhanced Stereo Matching | Soomin Kim et.al. | 2409.02545 | null |
| 2024-09-04 | SG-MIM: Structured Knowledge Guided Efficient Pre-training for Dense Prediction | Sumin Son et.al. | 2409.02513 | null |
| 2024-09-04 | Plane2Depth: Hierarchical Adaptive Plane Guidance for Monocular Depth Estimation | Li Liu et.al. | 2409.02494 | null |
| 2024-09-04 | Boosting Generalizability towards Zero-Shot Cross-Dataset Single-Image Indoor Depth by Meta-Initialization | Cho-Ying Wu et.al. | 2409.02486 | null |
| 2024-09-04 | GGS: Generalizable Gaussian Splatting for Lane Switching in Autonomous Driving | Huasong Han et.al. | 2409.02382 | null |
| 2024-09-03 | DepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos | Wenbo Hu et.al. | 2409.02095 | null |
| 2024-09-02 | Large Language Models Can Understanding Depth from Monocular Images | Zhongyi Xia et.al. | 2409.01133 | null |
| 2024-08-30 | DARES: Depth Anything in Robotic Endoscopic Surgery with Self-supervised Vector-LoRA of the Foundation Model | Mona Sheikh Zeinoddin et.al. | 2408.17433 | null |
| 2024-08-30 | Enhancing Underwater Imaging with 4-D Light Fields: Dataset and Method | Yuji Lin et.al. | 2408.17339 | null |
| 2024-08-30 | Synthetic Lunar Terrain: A Multimodal Open Dataset for Training and Evaluating Neuromorphic Vision Algorithms | Marcus Märtens et.al. | 2408.16971 | null |
| 2024-08-29 | EvLight++: Low-Light Video Enhancement with an Event Camera: A Large-Scale Real-World Dataset, Novel Method, and More | Kanghao Chen et.al. | 2408.16254 | null |
| 2024-08-30 | Revisiting 360 Depth Estimation with PanoGabor: A New Fusion Perspective | Zhijie Shen et.al. | 2408.16227 | link |
| 2024-08-27 | Adversarial Manhole: Challenging Monocular Depth Estimation and Semantic Segmentation Models with Patch Attack | Naufal Suryanto et.al. | 2408.14879 | null |
| 2024-08-26 | NimbleD: Enhancing Self-supervised Monocular Depth Estimation with Pseudo-labels and Large-scale Video Pre-training | Albert Luginov et.al. | 2408.14177 | null |
| 2024-08-26 | Pixel-Aligned Multi-View Generation with Depth Guided Decoder | Zhenggang Tang et.al. | 2408.14016 | null |
| 2024-08-25 | TranSplat: Generalizable 3D Gaussian Splatting from Sparse Multi-View Images with Transformers | Chuanrui Zhang et.al. | 2408.13770 | null |
| 2024-08-25 | InSpaceType: Dataset and Benchmark for Reconsidering Cross-Space Type Performance in Indoor Monocular Depth | Cho-Ying Wu et.al. | 2408.13708 | null |
| 2024-08-25 | SeeBelow: Sub-dermal 3D Reconstruction of Tumors with Surgical Robotic Palpation and Tactile Exploration | Raghava Uppuluri et.al. | 2408.13699 | null |
| 2024-08-27 | Sapiens: Foundation for Human Vision Models | Rawal Khirodkar et.al. | 2408.12569 | null |
| 2024-08-21 | LiFCal: Online Light Field Camera Calibration via Bundle Adjustment | Aymeric Fleith et.al. | 2408.11682 | null |
| 2024-08-19 | Structure-preserving Image Translation for Depth Estimation in Colonoscopy Video | Shuxian Wang et.al. | 2408.10153 | null |
| 2024-08-19 | SHARP: Segmentation of Hands and Arms by Range using Pseudo-Depth for Enhanced Egocentric 3D Hand Pose Estimation and Action Recognition | Wiktor Mucha et.al. | 2408.10037 | link |
| 2024-08-19 | P3P: Pseudo-3D Pre-training for Scaling 3D Masked Autoencoders | Xuechao Chen et.al. | 2408.10007 | null |
| 2024-08-14 | Enhanced Scale-aware Depth Estimation for Monocular Endoscopic Scenes with Geometric Modeling | Ruofeng Wei et.al. | 2408.07266 | null |
| 2024-08-12 | Towards Robust Monocular Depth Estimation in Non-Lambertian Surfaces | Junrui Zhang et.al. | 2408.06083 | null |
| 2024-08-08 | Depth Any Canopy: Leveraging Depth Foundation Models for Canopy Height Estimation | Daniele Rege Cambrin et.al. | 2408.04523 | link |
| 2024-08-08 | Detecting Car Speed using Object Detection and Depth Estimation: A Deep Learning Framework | Subhasis Dasgupta et.al. | 2408.04360 | null |
| 2024-08-08 | Design and Implementation of Smart Infrastructures and Connected Vehicles in A Mini-city Platform | Daniel Vargas et.al. | 2408.04195 | null |
| 2024-08-07 | Focal Depth Estimation: A Calibration-Free, Subject- and Daytime Invariant Approach | Benedikt W. Hosp et.al. | 2408.03591 | null |
| 2024-08-06 | BodySLAM: A Generalized Monocular Visual SLAM Framework for Surgical Applications | G. Manni et.al. | 2408.03078 | link |
| 2024-08-05 | Gaussian Mixture based Evidential Learning for Stereo Matching | Weide Liu et.al. | 2408.02796 | null |
| 2024-08-05 | Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining | Dongyang Liu et.al. | 2408.02657 | link |
| 2024-08-03 | MCPDepth: Omnidirectional Depth Estimation via Stereo Matching from Multi-Cylindrical Panoramas | Feng Qiao et.al. | 2408.01653 | null |
| 2024-08-02 | Self-Supervised Depth Estimation Based on Camera Models | Jinchang Zhang et.al. | 2408.01565 | null |
| 2024-08-01 | MonoMM: A Multi-scale Mamba-Enhanced Network for Real-time Monocular 3D Object Detection | Youjia Fu et.al. | 2408.00438 | null |
| 2024-08-01 | High-Precision Self-Supervised Monocular Depth Estimation with Rich-Resource Prior | Wencheng Han et.al. | 2408.00361 | null |
| 2024-07-31 | Unifying Event-based Flow, Stereo and Depth Estimation via Feature Similarity Matching | Pengjie Zhang et.al. | 2407.21735 | null |
| 2024-07-29 | BaseBoostDepth: Exploiting Larger Baselines For Self-supervised Monocular Depth Estimation | Kieran Saunders et.al. | 2407.20437 | null |
| 2024-07-29 | Analysis and Improvement of Rank-Ordered Mean Algorithm in Single-Photon LiDAR | William C. Yau et.al. | 2407.20399 | null |
| 2024-07-29 | Improving 2D Feature Representations by 3D-Aware Fine-Tuning | Yuanwen Yue et.al. | 2407.20229 | null |
| 2024-07-27 | Revisit Self-supervised Depth Estimation with Local Structure-from-Motion | Shengjie Zhu et.al. | 2407.19166 | null |
| 2024-07-27 | RePLAy: Remove Projective LiDAR Depthmap Artifacts via Exploiting Epipolar Geometry | Shengjie Zhu et.al. | 2407.19154 | null |
| 2024-07-26 | HybridDepth: Robust Depth Fusion for Mobile AR by Leveraging Depth from Focus and Single-Image Priors | Ashkan Ganj et.al. | 2407.18443 | link |
| 2024-07-26 | Enhanced Depth Estimation and 3D Geometry Reconstruction using Bayesian Helmholtz Stereopsis with Belief Propagation | Razieh Azizi et.al. | 2407.18195 | null |
| 2024-07-25 | BetterDepth: Plug-and-Play Diffusion Refiner for Zero-Shot Monocular Depth Estimation | Xiang Zhang et.al. | 2407.17952 | null |
| 2024-07-25 | UMono: Physical Model Informed Hybrid CNN-Transformer Framework for Underwater Monocular Depth Estimation | Jian Wang et.al. | 2407.17838 | null |
| 2024-07-24 | DarSwin-Unet: Distortion Aware Encoder-Decoder Architecture | Akshaya Athwale et.al. | 2407.17328 | null |
| 2024-07-24 | Physical Adversarial Attack on Monocular Depth Estimation via Shape-Varying Patches | Chenxing Zhao et.al. | 2407.17312 | null |
| 2024-07-23 | SINDER: Repairing the Singular Defects of DINOv2 | Haoqi Wang et.al. | 2407.16826 | link |
| 2024-07-23 | Diffusion Models for Monocular Depth Estimation: Overcoming Challenging Conditions | Fabio Tosi et.al. | 2407.16698 | link |
| 2024-07-23 | ToDER: Towards Colonoscopy Depth Estimation and Reconstruction with Geometry Constraint Adaptation | Zhenhua Wu et.al. | 2407.16508 | null |
| 2024-07-19 | Mono-ViFI: A Unified Learning Framework for Self-supervised Single- and Multi-frame Monocular Depth Estimation | Jinfeng Liu et.al. | 2407.14126 | link |
| 2024-07-18 | Unveiling the purely young star formation history of the SMC’s northeastern shell from colour-magnitude diagram fitting | Joanna D. Sakowska et.al. | 2407.13876 | null |
| 2024-07-18 | Many Perception Tasks are Highly Redundant Functions of their Input Data | Rahul Ramesh et.al. | 2407.13841 | null |
| 2024-07-18 | Benchmarking Robust Self-Supervised Learning Across Diverse Downstream Tasks | Antoni Kowalczuk et.al. | 2407.12588 | link |
| 2024-07-16 | Temporally Consistent Stereo Matching | Jiaxi Zeng et.al. | 2407.11950 | link |
| 2024-07-15 | IDOL: Unified Dual-Modal Latent Diffusion for Human-Centric Joint Video-Depth Generation | Yuanhao Zhai et.al. | 2407.10937 | link |
| 2024-07-15 | OPEN: Object-wise Position Embedding for Multi-view 3D Object Detection | Jinghua Hou et.al. | 2407.10753 | link |
| 2024-07-15 | Towards Scale-Aware Full Surround Monodepth with Transformers | Yuchen Yang et.al. | 2407.10406 | null |
| 2024-07-12 | ProDepth: Boosting Self-Supervised Multi-Frame Monocular Depth with Probabilistic Fusion | Sungmin Woo et.al. | 2407.09303 | link |
| 2024-07-11 | ScaleDepth: Decomposing Metric Depth Estimation into Scale Prediction and Relative Depth Estimation | Ruijie Zhu et.al. | 2407.08187 | link |
| 2024-07-10 | Controlling Space and Time with Diffusion Models | Daniel Watson et.al. | 2407.07860 | null |
| 2024-07-07 | SCIPaD: Incorporating Spatial Clues into Unsupervised Pose-Depth Joint Learning | Yi Feng et.al. | 2407.05283 | link |
| 2024-07-05 | A Physical Model-Guided Framework for Underwater Image Enhancement and Depth Estimation | Dazhao Du et.al. | 2407.04230 | null |
| 2024-07-04 | Towards Cross-View-Consistent Self-Supervised Surround Depth Estimation | Laiyan Ding et.al. | 2407.04041 | null |
| 2024-07-02 | Parametric Modeling and Estimation of Photon Registrations for 3D Imaging | Weijian Zhang et.al. | 2407.02712 | null |
| 2024-07-02 | Depth-Aware Endoscopic Video Inpainting | Francis Xiatian Zhang et.al. | 2407.02675 | link |
| 2024-07-04 | Camera-LiDAR Cross-modality Gait Recognition | Wenxuan Guo et.al. | 2407.02038 | null |
| 2024-07-07 | CaFNet: A Confidence-Driven Framework for Radar Camera Depth Estimation | Huawei Sun et.al. | 2407.00697 | link |
| 2024-06-28 | Deep Learning-based Depth Estimation Methods from Monocular Image and Videos: A Comprehensive Survey | Uchitha Rajapaksha et.al. | 2406.19675 | null |
| 2024-07-05 | 360 in the Wild: Dataset for Depth Prediction and View Synthesis | Kibaek Park et.al. | 2406.18898 | null |
| 2024-06-27 | Dense Monocular Motion Segmentation Using Optical Flow and Pseudo Depth Map: A Zero-Shot Approach | Yuxiang Huang et.al. | 2406.18837 | null |
| 2024-06-26 | DoubleTake: Geometry Guided Depth Estimation | Mohamed Sayed et.al. | 2406.18387 | null |
| 2024-06-25 | Depth-Guided Semi-Supervised Instance Segmentation | Xin Chen et.al. | 2406.17413 | null |
| 2024-06-20 | Uncertainty and Self-Supervision in Single-View Depth | Javier Rodriguez-Puigvert et.al. | 2406.14226 | null |
| 2024-06-19 | WaterMono: Teacher-Guided Anomaly Masking and Enhancement Boosting for Robust Underwater Self-Supervised Monocular Depth Estimation | Yilin Ding et.al. | 2406.13344 | link |
| 2024-06-18 | Depth Anywhere: Enhancing 360 Monocular Depth Estimation via Perspective Distillation and Unlabeled Data Augmentation | Ning-Hsu Wang et.al. | 2406.12849 | null |
| 2024-06-21 | GeoBench: Benchmarking and Analyzing Monocular Geometry Estimation Models | Yongtao Ge et.al. | 2406.12671 | link |
| 2024-06-17 | DistillNeRF: Perceiving 3D Scenes from Single-Glance Images by Distilling Neural Fields and Foundation Model Features | Letian Wang et.al. | 2406.12095 | null |
| 2024-06-17 | MEDeA: Multi-view Efficient Depth Adjustment | Mikhail Artemyev et.al. | 2406.12048 | null |
| 2024-06-16 | 3D Gaze Tracking for Studying Collaborative Interactions in Mixed-Reality Environments | Eduardo Davalos et.al. | 2406.11003 | null |
| 2024-06-15 | GenMM: Geometrically and Temporally Consistent Multimodal Data Generation for Video and LiDAR | Bharat Singh et.al. | 2406.10722 | null |
| 2024-06-14 | The BabyView dataset: High-resolution egocentric videos of infants’ and young children’s everyday experiences | Bria Long et.al. | 2406.10447 | null |
| 2024-06-14 | D-NPC: Dynamic Neural Point Clouds for Non-Rigid View Synthesis from Monocular Video | Moritz Kappel et.al. | 2406.10078 | null |
| 2024-06-14 | DurLAR: A High-fidelity 128-channel LiDAR Dataset with Panoramic Ambient and Reflectivity Imagery for Multi-modal Autonomous Driving Applications | Li Li et.al. | 2406.10068 | link |
| 2024-06-14 | Unsupervised Monocular Depth Estimation Based on Hierarchical Feature-Guided Diffusion | Runze Liu et.al. | 2406.09782 | null |
| 2024-06-13 | Depth Anything V2 | Lihe Yang et.al. | 2406.09414 | link |
| 2024-06-14 | WonderWorld: Interactive 3D Scene Generation from a Single Image | Hong-Xing Yu et.al. | 2406.09394 | link |
| 2024-06-13 | Scale-Invariant Monocular Depth Estimation via SSI Depth | S. Mahdi H. Miangoleh et.al. | 2406.09374 | null |
| 2024-06-13 | Multiple Prior Representation Learning for Self-Supervised Monocular Depth Estimation via Hybrid Transformer | Guodong Sun et.al. | 2406.08928 | link |
| 2024-06-13 | ToSA: Token Selective Attention for Efficient Vision Transformers | Manish Kumar Singh et.al. | 2406.08816 | null |
| 2024-06-11 | Back to the Color: Learning Depth to Specific Color Transformation for Unsupervised Depth Estimation | Yufan Zhu et.al. | 2406.07741 | link |
| 2024-06-11 | PLT-D3: A High-fidelity Dynamic Driving Simulation Dataset for Stereo Depth and Scene Flow | Joshua Tokarsky et.al. | 2406.07667 | null |
| 2024-06-11 | RS-DFM: A Remote Sensing Distributed Foundation Model for Diverse Downstream Tasks | Zhechao Wang et.al. | 2406.07032 | null |
| 2024-06-10 | PatchRefiner: Leveraging Synthetic Data for Real-Domain High-Resolution Monocular Metric Depth Estimation | Zhenyu Li et.al. | 2406.06679 | link |
| 2024-06-09 | Self-supervised Adversarial Training of Monocular Depth Estimation against Physical-World Attacks | Zhiyuan Cheng et.al. | 2406.05857 | link |
| 2024-06-09 | RefGaussian: Disentangling Reflections from 3D Gaussian Splatting for Realistic Rendering | Rui Zhang et.al. | 2406.05852 | null |
| 2024-06-07 | Normal-guided Detail-Preserving Neural Implicit Functions for High-Fidelity 3D Surface Reconstruction | Aarya Patel et.al. | 2406.04861 | null |
| 2024-06-07 | UVCPNet: A UAV-Vehicle Collaborative Perception Network for 3D Object Detection | Yuchao Wang et.al. | 2406.04647 | null |
| 2024-06-06 | MambaDepth: Enhancing Long-range Dependency for Self-Supervised Fine-Structured Monocular Depth Estimation | Ionuţ Grigore et.al. | 2406.04532 | null |
| 2024-06-06 | Flash3D: Feed-Forward Generalisable 3D Scene Reconstruction from a Single Image | Stanislaw Szymanowicz et.al. | 2406.04343 | null |
| 2024-06-06 | Neural Surface Reconstruction from Sparse Views Using Epipolar Geometry | Kaichen Zhou et.al. | 2406.04301 | null |
| 2024-06-04 | VHS: High-Resolution Iterative Stereo Matching with Visual Hull Priors | Markus Plack et.al. | 2406.02552 | null |
| 2024-06-03 | L-MAGIC: Language Model Assisted Generation of Images with Coherence | Zhipeng Cai et.al. | 2406.01843 | link |
| 2024-06-04 | Learning Temporally Consistent Video Depth from Video Diffusion Priors | Jiahao Shao et.al. | 2406.01493 | link |
| 2024-06-03 | Self-Supervised Geometry-Guided Initialization for Robust Monocular Visual Odometry | Takayuki Kanai et.al. | 2406.00929 | null |
| 2024-06-01 | MoDGS: Dynamic Gaussian Splatting from Causually-captured Monocular Videos | Qingming Liu et.al. | 2406.00434 | null |
| 2024-05-30 | Uncertainty-guided Optimal Transport in Depth Supervised Sparse-View 3D Gaussian | Wei Sun et.al. | 2405.19657 | null |
| 2024-05-28 | Hybrid Multi-Head Physics-informed Neural Network for Depth Estimation in Terahertz Imaging | Mingjun Xiang et.al. | 2405.18317 | null |
| 2024-05-27 | Consistency Regularisation for Unsupervised Domain Adaptation in Monocular Depth Estimation | Amir El-Ghoussani et.al. | 2405.17704 | null |
| 2024-05-27 | Benchmarking and Improving Bird’s Eye View Perception Robustness in Autonomous Driving | Shaoyuan Xie et.al. | 2405.17426 | link |
| 2024-05-27 | All-day Depth Completion | Vadim Ezhov et.al. | 2405.17315 | null |
| 2024-05-27 | GenWarp: Single Image to Novel Views with Semantic-Preserving Generative Warping | Junyoung Seo et.al. | 2405.17251 | link |
| 2024-05-27 | SDL-MVS: View Space and Depth Deformable Learning Paradigm for Multi-View Stereo Reconstruction in Remote Sensing | Yong-Qiang Mao et.al. | 2405.17140 | null |
| 2024-05-27 | DINO-SD: Champion Solution for ICRA 2024 RoboDepth Challenge | Yifan Mao et.al. | 2405.17102 | null |
| 2024-05-27 | Evaluation of Multi-task Uncertainties in Joint Semantic Segmentation and Monocular Depth Estimation | Steven Landgraf et.al. | 2405.17097 | null |
| 2024-05-27 | DCPI-Depth: Explicitly Infusing Dense Correspondence Prior to Unsupervised Monocular Depth Estimation | Mengtan Zhang et.al. | 2405.16960 | null |
| 2024-05-27 | ContrastAlign: Toward Robust BEV Feature Alignment via Contrastive Learning for Multi-Modal 3D Object Detection | Ziying Song et.al. | 2405.16873 | null |
| 2024-05-27 | Estimating Depth of Monocular Panoramic Image with Teacher-Student Model Fusing Equirectangular and Spherical Representations | Jingguo Liu et.al. | 2405.16858 | null |
| 2024-05-26 | Splat-SLAM: Globally Optimized RGB-only SLAM with 3D Gaussians | Erik Sandström et.al. | 2405.16544 | null |
| 2024-05-24 | Transparent Object Depth Completion | Yifan Zhou et.al. | 2405.15299 | null |
| 2024-05-24 | MonoDETRNext: Next-generation Accurate and Efficient Monocular 3D Object Detection Method | Pan Liao et.al. | 2405.15176 | null |
| 2024-05-23 | EvGGS: A Collaborative Learning Framework for Event-based Generalizable Gaussian Splatting | Jiaxu Wang et.al. | 2405.14959 | link |
| 2024-05-23 | Ghost-Stereo: GhostNet-based Cost Volume Enhancement and Aggregation for Stereo Matching Networks | Xingguang Jiang et.al. | 2405.14520 | null |
| 2024-05-23 | Enhanced Object Tracking by Self-Supervised Auxiliary Depth Estimation Learning | Zhenyu Wei et.al. | 2405.14195 | null |
| 2024-05-21 | Cross-spectral Gated-RGB Stereo Depth Estimation | Samuel Brucker et.al. | 2405.12759 | null |
| 2024-05-20 | Depth Reconstruction with Neural Signed Distance Fields in Structured Light Systems | Rukun Qiao et.al. | 2405.12006 | null |
| 2024-05-20 | Depth Prompting for Sensor-Agnostic Depth Estimation | Jin-Hwi Park et.al. | 2405.11867 | null |
| 2024-05-19 | CRF360D: Monocular 360 Depth Estimation via Spherical Fully-Connected CRFs | Zidong Cao et.al. | 2405.11564 | null |
| 2024-05-18 | Dusk Till Dawn: Self-supervised Nighttime Stereo Depth Estimation using Visual Foundation Models | Madhu Vankadari et.al. | 2405.11158 | link |
| 2024-05-17 | FA-Depth: Toward Fast and Accurate Self-supervised Monocular Depth Estimation | Fei Wang et.al. | 2405.10885 | link |
| 2024-05-17 | Accurate Training Data for Occupancy Map Prediction in Automated Driving Using Evidence Theory | Jonas Kälble et.al. | 2405.10575 | link |
| 2024-05-16 | Towards Task-Compatible Compressible Representations | Anderson de Andrade et.al. | 2405.10244 | link |
| 2024-05-16 | KPNDepth: Depth Estimation of Lane Images under Complex Rainy Environment | Zhengxu Shi et.al. | 2405.09964 | null |
| 2024-05-14 | CLIP with Quality Captions: A Strong Pretraining for Vision Tasks | Pavan Kumar Anasosalu Vasu et.al. | 2405.08911 | null |
| 2024-05-14 | The RoboDrive Challenge: Drive Anytime Anywhere in Any Condition | Lingdong Kong et.al. | 2405.08816 | null |
| 2024-05-14 | EndoDAC: Efficient Adapting Foundation Model for Self-Supervised Depth Estimation from Any Endoscopic Camera | Beilei Cui et.al. | 2405.08672 | link |
| 2024-05-13 | SceneFactory: A Workflow-centric and Unified Framework for Incremental Scene Modeling | Yijun Yuan et.al. | 2405.07847 | null |
| 2024-05-16 | Ensuring UAV Safety: A Vision-only and Real-time Framework for Collision Avoidance Through Object Detection, Tracking, and Distance Estimation | Vasileios Karampinis et.al. | 2405.06749 | null |
| 2024-05-10 | MGS-SLAM: Monocular Sparse Tracking and Gaussian Mapping with Depth Smooth Regularization | Pengcheng Zhu et.al. | 2405.06241 | null |
| 2024-04-30 | A critical appraisal of water table depth estimation: Challenges and opportunities within machine learning | Joseph Janssen et.al. | 2405.04579 | null |
| 2024-05-06 | A Construct-Optimize Approach to Sparse View Synthesis without Camera Pose | Kaiwen Jiang et.al. | 2405.03659 | null |
| 2024-05-03 | M ${^2}$ Depth: Self-supervised Two-Frame Multi-camera Metric Depth Estimation | Yingshuang Zou et.al. | 2405.02004 | null |
| 2024-05-02 | Domain-Transferred Synthetic Data Generation for Improving Monocular Depth Estimation | Seungyeop Lee et.al. | 2405.01113 | null |
| 2024-05-13 | Depth Priors in Removal Neural Radiance Fields | Zhihao Guo et.al. | 2405.00630 | null |
| 2024-04-30 | Invisible Stitch: Generating Smooth 3D Scenes with Depth Inpainting | Paul Engstler et.al. | 2404.19758 | link |
| 2024-04-30 | Masked Spatial Propagation Network for Sparsity-Adaptive Depth Refinement | Jinyoung Jun et.al. | 2404.19294 | link |
| 2024-04-29 | Simple-RF: Regularizing Sparse Input Radiance Fields with Simpler Solutions | Nagabhushan Somraj et.al. | 2404.19015 | null |
| 2024-05-02 | Underwater Variable Zoom: Depth-Guided Perception Network for Underwater Image Enhancement | Zhixiong Huang et.al. | 2404.17883 | link |
| 2024-05-01 | A Novel Spike Transformer Network for Depth Estimation from Event Cameras via Cross-modality Knowledge Distillation | Xin Zhang et.al. | 2404.17335 | null |
| 2024-04-27 | The Third Monocular Depth Estimation Challenge | Jaime Spencer et.al. | 2404.16831 | null |
| 2024-04-25 | MonoPCC: Photometric-invariant Cycle Constraint for Monocular Depth Estimation of Endoscopic Images | Zhiwei Wang et.al. | 2404.16571 | null |
| 2024-04-25 | Promoting CNNs with Cross-Architecture Knowledge Distillation for Efficient Monocular Depth Estimation | Zhimeng Zheng et.al. | 2404.16386 | null |
| 2024-04-23 | SGFormer: Spherical Geometry Transformer for 360 Depth Estimation | Junsong Zhang et.al. | 2404.14979 | null |
| 2024-04-23 | Mining Supervision for Dynamic Regions in Self-Supervised Monocular Depth Estimation | Hoang Chuong Nguyen et.al. | 2404.14908 | null |
| 2024-04-22 | Self-Supervised Monocular Depth Estimation in the Dark: Towards Data Distribution Compensation | Haolin Yang et.al. | 2404.13854 | null |
| 2024-04-21 | GScream: Learning 3D Geometry and Feature Consistent Gaussian Splatting for Object Removal | Yuxin Wang et.al. | 2404.13679 | null |
| 2024-04-20 | High-fidelity Endoscopic Image Synthesis by Utilizing Depth-guided Neural Surfaces | Baoru Huang et.al. | 2404.13437 | null |
| 2024-04-18 | SPIdepth: Strengthened Pose Information for Self-supervised Monocular Depth Estimation | Mykola Lavreniuk et.al. | 2404.12501 | link |
| 2024-04-25 | BLINK: Multimodal Large Language Models Can See but Not Perceive | Xingyu Fu et.al. | 2404.12390 | null |
| 2024-04-17 | How to deal with glare for improved perception of Autonomous Vehicles | Muhammad Z. Alam et.al. | 2404.10992 | null |
| 2024-04-12 | Into the Fog: Evaluating Multiple Object Tracking Robustness | Nadezda Kirillova et.al. | 2404.10534 | null |
| 2024-04-17 | Digging into contrastive learning for robust depth estimation with diffusion models | Jiyuan Wang et.al. | 2404.09831 | null |
| 2024-04-15 | Virtually Enriched NYU Depth V2 Dataset for Monocular Depth Estimation: Do We Need Artificial Augmentation? | Dmitry Ignatov et.al. | 2404.09469 | link |
| 2024-04-14 | In My Perspective, In My Hands: Accurate Egocentric 2D Hand Pose and Action Recognition | Wiktor Mucha et.al. | 2404.09308 | null |
| 2024-04-12 | FusionPortableV2: A Unified Multi-Sensor Dataset for Generalized SLAM Across Diverse Platforms and Scalable Environments | Hexiang Wei et.al. | 2404.08563 | null |
| 2024-04-12 | On the Robustness of Language Guidance for Low-Level Vision Tasks: Findings from Depth Estimation | Agneet Chatterjee et.al. | 2404.08540 | link |
| 2024-04-11 | Depth Estimation using Weighted-loss and Transfer Learning | Muhammad Adeel Hafeez et.al. | 2404.07686 | null |
| 2024-04-11 | GLID: Pre-training a Generalist Encoder-Decoder Vision Model | Jihao Liu et.al. | 2404.07603 | null |
| 2024-04-11 | Implicit and Explicit Language Guidance for Diffusion-based Visual Perception | Hefeng Wang et.al. | 2404.07600 | null |
| 2024-04-11 | Stereo-LiDAR Depth Estimation with Deformable Propagation and Learned Disparity-Depth Conversion | Ang Li et.al. | 2404.07545 | null |
| 2024-04-10 | Self-supervised Monocular Depth Estimation on Water Scenes via Specular Reflection Prior | Zhengyang Lu et.al. | 2404.07176 | null |
| 2024-04-10 | MonoSelfRecon: Purely Self-Supervised Explicit Generalizable 3D Reconstruction of Indoor Scenes from Monocular RGB Views | Runfa Li et.al. | 2404.06753 | null |
| 2024-04-09 | RoadBEV: Road Surface Reconstruction in Bird’s Eye View | Tong Zhao et.al. | 2404.06605 | link |
| 2024-04-09 | ZeST: Zero-Shot Material Transfer from a Single Image | Ta-Ying Cheng et.al. | 2404.06425 | link |
| 2024-04-09 | Matching 2D Images in 3D: Metric Relative Pose from Metric Correspondences | Axel Barroso-Laguna et.al. | 2404.06337 | null |
| 2024-04-09 | Enhanced Radar Perception via Multi-Task Learning: Towards Refined Data for Sensor Fusion Applications | Huawei Sun et.al. | 2404.06165 | null |
| 2024-04-09 | Incremental Joint Learning of Depth, Pose and Implicit Scene Representation on Monocular Camera in Large-scale Scenes | Tianchen Deng et.al. | 2404.06050 | null |
| 2024-04-06 | HawkDrive: A Transformer-driven Visual Perception System for Autonomous Driving in Night Scene | Ziang Guo et.al. | 2404.04653 | null |
| 2024-04-09 | Co-Occ: Coupling Explicit Feature Fusion with Volume Rendering Regularization for Multi-Modal 3D Semantic Occupancy Prediction | Jingyi Pan et.al. | 2404.04561 | null |
| 2024-04-05 | SpatialTracker: Tracking Any 2D Pixels in 3D Space | Yuxi Xiao et.al. | 2404.04319 | null |
| 2024-04-05 | Deep Phase Coded Image Prior | Nimrod Shabtay et.al. | 2404.03906 | null |
| 2024-04-04 | Know Your Neighbors: Improving Single-View Reconstruction via Spatial Vision-Language Reasoning | Rui Li et.al. | 2404.03658 | link |
| 2024-04-04 | MVD-Fusion: Single-view 3D via Depth-consistent Multi-view Generation | Hanzhe Hu et.al. | 2404.03656 | null |
| 2024-04-05 | WorDepth: Variational Language Prior for Monocular Depth Estimation | Ziyao Zeng et.al. | 2404.03635 | link |
| 2024-04-04 | Adaptive Discrete Disparity Volume for Self-supervised Monocular Depth Estimation | Jianwei Ren et.al. | 2404.03190 | null |
| 2024-04-04 | MonoCD: Monocular 3D Object Detection with Complementary Depths | Longfei Yan et.al. | 2404.03181 | link |
| 2024-04-02 | CHOSEN: Contrastive Hypothesis Selection for Multi-View Depth Refinement | Di Qiu et.al. | 2404.02225 | null |
| 2024-04-02 | Improving Bird’s Eye View Semantic Segmentation by Task Decomposition | Tianhao Zhao et.al. | 2404.01925 | null |
| 2024-04-01 | BadPart: Unified Black-box Adversarial Patch Attacks against Pixel-wise Regression Tasks | Zhiyuan Cheng et.al. | 2404.00924 | null |
| 2024-04-01 | MM3DGS SLAM: Multi-modal 3D Gaussian Splatting for SLAM Using Vision, Depth, and Inertial Measurements | Lisong C. Sun et.al. | 2404.00923 | link |
| 2024-03-31 | OmniSDF: Scene Reconstruction using Omnidirectional Signed Distance Functions and Adaptive Binoctrees | Hakyeong Kim et.al. | 2404.00678 | null |
| 2024-03-30 | The Devil is in the Edges: Monocular Depth Estimation with Edge-aware Consistency Fusion | Pengzhi Li et.al. | 2404.00373 | null |
| 2024-03-30 | Reusable Architecture Growth for Continual Stereo Matching | Chenghao Zhang et.al. | 2404.00360 | null |
| 2024-03-30 | MaGRITTe: Manipulative and Generative 3D Realization from Image, Topview and Text | Takayuki Hara et.al. | 2404.00345 | null |
| 2024-03-29 | VSRD: Instance-Aware Volumetric Silhouette Rendering for Weakly Supervised 3D Object Detection | Zihua Liu et.al. | 2404.00149 | null |
| 2024-03-29 | NeSLAM: Neural Implicit Mapping and Self-Supervised Feature Tracking With Depth Completion and Denoising | Tianchen Deng et.al. | 2403.20034 | link |
| 2024-03-28 | SAID-NeRF: Segmentation-AIDed NeRF for Depth Completion of Transparent Objects | Avinash Ummadisingu et.al. | 2403.19607 | null |
| 2024-03-30 | GlORIE-SLAM: Globally Optimized RGB-only Implicit Encoding Point Cloud SLAM | Ganlin Zhang et.al. | 2403.19549 | link |
| 2024-03-28 | CoherentGS: Sparse Novel View Synthesis with Coherent 3D Gaussians | Avinash Paliwal et.al. | 2403.19495 | link |
| 2024-03-28 | FlowDepth: Decoupling Optical Flow for Self-Supervised Monocular Depth Estimation | Yiyang Sun et.al. | 2403.19294 | null |
| 2024-03-28 | Neural Fields for 3D Tracking of Anatomy and Surgical Instruments in Monocular Laparoscopic Video Clips | Beerend G. A. Gerats et.al. | 2403.19265 | null |
| 2024-03-27 | UniDepth: Universal Monocular Metric Depth Estimation | Luigi Piccinelli et.al. | 2403.18913 | link |
| 2024-04-01 | ECoDepth: Effective Conditioning of Diffusion Models for Monocular Depth Estimation | Suraj Patni et.al. | 2403.18807 | link |
| 2024-03-27 | ModaLink: Unifying Modalities for Efficient Image-to-PointCloud Place Recognition | Weidong Xie et.al. | 2403.18762 | link |
| 2024-03-27 | $\mathrm{F^2Depth}$ : Self-supervised Indoor Monocular Depth Estimation via Optical Flow Consistency and Feature Map Synthesis | Xiaotong Guo et.al. | 2403.18443 | null |
| 2024-03-26 | Track Everything Everywhere Fast and Robustly | Yunzhou Song et.al. | 2403.17931 | null |
| 2024-03-26 | Leveraging Near-Field Lighting for Monocular Depth Estimation from Endoscopy Videos | Akshay Paruchuri et.al. | 2403.17915 | null |
| 2024-03-26 | DN-Splatter: Depth and Normal Priors for Gaussian Splatting and Meshing | Matias Turkulainen et.al. | 2403.17822 | null |
| 2024-03-27 | Physical 3D Adversarial Attacks against Monocular Depth Estimation in Autonomous Driving | Junhao Zheng et.al. | 2403.17301 | link |
| 2024-03-25 | Spike-NeRF: Neural Radiance Field Based On Spike Camera | Yijia Guo et.al. | 2403.16410 | null |
| 2024-03-25 | Elite360D: Towards Efficient 360 Depth Estimation via Semantic- and Distance-Aware Bi-Projection Fusion | Hao Ai et.al. | 2403.16376 | null |
| 2024-03-23 | Depth Estimation fusing Image and Radar Measurements with Uncertain Directions | Masaya Kotani et.al. | 2403.15787 | null |
| 2024-03-22 | Language-Based Depth Hints for Monocular Depth Estimation | Dylan Auty et.al. | 2403.15551 | null |
| 2024-03-21 | Learning to Project for Cross-Task Knowledge Distillation | Dylan Auty et.al. | 2403.14494 | null |
| 2024-03-20 | DepthFM: Fast Monocular Depth Estimation with Flow Matching | Ming Gui et.al. | 2403.13788 | null |
| 2024-03-19 | When Do We Not Need Larger Vision Models? | Baifeng Shi et.al. | 2403.13043 | link |
| 2024-03-19 | FutureDepth: Learning to Predict the Future Improves Video Depth Estimation | Rajeev Yasarla et.al. | 2403.12953 | null |
| 2024-03-19 | Geometric Constraints in Deep Learning Frameworks: A Survey | Vibhas K Vats et.al. | 2403.12431 | null |
| 2024-03-18 | GraphBEV: Towards Robust BEV Feature Alignment for Multi-Modal 3D Object Detection | Ziying Song et.al. | 2403.11848 | null |
| 2024-03-18 | SSAP: A Shape-Sensitive Adversarial Patch for Comprehensive Disruption of Monocular Depth Estimation in Autonomous Navigation Applications | Amira Guesmi et.al. | 2403.11515 | null |
| 2024-03-17 | Bilateral Propagation Network for Depth Completion | Jie Tang et.al. | 2403.11270 | null |
| 2024-03-16 | MSI-NeRF: Linking Omni-Depth with View Synthesis through Multi-Sphere Image aided Generalizable Neural Radiance Field | Dongyu Yan et.al. | 2403.10840 | null |
| 2024-03-15 | SwinMTL: A Shared Architecture for Simultaneous Depth Estimation and Semantic Segmentation from Monocular Camera Images | Pardis Taghavi et.al. | 2403.10662 | link |
| 2024-03-15 | Robust Shape Fitting for 3D Scene Abstraction | Florian Kluger et.al. | 2403.10452 | link |
| 2024-03-15 | Region-aware Distribution Contrast: A Novel Approach to Multi-Task Partially Supervised Learning | Meixuan Li et.al. | 2403.10252 | null |
| 2024-03-18 | Touch-GS: Visual-Tactile Supervised 3D Gaussian Splatting | Aiden Swann et.al. | 2403.09875 | null |
| 2024-03-14 | Improving Distant 3D Object Detection Using 2D Box Supervision | Zetong Yang et.al. | 2403.09230 | null |
| 2024-03-13 | SM4Depth: Seamless Monocular Metric Depth Estimation across Multiple Cameras and Scenes by One Model | Yihao Liu et.al. | 2403.08556 | link |
| 2024-03-13 | METER: a mobile vision transformer architecture for monocular depth estimation | L. Papa et.al. | 2403.08368 | link |
| 2024-03-12 | Q-SLAM: Quadric Representations for Monocular SLAM | Chensheng Peng et.al. | 2403.08125 | null |
| 2024-03-12 | Adaptive Fusion of Single-View and Multi-View Depth for Autonomous Driving | JunDa Cheng et.al. | 2403.07535 | null |
| 2024-03-12 | D4D: An RGBD diffusion model to boost monocular depth estimation | L. Papa et.al. | 2403.07516 | link |
| 2024-03-12 | SGE: Structured Light System Based on Gray Code with an Event Camera | Xingyu Lu et.al. | 2403.07326 | null |
| 2024-03-11 | Forest Inspection Dataset for Aerial Semantic Segmentation and Depth Estimation | Bianca-Cerasela-Zelia Blaga et.al. | 2403.06621 | link |
| 2024-03-11 | HDA-LVIO: A High-Precision LiDAR-Visual-Inertial Odometry in Urban Environments with Hybrid Data Association | Jian Shi et.al. | 2403.06590 | null |
| 2024-03-11 | Confidence-Aware RGB-D Face Recognition via Virtual Depth Synthesis | Zijian Chen et.al. | 2403.06529 | null |
| 2024-03-09 | DO3D: Self-supervised Learning of Decomposed Object-aware 3D Motion and Depth from Monocular Videos | Xiuzhe Wu et.al. | 2403.05895 | null |
| 2024-03-07 | Density-Regression: Efficient and Distance-Aware Deep Regressor for Uncertainty Estimation under Distribution Shifts | Ha Manh Bui et.al. | 2403.05600 | link |
| 2024-03-08 | OccFusion: Depth Estimation Free Multi-sensor Fusion for 3D Occupancy Prediction | Ji Zhang et.al. | 2403.05329 | null |
| 2024-03-08 | Stealing Stable Diffusion Prior for Robust Monocular Depth Estimation | Yifan Mao et.al. | 2403.05056 | link |
| 2024-03-06 | Multi-task Learning for Real-time Autonomous Driving Leveraging Task-adaptive Attention Generator | Wonhyeok Choi et.al. | 2403.03468 | null |
| 2024-03-07 | Scene Depth Estimation from Traditional Oriental Landscape Paintings | Sungho Kang et.al. | 2403.03408 | null |
| 2024-03-04 | Iterative Occlusion-Aware Light Field Depth Estimation using 4D Geometrical Cues | Rui Lourenço et.al. | 2403.02043 | null |
| 2024-03-04 | Scalable Vision-Based 3D Object Detection and Monocular Depth Estimation for Autonomous Driving | Yuxuan Liu et.al. | 2403.02037 | link |
| 2024-03-04 | DD-VNB: A Depth-based Dual-Loop Framework for Real-time Visually Navigated Bronchoscopy | Qingyao Tian et.al. | 2403.01683 | null |
| 2024-03-03 | Kick Back & Relax++: Scaling Beyond Ground-Truth Depth with SlowTV & CribsTV | Jaime Spencer et.al. | 2403.01569 | link |
| 2024-03-03 | Pyramid Feature Attention Network for Monocular Depth Prediction | Yifang Xu et.al. | 2403.01440 | null |
| 2024-03-03 | Depth Estimation Algorithm Based on Transformer-Encoder and Feature Fusion | Linhan Xia et.al. | 2403.01370 | null |
| 2024-03-02 | Depth Information Assisted Collaborative Mutual Promotion Network for Single Image Dehazing | Yafei Zhang et.al. | 2403.01105 | null |
| 2024-02-29 | PCDepth: Pattern-based Complementary Learning for Monocular Depth Estimation by Best of Both Worlds | Haotian Liu et.al. | 2402.18925 | null |
| 2024-02-29 | CFDNet: A Generalizable Foggy Stereo Matching Network with Contrastive Feature Distillation | Zihua Liu et.al. | 2402.18181 | null |
| 2024-02-28 | Self-Supervised Spatially Variant PSF Estimation for Aberration-Aware Depth-from-Defocus | Zhuofeng Wu et.al. | 2402.18175 | null |
| 2024-02-28 | Passive Snapshot Coded Aperture Dual-Pixel RGB-D Imaging | Bhargav Ghanekar et.al. | 2402.18102 | null |
| 2024-02-27 | A Vanilla Multi-Task Framework for Dense Visual Prediction Solution to 1st VCL Challenge – Multi-Task Robustness Track | Zehui Chen et.al. | 2402.17319 | null |
| 2024-02-26 | Automated Floodwater Depth Estimation Using Large Multimodal Model for Rapid Flood Mapping | Temitope Akinboyewa et.al. | 2402.16684 | null |
| 2024-02-22 | GAM-Depth: Self-Supervised Indoor Depth Estimation Leveraging a Gradient-Aware Mask and Semantic Constraints | Anqi Cheng et.al. | 2402.14354 | null |
| 2024-02-22 | TIE-KD: Teacher-Independent and Explainable Knowledge Distillation for Monocular Depth Estimation | Sangwon Choi et.al. | 2402.14340 | link |
| 2024-02-21 | Zero-BEV: Zero-shot Projection of Any First-Person Modality to BEV Maps | Gianluca Monaci et.al. | 2402.13848 | null |
| 2024-02-19 | An Endoscopic Chisel: Intraoperative Imaging Carves 3D Anatomical Models | Jan Emily Mangulabnan et.al. | 2402.11840 | null |
| 2024-02-19 | Unveiling the Depths: A Multi-Modal Fusion Framework for Challenging Scenarios | Jialei Xu et.al. | 2402.11826 | null |
(<a href=../README.md>back to main</a>)