Action Recognition - 2025-05
Action Recognition - 2025-05
| Publish Date | Title | Authors | Translate | Read | Code | |
|---|---|---|---|---|---|---|
| 2025-05-30 | DiG-Net: Enhancing Quality of Life through Hyper-Range Dynamic Gesture Recognition in Assistive Robotics | Eran Bamani Beeri et.al. | 2505.24786 | translate | read | null |
| 2025-05-30 | Beyond FACS: Data-driven Facial Expression Dictionaries, with Application to Predicting Autism | Evangelos Sariyanidi et.al. | 2505.24679 | translate | read | null |
| 2025-05-30 | EgoExOR: An Ego-Exo-Centric Operating Room Dataset for Surgical Activity Understanding | Ege Özsoy et.al. | 2505.24287 | translate | read | null |
| 2025-05-29 | Autoregressive Meta-Actions for Unified Controllable Trajectory Generation | Jianbo Zhao et.al. | 2505.23612 | translate | read | null |
| 2025-05-29 | CLIP-AE: CLIP-assisted Cross-view Audio-Visual Enhancement for Unsupervised Temporal Action Localization | Rui Xia et.al. | 2505.23524 | translate | read | null |
| 2025-05-29 | Spatio-Temporal Joint Density Driven Learning for Skeleton-Based Action Recognition | Shanaka Ramesh Gunasekara et.al. | 2505.23012 | translate | read | link |
| 2025-05-28 | PRISM: Video Dataset Condensation with Progressive Refinement and Insertion for Sparse Motion | Jaehyun Choi et.al. | 2505.22564 | translate | read | null |
| 2025-05-27 | DeepConvContext: A Multi-Scale Approach to Timeseries Classification in Human Activity Recognition | Marius Bock et.al. | 2505.20894 | translate | read | link |
| 2025-05-27 | TrustSkin: A Fairness Pipeline for Trustworthy Facial Affect Analysis Across Skin Tone | Ana M. Cabanas et.al. | 2505.20637 | translate | read | null |
| 2025-05-26 | Data-Free Class-Incremental Gesture Recognition with Prototype-Guided Pseudo Feature Replay | Hongsong Wang et.al. | 2505.20049 | translate | read | link |
| 2025-05-26 | PHI: Bridging Domain Shift in Long-Term Action Quality Assessment via Progressive Hierarchical Instruction | Kanglei Zhou et.al. | 2505.19972 | translate | read | link |
| 2025-05-26 | The Role of Video Generation in Enhancing Data-Limited Action Understanding | Wei Li et.al. | 2505.19495 | translate | read | null |
| 2025-05-24 | ProphetDWM: A Driving World Model for Rolling Out Future Actions and Videos | Xiaodong Wang et.al. | 2505.18650 | translate | read | null |
| 2025-05-27 | SHARDeg: A Benchmark for Skeletal Human Action Recognition in Degraded Scenarios | Simon Malzard et.al. | 2505.18048 | translate | read | null |
| 2025-05-23 | 3D Face Reconstruction Error Decomposed: A Modular Benchmark for Fair and Fast Method Evaluation | Evangelos Sariyanidi et.al. | 2505.18025 | translate | read | null |
| 2025-05-23 | Multi-task Learning For Joint Action and Gesture Recognition | Konstantinos Spathis et.al. | 2505.17867 | translate | read | null |
| 2025-05-23 | Temporal Consistency Constrained Transferable Adversarial Attacks with Background Mixup for Action Recognition | Ping Li et.al. | 2505.17807 | translate | read | link |
| 2025-05-23 | Integrating Counterfactual Simulations with Language Models for Explaining Multi-Agent Behaviour | Bálint Gyevnár et.al. | 2505.17801 | translate | read | null |
| 2025-05-23 | SVL: Spike-based Vision-language Pretraining for Efficient 3D Open-world Understanding | Xuerui Qiu et.al. | 2505.17674 | translate | read | null |
| 2025-05-23 | ProTAL: A Drag-and-Link Video Programming Framework for Temporal Action Localization | Yuchen He et.al. | 2505.17555 | translate | read | null |
| 2025-05-22 | UAV Control with Vision-based Hand Gesture Recognition over Edge-Computing | Sousannah Abdalla et.al. | 2505.17303 | translate | read | null |
| 2025-05-22 | CoMo: Learning Continuous Latent Motion from Internet Videos for Scalable Robot Learning | Jiange Yang et.al. | 2505.17006 | translate | read | null |
| 2025-05-21 | Towards Zero-Shot Differential Morphing Attack Detection with Multimodal Large Language Models | Ria Shekhawat et.al. | 2505.15332 | translate | read | null |
| 2025-05-21 | DiffProb: Data Pruning for Face Recognition | Eduarda Caldeira et.al. | 2505.15272 | translate | read | link |
| 2025-05-21 | Leveraging Foundation Models for Multimodal Graph-Based Action Recognition | Fatemeh Ziaeetabar et.al. | 2505.15192 | translate | read | null |
| 2025-05-20 | Egocentric Action-aware Inertial Localization in Point Clouds | Mingfang Zhang et.al. | 2505.14346 | translate | read | link |
| 2025-05-20 | Transfer Learning from Visual Speech Recognition to Mouthing Recognition in German Sign Language | Dinh Nam Pham et.al. | 2505.13784 | translate | read | link |
| 2025-05-18 | MTIL: Encoding Full History with Mamba for Temporal Imitation Learning | Yulin Zhou et.al. | 2505.12410 | translate | read | link |
| 2025-05-20 | Aux-Think: Exploring Reasoning Strategies for Data-Efficient Vision-Language Navigation | Shuo Wang et.al. | 2505.11886 | translate | read | null |
| 2025-05-16 | Dynam3D: Dynamic Layered 3D Tokens Empower VLM for Vision-and-Language Navigation | Zihan Wang et.al. | 2505.11383 | translate | read | link |
| 2025-05-15 | NeoLightning: A Modern Reimagination of Gesture-Based Sound Design | Yonghyun Kim et.al. | 2505.10686 | translate | read | link |
| 2025-05-15 | Are Spatial-Temporal Graph Convolution Networks for Human Action Recognition Over-Parameterized? | Jianyang Xie et.al. | 2505.10679 | translate | read | link |
| 2025-05-14 | Mission Balance: Generating Under-represented Class Samples using Video Diffusion Models | Danush Kumar Venkatesh et.al. | 2505.09858 | translate | read | link |
| 2025-05-13 | Reinforcement Learning meets Masked Video Modeling : Trajectory-Guided Adaptive Token Selection | Ayush K. Rai et.al. | 2505.08561 | translate | read | null |
| 2025-05-17 | Training Strategies for Efficient Embodied Reasoning | William Chen et.al. | 2505.08243 | translate | read | null |
| 2025-05-12 | H $^{\mathbf{3}}$ DP: Triply-Hierarchical Diffusion Policy for Visuomotor Learning | Yiyang Lu et.al. | 2505.07819 | translate | read | null |
| 2025-05-11 | DeepSORT-Driven Visual Tracking Approach for Gesture Recognition in Interactive Systems | Tong Zhang et.al. | 2505.07110 | translate | read | null |
| 2025-05-10 | A Short Overview of Multi-Modal Wi-Fi Sensing | Zijian Zhao et.al. | 2505.06682 | translate | read | link |
| 2025-05-09 | Context Informed Incremental Learning Improves Myoelectric Control Performance in Virtual Reality Object Manipulation Tasks | Gabriel Gagné et.al. | 2505.06064 | translate | read | link |
| 2025-05-09 | Task-Adapter++: Task-specific Adaptation with Order-aware Alignment for Few-shot Action Recognition | Congqi Cao et.al. | 2505.06002 | translate | read | link |
| 2025-05-07 | DetReIDX: A Stress-Test Dataset for Real-World UAV-Based Person Recognition | Kailash A. Hambarde et.al. | 2505.04793 | translate | read | link |
| 2025-05-07 | Comparison of Visual Trackers for Biomechanical Analysis of Running | Luis F. Gomez et.al. | 2505.04713 | translate | read | null |
| 2025-05-07 | Trajectory Entropy Reinforcement Learning for Predictable and Robust Control | Bang You et.al. | 2505.04193 | translate | read | null |
| 2025-05-07 | FoodTrack: Estimating Handheld Food Portions with Egocentric Video | Ervin Wang et.al. | 2505.04055 | translate | read | null |
| 2025-05-06 | Action Spotting and Precise Event Detection in Sports: Datasets, Methods, and Challenges | Hao Xu et.al. | 2505.03991 | translate | read | null |
| 2025-05-03 | A Multimodal Framework for Explainable Evaluation of Soft Skills in Educational Environments | Jared D. T. Guerrero-Sosa et.al. | 2505.01794 | translate | read | null |
| 2025-05-01 | Predicting Estimated Times of Restoration for Electrical Outages Using Longitudinal Tabular Transformers | Bogireddy Sai Prasanna Teja et.al. | 2505.00225 | translate | read | null |
(<a href=../Action_Recognition.md>back to Action Recognition</a>)