Action Recognition - 2025-03

Publish Date Title Authors PDF Translate Read Code
2025-03-30 CA^2ST: Cross-Attention in Audio, Space, and Time for Holistic Video Recognition Jongseo Lee et.al. 2503.23447 translate read null
2025-03-30 OwlSight: A Robust Illumination Adaptation Framework for Dark Video Human Action Recognition Shihao Cheng et.al. 2503.23266 translate read null
2025-03-29 Action Recognition in Real-World Ambient Assisted Living Environment Vincent Gbouna Zakka et.al. 2503.23214 translate read link
2025-03-28 ForcePose: A Deep Learning Approach for Force Calculation Based on Action Recognition Using MediaPipe Pose Estimation Combined with Object Detection Nandakishor M et.al. 2503.22363 translate read null
2025-03-30 UI-R1: Enhancing Action Prediction of GUI Agents by Reinforcement Learning Zhengxi Lu et.al. 2503.21620 translate read link
2025-03-27 One Snapshot is All You Need: A Generalized Method for mmWave Signal Generation Teng Huang et.al. 2503.21122 translate read null
2025-03-26 ScreenLLM: Stateful Screen Schema for Efficient Action Understanding and Prediction Yiqiao Jin et.al. 2503.20978 translate read null
2025-03-26 Siformer: Feature-isolated Transformer for Efficient Skeleton-based Sign Language Recognition Muxin Pu et.al. 2503.20436 translate read null
2025-03-25 Surg-3M: A Dataset and Foundation Model for Perception in Surgical Settings Chengan Che et.al. 2503.19740 translate read link
2025-03-25 fine-CLIP: Enhancing Zero-Shot Fine-Grained Surgical Action Recognition with Vision-Language Models Saurav Sharma et.al. 2503.19670 translate read null
2025-03-24 LLaVAction: evaluating and training multi-modal large language models for action recognition Shaokai Ye et.al. 2503.18712 translate read link
2025-03-24 Surgical Action Planning with Large Language Models Mengya Xu et.al. 2503.18296 translate read null
2025-03-27 Temporal-Guided Spiking Neural Networks for Event-Based Human Action Recognition Siyuan Yang et.al. 2503.17132 translate read null
2025-03-21 BEAC: Imitating Complex Exploration and Task-oriented Behaviors for Invisible Object Nonprehensile Manipulation Hirotaka Tahara et.al. 2503.16803 translate read null
2025-03-21 Improving mmWave based Hand Hygiene Monitoring through Beam Steering and Combining Techniques Isura Nirmal et.al. 2503.16764 translate read null
2025-03-19 A Comprehensive Survey on Architectural Advances in Deep CNNs: Challenges, Applications, and Emerging Research Directions Saddam Hussain Khan et.al. 2503.16546 translate read null
2025-03-25 Deep learning framework for action prediction reveals multi-timescale locomotor control Wei-Chen Wang et.al. 2503.16340 translate read null
2025-03-19 UI-Vision: A Desktop-centric GUI Benchmark for Visual Perception and Interaction Shravan Nayak et.al. 2503.15661 translate read null
2025-03-19 Multi-Modal Gesture Recognition from Video and Surgical Tool Pose Information via Motion Invariants Jumanh Atoum et.al. 2503.15647 translate read null
2025-03-21 Body-Hand Modality Expertized Networks with Cross-attention for Fine-grained Skeleton Action Recognition Seungyeon Cho et.al. 2503.14960 translate read null
2025-03-19 DPFlow: Adaptive Optical Flow Estimation with a Dual-Pyramid Framework Henrique Morimitsu et.al. 2503.14880 translate read link
2025-03-15 Salient Temporal Encoding for Dynamic Scene Graph Generation Zhihao Zhu et.al. 2503.14524 translate read null
2025-03-17 Towards Scalable Modeling of Compressed Videos for Efficient Action Recognition Shristi Das Biswas et.al. 2503.13724 translate read null
2025-03-20 STEP: Simultaneous Tracking and Estimation of Pose for Animals and Humans Shashikant Verma et.al. 2503.13344 translate read null
2025-03-17 Dense Policy: Bidirectional Autoregressive Learning of Actions Yue Su et.al. 2503.13217 translate read null
2025-03-16 EgoEvGesture: Gesture Recognition Based on Egocentric Event Camera Luming Wang et.al. 2503.12419 translate read link
2025-03-16 ProbDiffFlow: An Efficient Learning-Free Framework for Probabilistic Single-Image Optical Flow Estimation Mo Zhou et.al. 2503.12348 translate read null
2025-03-15 Real-Time Manipulation Action Recognition with a Factorized Graph Sequence Encoder Enes Erdogan et.al. 2503.12034 translate read null
2025-03-14 Enhancing Hand Palm Motion Gesture Recognition by Eliminating Reference Frame Bias via Frame-Invariant Similarity Measures Arno Verduyn et.al. 2503.11352 translate read null
2025-03-14 Aerial Vision-and-Language Navigation with Grid-based View Selection and Map Construction Ganlong Zhao et.al. 2503.11091 translate read null
2025-03-14 VA-AR: Learning Velocity-Aware Action Representations with Mixture of Window Attention Jiangning Wei et.al. 2503.11004 translate read null
2025-03-13 Spatial-Temporal Graph Diffusion Policy with Kinematic Modeling for Bimanual Robotic Manipulation Qi Lv et.al. 2503.10743 translate read null
2025-03-11 Open-World Skill Discovery from Unsegmented Demonstrations Jingwen Deng et.al. 2503.10684 translate read link
2025-03-17 HybridVLA: Collaborative Diffusion and Autoregression in a Unified Vision-Language-Action Model Jiaming Liu et.al. 2503.10631 translate read null
2025-03-13 SurgRAW: Multi-Agent Workflow with Chain-of-Thought Reasoning for Surgical Intelligence Chang Han Low et.al. 2503.10265 translate read null
2025-03-12 A Hybrid Neural Network with Smart Skip Connections for High-Precision, Low-Latency EMG-Based Hand Gesture Recognition Hafsa Wazir et.al. 2503.09041 translate read null
2025-03-12 Unified Locomotion Transformer with Simultaneous Sim-to-Real Transfer for Quadrupeds Dikai Liu et.al. 2503.08997 translate read null
2025-03-11 PromptGAR: Flexible Promptive Group Activity Recognition Zhangyu Jin et.al. 2503.08933 translate read null
2025-03-11 MetaFold: Language-Guided Multi-Category Garment Folding Framework via Trajectory Generation and Foundation Model Haonan Chen et.al. 2503.08372 translate read null
2025-03-11 A Survey on Wi-Fi Sensing Generalizability: Taxonomy, Techniques, Datasets, and Future Research Prospects Fei Wang et.al. 2503.08008 translate read null
2025-03-10 Helios 2.0: A Robust, Ultra-Low Power Gesture Recognition System Optimised for Event-Sensor based Wearables Prarthana Bhattacharyya et.al. 2503.07825 translate read null
2025-03-10 Elderly Activity Recognition in the Wild: Results from the EAR Challenge Anh-Kiet Duong et.al. 2503.07821 translate read link
2025-03-09 TimeLoc: A Unified End-to-End Framework for Precise Timestamp Localization in Long Videos Chen-Lin Zhang et.al. 2503.06526 translate read link
2025-03-09 SGA-INTERACT: A 3D Skeleton-based Benchmark for Group Activity Understanding in Modern Basketball Tactic Yuchen Yang et.al. 2503.06522 translate read link
2025-03-07 MPTSNet: Integrating Multiscale Periodic Local Patterns and Global Dependencies for Multivariate Time Series Classification Yang Mu et.al. 2503.05582 translate read null
2025-03-07 Multi-Grained Feature Pruning for Video-Based Human Pose Estimation Zhigang Wang et.al. 2503.05365 translate read null
2025-03-06 Maestro: A 302 GFLOPS/W and 19.8GFLOPS RISC-V Vector-Tensor Architecture for Wearable Ultrasound Edge Computing Mattia Sinigaglia et.al. 2503.04581 translate read null
2025-03-06 Gate-Shift-Pose: Enhancing Action Recognition in Sports with Skeleton Information Edoardo Bianchi et.al. 2503.04470 translate read link
2025-03-06 Spatial-Temporal Perception with Causal Inference for Naturalistic Driving Action Recognition Qing Chang et.al. 2503.04078 translate read null
2025-03-06 Social Gesture Recognition in spHRI: Leveraging Fabric-Based Tactile Sensing on Humanoid Robots Dakarai Crowder et.al. 2503.03234 translate read null
2025-03-04 Semi-Supervised Audio-Visual Video Action Recognition with Audio Source Localization Guided Mixup Seokun Kang et.al. 2503.02284 translate read null
2025-03-04 FABG : End-to-end Imitation Learning for Embodied Affective Human-Robot Interaction Yanghai Zhang et.al. 2503.01363 translate read null
2025-03-04 An Efficient 3D Convolutional Neural Network with Channel-wise, Spatial-grouped, and Temporal Convolutions Zhe Wang et.al. 2503.00796 translate read null
2025-03-02 One-Shot Gesture Recognition for Underwater Diver-To-Robot Communication Rishikesh Joshi et.al. 2503.00676 translate read null
2025-03-04 Unified Video Action Model Shuang Li et.al. 2503.00200 translate read link

(<a href=../Action_Recognition.md>back to Action Recognition</a>)