Action Recognition - 2025-03 | Paper Arxiv Daily

Action Recognition - 2025-03

Publish Date	Title	Authors	PDF	Translate	Read	Code
2025-03-30	CA^2ST: Cross-Attention in Audio, Space, and Time for Holistic Video Recognition	Jongseo Lee et.al.	2503.23447	translate	read	null
2025-03-30	OwlSight: A Robust Illumination Adaptation Framework for Dark Video Human Action Recognition	Shihao Cheng et.al.	2503.23266	translate	read	null
2025-03-29	Action Recognition in Real-World Ambient Assisted Living Environment	Vincent Gbouna Zakka et.al.	2503.23214	translate	read	link
2025-03-28	ForcePose: A Deep Learning Approach for Force Calculation Based on Action Recognition Using MediaPipe Pose Estimation Combined with Object Detection	Nandakishor M et.al.	2503.22363	translate	read	null
2025-03-30	UI-R1: Enhancing Action Prediction of GUI Agents by Reinforcement Learning	Zhengxi Lu et.al.	2503.21620	translate	read	link
2025-03-27	One Snapshot is All You Need: A Generalized Method for mmWave Signal Generation	Teng Huang et.al.	2503.21122	translate	read	null
2025-03-26	ScreenLLM: Stateful Screen Schema for Efficient Action Understanding and Prediction	Yiqiao Jin et.al.	2503.20978	translate	read	null
2025-03-26	Siformer: Feature-isolated Transformer for Efficient Skeleton-based Sign Language Recognition	Muxin Pu et.al.	2503.20436	translate	read	null
2025-03-25	Surg-3M: A Dataset and Foundation Model for Perception in Surgical Settings	Chengan Che et.al.	2503.19740	translate	read	link
2025-03-25	fine-CLIP: Enhancing Zero-Shot Fine-Grained Surgical Action Recognition with Vision-Language Models	Saurav Sharma et.al.	2503.19670	translate	read	null
2025-03-24	LLaVAction: evaluating and training multi-modal large language models for action recognition	Shaokai Ye et.al.	2503.18712	translate	read	link
2025-03-24	Surgical Action Planning with Large Language Models	Mengya Xu et.al.	2503.18296	translate	read	null
2025-03-27	Temporal-Guided Spiking Neural Networks for Event-Based Human Action Recognition	Siyuan Yang et.al.	2503.17132	translate	read	null
2025-03-21	BEAC: Imitating Complex Exploration and Task-oriented Behaviors for Invisible Object Nonprehensile Manipulation	Hirotaka Tahara et.al.	2503.16803	translate	read	null
2025-03-21	Improving mmWave based Hand Hygiene Monitoring through Beam Steering and Combining Techniques	Isura Nirmal et.al.	2503.16764	translate	read	null
2025-03-19	A Comprehensive Survey on Architectural Advances in Deep CNNs: Challenges, Applications, and Emerging Research Directions	Saddam Hussain Khan et.al.	2503.16546	translate	read	null
2025-03-25	Deep learning framework for action prediction reveals multi-timescale locomotor control	Wei-Chen Wang et.al.	2503.16340	translate	read	null
2025-03-19	UI-Vision: A Desktop-centric GUI Benchmark for Visual Perception and Interaction	Shravan Nayak et.al.	2503.15661	translate	read	null
2025-03-19	Multi-Modal Gesture Recognition from Video and Surgical Tool Pose Information via Motion Invariants	Jumanh Atoum et.al.	2503.15647	translate	read	null
2025-03-21	Body-Hand Modality Expertized Networks with Cross-attention for Fine-grained Skeleton Action Recognition	Seungyeon Cho et.al.	2503.14960	translate	read	null
2025-03-19	DPFlow: Adaptive Optical Flow Estimation with a Dual-Pyramid Framework	Henrique Morimitsu et.al.	2503.14880	translate	read	link
2025-03-15	Salient Temporal Encoding for Dynamic Scene Graph Generation	Zhihao Zhu et.al.	2503.14524	translate	read	null
2025-03-17	Towards Scalable Modeling of Compressed Videos for Efficient Action Recognition	Shristi Das Biswas et.al.	2503.13724	translate	read	null
2025-03-20	STEP: Simultaneous Tracking and Estimation of Pose for Animals and Humans	Shashikant Verma et.al.	2503.13344	translate	read	null
2025-03-17	Dense Policy: Bidirectional Autoregressive Learning of Actions	Yue Su et.al.	2503.13217	translate	read	null
2025-03-16	EgoEvGesture: Gesture Recognition Based on Egocentric Event Camera	Luming Wang et.al.	2503.12419	translate	read	link
2025-03-16	ProbDiffFlow: An Efficient Learning-Free Framework for Probabilistic Single-Image Optical Flow Estimation	Mo Zhou et.al.	2503.12348	translate	read	null
2025-03-15	Real-Time Manipulation Action Recognition with a Factorized Graph Sequence Encoder	Enes Erdogan et.al.	2503.12034	translate	read	null
2025-03-14	Enhancing Hand Palm Motion Gesture Recognition by Eliminating Reference Frame Bias via Frame-Invariant Similarity Measures	Arno Verduyn et.al.	2503.11352	translate	read	null
2025-03-14	Aerial Vision-and-Language Navigation with Grid-based View Selection and Map Construction	Ganlong Zhao et.al.	2503.11091	translate	read	null
2025-03-14	VA-AR: Learning Velocity-Aware Action Representations with Mixture of Window Attention	Jiangning Wei et.al.	2503.11004	translate	read	null
2025-03-13	Spatial-Temporal Graph Diffusion Policy with Kinematic Modeling for Bimanual Robotic Manipulation	Qi Lv et.al.	2503.10743	translate	read	null
2025-03-11	Open-World Skill Discovery from Unsegmented Demonstrations	Jingwen Deng et.al.	2503.10684	translate	read	link
2025-03-17	HybridVLA: Collaborative Diffusion and Autoregression in a Unified Vision-Language-Action Model	Jiaming Liu et.al.	2503.10631	translate	read	null
2025-03-13	SurgRAW: Multi-Agent Workflow with Chain-of-Thought Reasoning for Surgical Intelligence	Chang Han Low et.al.	2503.10265	translate	read	null
2025-03-12	A Hybrid Neural Network with Smart Skip Connections for High-Precision, Low-Latency EMG-Based Hand Gesture Recognition	Hafsa Wazir et.al.	2503.09041	translate	read	null
2025-03-12	Unified Locomotion Transformer with Simultaneous Sim-to-Real Transfer for Quadrupeds	Dikai Liu et.al.	2503.08997	translate	read	null
2025-03-11	PromptGAR: Flexible Promptive Group Activity Recognition	Zhangyu Jin et.al.	2503.08933	translate	read	null
2025-03-11	MetaFold: Language-Guided Multi-Category Garment Folding Framework via Trajectory Generation and Foundation Model	Haonan Chen et.al.	2503.08372	translate	read	null
2025-03-11	A Survey on Wi-Fi Sensing Generalizability: Taxonomy, Techniques, Datasets, and Future Research Prospects	Fei Wang et.al.	2503.08008	translate	read	null
2025-03-10	Helios 2.0: A Robust, Ultra-Low Power Gesture Recognition System Optimised for Event-Sensor based Wearables	Prarthana Bhattacharyya et.al.	2503.07825	translate	read	null
2025-03-10	Elderly Activity Recognition in the Wild: Results from the EAR Challenge	Anh-Kiet Duong et.al.	2503.07821	translate	read	link
2025-03-09	TimeLoc: A Unified End-to-End Framework for Precise Timestamp Localization in Long Videos	Chen-Lin Zhang et.al.	2503.06526	translate	read	link
2025-03-09	SGA-INTERACT: A 3D Skeleton-based Benchmark for Group Activity Understanding in Modern Basketball Tactic	Yuchen Yang et.al.	2503.06522	translate	read	link
2025-03-07	MPTSNet: Integrating Multiscale Periodic Local Patterns and Global Dependencies for Multivariate Time Series Classification	Yang Mu et.al.	2503.05582	translate	read	null
2025-03-07	Multi-Grained Feature Pruning for Video-Based Human Pose Estimation	Zhigang Wang et.al.	2503.05365	translate	read	null
2025-03-06	Maestro: A 302 GFLOPS/W and 19.8GFLOPS RISC-V Vector-Tensor Architecture for Wearable Ultrasound Edge Computing	Mattia Sinigaglia et.al.	2503.04581	translate	read	null
2025-03-06	Gate-Shift-Pose: Enhancing Action Recognition in Sports with Skeleton Information	Edoardo Bianchi et.al.	2503.04470	translate	read	link
2025-03-06	Spatial-Temporal Perception with Causal Inference for Naturalistic Driving Action Recognition	Qing Chang et.al.	2503.04078	translate	read	null
2025-03-06	Social Gesture Recognition in spHRI: Leveraging Fabric-Based Tactile Sensing on Humanoid Robots	Dakarai Crowder et.al.	2503.03234	translate	read	null
2025-03-04	Semi-Supervised Audio-Visual Video Action Recognition with Audio Source Localization Guided Mixup	Seokun Kang et.al.	2503.02284	translate	read	null
2025-03-04	FABG : End-to-end Imitation Learning for Embodied Affective Human-Robot Interaction	Yanghai Zhang et.al.	2503.01363	translate	read	null
2025-03-04	An Efficient 3D Convolutional Neural Network with Channel-wise, Spatial-grouped, and Temporal Convolutions	Zhe Wang et.al.	2503.00796	translate	read	null
2025-03-02	One-Shot Gesture Recognition for Underwater Diver-To-Robot Communication	Rishikesh Joshi et.al.	2503.00676	translate	read	null
2025-03-04	Unified Video Action Model	Shuang Li et.al.	2503.00200	translate	read	link

(<a href=../Action_Recognition.md>back to Action Recognition</a>)