Action Recognition - 2025-11 | Paper Arxiv Daily

Action Recognition - 2025-11

Publish Date	Title	Authors	PDF	Translate	Read	Code
2025-11-29	Developing Fairness-Aware Task Decomposition to Improve Equity in Post-Spinal Fusion Complication Prediction	Yining Yuan et.al.	2512.00598	translate	read	null
2025-11-29	Integrating Skeleton Based Representations for Robust Yoga Pose Classification Using Deep Learning Models	Mohammed Mohiuddin et.al.	2512.00572	translate	read	null
2025-11-28	LatBot: Distilling Universal Latent Actions for Vision-Language-Action Models	Zuolei Li et.al.	2511.23034	translate	read	null
2025-11-27	SkeletonAgent: An Agentic Interaction Framework for Skeleton-based Action Recognition	Hongda Liu et.al.	2511.22433	translate	read	null
2025-11-27	HandyLabel: Towards Post-Processing to Real-Time Annotation Using Skeleton Based Hand Gesture Recognition	Sachin Kumar Singh et.al.	2511.22337	translate	read	null
2025-11-26	Attention-Guided Patch-Wise Sparse Adversarial Attacks on Vision-Language-Action Models	Naifu Zhang et.al.	2511.21663	translate	read	null
2025-11-26	Active Learning for GCN-based Action Recognition	Hichem Sahbi et.al.	2511.21625	translate	read	null
2025-11-26	Towards an Effective Action-Region Tracking Framework for Fine-grained Video Action Recognition	Baoli Sun et.al.	2511.21202	translate	read	null
2025-11-24	Scale What Counts, Mask What Matters: Evaluating Foundation Models for Zero-Shot Cross-Domain Wi-Fi Sensing	Cheng Jiang et.al.	2511.18792	translate	read	null
2025-11-22	ActDistill: General Action-Guided Self-Derived Distillation for Efficient Vision-Language-Action Models	Wencheng Ye et.al.	2511.18082	translate	read	null
2025-11-21	Label-Efficient Skeleton-based Recognition with Stable-Invertible Graph Convolutional Networks	Hichem Sahbi et.al.	2511.17345	translate	read	null
2025-11-21	Social-Media Based Personas Challenge: Hybrid Prediction of Common and Rare User Actions on Bluesky	Benjamin White et.al.	2511.17241	translate	read	null
2025-11-21	VLA-4D: Embedding 4D Awareness into Vision-Language-Action Models for SpatioTemporally Coherent Robotic Manipulation	Hanyu Zhou et.al.	2511.17199	translate	read	null
2025-11-21	Progress-Think: Semantic Progress Reasoning for Vision-Language Navigation	Shuo Wang et.al.	2511.17097	translate	read	null
2025-11-21	H-GAR: A Hierarchical Interaction Framework via Goal-Driven Observation-Action Refinement for Robotic Manipulation	Yijie Zhu et.al.	2511.17079	translate	read	null
2025-11-21	The Wireless Charger as a Gesture Sensor: A Novel Approach to Ubiquitous Interaction	Weiyi Wang et.al.	2511.16989	translate	read	null
2025-11-21	Parts-Mamba: Augmenting Joint Context with Part-Level Scanning for Occluded Human Skeleton	Tianyi Shen et.al.	2511.16860	translate	read	null
2025-11-20	BoxingVI: A Multi-Modal Benchmark for Boxing Action Recognition and Localization	Rahul Kumar et.al.	2511.16524	translate	read	null
2025-11-20	FOOTPASS: A Multi-Modal Multi-Agent Tactical Context Dataset for Play-by-Play Action Spotting in Soccer Broadcast Videos	Jeremie Ochin et.al.	2511.16183	translate	read	null
2025-11-19	Scriboora: Rethinking Human Pose Forecasting	Daniel Bermuth et.al.	2511.15565	translate	read	null
2025-11-18	DoGCLR: Dominance-Game Contrastive Learning Network for Skeleton-Based Action Recognition	Yanshan Li et.al.	2511.14179	translate	read	null
2025-11-18	A Machine Learning-Based Multimodal Framework for Wearable Sensor-Based Archery Action Recognition and Stress Estimation	Xianghe Liu et.al.	2511.14057	translate	read	null
2025-11-17	Computer Vision based group activity detection and action spotting	Narthana Sivalingam et.al.	2511.13315	translate	read	null
2025-11-17	MGCA-Net: Multi-Grained Category-Aware Network for Open-Vocabulary Temporal Action Localization	Zhenying Fang et.al.	2511.13039	translate	read	null
2025-11-17	View-aware Cross-modal Distillation for Multi-view Action Recognition	Trung Thanh Nguyen et.al.	2511.12870	translate	read	null
2025-11-16	Pixels or Positions? Benchmarking Modalities in Group Activity Recognition	Drishya Karki et.al.	2511.12606	translate	read	null
2025-11-15	Locomotion in CAVE: Enhancing Immersion through Full-Body Motion	Xiaohui Li et.al.	2511.12251	translate	read	null
2025-11-14	Rethinking Progression of Memory State in Robotic Manipulation: An Object-Centric Perspective	Nhat Chung et.al.	2511.11478	translate	read	null
2025-11-13	SUGAR: Learning Skeleton Representation with Visual-Motion Knowledge for Action Recognition	Qilang Ye et.al.	2511.10091	translate	read	null
2025-11-12	Revisiting Cross-Architecture Distillation: Adaptive Dual-Teacher Transfer for Lightweight Video Models	Ying Peng et.al.	2511.09469	translate	read	null
2025-11-12	Learning by Neighbor-Aware Semantics, Deciding by Open-form Flows: Towards Robust Zero-Shot Skeleton Action Recognition	Yang Chen et.al.	2511.09388	translate	read	null
2025-11-12	PressTrack-HMR: Pressure-Based Top-Down Multi-Person Global Human Mesh Recovery	Jiayue Yuan et.al.	2511.09147	translate	read	null
2025-11-11	Privacy Beyond Pixels: Latent Anonymization for Privacy-Preserving Video Understanding	Joseph Fioresi et.al.	2511.08666	translate	read	null
2025-11-09	Learning Topology-Driven Multi-Subspace Fusion for Grassmannian Deep Network	Xuan Yu et.al.	2511.08628	translate	read	null
2025-11-05	The chanciness of time	John M. Myers et.al.	2511.08611	translate	read	null
2025-11-11	SASG-DA: Sparse-Aware Semantic-Guided Diffusion Augmentation For Myoelectric Gesture Recognition	Chen Liu et.al.	2511.08344	translate	read	null
2025-11-10	Achieving Effective Virtual Reality Interactions via Acoustic Gesture Recognition based on Large Language Models	Xijie Zhang et.al.	2511.07085	translate	read	null
2025-11-10	Otter: Mitigating Background Distractions of Wide-Angle Few-Shot Action Recognition with Enhanced RWKV	Wenbo Huang et.al.	2511.06741	translate	read	null
2025-11-09	Learning-Based Robust Bayesian Persuasion with Conformal Prediction Guarantees	Heeseung Bang et.al.	2511.06223	translate	read	null
2025-11-06	Grounding Foundational Vision Models with 3D Human Poses for Robust Action Recognition	Nicholas Babey et.al.	2511.05622	translate	read	null
2025-11-06	Pose-Aware Multi-Level Motion Parsing for Action Quality Assessment	Shuaikang Zhu et.al.	2511.05611	translate	read	null
2025-11-07	Accurate online action and gesture recognition system using detectors and Deep SPD Siamese Networks	Mohamed Sanim Akremi et.al.	2511.05250	translate	read	null
2025-11-06	Unified Multimodal Diffusion Forcing for Forceful Manipulation	Zixuan Huang et.al.	2511.04812	translate	read	null
2025-11-06	X-Diffusion: Training Diffusion Policies on Cross-Embodiment Human Demonstrations	Maximus A. Pace et.al.	2511.04671	translate	read	null
2025-11-06	Evo-1: Lightweight Vision-Language-Action Model with Preserved Semantic Alignment	Tao Lin et.al.	2511.04555	translate	read	null
2025-11-06	Alternative Fairness and Accuracy Optimization in Criminal Justice	Shaolong Wu et.al.	2511.04505	translate	read	null
2025-11-06	ThaiOCRBench: A Task-Diverse Benchmark for Vision-Language Understanding in Thai	Surapon Nonesung et.al.	2511.04479	translate	read	null
2025-11-06	Temporal Action Selection for Action Chunking	Yueyang Weng et.al.	2511.04421	translate	read	null
2025-11-06	ForeRobo: Unlocking Infinite Simulation Data for 3D Goal-driven Robotic Manipulation	Dexin wang et.al.	2511.04381	translate	read	null
2025-11-06	GraSP-VLA: Graph-based Symbolic Action Representation for Long-Horizon Planning with VLA Policies	Maëlic Neau et.al.	2511.04357	translate	read	null
2025-11-06	RCMCL: A Unified Contrastive Learning Framework for Robust Multi-Modal (RGB-D, Skeleton, Point Cloud) Action Understanding	Hasan Akgul et.al.	2511.04351	translate	read	null
2025-11-06	GUI-360 $^\circ$ : A Comprehensive Dataset and Benchmark for Computer-Using Agents	Jian Mu et.al.	2511.04307	translate	read	null
2025-11-06	Expectation-Realization Interpretation of Quantum Superposition	Yanting Wang et.al.	2511.04154	translate	read	null
2025-11-06	Learning from Online Videos at Inference Time for Computer-Use Agents	Yujian Liu et.al.	2511.04137	translate	read	null
2025-11-06	Unified Effective Field Theory for Nonlinear and Quantum Optics	Xiaochen Liu et.al.	2511.04118	translate	read	null
2025-11-06	Simple 3D Pose Features Support Human and Machine Social Scene Understanding	Wenshuo Qin et.al.	2511.03988	translate	read	null
2025-11-06	Use of Continuous Glucose Monitoring with Machine Learning to Identify Metabolic Subphenotypes and Inform Precision Lifestyle Changes	Ahmed A. Metwally et.al.	2511.03986	translate	read	null
2025-11-06	Temporal Zoom Networks: Distance Regression and Continuous Depth for Efficient Action Localization	Ibne Farabi Shihab et.al.	2511.03943	translate	read	null
2025-11-05	Enhancing Q-Value Updates in Deep Q-Learning via Successor-State Prediction	Lipeng Zu et.al.	2511.03836	translate	read	null
2025-11-05	Krylov Complexity Meets Confinement	Xuhao Jiang et.al.	2511.03783	translate	read	null
2025-11-05	Disentangled Concepts Speak Louder Than Words: Explainable Video Action Recognition	Jongseo Lee et.al.	2511.03725	translate	read	null
2025-11-05	A Lightweight 3D-CNN for Event-Based Human Action Recognition with Privacy-Preserving Potential	Mehdi Sefidgar Dilmaghani et.al.	2511.03665	translate	read	null
2025-11-05	LiveTradeBench: Seeking Real-World Alpha with Large Language Models	Haofei Yu et.al.	2511.03628	translate	read	link
2025-11-05	Learning Communication Skills in Multi-task Multi-agent Deep Reinforcement Learning	Changxi Zhu et.al.	2511.03348	translate	read	null
2025-11-05	Multi-Object Tracking Retrieval with LLaVA-Video: A Training-Free Solution to MOT25-StAG Challenge	Yi Yang et.al.	2511.03332	translate	read	null
2025-11-04	WorldPlanner: Monte Carlo Tree Search and MPC with Action-Conditioned Visual World Models	R. Khorrambakht et.al.	2511.03077	translate	read	null
2025-11-04	The Curved Spacetime of Transformer Architectures	Riccardo Di Sipio et.al.	2511.03060	translate	read	null
2025-11-04	VCode: a Multimodal Coding Benchmark with SVG as Symbolic Visual Representation	Kevin Qinghong Lin et.al.	2511.02778	translate	read	link
2025-11-04	Agentic World Modeling for 6G: Near-Real-Time Generative State-Space Reasoning	Farhad Rezazadeh et.al.	2511.02748	translate	read	null
2025-11-04	Radio and Optical Flares on the dMe Flare Star EV Lac	Rachel A. Osten et.al.	2511.02719	translate	read	null
2025-11-04	MVAFormer: RGB-based Multi-View Spatio-Temporal Action Recognition with Transformer	Taiga Yamane et.al.	2511.02473	translate	read	null
2025-11-04	From the Laboratory to Real-World Application: Evaluating Zero-Shot Scene Interpretation on Edge Devices for Mobile Robotics	Nicolas Schuler et.al.	2511.02427	translate	read	null
2025-11-03	Euler-Heisenberg action for fermions coupled to gauge and axial vectors: Hessian diagonalization, sector classification, and applications	Lucas Pereira de Souza et.al.	2511.02118	translate	read	null
2025-11-03	Neural dynamics of cognitive control: Current tensions and future promise	Dale Zhou et.al.	2511.02063	translate	read	null
2025-11-03	Path-Coordinated Continual Learning with Neural Tangent Kernel-Justified Plasticity: A Theoretical Framework with Near State-of-the-Art Performance	Rathin Chandra Shit et.al.	2511.02025	translate	read	null
2025-11-03	Unified Diffusion VLA: Vision-Language-Action Model via Joint Discrete Denoising Diffusion Process	Jiayi Chen et.al.	2511.01718	translate	read	null
2025-11-03	OmniVLA: Physically-Grounded Multimodal VLA with Unified Multi-Sensor Perception for Robotic Manipulation	Heyu Guo et.al.	2511.01210	translate	read	null
2025-11-02	Rhythm in the Air: Vision-based Real-Time Music Generation through Gestures	Barathi Subramanian et.al.	2511.00793	translate	read	null

(<a href=../Action_Recognition.md>back to Action Recognition</a>)