Action Recognition - 2025-11

Publish Date Title Authors PDF Translate Read Code
2025-11-29 Developing Fairness-Aware Task Decomposition to Improve Equity in Post-Spinal Fusion Complication Prediction Yining Yuan et.al. 2512.00598 translate read null
2025-11-29 Integrating Skeleton Based Representations for Robust Yoga Pose Classification Using Deep Learning Models Mohammed Mohiuddin et.al. 2512.00572 translate read null
2025-11-28 LatBot: Distilling Universal Latent Actions for Vision-Language-Action Models Zuolei Li et.al. 2511.23034 translate read null
2025-11-27 SkeletonAgent: An Agentic Interaction Framework for Skeleton-based Action Recognition Hongda Liu et.al. 2511.22433 translate read null
2025-11-27 HandyLabel: Towards Post-Processing to Real-Time Annotation Using Skeleton Based Hand Gesture Recognition Sachin Kumar Singh et.al. 2511.22337 translate read null
2025-11-26 Attention-Guided Patch-Wise Sparse Adversarial Attacks on Vision-Language-Action Models Naifu Zhang et.al. 2511.21663 translate read null
2025-11-26 Active Learning for GCN-based Action Recognition Hichem Sahbi et.al. 2511.21625 translate read null
2025-11-26 Towards an Effective Action-Region Tracking Framework for Fine-grained Video Action Recognition Baoli Sun et.al. 2511.21202 translate read null
2025-11-24 Scale What Counts, Mask What Matters: Evaluating Foundation Models for Zero-Shot Cross-Domain Wi-Fi Sensing Cheng Jiang et.al. 2511.18792 translate read null
2025-11-22 ActDistill: General Action-Guided Self-Derived Distillation for Efficient Vision-Language-Action Models Wencheng Ye et.al. 2511.18082 translate read null
2025-11-21 Label-Efficient Skeleton-based Recognition with Stable-Invertible Graph Convolutional Networks Hichem Sahbi et.al. 2511.17345 translate read null
2025-11-21 Social-Media Based Personas Challenge: Hybrid Prediction of Common and Rare User Actions on Bluesky Benjamin White et.al. 2511.17241 translate read null
2025-11-21 VLA-4D: Embedding 4D Awareness into Vision-Language-Action Models for SpatioTemporally Coherent Robotic Manipulation Hanyu Zhou et.al. 2511.17199 translate read null
2025-11-21 Progress-Think: Semantic Progress Reasoning for Vision-Language Navigation Shuo Wang et.al. 2511.17097 translate read null
2025-11-21 H-GAR: A Hierarchical Interaction Framework via Goal-Driven Observation-Action Refinement for Robotic Manipulation Yijie Zhu et.al. 2511.17079 translate read null
2025-11-21 The Wireless Charger as a Gesture Sensor: A Novel Approach to Ubiquitous Interaction Weiyi Wang et.al. 2511.16989 translate read null
2025-11-21 Parts-Mamba: Augmenting Joint Context with Part-Level Scanning for Occluded Human Skeleton Tianyi Shen et.al. 2511.16860 translate read null
2025-11-20 BoxingVI: A Multi-Modal Benchmark for Boxing Action Recognition and Localization Rahul Kumar et.al. 2511.16524 translate read null
2025-11-20 FOOTPASS: A Multi-Modal Multi-Agent Tactical Context Dataset for Play-by-Play Action Spotting in Soccer Broadcast Videos Jeremie Ochin et.al. 2511.16183 translate read null
2025-11-19 Scriboora: Rethinking Human Pose Forecasting Daniel Bermuth et.al. 2511.15565 translate read null
2025-11-18 DoGCLR: Dominance-Game Contrastive Learning Network for Skeleton-Based Action Recognition Yanshan Li et.al. 2511.14179 translate read null
2025-11-18 A Machine Learning-Based Multimodal Framework for Wearable Sensor-Based Archery Action Recognition and Stress Estimation Xianghe Liu et.al. 2511.14057 translate read null
2025-11-17 Computer Vision based group activity detection and action spotting Narthana Sivalingam et.al. 2511.13315 translate read null
2025-11-17 MGCA-Net: Multi-Grained Category-Aware Network for Open-Vocabulary Temporal Action Localization Zhenying Fang et.al. 2511.13039 translate read null
2025-11-17 View-aware Cross-modal Distillation for Multi-view Action Recognition Trung Thanh Nguyen et.al. 2511.12870 translate read null
2025-11-16 Pixels or Positions? Benchmarking Modalities in Group Activity Recognition Drishya Karki et.al. 2511.12606 translate read null
2025-11-15 Locomotion in CAVE: Enhancing Immersion through Full-Body Motion Xiaohui Li et.al. 2511.12251 translate read null
2025-11-14 Rethinking Progression of Memory State in Robotic Manipulation: An Object-Centric Perspective Nhat Chung et.al. 2511.11478 translate read null
2025-11-13 SUGAR: Learning Skeleton Representation with Visual-Motion Knowledge for Action Recognition Qilang Ye et.al. 2511.10091 translate read null
2025-11-12 Revisiting Cross-Architecture Distillation: Adaptive Dual-Teacher Transfer for Lightweight Video Models Ying Peng et.al. 2511.09469 translate read null
2025-11-12 Learning by Neighbor-Aware Semantics, Deciding by Open-form Flows: Towards Robust Zero-Shot Skeleton Action Recognition Yang Chen et.al. 2511.09388 translate read null
2025-11-12 PressTrack-HMR: Pressure-Based Top-Down Multi-Person Global Human Mesh Recovery Jiayue Yuan et.al. 2511.09147 translate read null
2025-11-11 Privacy Beyond Pixels: Latent Anonymization for Privacy-Preserving Video Understanding Joseph Fioresi et.al. 2511.08666 translate read null
2025-11-09 Learning Topology-Driven Multi-Subspace Fusion for Grassmannian Deep Network Xuan Yu et.al. 2511.08628 translate read null
2025-11-05 The chanciness of time John M. Myers et.al. 2511.08611 translate read null
2025-11-11 SASG-DA: Sparse-Aware Semantic-Guided Diffusion Augmentation For Myoelectric Gesture Recognition Chen Liu et.al. 2511.08344 translate read null
2025-11-10 Achieving Effective Virtual Reality Interactions via Acoustic Gesture Recognition based on Large Language Models Xijie Zhang et.al. 2511.07085 translate read null
2025-11-10 Otter: Mitigating Background Distractions of Wide-Angle Few-Shot Action Recognition with Enhanced RWKV Wenbo Huang et.al. 2511.06741 translate read null
2025-11-09 Learning-Based Robust Bayesian Persuasion with Conformal Prediction Guarantees Heeseung Bang et.al. 2511.06223 translate read null
2025-11-06 Grounding Foundational Vision Models with 3D Human Poses for Robust Action Recognition Nicholas Babey et.al. 2511.05622 translate read null
2025-11-06 Pose-Aware Multi-Level Motion Parsing for Action Quality Assessment Shuaikang Zhu et.al. 2511.05611 translate read null
2025-11-07 Accurate online action and gesture recognition system using detectors and Deep SPD Siamese Networks Mohamed Sanim Akremi et.al. 2511.05250 translate read null
2025-11-06 Unified Multimodal Diffusion Forcing for Forceful Manipulation Zixuan Huang et.al. 2511.04812 translate read null
2025-11-06 X-Diffusion: Training Diffusion Policies on Cross-Embodiment Human Demonstrations Maximus A. Pace et.al. 2511.04671 translate read null
2025-11-06 Evo-1: Lightweight Vision-Language-Action Model with Preserved Semantic Alignment Tao Lin et.al. 2511.04555 translate read null
2025-11-06 Alternative Fairness and Accuracy Optimization in Criminal Justice Shaolong Wu et.al. 2511.04505 translate read null
2025-11-06 ThaiOCRBench: A Task-Diverse Benchmark for Vision-Language Understanding in Thai Surapon Nonesung et.al. 2511.04479 translate read null
2025-11-06 Temporal Action Selection for Action Chunking Yueyang Weng et.al. 2511.04421 translate read null
2025-11-06 ForeRobo: Unlocking Infinite Simulation Data for 3D Goal-driven Robotic Manipulation Dexin wang et.al. 2511.04381 translate read null
2025-11-06 GraSP-VLA: Graph-based Symbolic Action Representation for Long-Horizon Planning with VLA Policies Maëlic Neau et.al. 2511.04357 translate read null
2025-11-06 RCMCL: A Unified Contrastive Learning Framework for Robust Multi-Modal (RGB-D, Skeleton, Point Cloud) Action Understanding Hasan Akgul et.al. 2511.04351 translate read null
2025-11-06 GUI-360 $^\circ$ : A Comprehensive Dataset and Benchmark for Computer-Using Agents Jian Mu et.al. 2511.04307 translate read null
2025-11-06 Expectation-Realization Interpretation of Quantum Superposition Yanting Wang et.al. 2511.04154 translate read null
2025-11-06 Learning from Online Videos at Inference Time for Computer-Use Agents Yujian Liu et.al. 2511.04137 translate read null
2025-11-06 Unified Effective Field Theory for Nonlinear and Quantum Optics Xiaochen Liu et.al. 2511.04118 translate read null
2025-11-06 Simple 3D Pose Features Support Human and Machine Social Scene Understanding Wenshuo Qin et.al. 2511.03988 translate read null
2025-11-06 Use of Continuous Glucose Monitoring with Machine Learning to Identify Metabolic Subphenotypes and Inform Precision Lifestyle Changes Ahmed A. Metwally et.al. 2511.03986 translate read null
2025-11-06 Temporal Zoom Networks: Distance Regression and Continuous Depth for Efficient Action Localization Ibne Farabi Shihab et.al. 2511.03943 translate read null
2025-11-05 Enhancing Q-Value Updates in Deep Q-Learning via Successor-State Prediction Lipeng Zu et.al. 2511.03836 translate read null
2025-11-05 Krylov Complexity Meets Confinement Xuhao Jiang et.al. 2511.03783 translate read null
2025-11-05 Disentangled Concepts Speak Louder Than Words: Explainable Video Action Recognition Jongseo Lee et.al. 2511.03725 translate read null
2025-11-05 A Lightweight 3D-CNN for Event-Based Human Action Recognition with Privacy-Preserving Potential Mehdi Sefidgar Dilmaghani et.al. 2511.03665 translate read null
2025-11-05 LiveTradeBench: Seeking Real-World Alpha with Large Language Models Haofei Yu et.al. 2511.03628 translate read link
2025-11-05 Learning Communication Skills in Multi-task Multi-agent Deep Reinforcement Learning Changxi Zhu et.al. 2511.03348 translate read null
2025-11-05 Multi-Object Tracking Retrieval with LLaVA-Video: A Training-Free Solution to MOT25-StAG Challenge Yi Yang et.al. 2511.03332 translate read null
2025-11-04 WorldPlanner: Monte Carlo Tree Search and MPC with Action-Conditioned Visual World Models R. Khorrambakht et.al. 2511.03077 translate read null
2025-11-04 The Curved Spacetime of Transformer Architectures Riccardo Di Sipio et.al. 2511.03060 translate read null
2025-11-04 VCode: a Multimodal Coding Benchmark with SVG as Symbolic Visual Representation Kevin Qinghong Lin et.al. 2511.02778 translate read link
2025-11-04 Agentic World Modeling for 6G: Near-Real-Time Generative State-Space Reasoning Farhad Rezazadeh et.al. 2511.02748 translate read null
2025-11-04 Radio and Optical Flares on the dMe Flare Star EV Lac Rachel A. Osten et.al. 2511.02719 translate read null
2025-11-04 MVAFormer: RGB-based Multi-View Spatio-Temporal Action Recognition with Transformer Taiga Yamane et.al. 2511.02473 translate read null
2025-11-04 From the Laboratory to Real-World Application: Evaluating Zero-Shot Scene Interpretation on Edge Devices for Mobile Robotics Nicolas Schuler et.al. 2511.02427 translate read null
2025-11-03 Euler-Heisenberg action for fermions coupled to gauge and axial vectors: Hessian diagonalization, sector classification, and applications Lucas Pereira de Souza et.al. 2511.02118 translate read null
2025-11-03 Neural dynamics of cognitive control: Current tensions and future promise Dale Zhou et.al. 2511.02063 translate read null
2025-11-03 Path-Coordinated Continual Learning with Neural Tangent Kernel-Justified Plasticity: A Theoretical Framework with Near State-of-the-Art Performance Rathin Chandra Shit et.al. 2511.02025 translate read null
2025-11-03 Unified Diffusion VLA: Vision-Language-Action Model via Joint Discrete Denoising Diffusion Process Jiayi Chen et.al. 2511.01718 translate read null
2025-11-03 OmniVLA: Physically-Grounded Multimodal VLA with Unified Multi-Sensor Perception for Robotic Manipulation Heyu Guo et.al. 2511.01210 translate read null
2025-11-02 Rhythm in the Air: Vision-based Real-Time Music Generation through Gestures Barathi Subramanian et.al. 2511.00793 translate read null

(<a href=../Action_Recognition.md>back to Action Recognition</a>)