Action Recognition

Publish Date Title Authors PDF Code
2025-12-18 OMG-Bench: A New Challenging Benchmark for Skeleton-based Online Micro Hand Gesture Recognition Haochen Chang et.al. 2512.16727 null
2025-12-18 Skeleton-Snippet Contrastive Learning with Multiscale Feature Fusion for Action Localization Qiushuo Cheng et.al. 2512.16504 null
2025-12-06 Smart Surveillance: Identifying IoT Device Behaviours using ML-Powered Traffic Analysis Reza Ryan et.al. 2512.13709 null
2025-12-15 Recurrent Video Masked Autoencoders Daniel Zoran et.al. 2512.13684 null
2025-12-14 StegaVAR: Privacy-Preserving Video Action Recognition via Steganographic Domain Analysis Lixin Chen et.al. 2512.12586 null
2025-12-13 From Human Intention to Action Prediction: A Comprehensive Benchmark for Intention-driven End-to-End Autonomous Driving Huan Zheng et.al. 2512.12302 null
2025-12-12 DynaPURLS: Dynamic Refinement of Part-aware Representations for Skeleton-based Zero-Shot Action Recognition Jingmin Zhu et.al. 2512.11941 null
2025-12-05 Explainable Adversarial-Robust Vision-Language-Action Model for Robotic Manipulation Ju-Young Kim et.al. 2512.11865 null
2025-12-12 TSkel-Mamba: Temporal Dynamic Modeling via State Space Model for Human Skeleton-based Action Recognition Yanan Liu et.al. 2512.11503 null
2025-12-12 Boosting Skeleton-based Zero-Shot Action Recognition with Training-Free Test-Time Adaptation Jingmin Zhu et.al. 2512.11458 null
2025-12-12 Task-Specific Distance Correlation Matching for Few-Shot Action Recognition Fei Long et.al. 2512.11340 null
2025-12-12 Breast-Rehab: A Postoperative Breast Cancer Rehabilitation Training Assessment System Based on Human Action Recognition Zikang Chen et.al. 2512.11245 null
2025-12-12 Multi-task Learning with Extended Temporal Shift Module for Temporal Action Localization Anh-Kiet Duong et.al. 2512.11189 null
2025-12-11 Deep Photonic Reservoir Computing with On-chip Nonlinearity Jinlong Xiang et.al. 2512.10626 null
2025-12-11 Lang2Motion: Bridging Language and Motion through Joint Embedding Spaces Bishoy Galoaa et.al. 2512.10617 null
2025-12-11 Lies We Can Trust: Quantifying Action Uncertainty with Inaccurate Stochastic Dynamics through Conformalized Nonholonomic Lie Groups Luís Marques et.al. 2512.10294 null
2025-12-10 GLaD: Geometric Latent Distillation for Vision-Language-Action Models Minghao Guo et.al. 2512.09619 null
2025-12-09 Neural Ordinary Differential Equations for Simulating Metabolic Pathway Dynamics from Time-Series Multiomics Data Udesh Habaraduwa et.al. 2512.08732 null
2025-12-09 Aerial Vision-Language Navigation with a Unified Framework for Spatial, Temporal and Embodied Reasoning Huilin Xu et.al. 2512.08639 null
2025-12-09 Mind to Hand: Purposeful Robotic Control via Embodied Reasoning Peijun Tang et.al. 2512.08580 null
2025-12-08 A Comparative Study of EMG- and IMU-based Gesture Recognition at the Wrist and Forearm Soroush Baghernezhad et.al. 2512.07997 null
2025-12-08 Improving action classification with brain-inspired deep networks Aidas Aglinskas et.al. 2512.07729 null
2025-12-08 A Large-Scale Multimodal Dataset and Benchmarks for Human Activity Scene Understanding and Reasoning Siyang Jiang et.al. 2512.07136 null
2025-12-07 VideoVLA: Video Generators Can Be Generalizable Robot Manipulators Yichao Shen et.al. 2512.06963 null
2025-12-04 Towards Adaptive Fusion of Multimodal Deep Networks for Human Action Recognition Novanto Yudistira et.al. 2512.04943 null
2025-12-04 CIG-MAE: Cross-Modal Information-Guided Masked Autoencoder for Self-Supervised WiFi Sensing Gang Liu et.al. 2512.04723 null
2025-12-04 WiFi-based Cross-Domain Gesture Recognition Using Attention Mechanism Ruijing Liu et.al. 2512.04521 null
2025-12-03 Heatmap Pooling Network for Action Recognition from RGB Videos Mengyuan Liu et.al. 2512.03837 null
2025-12-02 SAM2Grasp: Resolve Multi-modal Grasping via Prompt-conditioned Temporal Action Prediction Shengkai Wu et.al. 2512.02609 null
2025-12-01 TBT-Former: Learning Temporal Boundary Distributions for Action Localization Thisara Rathnayaka et.al. 2512.01298 null
2025-11-29 Developing Fairness-Aware Task Decomposition to Improve Equity in Post-Spinal Fusion Complication Prediction Yining Yuan et.al. 2512.00598 null
2025-11-29 Integrating Skeleton Based Representations for Robust Yoga Pose Classification Using Deep Learning Models Mohammed Mohiuddin et.al. 2512.00572 null
2025-10-14 MOTION: ML-Assisted On-Device Low-Latency Motion Recognition Veeramani Pugazhenthi et.al. 2512.00008 null
2025-11-28 LatBot: Distilling Universal Latent Actions for Vision-Language-Action Models Zuolei Li et.al. 2511.23034 null
2025-11-27 SkeletonAgent: An Agentic Interaction Framework for Skeleton-based Action Recognition Hongda Liu et.al. 2511.22433 null
2025-11-27 HandyLabel: Towards Post-Processing to Real-Time Annotation Using Skeleton Based Hand Gesture Recognition Sachin Kumar Singh et.al. 2511.22337 null
2025-11-26 Attention-Guided Patch-Wise Sparse Adversarial Attacks on Vision-Language-Action Models Naifu Zhang et.al. 2511.21663 null
2025-11-26 Active Learning for GCN-based Action Recognition Hichem Sahbi et.al. 2511.21625 null
2025-11-26 Towards an Effective Action-Region Tracking Framework for Fine-grained Video Action Recognition Baoli Sun et.al. 2511.21202 null
2025-11-24 Scale What Counts, Mask What Matters: Evaluating Foundation Models for Zero-Shot Cross-Domain Wi-Fi Sensing Cheng Jiang et.al. 2511.18792 null
2025-11-22 ActDistill: General Action-Guided Self-Derived Distillation for Efficient Vision-Language-Action Models Wencheng Ye et.al. 2511.18082 null
2025-11-21 Label-Efficient Skeleton-based Recognition with Stable-Invertible Graph Convolutional Networks Hichem Sahbi et.al. 2511.17345 null
2025-11-21 Social-Media Based Personas Challenge: Hybrid Prediction of Common and Rare User Actions on Bluesky Benjamin White et.al. 2511.17241 null
2025-11-21 VLA-4D: Embedding 4D Awareness into Vision-Language-Action Models for SpatioTemporally Coherent Robotic Manipulation Hanyu Zhou et.al. 2511.17199 null
2025-11-21 Progress-Think: Semantic Progress Reasoning for Vision-Language Navigation Shuo Wang et.al. 2511.17097 null
2025-11-21 H-GAR: A Hierarchical Interaction Framework via Goal-Driven Observation-Action Refinement for Robotic Manipulation Yijie Zhu et.al. 2511.17079 null
2025-11-21 The Wireless Charger as a Gesture Sensor: A Novel Approach to Ubiquitous Interaction Weiyi Wang et.al. 2511.16989 null
2025-11-21 Parts-Mamba: Augmenting Joint Context with Part-Level Scanning for Occluded Human Skeleton Tianyi Shen et.al. 2511.16860 null
2025-11-20 BoxingVI: A Multi-Modal Benchmark for Boxing Action Recognition and Localization Rahul Kumar et.al. 2511.16524 null
2025-11-20 FOOTPASS: A Multi-Modal Multi-Agent Tactical Context Dataset for Play-by-Play Action Spotting in Soccer Broadcast Videos Jeremie Ochin et.al. 2511.16183 null
2025-11-19 Scriboora: Rethinking Human Pose Forecasting Daniel Bermuth et.al. 2511.15565 null
2025-11-18 DoGCLR: Dominance-Game Contrastive Learning Network for Skeleton-Based Action Recognition Yanshan Li et.al. 2511.14179 null
2025-11-18 A Machine Learning-Based Multimodal Framework for Wearable Sensor-Based Archery Action Recognition and Stress Estimation Xianghe Liu et.al. 2511.14057 null
2025-11-17 Computer Vision based group activity detection and action spotting Narthana Sivalingam et.al. 2511.13315 null
2025-11-17 MGCA-Net: Multi-Grained Category-Aware Network for Open-Vocabulary Temporal Action Localization Zhenying Fang et.al. 2511.13039 null
2025-11-17 View-aware Cross-modal Distillation for Multi-view Action Recognition Trung Thanh Nguyen et.al. 2511.12870 null
2025-11-16 Pixels or Positions? Benchmarking Modalities in Group Activity Recognition Drishya Karki et.al. 2511.12606 null
2025-11-15 Locomotion in CAVE: Enhancing Immersion through Full-Body Motion Xiaohui Li et.al. 2511.12251 null
2025-11-14 Rethinking Progression of Memory State in Robotic Manipulation: An Object-Centric Perspective Nhat Chung et.al. 2511.11478 null
2025-11-13 SUGAR: Learning Skeleton Representation with Visual-Motion Knowledge for Action Recognition Qilang Ye et.al. 2511.10091 null
2025-11-12 Revisiting Cross-Architecture Distillation: Adaptive Dual-Teacher Transfer for Lightweight Video Models Ying Peng et.al. 2511.09469 null
2025-11-12 Learning by Neighbor-Aware Semantics, Deciding by Open-form Flows: Towards Robust Zero-Shot Skeleton Action Recognition Yang Chen et.al. 2511.09388 null
2025-11-12 PressTrack-HMR: Pressure-Based Top-Down Multi-Person Global Human Mesh Recovery Jiayue Yuan et.al. 2511.09147 null
2025-11-11 Privacy Beyond Pixels: Latent Anonymization for Privacy-Preserving Video Understanding Joseph Fioresi et.al. 2511.08666 null
2025-11-09 Learning Topology-Driven Multi-Subspace Fusion for Grassmannian Deep Network Xuan Yu et.al. 2511.08628 null
2025-11-05 The chanciness of time John M. Myers et.al. 2511.08611 null
2025-11-11 SASG-DA: Sparse-Aware Semantic-Guided Diffusion Augmentation For Myoelectric Gesture Recognition Chen Liu et.al. 2511.08344 null
2025-11-10 Achieving Effective Virtual Reality Interactions via Acoustic Gesture Recognition based on Large Language Models Xijie Zhang et.al. 2511.07085 null
2025-11-10 Otter: Mitigating Background Distractions of Wide-Angle Few-Shot Action Recognition with Enhanced RWKV Wenbo Huang et.al. 2511.06741 null
2025-11-09 Learning-Based Robust Bayesian Persuasion with Conformal Prediction Guarantees Heeseung Bang et.al. 2511.06223 null
2025-11-06 Grounding Foundational Vision Models with 3D Human Poses for Robust Action Recognition Nicholas Babey et.al. 2511.05622 null
2025-11-06 Pose-Aware Multi-Level Motion Parsing for Action Quality Assessment Shuaikang Zhu et.al. 2511.05611 null
2025-11-07 Accurate online action and gesture recognition system using detectors and Deep SPD Siamese Networks Mohamed Sanim Akremi et.al. 2511.05250 null
2025-11-06 Unified Multimodal Diffusion Forcing for Forceful Manipulation Zixuan Huang et.al. 2511.04812 null
2025-11-06 X-Diffusion: Training Diffusion Policies on Cross-Embodiment Human Demonstrations Maximus A. Pace et.al. 2511.04671 null
2025-11-06 Evo-1: Lightweight Vision-Language-Action Model with Preserved Semantic Alignment Tao Lin et.al. 2511.04555 null
2025-11-06 Alternative Fairness and Accuracy Optimization in Criminal Justice Shaolong Wu et.al. 2511.04505 null
2025-11-06 ThaiOCRBench: A Task-Diverse Benchmark for Vision-Language Understanding in Thai Surapon Nonesung et.al. 2511.04479 null
2025-11-06 Temporal Action Selection for Action Chunking Yueyang Weng et.al. 2511.04421 null
2025-11-06 ForeRobo: Unlocking Infinite Simulation Data for 3D Goal-driven Robotic Manipulation Dexin wang et.al. 2511.04381 null
2025-11-06 GraSP-VLA: Graph-based Symbolic Action Representation for Long-Horizon Planning with VLA Policies Maëlic Neau et.al. 2511.04357 null
2025-11-06 RCMCL: A Unified Contrastive Learning Framework for Robust Multi-Modal (RGB-D, Skeleton, Point Cloud) Action Understanding Hasan Akgul et.al. 2511.04351 null
2025-11-06 GUI-360 $^\circ$ : A Comprehensive Dataset and Benchmark for Computer-Using Agents Jian Mu et.al. 2511.04307 null
2025-11-06 Expectation-Realization Interpretation of Quantum Superposition Yanting Wang et.al. 2511.04154 null
2025-11-06 Learning from Online Videos at Inference Time for Computer-Use Agents Yujian Liu et.al. 2511.04137 null
2025-11-06 Unified Effective Field Theory for Nonlinear and Quantum Optics Xiaochen Liu et.al. 2511.04118 null
2025-11-06 Simple 3D Pose Features Support Human and Machine Social Scene Understanding Wenshuo Qin et.al. 2511.03988 null
2025-11-06 Use of Continuous Glucose Monitoring with Machine Learning to Identify Metabolic Subphenotypes and Inform Precision Lifestyle Changes Ahmed A. Metwally et.al. 2511.03986 null
2025-11-06 Temporal Zoom Networks: Distance Regression and Continuous Depth for Efficient Action Localization Ibne Farabi Shihab et.al. 2511.03943 null
2025-11-05 Enhancing Q-Value Updates in Deep Q-Learning via Successor-State Prediction Lipeng Zu et.al. 2511.03836 null
2025-11-05 Krylov Complexity Meets Confinement Xuhao Jiang et.al. 2511.03783 null
2025-11-05 Disentangled Concepts Speak Louder Than Words: Explainable Video Action Recognition Jongseo Lee et.al. 2511.03725 null
2025-11-05 A Lightweight 3D-CNN for Event-Based Human Action Recognition with Privacy-Preserving Potential Mehdi Sefidgar Dilmaghani et.al. 2511.03665 null
2025-11-05 LiveTradeBench: Seeking Real-World Alpha with Large Language Models Haofei Yu et.al. 2511.03628 link
2025-11-05 Learning Communication Skills in Multi-task Multi-agent Deep Reinforcement Learning Changxi Zhu et.al. 2511.03348 null
2025-11-05 Multi-Object Tracking Retrieval with LLaVA-Video: A Training-Free Solution to MOT25-StAG Challenge Yi Yang et.al. 2511.03332 null
2025-11-04 WorldPlanner: Monte Carlo Tree Search and MPC with Action-Conditioned Visual World Models R. Khorrambakht et.al. 2511.03077 null
2025-11-04 The Curved Spacetime of Transformer Architectures Riccardo Di Sipio et.al. 2511.03060 null
2025-11-04 VCode: a Multimodal Coding Benchmark with SVG as Symbolic Visual Representation Kevin Qinghong Lin et.al. 2511.02778 link
2025-11-04 Agentic World Modeling for 6G: Near-Real-Time Generative State-Space Reasoning Farhad Rezazadeh et.al. 2511.02748 null
2025-11-04 Radio and Optical Flares on the dMe Flare Star EV Lac Rachel A. Osten et.al. 2511.02719 null
2025-11-04 MVAFormer: RGB-based Multi-View Spatio-Temporal Action Recognition with Transformer Taiga Yamane et.al. 2511.02473 null
2025-11-04 From the Laboratory to Real-World Application: Evaluating Zero-Shot Scene Interpretation on Edge Devices for Mobile Robotics Nicolas Schuler et.al. 2511.02427 null
2025-11-03 Euler-Heisenberg action for fermions coupled to gauge and axial vectors: Hessian diagonalization, sector classification, and applications Lucas Pereira de Souza et.al. 2511.02118 null
2025-11-03 Neural dynamics of cognitive control: Current tensions and future promise Dale Zhou et.al. 2511.02063 null
2025-11-03 Path-Coordinated Continual Learning with Neural Tangent Kernel-Justified Plasticity: A Theoretical Framework with Near State-of-the-Art Performance Rathin Chandra Shit et.al. 2511.02025 null
2025-11-03 Unified Diffusion VLA: Vision-Language-Action Model via Joint Discrete Denoising Diffusion Process Jiayi Chen et.al. 2511.01718 null
2025-11-03 OmniVLA: Physically-Grounded Multimodal VLA with Unified Multi-Sensor Perception for Robotic Manipulation Heyu Guo et.al. 2511.01210 null
2025-11-02 Rhythm in the Air: Vision-based Real-Time Music Generation through Gestures Barathi Subramanian et.al. 2511.00793 null
2025-10-30 Alpamayo-R1: Bridging Reasoning and Action Prediction for Generalizable Autonomous Driving in the Long Tail NVIDIA et.al. 2511.00088 null
2025-10-31 Enhancing Spatio-Temporal Zero-shot Action Recognition with Language-driven Description Attributes Yehna Kim et.al. 2510.27255 null
2025-10-31 GUI-Rise: Structured Reasoning and History Summarization for GUI Navigation Tao Liu et.al. 2510.27210 null
2025-10-30 Spiking Patches: Asynchronous, Sparse, and Efficient Tokens for Event Cameras Christoffer Koo Øhrstrøm et.al. 2510.26614 null
2025-10-29 Enhancing Temporal Understanding in Video-LLMs through Stacked Temporal Attention in Vision Encoders Ali Rasekh et.al. 2510.26027 null
2025-10-29 Informative Sample Selection Model for Skeleton-based Action Recognition with Limited Training Samples Zhigang Tu et.al. 2510.25345 null
2025-10-27 Evaluation of Vision-LLMs in Surveillance Video Pascal Benschop et.al. 2510.23190 null
2025-10-27 Enabling Vibration-Based Gesture Recognition on Everyday Furniture via Energy-Efficient FPGA Implementation of 1D Convolutional Networks Koki Shibata et.al. 2510.23156 null
2025-10-27 Neural Recording Power Optimization Through Machine Learning Guided Resolution Reconfiguration Aviral Pandey et.al. 2510.22924 null
2025-10-13 J-ORA: A Framework and Multimodal Dataset for Japanese Object Identification, Reference, Action Prediction in Robot Perception Jesse Atuhurra et.al. 2510.21761 link
2025-10-22 From Forecasting to Planning: Policy World Model for Collaborative State-Action Prediction Zhida Zhao et.al. 2510.19654 null
2025-10-22 Vision-Based Mistake Analysis in Procedural Activities: A Review of Advances and Challenges Konstantinos Bacharidis et.al. 2510.19292 null
2025-10-22 MobiAct: Efficient MAV Action Recognition Using MobileNetV4 with Contrastive Learning and Knowledge Distillation Zhang Nengbo et.al. 2510.19273 null
2025-10-22 See, Think, Act: Online Shopper Behavior Simulation with VLM Agents Yimeng Zhang et.al. 2510.19245 null
2025-10-21 UniHPR: Unified Human Pose Representation via Singular Value Contrastive Learning Zhongyu Jiang et.al. 2510.19078 null
2025-10-21 A Renaissance of Explicit Motion Information Mining from Transformers for Action Recognition Peiqin Zhuang et.al. 2510.18705 null
2025-10-21 Biomechanically consistent real-time action recognition for human-robot interaction Wanchen Li et.al. 2510.18373 null
2025-10-21 FST.ai 2.0: An Explainable AI Ecosystem for Fair, Fast, and Inclusive Decision-Making in Olympic and Paralympic Taekwondo Keivan Shariatmadar et.al. 2510.18193 null
2025-10-20 Muscle Anatomy-aware Geometric Deep Learning for sEMG-based Gesture Decoding Adyasha Dash et.al. 2510.17660 null
2025-10-18 MoS-VLA: A Vision-Language-Action Model with One-Shot Skill Adaptation Ruihan Zhao et.al. 2510.16617 null
2025-10-18 RefAtomNet++: Advancing Referring Atomic Video Action Recognition using Semantic Retrieval based Multi-Trajectory Mamba Kunyu Peng et.al. 2510.16444 null
2025-10-17 StretchySnake: Flexible SSM Training Unlocks Action Recognition Across Spatio-Temporal Scales Nyle Siddiqui et.al. 2510.16209 null
2025-10-17 MAVR-Net: Robust Multi-View Learning for MAV Action Recognition with Cross-View Attention Nengbo Zhang et.al. 2510.15448 null
2025-10-16 MoCom: Motion-based Inter-MAV Visual Communication Using Event Vision and Spiking Neural Networks Zhang Nengbo et.al. 2510.14770 null
2025-10-15 Generalizing WiFi Gesture Recognition via Large-Model-Aware Semantic Distillation and Alignment Feng-Qi Cui et.al. 2510.13390 null
2025-10-14 SVAG-Bench: A Large-Scale Benchmark for Multi-Instance Spatio-temporal Video Action Grounding Tanveer Hannan et.al. 2510.13016 null
2025-10-13 FOSSIL: Harnessing Feedback on Suboptimal Samples for Data-Efficient Generalisation with Imitation Learning for Embodied Vision-and-Language Tasks Sabrina McCallum et.al. 2510.11307 null
2025-10-13 Mixup Helps Understanding Multimodal Video Better Xiaoyu Ma et.al. 2510.10986 null
2025-10-12 MSF-Mamba: Motion-aware State Fusion Mamba for Efficient Micro-Gesture Recognition Deng Li et.al. 2510.10478 null
2025-10-11 Dejavu: Towards Experience Feedback Learning for Embodied Intelligence Shaokai Wu et.al. 2510.10181 null
2025-10-11 SyncLipMAE: Contrastive Masked Pretraining for Audio-Visual Talking-Face Representation Zeyu Ling et.al. 2510.10069 null
2025-10-10 Enhancing Diffusion Policy with Classifier-Free Guidance for Temporal Robotic Tasks Yuang Lu et.al. 2510.09786 null
2025-10-10 Diagnosing Shoulder Disorders Using Multimodal Large Language Models and Consumer-Grade Cameras Jindong Hong et.al. 2510.09230 null
2025-10-09 Video-STAR: Reinforcing Open-Vocabulary Action Recognition with Tools Zhenlong Yuan et.al. 2510.08480 null
2025-10-09 MMHOI: Modeling Complex 3D Multi-Human Multi-Object Interactions Kaen Kogashi et.al. 2510.07828 null
2025-10-07 Mitigating Surgical Data Imbalance with Dual-Prediction Video Diffusion Model Danush Kumar Venkatesh et.al. 2510.07345 null
2025-10-08 Customer-R1: Personalized Simulation of Human Behaviors via RL-based LLM Agent in Online Shopping Ziyi Wang et.al. 2510.07230 null
2025-10-08 TrackVLA++: Unleashing Reasoning and Memory Capabilities in VLA Models for Embodied Visual Tracking Jiahang Liu et.al. 2510.07134 null
2025-10-07 From Captions to Keyframes: KeyScore for Multimodal Frame Scoring and Video-Language Understanding Shih-Yao Lin et.al. 2510.06509 null
2025-10-07 Human Action Recognition from Point Clouds over Time James Dickens et.al. 2510.05506 null
2025-10-05 Speculative Actions: A Lossless Framework for Faster Agentic Systems Naimeng Ye et.al. 2510.04371 null
2025-10-04 Talking Tennis: Language Feedback from 3D Biomechanical Action Recognition Arushi Dashore et.al. 2510.03921 null
2025-10-04 MonitorVLM:A Vision Language Framework for Safety Violation Detection in Mining Operations Jiang Wu et.al. 2510.03666 null
2025-10-03 FocusAgent: Simple Yet Effective Ways of Trimming the Large Context of Web Agents Imene Kerboua et.al. 2510.03204 link
2025-09-27 $\texttt{BluePrint}$ : A Social Media User Dataset for LLM Persona Evaluation and Training Aurélien Bück-Kaeffer et.al. 2510.02343 null
2025-10-02 Wearable and Ultra-Low-Power Fusion of EMG and A-Mode US for Hand-Wrist Kinematic Tracking Giusy Spacone et.al. 2510.02000 null
2025-10-02 Contrastive Representation Regularization for Vision-Language-Action Models Taeyoung Kim et.al. 2510.01711 null
2025-10-01 EvoStruggle: A Dataset Capturing the Evolution of Struggle across Activities and Skill Levels Shijia Feng et.al. 2510.01362 link
2025-10-01 HAMLET: Switch your Vision-Language-Action Model into a History-Aware Policy Myungkyu Koo et.al. 2510.00695 null
2025-10-01 Expandable Decision-Making States for Multi-Agent Deep Reinforcement Learning in Soccer Tactical Analysis Kenjiro Ide et.al. 2510.00480 null
2025-09-30 Towards Intuitive Human-Robot Interaction through Embodied Gesture-Driven Control with Woven Tactile Skins ChunPing Lam et.al. 2509.25951 null
2025-09-22 Six Sigma For Neural Networks: Taguchi-based optimization Sai Varun Kodathala et.al. 2509.25213 null
2025-09-29 Fast Real-Time Pipeline for Robust Arm Gesture Recognition Milán Zsolt Bagladi et.al. 2509.25042 null
2025-09-28 AssemblyHands-X: Modeling 3D Hand-Body Coordination for Understanding Bimanual Human Activities Tatsuro Banno et.al. 2509.23888 null
2025-09-27 New Synthetic Goldmine: Hand Joint Angle-Driven EMG Data Generation Framework for Micro-Gesture Recognition Nana Wang et.al. 2509.23359 null
2025-09-27 Spatiotemporal Radar Gesture Recognition with Hybrid Spiking Neural Networks: Balancing Accuracy and Efficiency Riccardo Mazzieri et.al. 2509.23303 null
2025-09-27 MMeViT: Multi-Modal ensemble ViT for Post-Stroke Rehabilitation Action Recognition Ye-eun Kim et.al. 2509.23044 null
2025-09-27 Disentangling Static and Dynamic Information for Reducing Static Bias in Action Recognition Masato Kobayashi et.al. 2509.23009 null
2025-09-26 See, Point, Fly: A Learning-Free VLM Framework for Universal Unmanned Aerial Navigation Chih Yao Hu et.al. 2509.22653 null
2025-09-26 Prompt-guided Representation Disentanglement for Action Recognition Tianci Wu et.al. 2509.21783 null
2025-09-25 SlotFM: A Motion Foundation Model with Slot Attention for Diverse Downstream Tasks Junyong Park et.al. 2509.21673 null
2025-09-25 Temporal vs. Spatial: Comparing DINOv3 and V-JEPA2 Feature Representations for Video Action Analysis Sai Varun Kodathala et.al. 2509.21595 null
2025-09-25 EMG-UP: Unsupervised Personalization in Cross-User EMG Gesture Recognition Nana Wang et.al. 2509.21589 null
2025-09-24 mmHSense: Multi-Modal and Distributed mmWave ISAC Datasets for Human Sensing Nabeel Nisar Bhat et.al. 2509.21396 null
2025-09-25 Every Subtlety Counts: Fine-grained Person Independence Micro-Action Recognition via Distributionally Robust Optimization Feng-Qi Cui et.al. 2509.21261 null
2025-09-25 Autoregressive End-to-End Planning with Time-Invariant Spatial Alignment and Multi-Objective Policy Refinement Jianbo Zhao et.al. 2509.20938 null
2025-09-25 GenFacts-Generative Counterfactual Explanations for Multi-Variate Time Series Sarah Seifi et.al. 2509.20936 null
2025-09-25 Causal Time Series Generation via Diffusion Models Yutong Xia et.al. 2509.20846 null
2025-09-23 A Bimanual Gesture Interface for ROS-Based Mobile Manipulators Using TinyML and Sensor Fusion Najeeb Ahmed Bhuiyan et.al. 2509.19521 null
2025-09-23 FERA: Foil Fencing Referee Assistant Using Pose-Based Multi-Label Move Recognition and Rule Reasoning Ziwen Chen et.al. 2509.18527 null
2025-09-22 MoCrop: Training Free Motion Guided Cropping for Efficient Video Action Recognition Binhua Huang et.al. 2509.18473 null
2025-09-22 Orcust: Stepwise-Feedback Reinforcement Learning for GUI Agent Junyu Lu et.al. 2509.17917 null
2025-09-22 Trainee Action Recognition through Interaction Analysis in CCATT Mixed-Reality Training Divya Mereddy et.al. 2509.17888 null
2025-09-22 A $^2$M$^2$ -Net: Adaptively Aligned Multi-Scale Moment for Few-Shot Action Recognition Zilin Gao et.al. 2509.17638 null
2025-09-22 UIPro: Unleashing Superior Interaction Capability For GUI Agents Hongxin Li et.al. 2509.17328 null
2025-09-21 Imagine2Act: Leveraging Object-Action Motion Consistency from Imagined Goals for Robotic Manipulation Liang Heng et.al. 2509.17125 null
2025-09-21 MoCLIP-Lite: Efficient Video Recognition by Fusing CLIP with Motion Vectors Binhua Huang et.al. 2509.17084 null
2025-09-20 Automated Procedural Analysis via Video-Language Models for AI-assisted Nursing Skills Assessment Shen Chang et.al. 2509.16810 null
2025-09-19 KRAST: Knowledge-Augmented Robotic Action Recognition with Structured Text for Vision-Language Models Son Hai Nguyen et.al. 2509.16452 null
2025-09-18 RynnVLA-001: Using Human Demonstrations to Improve Robot Manipulation Yuming Jiang et.al. 2509.15212 link
2025-09-18 Doppler Radiance Field-Guided Antenna Selection for Improved Generalization in Multi-Antenna Wi-Fi-based Human Activity Recognition Navid Hasanzadeh et.al. 2509.15129 null
2025-09-18 LSTC-MDA: A Unified Framework for Long-Short Term Temporal Convolution and Mixed Data Augmentation in Skeleton-Based Action Recognition Feng Ding et.al. 2509.14619 null
2025-09-18 ClearFairy: Capturing Creative Workflows through Decision Structuring, In-Situ Questioning, and Rationale Inference Kihoon Son et.al. 2509.14537 null
2025-09-15 Domain-Adaptive Pretraining Improves Primate Behavior Recognition Felix B. Mueller et.al. 2509.12193 null
2025-09-15 Open-ended Hierarchical Streaming Video Understanding with Vision Language Models Hyolim Kang et.al. 2509.12145 null
2025-09-15 Gesture-Based Robot Control Integrating Mm-wave Radar and Behavior Trees Yuqing Song et.al. 2509.12008 null
2025-09-15 Learning Representations in Video Game Agents with Supervised Contrastive Imitation Learning Carlos Celemin et.al. 2509.11880 null
2025-09-11 Improvement of Human-Object Interaction Action Recognition Using Scene Information and Multi-Task Learning Approach Hesham M. Shehata et.al. 2509.09067 null
2025-09-10 A Contextual Bandits Approach for Personalization of Hand Gesture Recognition Duke Lin et.al. 2509.08915 null
2025-09-10 Diffusion-Based Action Recognition Generalizes to Untrained Domains Rogerio Guimaraes et.al. 2509.08908 null
2025-09-10 Chirality in Action: Time-Aware Video Representation Learning by Latent Straightening Piyush Bagad et.al. 2509.08502 null
2025-09-10 LD-ViCE: Latent Diffusion Model for Video Counterfactual Explanations Payal Varshney et.al. 2509.08422 null
2025-09-09 EHWGesture – A dataset for multimodal understanding of clinical gestures Gianluca Amprimo et.al. 2509.07525 null
2025-09-09 G3CN: Gaussian Topology Refinement Gated Graph Convolutional Network for Skeleton-Based Action Recognition Haiqing Ren et.al. 2509.07335 null
2025-08-05 Live Demonstration: Neuromorphic Radar for Gesture Recognition Satyapreet Singh Yadav et.al. 2508.03324 null
2025-07-22 Beyond Label Semantics: Language-Guided Action Anatomy for Few-shot Action Recognition Zefeng Qian et.al. 2507.16287 null
2025-07-22 SPACT18: Spiking Human Action Recognition Benchmark Dataset with Complementary RGB and Thermal Modalities Yasser Ashraf et.al. 2507.16151 null
2025-07-20 Light Future: Multimodal Action Frame Prediction via InstructPix2Pix Zesen Zhong et.al. 2507.14809 null
2025-07-17 A Real-Time System for Egocentric Hand-Object Interaction Detection in Industrial Domains Antonio Finocchiaro et.al. 2507.13326 null
2025-07-17 Rethinking the Embodied Gap in Vision-and-Language Navigation: A Holistic Study of Physical and Visual Disparities Liuyi Wang et.al. 2507.13019 null
2025-07-17 Generalist Bimanual Manipulation via Foundation Video Diffusion Models Yao Feng et.al. 2507.12898 null
2025-07-16 Predicting Soccer Penalty Kick Direction Using Human Action Recognition David Freire-Obregón et.al. 2507.12617 null
2025-07-18 DVFL-Net: A Lightweight Distilled Video Focal Modulation Network for Spatio-Temporal Action Recognition Hayat Ullah et.al. 2507.12426 null
2025-07-16 Calisthenics Skills Temporal Video Segmentation Antonio Finocchiaro et.al. 2507.12245 null
2025-07-15 Diffusion-Based Imaginative Coordination for Bimanual Manipulation Huilin Xu et.al. 2507.11296 null
2025-07-15 Women Sport Actions Dataset for Visual Classification Using Small Scale Training Data Palash Ray et.al. 2507.10969 null
2025-07-14 Hand Gesture Recognition for Collaborative Robots Using Lightweight Deep Learning in Real-Time Robotic Systems Muhtadin et.al. 2507.10055 null
2025-07-13 Online Micro-gesture Recognition Using Data Augmentation and Spatial-Temporal Attention Pengyu Liu et.al. 2507.09512 null
2025-07-11 MM-Gesture: Towards Precise Micro-Gesture Recognition through Multimodal Fusion Jihao Gu et.al. 2507.08344 null
2025-07-10 Multimodal Framework for Explainable Autonomous Driving: Integrating Video, Sensor, and Textual Data for Enhanced Decision-Making and Transparency Abolfazl Zarghani et.al. 2507.07938 null
2025-07-10 EEvAct: Early Event-Based Action Recognition with High-Rate Two-Stream Spiking Neural Networks Michael Neumeier et.al. 2507.07734 null
2025-07-09 Cross-Modal Dual-Causal Learning for Long-Term Action Recognition Xu Shaowu et.al. 2507.06603 null
2025-07-08 Hierarchical Multi-Stage Transformer Architecture for Context-Aware Temporal Action Localization Hayat Ullah et.al. 2507.06411 null
2025-07-10 VOTE: Vision-Language-Action Optimization with Trajectory Ensemble Voting Juyi Lin et.al. 2507.05116 link
2025-07-07 HV-MMBench: Benchmarking MLLMs for Human-Centric Video Understanding Yuxuan Cai et.al. 2507.04909 null
2025-07-06 Visual Hand Gesture Recognition with Deep Learning: A Comprehensive Review of Methods, Datasets, Challenges and Future Research Directions Konstantinos Foteinos et.al. 2507.04465 null
2025-07-06 DreamVLA: A Vision-Language-Action Model Dreamed with Comprehensive World Knowledge Wenyao Zhang et.al. 2507.04447 link
2025-07-04 Masked Temporal Interpolation Diffusion for Procedure Planning in Instructional Videos Yufan Zhou et.al. 2507.03393 link
2025-07-05 AC-DiT: Adaptive Coordination Diffusion Transformer for Mobile Manipulation Sixiang Chen et.al. 2507.01961 null
2025-07-02 Variational Graph Convolutional Neural Networks Illia Oleksiienko et.al. 2507.01699 null
2025-07-01 Zero-shot Skeleton-based Action Recognition with Prototype-guided Feature Alignment Kai Zhou et.al. 2507.00566 null
2025-06-30 LineRetriever: Planning-Aware Observation Reduction for Web Agents Imene Kerboua et.al. 2507.00210 null
2025-06-30 Online Human Action Detection during Escorting Siddhartha Mondal et.al. 2506.23573 null
2025-06-29 DEL: Dense Event Localization for Multi-modal Audio-Visual Understanding Mona Ahmadian et.al. 2506.23196 null
2025-06-27 Frequency-Semantic Enhanced Variational Autoencoder for Zero-Shot Skeleton-based Action Recognition Wenhan Wu et.al. 2506.22179 null
2025-06-26 WorldVLA: Towards Autoregressive Action World Model Jun Cen et.al. 2506.21539 link
2025-06-26 EgoAdapt: Adaptive Multisensory Distillation and Policy Learning for Efficient Egocentric Perception Sanjoy Chowdhury et.al. 2506.21080 null
2025-06-25 How do Foundation Models Compare to Skeleton-Based Approaches for Gesture Recognition in Human-Robot Interaction? Stephanie Käs et.al. 2506.20795 null
2025-06-25 CARMA: Context-Aware Situational Grounding of Human-Robot Group Interactions by Combining Vision-Language Models with Object and Action Recognition Joerg Deigmoeller et.al. 2506.20373 null
2025-06-25 Feature Hallucination for Self-supervised Action Recognition Lei Wang et.al. 2506.20342 null
2025-06-27 ReactEMG: Zero-Shot, Low-Latency Intent Detection via sEMG Runsheng Wang et.al. 2506.19815 null
2025-06-24 Self-Paced Collaborative and Adversarial Network for Unsupervised Domain Adaptation Weichen Zhang et.al. 2506.19267 null
2025-06-23 Including Semantic Information via Word Embeddings for Skeleton-based Action Recognition Dustin Aganian et.al. 2506.18721 null
2025-06-23 Improving Weakly Supervised Temporal Action Localization by Exploiting Multi-resolution Information in Temporal Domain Rui Su et.al. 2506.18261 null
2025-06-23 Robot Tactile Gesture Recognition Based on Full-body Modular E-skin Shuo Jiang et.al. 2506.18256 null
2025-06-22 Adapting Vision-Language Models for Evaluating World Models Mariya Hendriksen et.al. 2506.17967 null
2025-06-21 Domain Generalization using Action Sequences for Egocentric Action Recognition Amirshayan Nasirimajd et.al. 2506.17685 null
2025-06-20 Wi-Fi Sensing Tool Release: Gathering 802.11ax Channel State Information from a Commercial Wi-Fi Access Point Zisheng Wang et.al. 2506.16957 null
2025-06-20 Language-driven Description Generation and Common Sense Reasoning for Video Action Recognition Xiaodan Hu et.al. 2506.16701 null
2025-06-19 CLIP-MG: Guiding Semantic Attention with Skeletal Pose Features and RGB Data for Micro-Gesture Recognition on the iMiGUE Dataset Santosh Patapati et.al. 2506.16385 null
2025-06-18 Accessible Gesture-Driven Augmented Reality Interaction System Yikan Wang et.al. 2506.15189 null
2025-06-17 CDP: Towards Robust Autoregressive Visuomotor Policy Learning via Causal Diffusion Jiahua Ma et.al. 2506.14769 null
2025-06-16 Leveraging Vision-Language Pre-training for Human Activity Recognition in Still Images Cristina Mahanta et.al. 2506.13458 null
2025-06-16 Active Multimodal Distillation for Few-shot Action Recognition Weijia Feng et.al. 2506.13322 null
2025-06-16 Action Dubber: Timing Audible Actions via Inflectional Flow Wenlong Wan et.al. 2506.13320 null
2025-06-15 Towards Fine-Grained Emotion Understanding via Skeleton-Based Micro-Gesture Recognition Hao Xu et.al. 2506.12848 null
2025-06-13 Pose Matters: Evaluating Vision Transformers and CNNs for Human Action Recognition on Small COCO Subsets MingZe Tang et.al. 2506.11678 null
2025-06-12 GynSurg: A Comprehensive Gynecology Laparoscopic Surgery Dataset Sahar Nasirihaghighi et.al. 2506.11356 null
2025-06-12 WaveFormer: A Lightweight Transformer Model for sEMG-based Gesture Recognition Yanlong Chen et.al. 2506.11168 null
2025-06-11 SLRNet: A Real-Time LSTM-Based Sign Language Recognition System Sharvari Kamble et.al. 2506.11154 link
2025-06-10 Gender Fairness of Machine Learning Algorithms for Pain Detection Dylan Green et.al. 2506.11132 null
2025-06-12 Eye, Robot: Learning to Look to Act with a BC-RL Perception-Action Loop Justin Kerr et.al. 2506.10968 null
2025-06-11 HopaDIFF: Holistic-Partial Aware Fourier Conditioned Diffusion for Referring Human Action Segmentation in Multi-Person Scenarios Kunyu Peng et.al. 2506.09650 link
2025-06-11 Time-Unified Diffusion Policy with Action Discrimination for Robotic Manipulation Ye Niu et.al. 2506.09422 null
2025-06-11 Synthetic Human Action Video Data Generation with Pose Transfer Vaclav Knapp et.al. 2506.09411 null
2025-06-11 An Effective End-to-End Solution for Multimodal Action Recognition Songping Wang et.al. 2506.09345 null
2025-06-10 Diver-Robot Communication Dataset for Underwater Hand Gesture Recognition Igor Kvasić et.al. 2506.08974 null
2025-06-09 BridgeVLA: Input-Output Alignment for Efficient 3D Manipulation Learning with Vision-Language Models Peiyan Li et.al. 2506.07961 link
2025-06-08 AugmentGest: Can Random Data Cropping Augmentation Boost Gesture Recognition Performance? Nada Aboudeshish et.al. 2506.07216 null
2025-06-08 SAP-Bench: Benchmarking Multimodal Large Language Models in Surgical Action Planning Mengya Xu et.al. 2506.07196 null
2025-06-07 PhysLab: A Benchmark Dataset for Multi-Granularity Visual Parsing of Physics Experiments Minghao Zou et.al. 2506.06631 null
2025-06-06 Conversational Interfaces for Parametric Conceptual Architectural Design: Integrating Mixed Reality with LLM-driven Interaction Ruochen Ji et.al. 2506.06066 null
2025-06-06 DriveAction: A Benchmark for Exploring Human-like Driving Decisions in VLA Models Yuhan Hao et.al. 2506.05667 null
2025-06-05 Robustness Evaluation for Video Models with Reinforcement Learning Ashwin Ramesh Babu et.al. 2506.05431 null
2025-06-04 Video, How Do Your Tokens Merge? Sam Pollard et.al. 2506.03885 null
2025-06-04 Zero-Shot Temporal Interaction Localization for Egocentric Videos Erhang Zhang et.al. 2506.03662 link
2025-06-04 Heterogeneous Skeleton-Based Action Representation Learning Hongsong Wang et.al. 2506.03481 null
2025-06-04 Go Beyond Earth: Understanding Human Actions and Scenes in Microgravity Environments Di Wen et.al. 2506.02845 link
2025-06-03 Technical Report for Ego4D Long-Term Action Anticipation Challenge 2025 Qiaohui Chu et.al. 2506.02550 null
2025-06-03 VS-Bench: Evaluating VLMs for Strategic Reasoning and Decision-Making in Multi-Agent Environments Zelai Xu et.al. 2506.02387 link
2025-06-03 Multi-level and Multi-modal Action Anticipation Seulgi Kim et.al. 2506.02382 null
2025-06-02 TransAct V2: Lifelong User Action Sequence Modeling on Pinterest Recommendation Xue Xia et.al. 2506.02267 null
2025-06-02 SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics Mustafa Shukor et.al. 2506.01844 link
2025-06-02 Efficient Egocentric Action Recognition with Multimodal Data Marco Calzavara et.al. 2506.01757 null
2025-06-02 EPFL-Smart-Kitchen-30: Densely annotated cooking dataset with 3D kinematics to challenge video and language models Andy Bonnetto et.al. 2506.01608 link
2025-06-02 Sheep Facial Pain Assessment Under Weighted Graph Neural Networks Alam Noor et.al. 2506.01468 null
2025-06-02 EgoBrain: Synergizing Minds and Eyes For Human Action Understanding Nie Lin et.al. 2506.01353 null
2025-05-30 DiG-Net: Enhancing Quality of Life through Hyper-Range Dynamic Gesture Recognition in Assistive Robotics Eran Bamani Beeri et.al. 2505.24786 null
2025-05-30 Beyond FACS: Data-driven Facial Expression Dictionaries, with Application to Predicting Autism Evangelos Sariyanidi et.al. 2505.24679 null
2025-05-30 EgoExOR: An Ego-Exo-Centric Operating Room Dataset for Surgical Activity Understanding Ege Özsoy et.al. 2505.24287 null
2025-05-29 Autoregressive Meta-Actions for Unified Controllable Trajectory Generation Jianbo Zhao et.al. 2505.23612 null
2025-05-29 CLIP-AE: CLIP-assisted Cross-view Audio-Visual Enhancement for Unsupervised Temporal Action Localization Rui Xia et.al. 2505.23524 null
2025-05-29 Spatio-Temporal Joint Density Driven Learning for Skeleton-Based Action Recognition Shanaka Ramesh Gunasekara et.al. 2505.23012 link
2025-05-28 PRISM: Video Dataset Condensation with Progressive Refinement and Insertion for Sparse Motion Jaehyun Choi et.al. 2505.22564 null
2025-05-27 DeepConvContext: A Multi-Scale Approach to Timeseries Classification in Human Activity Recognition Marius Bock et.al. 2505.20894 link
2025-05-27 TrustSkin: A Fairness Pipeline for Trustworthy Facial Affect Analysis Across Skin Tone Ana M. Cabanas et.al. 2505.20637 null
2025-05-26 Data-Free Class-Incremental Gesture Recognition with Prototype-Guided Pseudo Feature Replay Hongsong Wang et.al. 2505.20049 link
2025-05-26 PHI: Bridging Domain Shift in Long-Term Action Quality Assessment via Progressive Hierarchical Instruction Kanglei Zhou et.al. 2505.19972 link
2025-05-26 The Role of Video Generation in Enhancing Data-Limited Action Understanding Wei Li et.al. 2505.19495 null
2025-05-24 ProphetDWM: A Driving World Model for Rolling Out Future Actions and Videos Xiaodong Wang et.al. 2505.18650 null
2025-05-27 SHARDeg: A Benchmark for Skeletal Human Action Recognition in Degraded Scenarios Simon Malzard et.al. 2505.18048 null
2025-05-23 3D Face Reconstruction Error Decomposed: A Modular Benchmark for Fair and Fast Method Evaluation Evangelos Sariyanidi et.al. 2505.18025 null
2025-05-23 Multi-task Learning For Joint Action and Gesture Recognition Konstantinos Spathis et.al. 2505.17867 null
2025-05-23 Temporal Consistency Constrained Transferable Adversarial Attacks with Background Mixup for Action Recognition Ping Li et.al. 2505.17807 link
2025-05-23 Integrating Counterfactual Simulations with Language Models for Explaining Multi-Agent Behaviour Bálint Gyevnár et.al. 2505.17801 null
2025-05-23 SVL: Spike-based Vision-language Pretraining for Efficient 3D Open-world Understanding Xuerui Qiu et.al. 2505.17674 null
2025-05-23 ProTAL: A Drag-and-Link Video Programming Framework for Temporal Action Localization Yuchen He et.al. 2505.17555 null
2025-05-22 UAV Control with Vision-based Hand Gesture Recognition over Edge-Computing Sousannah Abdalla et.al. 2505.17303 null
2025-05-22 CoMo: Learning Continuous Latent Motion from Internet Videos for Scalable Robot Learning Jiange Yang et.al. 2505.17006 null
2025-05-21 Towards Zero-Shot Differential Morphing Attack Detection with Multimodal Large Language Models Ria Shekhawat et.al. 2505.15332 null
2025-05-21 DiffProb: Data Pruning for Face Recognition Eduarda Caldeira et.al. 2505.15272 link
2025-05-21 Leveraging Foundation Models for Multimodal Graph-Based Action Recognition Fatemeh Ziaeetabar et.al. 2505.15192 null
2025-05-20 Egocentric Action-aware Inertial Localization in Point Clouds Mingfang Zhang et.al. 2505.14346 link
2025-05-20 Transfer Learning from Visual Speech Recognition to Mouthing Recognition in German Sign Language Dinh Nam Pham et.al. 2505.13784 link
2025-05-18 MTIL: Encoding Full History with Mamba for Temporal Imitation Learning Yulin Zhou et.al. 2505.12410 link
2025-05-20 Aux-Think: Exploring Reasoning Strategies for Data-Efficient Vision-Language Navigation Shuo Wang et.al. 2505.11886 null
2025-05-16 Dynam3D: Dynamic Layered 3D Tokens Empower VLM for Vision-and-Language Navigation Zihan Wang et.al. 2505.11383 link
2025-05-15 NeoLightning: A Modern Reimagination of Gesture-Based Sound Design Yonghyun Kim et.al. 2505.10686 link
2025-05-15 Are Spatial-Temporal Graph Convolution Networks for Human Action Recognition Over-Parameterized? Jianyang Xie et.al. 2505.10679 link
2025-05-14 Mission Balance: Generating Under-represented Class Samples using Video Diffusion Models Danush Kumar Venkatesh et.al. 2505.09858 link
2025-05-13 Reinforcement Learning meets Masked Video Modeling : Trajectory-Guided Adaptive Token Selection Ayush K. Rai et.al. 2505.08561 null
2025-05-17 Training Strategies for Efficient Embodied Reasoning William Chen et.al. 2505.08243 null
2025-05-12 H $^{\mathbf{3}}$ DP: Triply-Hierarchical Diffusion Policy for Visuomotor Learning Yiyang Lu et.al. 2505.07819 null
2025-05-11 DeepSORT-Driven Visual Tracking Approach for Gesture Recognition in Interactive Systems Tong Zhang et.al. 2505.07110 null
2025-05-10 A Short Overview of Multi-Modal Wi-Fi Sensing Zijian Zhao et.al. 2505.06682 link
2025-05-09 Context Informed Incremental Learning Improves Myoelectric Control Performance in Virtual Reality Object Manipulation Tasks Gabriel Gagné et.al. 2505.06064 link
2025-05-09 Task-Adapter++: Task-specific Adaptation with Order-aware Alignment for Few-shot Action Recognition Congqi Cao et.al. 2505.06002 link
2025-05-07 DetReIDX: A Stress-Test Dataset for Real-World UAV-Based Person Recognition Kailash A. Hambarde et.al. 2505.04793 link
2025-05-07 Comparison of Visual Trackers for Biomechanical Analysis of Running Luis F. Gomez et.al. 2505.04713 null
2025-05-07 Trajectory Entropy Reinforcement Learning for Predictable and Robust Control Bang You et.al. 2505.04193 null
2025-05-07 FoodTrack: Estimating Handheld Food Portions with Egocentric Video Ervin Wang et.al. 2505.04055 null
2025-05-06 Action Spotting and Precise Event Detection in Sports: Datasets, Methods, and Challenges Hao Xu et.al. 2505.03991 null
2025-05-03 A Multimodal Framework for Explainable Evaluation of Soft Skills in Educational Environments Jared D. T. Guerrero-Sosa et.al. 2505.01794 null
2025-05-01 Predicting Estimated Times of Restoration for Electrical Outages Using Longitudinal Tabular Transformers Bogireddy Sai Prasanna Teja et.al. 2505.00225 null
2025-04-30 Direct Motion Models for Assessing Generated Videos Kelsey Allen et.al. 2505.00209 null
2025-04-30 CoCoDiff: Diversifying Skeleton Action Features via Coarse-Fine Text-Co-Guided Latent Diffusion Zhifu Zhao et.al. 2504.21266 null
2025-04-29 Beyond the Horizon: Decoupling UAVs Multi-View Action Recognition via Partial Order Transfer Wenxuan Liu et.al. 2504.20530 null
2025-04-28 ProFi-Net: Prototype-based Feature Attention with Curriculum Augmentation for WiFi-based Gesture Recognition Zhe Cui et.al. 2504.20193 null
2025-04-28 FSBench: A Figure Skating Benchmark for Advancing Artistic Sports Understanding Rong Gao et.al. 2504.19514 null
2025-04-26 3DPyranet Features Fusion for Spatio-temporal Feature Learning Ihsan Ullah et.al. 2504.18977 null
2025-04-25 POET: Prompt Offset Tuning for Continual Human Action Adaptation Prachi Garg et.al. 2504.18059 null
2025-04-25 RSRNav: Reasoning Spatial Relationship for Image-Goal Navigation Zheng Qin et.al. 2504.17991 null
2025-04-24 Robotic Task Ambiguity Resolution via Natural Language Interaction Eugenio Chisari et.al. 2504.17748 null
2025-04-23 Latent Diffusion Planning for Imitation Learning Amber Xie et.al. 2504.16925 null
2025-04-23 WiFi based Human Fall and Activity Recognition using Transformer based Encoder Decoder and Graph Neural Networks Younggeol Cho et.al. 2504.16655 null
2025-04-23 Advancing Radar Hand Gesture Recognition: A Hybrid Spectrum Synthetic Framework Merging Simulation with Neural Networks Jiaqi Tang et.al. 2504.16423 null
2025-04-21 Bridge the Gap: From Weak to Full Supervision for Temporal Action Localization with PseudoFormer Ziyi Liu et.al. 2504.14860 null
2025-04-20 Time Frequency Analysis of EMG Signal for Gesture Recognition using Fine grained Features Parshuram N. Aarotale et.al. 2504.14708 null
2025-04-22 Talk is Not Always Cheap: Promoting Wireless Sensing Models with Text Prompts Zhenkui Yang et.al. 2504.14621 link
2025-04-19 Balancing Privacy and Action Performance: A Penalty-Driven Approach to Image Anonymization Nazia Aslam et.al. 2504.14301 null
2025-04-18 Are you SURE? Enhancing Multimodal Pretraining with Missing Modalities through Uncertainty Estimation Duy A. Nguyen et.al. 2504.13465 null
2025-04-23 Chain-of-Thought Textual Reasoning for Few-shot Temporal Action Localization Hongwei Ji et.al. 2504.13460 null
2025-04-17 Wearable-Derived Behavioral and Physiological Biomarkers for Classifying Unipolar and Bipolar Depression Severity Yassine Ouzar et.al. 2504.13331 null
2025-04-17 PCBEAR: Pose Concept Bottleneck for Explainable Action Recognition Jongseo Lee et.al. 2504.13140 null
2025-04-16 SkeletonX: Data-Efficient Skeleton-based Action Recognition via Cross-sample Feature Aggregation Zongye Zhang et.al. 2504.11749 link
2025-04-14 Toward Aligning Human and Robot Actions via Multi-Modal Demonstration Learning Azizul Zahid et.al. 2504.11493 link
2025-04-14 H-MoRe: Learning Human-centric Motion Representation for Action Analysis Zhanbo Huang et.al. 2504.10676 null
2025-04-14 Towards Low-Latency Event-based Obstacle Avoidance on a FPGA-Drone Pietro Bonazzi et.al. 2504.10400 link
2025-04-14 Hierarchical Relation-augmented Representation Generalization for Few-shot Action Recognition Hongyu Qu et.al. 2504.10079 null
2025-04-14 EmbodiedAgent: A Scalable Hierarchical Approach to Overcome Practical Challenge in Multi-Robot Control Hanwen Wan et.al. 2504.10030 link
2025-04-14 Hands-On: Segmenting Individual Signs from Continuous Sequences Low Jian He et.al. 2504.08593 null
2025-04-11 Knowledge Distillation for Multimodal Egocentric Action Recognition Robust to Missing Modalities Maria Santos-Villafranca et.al. 2504.08578 null
2025-04-11 Breaking the Barriers: Video Vision Transformers for Word-Level Sign Language Recognition Alexander Brettmann et.al. 2504.07792 null
2025-04-10 Towards Micro-Action Recognition with Limited Annotations: An Asynchronous Pseudo Labeling and Training Approach Yan Zhang et.al. 2504.07785 null
2025-04-13 ID-Booth: Identity-consistent Face Generation with Diffusion Models Darian Tomašević et.al. 2504.07392 link
2025-04-09 Leveraging GCN-based Action Recognition for Teleoperation in Daily Activity Assistance Thomas M. Kwok et.al. 2504.07001 null
2025-04-09 Efficient Deployment of Spiking Neural Networks on SpiNNaker2 for DVS Gesture Recognition Using Neuromorphic Intermediate Representation Sirine Arfa et.al. 2504.06748 null
2025-04-09 Exploring Ordinal Bias in Action Recognition for Instructional Videos Joochan Kim et.al. 2504.06580 link
2025-04-08 FaceCloak: Learning to Protect Face Templates Sudipta Banerjee et.al. 2504.06131 null
2025-04-08 Modular Soft Wearable Glove for Real-Time Gesture Recognition and Dynamic 3D Shape Reconstruction Huazhi Dong et.al. 2504.05983 null
2025-04-08 Temporal Alignment-Free Video Matching for Few-shot Action Recognition SuBeen Lee et.al. 2504.05956 null
2025-04-08 SEVERE++: Evaluating Benchmark Sensitivity in Generalization of Video Representation Learning Fida Mohammad Thoker et.al. 2504.05706 null
2025-04-07 SelfMAD: Enhancing Generalization and Robustness in Morphing Attack Detection via Self-Supervised Learning Marija Ivanovska et.al. 2504.05504 null
2025-04-06 SnapPix: Efficient-Coding–Inspired In-Sensor Compression for Edge Vision Weikai Lin et.al. 2504.04535 null
2025-04-04 An Exploration-free Method for a Linear Stochastic Bandit Driven by a Linear Gaussian Dynamical System Jonathan Gornet et.al. 2504.03926 null
2025-04-04 Electromyography-Based Gesture Recognition: Hierarchical Feature Extraction for Enhanced Spatial-Temporal Dynamics Jungpil Shin et.al. 2504.03221 null
2025-04-02 UAC: Uncertainty-Aware Calibration of Neural Networks for Gesture Detection Farida Al Haddad et.al. 2504.02895 null
2025-04-03 Unified World Models: Coupling Video and Action Diffusion for Pretraining on Large Robotic Datasets Chuning Zhu et.al. 2504.02792 null
2025-04-03 MultiSensor-Home: A Wide-area Multi-modal Multi-view Dataset for Action Recognition and Transformer-based Sensor Fusion Trung Thanh Nguyen et.al. 2504.02287 link
2025-04-07 MultiTSF: Transformer-based Sensor Fusion for Human-Centric Multi-view and Multi-modal Action Recognition Trung Thanh Nguyen et.al. 2504.02279 null
2025-04-03 SocialGesture: Delving into Multi-person Gesture Understanding Xu Cao et.al. 2504.02244 null
2025-04-02 LSC-ADL: An Activity of Daily Living (ADL)-Annotated Lifelog Dataset Generated via Semi-Automatic Clustering Minh-Quan Ho-Le et.al. 2504.02060 null
2025-04-07 Is Temporal Prompting All We Need For Limited Labeled Action Recognition? Shreyank N Gowda et.al. 2504.01890 null
2025-04-01 FDDet: Frequency-Decoupling for Boundary Refinement in Temporal Action Detection Xinnan Zhu et.al. 2504.00647 null
2025-04-01 Sample-level Adaptive Knowledge Distillation for Action Recognition Ping Li et.al. 2504.00606 null
2025-03-30 CA^2ST: Cross-Attention in Audio, Space, and Time for Holistic Video Recognition Jongseo Lee et.al. 2503.23447 null
2025-03-30 OwlSight: A Robust Illumination Adaptation Framework for Dark Video Human Action Recognition Shihao Cheng et.al. 2503.23266 null
2025-03-29 Action Recognition in Real-World Ambient Assisted Living Environment Vincent Gbouna Zakka et.al. 2503.23214 link
2025-03-28 ForcePose: A Deep Learning Approach for Force Calculation Based on Action Recognition Using MediaPipe Pose Estimation Combined with Object Detection Nandakishor M et.al. 2503.22363 null
2025-03-30 UI-R1: Enhancing Action Prediction of GUI Agents by Reinforcement Learning Zhengxi Lu et.al. 2503.21620 link
2025-03-27 One Snapshot is All You Need: A Generalized Method for mmWave Signal Generation Teng Huang et.al. 2503.21122 null
2025-03-26 ScreenLLM: Stateful Screen Schema for Efficient Action Understanding and Prediction Yiqiao Jin et.al. 2503.20978 null
2025-03-26 Siformer: Feature-isolated Transformer for Efficient Skeleton-based Sign Language Recognition Muxin Pu et.al. 2503.20436 null
2025-03-25 Surg-3M: A Dataset and Foundation Model for Perception in Surgical Settings Chengan Che et.al. 2503.19740 link
2025-03-25 fine-CLIP: Enhancing Zero-Shot Fine-Grained Surgical Action Recognition with Vision-Language Models Saurav Sharma et.al. 2503.19670 null
2025-03-24 LLaVAction: evaluating and training multi-modal large language models for action recognition Shaokai Ye et.al. 2503.18712 link
2025-03-24 Surgical Action Planning with Large Language Models Mengya Xu et.al. 2503.18296 null
2025-03-27 Temporal-Guided Spiking Neural Networks for Event-Based Human Action Recognition Siyuan Yang et.al. 2503.17132 null
2025-03-21 BEAC: Imitating Complex Exploration and Task-oriented Behaviors for Invisible Object Nonprehensile Manipulation Hirotaka Tahara et.al. 2503.16803 null
2025-03-21 Improving mmWave based Hand Hygiene Monitoring through Beam Steering and Combining Techniques Isura Nirmal et.al. 2503.16764 null
2025-03-19 A Comprehensive Survey on Architectural Advances in Deep CNNs: Challenges, Applications, and Emerging Research Directions Saddam Hussain Khan et.al. 2503.16546 null
2025-03-25 Deep learning framework for action prediction reveals multi-timescale locomotor control Wei-Chen Wang et.al. 2503.16340 null
2025-03-19 UI-Vision: A Desktop-centric GUI Benchmark for Visual Perception and Interaction Shravan Nayak et.al. 2503.15661 null
2025-03-19 Multi-Modal Gesture Recognition from Video and Surgical Tool Pose Information via Motion Invariants Jumanh Atoum et.al. 2503.15647 null
2025-03-21 Body-Hand Modality Expertized Networks with Cross-attention for Fine-grained Skeleton Action Recognition Seungyeon Cho et.al. 2503.14960 null
2025-03-19 DPFlow: Adaptive Optical Flow Estimation with a Dual-Pyramid Framework Henrique Morimitsu et.al. 2503.14880 link
2025-03-15 Salient Temporal Encoding for Dynamic Scene Graph Generation Zhihao Zhu et.al. 2503.14524 null
2025-03-17 Towards Scalable Modeling of Compressed Videos for Efficient Action Recognition Shristi Das Biswas et.al. 2503.13724 null
2025-03-20 STEP: Simultaneous Tracking and Estimation of Pose for Animals and Humans Shashikant Verma et.al. 2503.13344 null
2025-03-17 Dense Policy: Bidirectional Autoregressive Learning of Actions Yue Su et.al. 2503.13217 null
2025-03-16 EgoEvGesture: Gesture Recognition Based on Egocentric Event Camera Luming Wang et.al. 2503.12419 link
2025-03-16 ProbDiffFlow: An Efficient Learning-Free Framework for Probabilistic Single-Image Optical Flow Estimation Mo Zhou et.al. 2503.12348 null
2025-03-15 Real-Time Manipulation Action Recognition with a Factorized Graph Sequence Encoder Enes Erdogan et.al. 2503.12034 null
2025-03-14 Enhancing Hand Palm Motion Gesture Recognition by Eliminating Reference Frame Bias via Frame-Invariant Similarity Measures Arno Verduyn et.al. 2503.11352 null
2025-03-14 Aerial Vision-and-Language Navigation with Grid-based View Selection and Map Construction Ganlong Zhao et.al. 2503.11091 null
2025-03-14 VA-AR: Learning Velocity-Aware Action Representations with Mixture of Window Attention Jiangning Wei et.al. 2503.11004 null
2025-03-13 Spatial-Temporal Graph Diffusion Policy with Kinematic Modeling for Bimanual Robotic Manipulation Qi Lv et.al. 2503.10743 null
2025-03-11 Open-World Skill Discovery from Unsegmented Demonstrations Jingwen Deng et.al. 2503.10684 link
2025-03-17 HybridVLA: Collaborative Diffusion and Autoregression in a Unified Vision-Language-Action Model Jiaming Liu et.al. 2503.10631 null
2025-03-13 SurgRAW: Multi-Agent Workflow with Chain-of-Thought Reasoning for Surgical Intelligence Chang Han Low et.al. 2503.10265 null
2025-03-12 A Hybrid Neural Network with Smart Skip Connections for High-Precision, Low-Latency EMG-Based Hand Gesture Recognition Hafsa Wazir et.al. 2503.09041 null
2025-03-12 Unified Locomotion Transformer with Simultaneous Sim-to-Real Transfer for Quadrupeds Dikai Liu et.al. 2503.08997 null
2025-03-11 PromptGAR: Flexible Promptive Group Activity Recognition Zhangyu Jin et.al. 2503.08933 null
2025-03-11 MetaFold: Language-Guided Multi-Category Garment Folding Framework via Trajectory Generation and Foundation Model Haonan Chen et.al. 2503.08372 null
2025-03-11 A Survey on Wi-Fi Sensing Generalizability: Taxonomy, Techniques, Datasets, and Future Research Prospects Fei Wang et.al. 2503.08008 null
2025-03-10 Helios 2.0: A Robust, Ultra-Low Power Gesture Recognition System Optimised for Event-Sensor based Wearables Prarthana Bhattacharyya et.al. 2503.07825 null
2025-03-10 Elderly Activity Recognition in the Wild: Results from the EAR Challenge Anh-Kiet Duong et.al. 2503.07821 link
2025-03-09 TimeLoc: A Unified End-to-End Framework for Precise Timestamp Localization in Long Videos Chen-Lin Zhang et.al. 2503.06526 link
2025-03-09 SGA-INTERACT: A 3D Skeleton-based Benchmark for Group Activity Understanding in Modern Basketball Tactic Yuchen Yang et.al. 2503.06522 link
2025-03-07 MPTSNet: Integrating Multiscale Periodic Local Patterns and Global Dependencies for Multivariate Time Series Classification Yang Mu et.al. 2503.05582 null
2025-03-07 Multi-Grained Feature Pruning for Video-Based Human Pose Estimation Zhigang Wang et.al. 2503.05365 null
2025-03-06 Maestro: A 302 GFLOPS/W and 19.8GFLOPS RISC-V Vector-Tensor Architecture for Wearable Ultrasound Edge Computing Mattia Sinigaglia et.al. 2503.04581 null
2025-03-06 Gate-Shift-Pose: Enhancing Action Recognition in Sports with Skeleton Information Edoardo Bianchi et.al. 2503.04470 link
2025-03-06 Spatial-Temporal Perception with Causal Inference for Naturalistic Driving Action Recognition Qing Chang et.al. 2503.04078 null
2025-03-06 Social Gesture Recognition in spHRI: Leveraging Fabric-Based Tactile Sensing on Humanoid Robots Dakarai Crowder et.al. 2503.03234 null
2025-03-04 Semi-Supervised Audio-Visual Video Action Recognition with Audio Source Localization Guided Mixup Seokun Kang et.al. 2503.02284 null
2025-03-04 FABG : End-to-end Imitation Learning for Embodied Affective Human-Robot Interaction Yanghai Zhang et.al. 2503.01363 null
2025-03-04 An Efficient 3D Convolutional Neural Network with Channel-wise, Spatial-grouped, and Temporal Convolutions Zhe Wang et.al. 2503.00796 null
2025-03-02 One-Shot Gesture Recognition for Underwater Diver-To-Robot Communication Rishikesh Joshi et.al. 2503.00676 null
2025-03-04 Unified Video Action Model Shuang Li et.al. 2503.00200 link
2025-02-28 BST: Badminton Stroke-type Transformer for Skeleton-based Action Recognition in Racket Sports Jing-Yuan Chang et.al. 2502.21085 null
2025-02-27 Learning to Generalize without Bias for Open-Vocabulary Action Recognition Yating Yu et.al. 2502.20158 link
2025-02-27 QORT-Former: Query-optimized Real-time Transformer for Understanding Two Hands Manipulating Objects Elkhan Ismayilzada et.al. 2502.19769 null
2025-02-26 Deep Learning For Time Series Analysis With Application On Human Motion Ali Ismail-Fawaz et.al. 2502.19364 null
2025-02-26 UQABench: Evaluating User Embedding for Prompting LLMs in Personalized Question Answering Langming Liu et.al. 2502.19178 link
2025-02-25 EgoSim: An Egocentric Multi-view Simulator and Real Dataset for Body-worn Cameras during Motion and Activity Dominik Hollidt et.al. 2502.18373 link
2025-02-25 Edge Training and Inference with Analog ReRAM Technology for Hand Gesture Recognition Victoria Clerico et.al. 2502.18152 null
2025-02-23 Trunk-branch Contrastive Network with Multi-view Deformable Aggregation for Multi-view Action Recognition Yingyuan Yang et.al. 2502.16493 null
2025-02-20 Online hand gesture recognition using Continual Graph Transformers Rim Slama et.al. 2502.14939 null
2025-02-19 Are Rules Meant to be Broken? Understanding Multilingual Moral Reasoning as a Computational Pipeline with UniMoral Shivani Kumar et.al. 2502.14083 null
2025-02-19 PSCon: Toward Conversational Product Search Jie Zou et.al. 2502.13881 link
2025-02-19 SNN-Driven Multimodal Human Action Recognition via Event Camera and Skeleton Data Fusion Naichuan Zheng et.al. 2502.13385 null
2025-02-18 Beyond Timesteps: A Novel Activation-wise Membrane Potential Propagation Mechanism for Spiking Neural Networks in 3D cloud Jian Song et.al. 2502.12791 null
2025-02-18 Adaptive Prototype Model for Attribute-based Multi-label Few-shot Action Recognition Juefeng Xiao et.al. 2502.12582 null
2025-02-25 Duo Streamers: A Streaming Gesture Recognition Framework Boxuan Zhu et.al. 2502.12297 link
2025-02-17 Can LLMs Simulate Social Media Engagement? A Study on Action-Guided Response Generation Zhongyi Qiu et.al. 2502.12073 null
2025-02-14 ManiTrend: Bridging Future Generation and Action Prediction with 3D Flow for Robotic Manipulation Yuxin He et.al. 2502.10028 null
2025-02-14 VicKAM: Visual Conceptual Knowledge Guided Action Map for Weakly Supervised Group Activity Recognition Zhuming Wang et.al. 2502.09967 null
2025-02-13 CellFlow: Simulating Cellular Morphology Changes via Flow Matching Yuhui Zhang et.al. 2502.09775 link
2025-02-12 Measuring Anxiety Levels with Head Motion Patterns in Severe Depression Population Fouad Boualeb et.al. 2502.08813 null
2025-02-18 Robot Data Curation with Mutual Information Estimators Joey Hejna et.al. 2502.08623 null
2025-02-12 DGSense: A Domain Generalization Framework for Wireless Sensing Rui Zhou et.al. 2502.08155 null
2025-02-11 Enhancing Video Understanding: Deep Neural Networks for Spatiotemporal Analysis Amir Hosein Fadaei et.al. 2502.07277 null
2025-02-10 From Image to Video: An Empirical Study of Diffusion Representations Pedro Vélez et.al. 2502.07001 null
2025-02-10 Conformal Predictions for Human Action Recognition with Vision-Language Models Bary Tim et.al. 2502.06631 null
2025-02-10 AppVLM: A Lightweight Vision Language Model for Online App Control Georgios Papoudakis et.al. 2502.06395 null
2025-02-09 Preventing Rogue Agents Improves Multi-Agent Collaboration Ohav Barbi et.al. 2502.05986 link
2025-02-09 HyLiFormer: Hyperbolic Linear Attention for Skeleton-based Human Action Recognition Yue Li et.al. 2502.05869 null
2025-02-11 HAMSTER: Hierarchical Action Models For Open-World Robot Manipulation Yi Li et.al. 2502.05485 link
2025-02-06 HD-EPIC: A Highly-Detailed Egocentric Video Dataset Toby Perrett et.al. 2502.04144 null
2025-02-06 MD-BERT: Action Recognition in Dark Videos via Dynamic Multi-Stream Fusion and Temporal Modeling Sharana Dharshikgan Suresh Dass et.al. 2502.03724 null
2025-02-10 Kronecker Mask and Interpretive Prompts are Language-Action Video Learners Jingyi Yang et.al. 2502.03549 link
2025-02-05 SKI Models: Skeleton Induced Vision-Language Embeddings for Understanding Activities of Daily Living Arkaprava Sinha et.al. 2502.03459 null
2025-02-01 Minimalistic Video Saliency Prediction via Efficient Decoder & Spatio Temporal Action Cues Rohit Girmaji et.al. 2502.00397 null
2025-01-31 ALBAR: Adversarial Learning approach to mitigate Biases in Action Recognition Joseph Fioresi et.al. 2502.00156 null
2025-01-31 From Soft Materials to Controllers with NeuroTouch: A Neuromorphic Tactile Sensor for Real-Time Gesture Recognition Victor Hoffmann et.al. 2501.19174 null
2025-01-31 XRF V2: A Dataset for Action Summarization with Wi-Fi Signals, and IMUs in Phones, Watches, Earbuds, and Glasses Bo Lan et.al. 2501.19034 link
2025-02-03 Advances in Multimodal Adaptation and Generalization: From Traditional Approaches to Foundation Models Hao Dong et.al. 2501.18592 link
2025-01-29 Action Recognition Using Temporal Shift Module and Ensemble Learning Anh-Kiet Duong et.al. 2501.17550 link
2025-01-28 Bones of Contention: Exploring Query-Efficient Attacks Against Skeleton Recognition Systems Yuxin Cao et.al. 2501.16843 null
2025-01-27 A Low-Cost, High-Precision Human-Machine Interaction Solution Based on Multi-Coil Wireless Charging Pads Bojun Zhang et.al. 2501.15885 null
2025-01-25 Recognize Any Surgical Object: Unleashing the Power of Weakly-Supervised Data Jiajie Li et.al. 2501.15326 null
2025-01-27 ACT-JEPA: Joint-Embedding Predictive Architecture Improves Policy Representation Learning Aleksandar Vujinovic et.al. 2501.14622 null
2025-01-24 Optimizing Human Pose Estimation Through Focused Human and Joint Regions Yingying Jiao et.al. 2501.14439 null
2025-01-24 Human Activity Recognition with a 6.5 GHz Reconfigurable Intelligent Surface for Wi-Fi 6E Nuno Paulino et.al. 2501.14423 null
2025-01-23 MV-GMN: State Space Model for Multi-View Action Recognition Yuhui Lin et.al. 2501.13829 null
2025-01-23 EgoHand: Ego-centric Hand Pose Estimation and Gesture Recognition with Head-mounted Millimeter-wave Radar and IMUs Yizhe Lv et.al. 2501.13805 link
2025-01-22 SMART-Vision: Survey of Modern Action Recognition Techniques in Vision Ali K. AlShami et.al. 2501.13066 null
2025-01-22 Can masking background and object reduce static bias for zero-shot action recognition? Takumi Fukuzawa et.al. 2501.12681 null
2025-01-21 BlanketGen2-Fit3D: Synthetic Blanket Augmentation Towards Improving Real-World In-Bed Blanket Occluded Human Pose Estimation Tamás Karácsony et.al. 2501.12318 null
2025-01-21 InsTALL: Context-aware Instructional Task Assistance with Multi-modal Large Language Models Pha Nguyen et.al. 2501.12231 null
2025-01-21 DSTSA-GCN: Advancing Skeleton-Based Gesture Recognition with Semantic-Aware Spatio-Temporal Topology Modeling Hu Cui et.al. 2501.12086 null
2025-01-21 Survey on Hand Gesture Recognition from Visual Input Manousos Linardakis et.al. 2501.11992 null
2025-01-19 Rethinking Pseudo-Label Guided Learning for Weakly Supervised Temporal Action Localization from the Perspective of Noise Correction Quan Zhang et.al. 2501.11124 null
2025-01-23 HFGCN:Hypergraph Fusion Graph Convolutional Networks for Skeleton-Based Action Recognition Pengcheng Dong et.al. 2501.11007 null
2025-01-18 BAP v2: An Enhanced Task Framework for Instruction Following in Minecraft Dialogues Prashant Jayannavar et.al. 2501.10836 link
2025-01-15 Visual WetlandBirds Dataset: Bird Species Identification and Behavior Recognition in Videos Javier Rodriguez-Juan et.al. 2501.08931 link
2025-01-13 Collaborative Learning for 3D Hand-Object Reconstruction and Compositional Action Recognition from Egocentric RGB Videos Using Superquadrics Tze Ho Elden Tse et.al. 2501.07100 null
2025-01-12 DRDT3: Diffusion-Refined Decision Test-Time Training Model Xingshuai Huang et.al. 2501.06718 null
2025-01-07 Language and Planning in Robotic Navigation: A Multilingual Evaluation of State-of-the-Art Models Malak Mansour et.al. 2501.05478 null
2025-01-09 Improving Skeleton-based Action Recognition with Interactive Object Information Hao Wen et.al. 2501.05066 link
2025-01-08 Video Summarisation with Incident and Context Information using Generative AI Ulindu De Silva et.al. 2501.04764 null
2025-01-08 Assessing the Acceptance of a Mid-Air Gesture Syntax for Smart Space Interaction: An Empirical Study Ana M. Bernardos et.al. 2501.04464 null
2025-01-07 Extraction Of Cumulative Blobs From Dynamic Gestures Rishabh Naulakha et.al. 2501.04002 null
2025-01-06 Large Language Models for Video Surveillance Applications Ulindu De Silva et.al. 2501.02850 null
2025-01-05 Evolving Skeletons: Motion Dynamics in Action Recognition Jushang Qiu et.al. 2501.02593 null
2025-01-02 SeFAR: Semi-supervised Fine-grained Action Recognition with Temporal Perturbation and Learning Stabilization Yongle Huang et.al. 2501.01245 link
2025-01-02 Event Masked Autoencoder: Point-wise Action Recognition with Event-Based Cameras Jingkai Sun et.al. 2501.01040 null
2025-01-01 Multiscaled Multi-Head Attention-based Video Transformer Network for Hand Gesture Recognition Mallika Garg et.al. 2501.00935 null
2025-01-01 Multimodal Large Models Are Effective Action Anticipators Binglu Wang et.al. 2501.00795 link
2024-12-31 M2I2: Learning Efficient Multi-Agent Communication via Masked State Modeling and Intention Inference Chuxiong Sun et.al. 2501.00312 null
2024-12-30 A Large-Scale Study on Video Action Dataset Condensation Yang Chen et.al. 2412.21197 null
2024-12-30 Frequency-aware Event Cloud Network Hongwei Ren et.al. 2412.20803 null
2024-12-29 FreqMixFormerV2: Lightweight Frequency-aware Mixed Transformer for Human Skeleton Action Recognition Wenhan Wu et.al. 2412.20621 link
2024-12-29 Exploiting Aggregation and Segregation of Representations for Domain Adaptive Human Pose Estimation Qucheng Peng et.al. 2412.20538 link
2024-12-29 Improving Vision-Language-Action Models via Chain-of-Affordance Jinming Li et.al. 2412.20451 null
2024-12-28 DAVE: Diverse Atomic Visual Elements Dataset with High Representation of Vulnerable Road Users in Complex and Unpredictable Environments Xijun Wang et.al. 2412.20042 null
2024-12-27 Generalized Uncertainty-Based Evidential Fusion with Hybrid Multi-Head Attention for Weak-Supervised Temporal Action Localization Yuanpeng He et.al. 2412.19418 link
2024-12-25 SWAG: Long-term Surgical Workflow Prediction with Generative-based Anticipation Maxence Boels et.al. 2412.18849 null
2024-12-25 Skeleton-based Action Recognition with Non-linear Dependency Modeling and Hilbert-Schmidt Independence Criterion Yuheng Yang et.al. 2412.18780 link
2024-12-24 Computer Vision-Driven Gesture Recognition: Toward Natural and Intuitive Human-Computer Fenghua Shao et.al. 2412.18321 null
2024-12-23 HumanVBench: Exploring Human-Centric Video Understanding Capabilities of MLLMs with Synthetic Benchmark Data Ting Zhou et.al. 2412.17574 null
2024-12-22 Video Domain Incremental Learning for Human Action Recognition in Home Environments Yuanda Hu et.al. 2412.16946 null
2024-12-21 Optical Wireless Communications: Enabling the Next Generation Network of Networks Aravindh Krishnamoorthy et.al. 2412.16798 null
2024-12-21 FACTS: Fine-Grained Action Classification for Tactical Sports Christopher Lai et.al. 2412.16454 null
2024-12-20 iRadar: Synthesizing Millimeter-Waves from Wearable Inertial Inputs for Human Gesture Sensing Huanqi Yang et.al. 2412.15980 null
2024-12-19 Synchronized and Fine-Grained Head for Skeleton-Based Ambiguous Action Recognition Hao Huang et.al. 2412.14833 null
2024-12-19 Prototypical Calibrating Ambiguous Samples for Micro-Action Recognition Kun Li et.al. 2412.14719 link
2024-12-24 Towards Generalist Robot Policies: What Matters in Building Vision-Language-Action Models Xinghang Li et.al. 2412.14058 link
2024-12-18 Do Language Models Understand Time? Xi Ding et.al. 2412.13845 link
2024-12-17 CompactFlowNet: Efficient Real-time Optical Flow Estimation on Mobile Devices Andrei Znobishchev et.al. 2412.13273 null
2024-12-20 Future Aspects in Human Action Recognition: Exploring Emerging Techniques and Ethical Influences Antonios Gasteratos et.al. 2412.12990 null
2024-12-16 Designing Semi-Structured Pruning of Graph Convolutional Networks for Skeleton-based Recognition Hichem Sahbi et.al. 2412.11813 null
2024-12-13 TraceVLA: Visual Trace Prompting Enhances Spatial-Temporal Awareness for Generalist Robotic Policies Ruijie Zheng et.al. 2412.10345 null
2024-12-13 Building a Multi-modal Spatiotemporal Expert for Zero-shot Action Recognition with CLIP Yating Yu et.al. 2412.09895 link
2024-12-14 USDRL: Unified Skeleton-Based Dense Representation Learning with Multi-Grained Feature Decorrelation Wanjiang Weng et.al. 2412.09220 link
2024-12-13 Temporal Action Localization with Cross Layer Task Decoupling and Refinement Qiang Li et.al. 2412.09202 link
2024-12-12 Goal-Conditioned Supervised Learning for Multi-Objective Recommendation Shijun Li et.al. 2412.08911 null
2024-12-10 SAT: Spatial Aptitude Training for Multimodal Language Models Arijit Ray et.al. 2412.07755 link
2024-12-10 Manta: Enhancing Mamba for Few-Shot Action Recognition of Long Sub-Sequence Wenbo Huang et.al. 2412.07481 null
2024-12-09 Mining Limited Data Sufficiently: A BERT-inspired Approach for CSI Time Series Application in Wireless Communication and Sensing Zijian Zhao et.al. 2412.06861 link
2024-12-09 Exploring the Impact of Synthetic Data on Human Gesture Recognition Tasks Using GANs George Kontogiannis et.al. 2412.06389 null
2024-12-07 Action Recognition based Industrial Safety Violation Detection Surya N Reddy et.al. 2412.05531 null
2024-12-06 CCS: Continuous Learning for Customized Incremental Wireless Sensing Services Qunhang Fu et.al. 2412.04821 null
2024-12-06 KNN-MMD: Cross Domain Wi-Fi Sensing Based on Local Distribution Alignment Zijian Zhao et.al. 2412.04783 link
2024-12-03 Proximal Control of UAVs with Federated Learning for Human-Robot Collaborative Domains Lucas Nogueira Nobrega et.al. 2412.02863 null
2024-12-03 Planning-Guided Diffusion Policy Learning for Generalizable Contact-Rich Bimanual Manipulation Xuanlin Li et.al. 2412.02676 null
2024-12-02 Human-Machine Interfaces for Subsea Telerobotics: From Soda-straw to Natural Language Interactions Adnan Abdullah et.al. 2412.01753 null
2024-12-02 HaGRIDv2: 1M Images for Static and Dynamic Hand Gesture Recognition Anton Nuzhdin et.al. 2412.01508 link
2024-12-02 EdgeOAR: Real-time Online Action Recognition On Edge Devices Wei Luo et.al. 2412.01267 null
2024-11-29 CogACT: A Foundational Vision-Language-Action Model for Synergizing Cognition and Action in Robotic Manipulation Qixiu Li et.al. 2411.19650 null
2024-11-29 SkelMamba: A State Space Model for Efficient Skeleton Action Recognition of Neurological Disorders Niki Martinel et.al. 2411.19544 null
2024-11-29 Hierarchical Framework for Retrosynthesis Prediction with Enhanced Reaction Center Localization Seongeun Yun et.al. 2411.19503 null
2024-11-28 TAMT: Temporal-Aware Model Tuning for Cross-Domain Few-Shot Action Recognition Yilong Wang et.al. 2411.19041 null
2024-11-28 Revealing Key Details to See Differences: A Novel Prototypical Perspective for Skeleton-based Action Recognition Hongda Liu et.al. 2411.18941 link
2024-11-27 Robust Dynamic Gesture Recognition at Ultra-Long Distances Eran Bamani Beeri et.al. 2411.18413 null
2024-11-27 EventCrab: Harnessing Frame and Point Synergy for Event-based Action Recognition and Beyond Meiqi Cao et.al. 2411.18328 null
2024-11-27 An End-to-End Two-Stream Network Based on RGB Flow and Representation Flow for Human Action Recognition Song-Jiang Lai et.al. 2411.18002 null
2024-11-26 Pre-training for Action Recognition with Automatically Generated Fractal Datasets Davyd Svyezhentsev et.al. 2411.17584 link
2024-11-26 Real-Time Multimodal Signal Processing for HRI in RoboCup: Understanding a Human Referee Filippo Ansalone et.al. 2411.17347 null
2024-11-22 TSkips: Efficiency Through Explicit Temporal Delay Connections in Spiking Neural Networks Prajna G. Malettira et.al. 2411.16711 null
2024-11-24 OccludeNet: A Causal Journey into Mixed-View Actor-Centric Video Action Recognition under Occlusions Guanyu Zhou et.al. 2411.15729 link
2024-11-23 Machine Learning-based sEMG Signal Classification for Hand Gesture Recognition Parshuram N. Aarotale et.al. 2411.15655 null
2024-11-23 Optimizing Gesture Recognition for Seamless UI Interaction Using Convolutional Neural Networks Qi Sun et.al. 2411.15598 null
2024-11-22 When Spatial meets Temporal in Action Recognition Huilin Chen et.al. 2411.15284 null
2024-11-22 Adaptive Hyper-Graph Convolution Network for Skeleton-based Human Action Recognition with Virtual Connections Youwei Zhou et.al. 2411.14796 null
2024-11-22 Aim My Robot: Precision Local Navigation to Any Object Xiangyun Meng et.al. 2411.14770 null
2024-11-21 Tra-MoE: Learning Trajectory Prediction Model from Multiple Domains for Adaptive Policy Conditioning Jiange Yang et.al. 2411.14519 null
2024-11-18 Enhancing Bidirectional Sign Language Communication: Integrating YOLOv8 and NLP for Real-Time Gesture Recognition & Translation Hasnat Jamil Bhuiyan et.al. 2411.13597 null
2024-11-23 AzSLD: Azerbaijani Sign Language Dataset for Fingerspelling, Word, and Sentence Translation with Baseline Software Nigar Alishzade et.al. 2411.12865 null
2024-11-20 Topological Symmetry Enhanced Graph Convolution for Skeleton-Based Action Recognition Zeyu Liang et.al. 2411.12560 link
2024-11-19 Rethinking Top Probability from Multi-view for Distracted Driver Behaviour Localization Quang Vinh Nguyen et.al. 2411.12525 null
2024-11-18 Video-to-Task Learning via Motion-Guided Attention for Few-Shot Action Recognition Hanyu Guo et.al. 2411.11335 null
2024-11-18 Neuron: Learning Context-Aware Evolving Representations for Zero-Shot Skeleton Action Recognition Yang Chen et.al. 2411.11288 null
2024-11-18 Efficient Transfer Learning for Video-language Foundation Models Haoxing Chen et.al. 2411.11223 link
2024-11-16 TDSM:Triplet Diffusion for Skeleton-Text Matching in Zero-Shot Action Recognition Jeonghyeok Do et.al. 2411.10745 link
2024-11-15 KuaiFormer: Transformer-Based Retrieval at Kuaishou Chi Liu et.al. 2411.10057 null
2024-11-14 Towards Scalable Handwriting Communication via EEG Decoding and Latent Embedding Integration Jun-Young Kim et.al. 2411.09170 null
2024-11-14 VidMan: Exploiting Implicit Dynamics from Video Diffusion Model for Effective Robot Manipulation Youpeng Wen et.al. 2411.09153 null
2024-11-13 Can MLLMs Guide Weakly-Supervised Temporal Action Localization Tasks? Quan Zhang et.al. 2411.08466 null
2024-11-13 Generative AI for Data Augmentation in Wireless Networks: Analysis, Applications, and Case Study Jinbo Wen et.al. 2411.08341 null
2024-11-12 LapGSR: Laplacian Reconstructive Network for Guided Thermal Super-Resolution Aditya Kasliwal et.al. 2411.07750 null
2024-11-12 OWLed: Outlier-weighed Layerwise Pruning for Efficient Autonomous Driving Framework Jiaxi Li et.al. 2411.07711 null
2024-11-11 ConvMixFormer- A Resource-efficient Convolution Mixer for Transformer-based Dynamic Hand Gesture Recognition Mallika Garg et.al. 2411.07118 link
2024-11-10 Extended multi-stream temporal-attention module for skeleton-based human action recognition (HAR) Faisal Mehmood et.al. 2411.06553 null
2024-11-10 SuperResolution Radar Gesture Recognitio Netanel Blumenfeld et.al. 2411.06410 null
2024-11-08 Video RWKV:Video Action Recognition Based RWKV Zhuowen Yin et.al. 2411.05636 null
2024-11-06 Object Recognition in Human Computer Interaction:- A Comparative Analysis Kaushik Ranade et.al. 2411.04263 null
2024-11-06 Explaining Human Activity Recognition with SHAP: Validating Insights with Perturbation and Quantitative Measures Felix Tempel et.al. 2411.03714 link
2024-11-05 One-Stage-TFS: Thai One-Stage Fingerspelling Dataset for Fingerspelling Recognition Frameworks Siriwiwat Lata et.al. 2411.02768 null
2024-11-04 TI-PREGO: Chain of Thought and In-Context Learning for Online Mistake Detection in PRocedural EGOcentric Videos Leonardo Plini et.al. 2411.02570 null
2024-11-04 AM Flow: Adapters for Temporal Processing in Action Recognition Tanay Agrawal et.al. 2411.02065 null
2024-11-04 ARN-LSTM: A Multi-Stream Attention-Based Model for Action Recognition with Temporal Dynamics Chuanchuan Wang et.al. 2411.01769 null
2024-10-31 Technical Report for ActivityNet Challenge 2022 – Temporal Action Localization Shimin Chen et.al. 2411.00883 null
2024-10-30 A Simple and Effective Temporal Grounding Pipeline for Basketball Broadcast Footage Levi Harris et.al. 2411.00862 null
2024-11-01 STAA: Spatio-Temporal Attention Attribution for Real-Time Interpreting Transformer-based Video Models Zerui Wang et.al. 2411.00630 link
2024-11-01 Human Action Recognition (HAR) Using Skeleton-based Spatial Temporal Relative Transformer Network: ST-RTR Faisal Mehmood et.al. 2410.23806 null
2024-10-31 Recovering Complete Actions for Cross-dataset Skeleton Action Recognition Hanchao Liu et.al. 2410.23641 null
2024-10-30 Keypoint Abstraction using Large Models for Object-Relative Imitation Learning Xiaolin Fang et.al. 2410.23254 null
2024-10-30 AtGCN: A Graph Convolutional Network For Ataxic Gait Detection Karan Bania et.al. 2410.22862 null
2024-10-29 ProMQA: Question Answering Dataset for Multimodal Procedural Activity Understanding Kimihiro Hasegawa et.al. 2410.22211 link
2024-10-29 Multi-Level Feature Distillation of Joint Teachers Trained on Distinct Image Datasets Adrian Iordache et.al. 2410.22184 link
2024-10-28 Enhancing Action Recognition by Leveraging the Hierarchical Structure of Actions and Textual Context Manuel Benavent-Lledo et.al. 2410.21275 link
2024-10-28 One-Step Diffusion Policy: Fast Visuomotor Policies via Diffusion Distillation Zhendong Wang et.al. 2410.21257 null
2024-10-28 Zero-Shot Action Recognition in Surveillance Videos Joao Pereira et.al. 2410.21113 null
2024-10-28 LiGAR: LiDAR-Guided Hierarchical Transformer for Multi-Modal Group Activity Recognition Naga Venkata Sai Raviteja Chappa et.al. 2410.21108 null
2024-10-27 Exocentric To Egocentric Transfer For Action Recognition: A Short Survey Anirudh Thatipelli et.al. 2410.20621 null
2024-10-27 Idempotent Unsupervised Representation Learning for Skeleton-Based Action Recognition Lilang Lin et.al. 2410.20349 null
2024-10-28 x-RAGE: eXtended Reality – Action & Gesture Events Dataset Vivek Parmar et.al. 2410.19486 null
2024-10-24 Ferret-UI 2: Mastering Universal User Interface Understanding Across Platforms Zhangheng Li et.al. 2410.18967 link
2024-10-24 Research on gesture recognition method based on SEDCNN-SVM Mingjin Zhang et.al. 2410.18557 null
2024-10-23 Unsupervised Domain Adaptation for Action Recognition via Self-Ensembling and Conditional Embedding Alignment Indrajeet Ghosh et.al. 2410.17489 link
2024-10-22 Are Visual-Language Models Effective in Action Recognition? A Comparative Study Mahmoud Ali et.al. 2410.17149 null
2024-10-22 Masked Differential Privacy David Schneider et.al. 2410.17098 null
2024-10-22 SpikMamba: When SNN meets Mamba in Event-based Human Action Recognition Jiaqi Chen et.al. 2410.16746 link
2024-10-21 Improving the Multi-label Atomic Activity Recognition by Robust Visual Feature and Advanced Attention @ ROAD++ Atomic Activity Recognition 2024 Jiamin Cao et.al. 2410.16037 null
2024-10-19 CAGE: Causal Attention Enables Data-Efficient Generalizable Robotic Manipulation Shangning Xia et.al. 2410.14974 null
2024-10-18 DFlow: Diverse Dialogue Flow Simulation with Large Language Models Wanyu Du et.al. 2410.14853 null
2024-10-18 Storyboard guided Alignment for Fine-grained Video Action Recognition Enqi Liu et.al. 2410.14238 null
2024-10-17 SimpleToM: Exposing the Gap between Explicit ToM Inference and Implicit ToM Application in LLMs Yuling Gu et.al. 2410.13648 null
2024-10-16 In-Context Learning Enables Robot Action Prediction in LLMs Yida Yin et.al. 2410.12782 null
2024-10-14 Continual Learning Improves Zero-Shot Action Recognition Shreyank N Gowda et.al. 2410.10497 null
2024-10-16 PIVOT-R: Primitive-Driven Waypoint-Aware World Model for Robotic Manipulation Kaidong Zhang et.al. 2410.10394 null
2024-10-13 EITNet: An IoT-Enhanced Framework for Real-Time Basketball Action Recognition Jingyu Liu et.al. 2410.09954 null
2024-10-13 Multi class activity classification in videos using Motion History Image generation Senthilkumar Gopal et.al. 2410.09902 link
2024-10-12 Advanced Gesture Recognition in Autism: Integrating YOLOv7, Video Augmentation and VideoMAE for Video Analysis Amit Kumar Singh et.al. 2410.09339 null
2024-10-11 Aerial Vision-and-Language Navigation via Semantic-Topo-Metric Representation Guided LLM Reasoning Yunpeng Gao et.al. 2410.08500 null
2024-10-10 Human Stone Toolmaking Action Grammar (HSTAG): A Challenging Benchmark for Fine-grained Motor Behavior Recognition Cheng Liu et.al. 2410.08410 null
2024-10-10 Understanding Spatio-Temporal Relations in Human-Object Interaction using Pyramid Graph Convolutional Network Hao Xing et.al. 2410.07912 null
2024-10-09 CHASE: Learning Convex Hull Adaptive Shift for Skeleton-based Multi-Entity Action Recognition Yuhang Wen et.al. 2410.07153 link
2024-10-09 Fourier-based Action Recognition for Wildlife Behavior Quantification with Event Cameras Friedhelm Hamann et.al. 2410.06698 null
2024-10-08 GR-2: A Generative Video-Language-Action Model with Web-Scale Knowledge for Robot Manipulation Chi-Lam Cheang et.al. 2410.06158 null
2024-10-10 ActionAtlas: A VideoQA Benchmark for Domain-specialized Action Recognition Mohammadreza Salehi et.al. 2410.05774 null
2024-10-07 Exploring Gestural Interaction with a Cushion Interface for Smart Home Control Yuri Suzuki et.al. 2410.04730 null
2024-10-05 TR-LLM: Integrating Trajectory Data for Scene-Aware LLM-Based Human Action Prediction Kojiro Takeyama et.al. 2410.03993 null
2024-10-04 Shadow Augmentation for Handwashing Action Recognition: from Synthetic to Real Datasets Shengtai Ju et.al. 2410.03984 null
2024-10-04 Action Selection Learning for Multi-label Multi-view Action Recognition Trung Thanh Nguyen et.al. 2410.03302 link
2024-10-03 DivScene: Benchmarking LVLMs for Object Navigation with Diverse Scenes and Objects Zhaowei Wang et.al. 2410.02730 link
2024-10-03 An Evaluation of Large Pre-Trained Models for Gesture Recognition using Synthetic Videos Arun Reddy et.al. 2410.02152 null
2024-10-02 Language Supervised Human Action Recognition with Salient Fusion: Construction Worker Action Recognition as a Use Case Mohammad Mahdavian et.al. 2410.01962 null
2024-10-02 Sparse Covariance Neural Networks Andrea Cavallo et.al. 2410.01669 link
2024-10-02 Towards Generalizable Vision-Language Robotic Manipulation: A Benchmark and LLM-guided 3D Policy Ricardo Garcia et.al. 2410.01345 link
2024-10-01 Dynamic Planning for LLM-based Graphical User Interface Automation Shaoqing Zhang et.al. 2410.00467 link
2024-09-30 SurgPETL: Parameter-Efficient Image-to-Surgical-Video Transfer Learning for Surgical Phase Recognition Shu Yang et.al. 2409.20083 null
2024-09-28 Gesture Recognition for Feedback Based Mixed Reality and Robotic Fabrication: A Case Study of the UnLog Tower Alexander Htet Kyaw et.al. 2409.19281 null
2024-09-26 SOAR: Self-supervision Optimized UAV Action Recognition with Efficient Object-Aware Pretraining Ruiqi Xian et.al. 2409.18300 null
2024-09-26 Spatial Hierarchy and Temporal Attention Guided Cross Masking for Self-supervised Skeleton-based Action Recognition Xinpeng Yin et.al. 2409.17951 link
2024-09-26 EAGLE: Egocentric AGgregated Language-video Engine Jing Bi et.al. 2409.17523 null
2024-09-25 Path-adaptive Spatio-Temporal State Space Model for Event-based Recognition with Arbitrary Duration Jiazhou Zhou et.al. 2409.16953 null
2024-09-25 Dynamic Obstacle Avoidance through Uncertainty-Based Adaptive Planning with Diffusion Vineet Punyamoorty et.al. 2409.16950 null
2024-09-24 Hand Gesture Classification Based on Forearm Ultrasound Video Snippets Using 3D Convolutional Neural Networks Keshav Bimbraw et.al. 2409.16431 null
2024-09-22 Zero-Shot Skeleton-based Action Recognition with Dual Visual-Text Alignment Jidong Kuang et.al. 2409.14336 null
2024-09-21 Egocentric zone-aware action recognition across environments Simone Alberto Peirone et.al. 2409.14205 null
2024-09-19 Interpretable Action Recognition on Hard to Classify Actions Anastasia Anichenko et.al. 2409.13091 null
2024-09-18 Distillation-free Scaling of Large SSMs for Images and Videos Hamid Suleman et.al. 2409.11867 null
2024-09-17 Mamba Fusion: Learning Actions Through Questioning Zhikang Dong et.al. 2409.11513 link
2024-09-16 Forearm Ultrasound based Gesture Recognition on Edge Keshav Bimbraw et.al. 2409.09915 null
2024-09-15 Integrating Audio Narrations to Strengthen Domain Generalization in Multimodal First-Person Action Recognition Cagri Gungor et.al. 2409.09611 null
2024-09-14 MulCPred: Learning Multi-modal Concepts for Explainable Pedestrian Action Prediction Yan Feng et.al. 2409.09446 link
2024-09-14 KAN-HyperpointNet for Point Cloud Sequence-Based 3D Human Action Recognition Zhaoyu Chen et.al. 2409.09444 null
2024-09-14 ChildPlay-Hand: A Dataset of Hand Manipulations in the Wild Arya Farkhondeh et.al. 2409.09319 link
2024-09-13 Using The Concept Hierarchy for Household Action Recognition Andrei Costinescu et.al. 2409.08853 null
2024-09-12 Customized Mid-Air Gestures for Accessibility: A $B Recognizer for Multi-Dimensional Biosignal Gestures Momona Yamagami et.al. 2409.08402 null
2024-09-12 Spatial Adaptation Layer: Interpretable Domain Adaptation For Biosignal Sensor Array Applications Joao Pereira et.al. 2409.08058 null
2024-09-16 InterACT: Inter-dependency Aware Action Chunking with Hierarchical Attention Transformers for Bimanual Manipulation Andrew Lee et.al. 2409.07914 null
2024-09-11 2D bidirectional gated recurrent unit convolutional Neural networks for end-to-end violence detection In videos Abdarahmane Traoré et.al. 2409.07588 null
2024-09-10 Data Collection-free Masked Video Modeling Yuchi Ishikawa et.al. 2409.06665 null
2024-09-10 Advancements in Gesture Recognition Techniques and Machine Learning for Enhanced Human-Robot Interaction: A Comprehensive Review Sajjad Hussain et.al. 2409.06503 null
2024-09-10 Learning Generative Interactive Environments By Trained Agent Exploration Naser Kazemi et.al. 2409.06445 link
2024-09-09 ReL-SAR: Representation Learning for Skeleton Action Recognition with Convolutional Transformers and BYOL Safwen Naimi et.al. 2409.05749 null
2024-09-11 Real-Time Human Action Recognition on Embedded Platforms Ruiqi Wang et.al. 2409.05662 null
2024-09-06 Self-Supervised Contrastive Learning for Videos using Differentiable Local Alignment Keyne Oei et.al. 2409.04607 null
2024-09-05 MVTN: A Multiscale Video Transformer Network for Hand Gesture Recognition Mallika Garg et.al. 2409.03890 link
2024-09-05 UAV (Unmanned Aerial Vehicles): Diverse Applications of UAV Datasets in Segmentation, Classification, Detection, and Tracking Md. Mahfuzur Rahman et.al. 2409.03245 null
2024-09-04 SITAR: Semi-supervised Image Transformer for Action Recognition Owais Iqbal et.al. 2409.02910 null
2024-09-04 TASAR: Transferable Attack on Skeletal Action Recognition Yunfeng Diao et.al. 2409.02483 link
2024-09-04 Unified Framework with Consistency across Modalities for Human Activity Recognition Tuyen Tran et.al. 2409.02385 null
2024-09-07 Unfolding Videos Dynamics via Taylor Expansion Siyi Chen et.al. 2409.02371 null
2024-09-03 ADHD diagnosis based on action characteristics recorded in videos using machine learning Yichun Li et.al. 2409.02274 null
2024-09-03 Action-Based ADHD Diagnosis in Video Yichun Li et.al. 2409.02261 null
2024-09-03 ReSpike: Residual Frames-based Hybrid Spiking Neural Networks for Efficient Action Recognition Shiting Xiao et.al. 2409.01564 null
2024-09-02 FinePseudo: Improving Pseudo-Labelling through Temporal-Alignablity for Semi-Supervised Fine-Grained Action Recognition Ishan Rajendrakumar Dave et.al. 2409.01448 null
2024-09-01 Fisher Information guided Purification against Backdoor Attacks Nazmul Karim et.al. 2409.00863 link
2024-09-01 A Critical Analysis on Machine Learning Techniques for Video-based Human Activity Recognition of Surveillance Systems: A Review Shahriar Jahan et.al. 2409.00731 null
2024-09-03 Open-vocabulary Temporal Action Localization using VLMs Naoki Wake et.al. 2408.17422 null
2024-08-29 Text-Enhanced Zero-Shot Action Recognition: A training-free approach Massimo Bosetti et.al. 2408.16412 null
2024-08-28 DEAR: Depth-Enhanced Action Recognition Sadegh Rahmaniboldaji et.al. 2408.15679 link
2024-08-28 Online pre-training with long-form videos Itsuki Kato et.al. 2408.15651 null
2024-09-04 Hand1000: Generating Realistic Hands from Text with Only 1,000 Images Haozhuo Zhang et.al. 2408.15461 null
2024-08-26 Comparative Analysis: Violence Recognition from Videos using Transfer Learning Dursun Dashdamirov et.al. 2408.14659 link
2024-08-25 Towards Completeness: A Generalizable Action Proposal Generator for Zero-Shot Temporal Action Localization Jia-Run Du et.al. 2408.13777 link
2024-08-25 FMI-TAL: Few-shot Multiple Instances Temporal Action Localization by Probability Distribution Learning and Interval Cluster Refinement Fengshun Wang et.al. 2408.13765 link
2024-08-25 EMG-Based Hand Gesture Recognition through Diverse Domain Feature Enhancement and Machine Learning-Based Approach Abu Saleh Musa Miah et.al. 2408.13723 null
2024-08-24 HabitAction: A Video Dataset for Human Habitual Behavior Recognition Hongwu Li et.al. 2408.13463 null
2024-08-23 N-DriverMotion: Driver motion learning and prediction using an event-based camera and directly trained spiking neural networks Hyo Jong Chung et.al. 2408.13379 null
2024-08-23 Energy-Efficient Spiking Recurrent Neural Network for Gesture Recognition on Embedded GPUs Marzieh Hassanshahi Varposhti et.al. 2408.12978 null
2024-08-21 Data-Free Class Incremental Gesture Recognition via Synthetic Feature Sampling Zhenyu Lu et.al. 2408.12629 null
2024-08-22 Frame Order Matters: A Temporal Sequence-Aware Model for Few-Shot Action Recognition Bozheng Li et.al. 2408.12475 null
2024-08-23 TWLV-I: Analysis and Insights from Holistic Evaluation on Video Foundation Models Hyeongmin Lee et.al. 2408.11318 link
2024-08-21 CrossFi: A Cross Domain Wi-Fi Sensing Framework Based on Siamese Network Zijian Zhao et.al. 2408.10919 link
2024-08-20 TDS-CLIP: Temporal Difference Side Network for Image-to-Video Transfer Learning Bin Wang et.al. 2408.10688 link
2024-08-19 Narrowing the Gap between Vision and Action in Navigation Yue Zhang et.al. 2408.10388 link
2024-08-19 SHARP: Segmentation of Hands and Arms by Range using Pseudo-Depth for Enhanced Egocentric 3D Hand Pose Estimation and Action Recognition Wiktor Mucha et.al. 2408.10037 link
2024-08-19 Event Stream based Human Action Recognition: A High-Definition Benchmark Dataset and Algorithms Xiao Wang et.al. 2408.09764 link
2024-08-18 Joint Temporal Pooling for Improving Skeleton-based Action Recognition Shanaka Ramesh Gunasekara et.al. 2408.09356 null
2024-08-17 Intuitive Human-Robot Interface: A 3-Dimensional Action Recognition and UAV Collaboration Framework Akash Chaudhary et.al. 2408.09232 null
2024-08-17 Flatten: Video Action Recognition is an Image Classification task Junlin Chen et.al. 2408.09220 null
2024-08-17 Temporal Reversed Training for Spiking Neural Networks with Generalized Spatio-Temporal Representation Lin Zuo et.al. 2408.09108 null
2024-08-16 Towards Physical World Backdoor Attacks against Skeleton Action Recognition Qichen Zheng et.al. 2408.08671 null
2024-08-15 An Advanced Deep Learning Based Three-Stream Hybrid Model for Dynamic Hand Gesture Recognition Md Abdur Rahim et.al. 2408.08035 null
2024-08-12 HAT: History-Augmented Anchor Transformer for Online Temporal Action Localization Sakib Reza et.al. 2408.06437 link
2024-08-12 Probabilistic Vision-Language Representation for Weakly Supervised Temporal Action Localization Geuntaek Lim et.al. 2408.05955 link
2024-08-10 A Methodological and Structural Review of Hand Gesture Recognition Across Diverse Data Modalities Jungpil Shin et.al. 2408.05436 null
2024-08-10 EPAM-Net: An Efficient Pose-driven Attention-guided Multimodal Network for Video Action Recognition Ahmed Abdelkawy et.al. 2408.05421 link
2024-08-06 Prototype Learning for Micro-gesture Classification Guoliang Chen et.al. 2408.03097 null
2024-08-06 Online Temporal Action Localization with Memory-Augmented Transformer Youngkil Song et.al. 2408.02957 null
2024-08-05 From Recognition to Prediction: Leveraging Sequence Reasoning for Action Anticipation Xin Liu et.al. 2408.02769 null
2024-08-04 Enhancing Human Action Recognition and Violence Detection Through Deep Learning Audiovisual Fusion Pooya Janani et.al. 2408.02033 null
2024-08-03 MultiFuser: Multimodal Fusion Transformer for Enhanced Driver Action Recognition Ruoyu Wang et.al. 2408.01766 null
2024-08-03 Signal-SGN: A Spiking Graph Convolutional Network for Skeletal Action Recognition via Learning Temporal-Frequency Dynamics Naichuan Zheng et.al. 2408.01701 null
2024-08-01 Text-Guided Video Masked Autoencoder David Fan et.al. 2408.00759 null
2024-08-01 How Effective are Self-Supervised Models for Contact Identification in Videos Malitha Gunawardhana et.al. 2408.00498 null
2024-08-01 Task-Adapter: Task-specific Adaptation of Image Models for Few-shot Action Recognition Congqi Cao et.al. 2408.00249 null
2024-07-31 Explainable Artificial Intelligence for Quantifying Interfering and High-Risk Behaviors in Autism Spectrum Disorder in a Real-World Classroom Environment Using Privacy-Preserving Video Analysis Barun Das et.al. 2407.21691 null
2024-07-31 Skeleton-Based Action Recognition with Spatial-Structural Graph Convolution Jingyao Wang et.al. 2407.21525 null
2024-07-31 Dynamic Gesture Recognition in Ultra-Range Distance for Effective Human-Robot Interaction Eran Bamani Beeri et.al. 2407.21374 null
2024-07-29 Adversarial Robustness in RGB-Skeleton Action Recognition: Leveraging Attention Modality Reweighter Chao Liu et.al. 2407.19981 null
2024-07-29 ActivityCLIP: Enhancing Group Activity Recognition by Mining Complementary Information from Text to Supplement Image Modality Guoliang Xu et.al. 2407.19820 null
2024-07-29 PredIN: Towards Open-Set Gesture Recognition via Prediction Inconsistency Chen Liu et.al. 2407.19753 null
2024-07-28 Skeleton-based Group Activity Recognition via Spatial-Temporal Panoramic Graph Zhengcen Li et.al. 2407.19497 link
2024-07-25 MARINE: A Computer Vision Model for Detecting Rare Predator-Prey Interactions in Animal Videos Zsófia Katona et.al. 2407.18289 null
2024-07-25 Trajectory-aligned Space-time Tokens for Few-shot Action Recognition Pulkit Kumar et.al. 2407.18249 null
2024-07-26 Harnessing Temporal Causality for Advanced Temporal Action Detection Shuming Liu et.al. 2407.17792 link
2024-07-23 Fusion and Cross-Modal Transfer for Zero-Shot Human Action Recognition Abhi Kamboj et.al. 2407.16803 null
2024-07-23 PLM-Net: Perception Latency Mitigation Network for Vision-Based Lateral Control of Autonomous Vehicles Aws Khalil et.al. 2407.16740 link
2024-07-24 SOAP: Enhancing Spatio-Temporal Relation and Motion Information Capturing for Few-Shot Action Recognition Wenbo Huang et.al. 2407.16344 link
2024-07-22 Efficient and generalizable prediction of molecular alterations in multiple cancer cohorts using H&E whole slide images Kshitij Ingale et.al. 2407.15816 null
2024-07-25 Multi-Modality Co-Learning for Efficient Skeleton-based Action Recognition Jinfu Liu et.al. 2407.15706 link
2024-07-21 Semi-Supervised Pipe Video Temporal Defect Interval Localization Zhu Huang et.al. 2407.15170 null
2024-07-20 Automated Patient Positioning with Learned 3D Hand Gestures Zhongpai Gao et.al. 2407.14903 null
2024-07-20 Can VLMs be used on videos for action recognition? LLMs are Visual Reasoning Coordinators Harsh Lunia et.al. 2407.14834 null
2024-07-20 Decoupled Prompt-Adapter Tuning for Continual Activity Recognition Di Fu et.al. 2407.14811 null
2024-07-20 A Comprehensive Review of Few-shot Action Recognition Yuyang Wanyan et.al. 2407.14744 null
2024-07-19 LORTSAR: Low-Rank Transformer for Skeleton-based Action Recognition Soroush Oraki et.al. 2407.14655 null
2024-07-19 Fine-grained Knowledge Graph-driven Video-Language Learning for Action Recognition Rui Zhang et.al. 2407.14146 null
2024-07-19 Zero-Shot Underwater Gesture Recognition Sandipan Sarma et.al. 2407.14103 link
2024-07-18 Pose-guided multi-task video transformer for driver action recognition Ricardo Pizarro et.al. 2407.13750 null
2024-07-18 SA-DVAE: Improving Zero-Shot Skeleton-Based Action Recognition by Disentangled Variational Autoencoders Sheng-Wei Li et.al. 2407.13460 link
2024-07-18 QuIIL at T3 challenge: Towards Automation in Life-Saving Intervention Procedures from First-Person View Trinh T. L. Vuong et.al. 2407.13216 link
2024-07-18 Enhancing Temporal Action Localization: Advanced S6 Modeling with Recurrent Mechanism Sangyoun Lee et.al. 2407.13078 link
2024-07-17 ActionSwitch: Class-agnostic Detection of Simultaneous Actions in Streaming Videos Hyolim Kang et.al. 2407.12987 link
2024-07-17 NavGPT-2: Unleashing Navigational Reasoning Capability for Large Vision-Language Models Gengze Zhou et.al. 2407.12366 link
2024-07-17 Frequency Guidance Matters: Skeletal Action Recognition by Frequency-Aware Mixed Transformer Wenhan Wu et.al. 2407.12322 null
2024-07-17 Shap-Mix: Shapley Value Guided Mixing for Long-Tailed Skeleton Based Action Recognition Jiahang Zhang et.al. 2407.12312 null
2024-07-16 Enhancing Split Computing and Early Exit Applications through Predefined Sparsity Luigi Capogrosso et.al. 2407.11763 link
2024-07-10 Exploring the Boundaries of On-Device Inference: When Tiny Falls Short, Go Hierarchical Adarsh Prasad Behera et.al. 2407.11061 null
2024-07-15 STARS: Self-supervised Tuning for 3D Action Recognition in Skeleton Sequences Soroush Mehraban et.al. 2407.10935 null
2024-07-15 Human-Centric Transformer for Domain Adaptive Action Recognition Kun-Yu Lin et.al. 2407.10860 null
2024-07-17 Augmented Neural Fine-Tuning for Efficient Backdoor Purification Nazmul Karim et.al. 2407.10052 link
2024-07-13 Region-aware Image-based Human Action Retrieval with Transformers Hongsong Wang et.al. 2407.09924 null
2024-07-16 OmniRace: 6D Hand Pose Estimation for Intuitive Guidance of Racing Drone Valerii Serpiva et.al. 2407.09841 link
2024-07-12 Full-Stage Pseudo Label Quality Enhancement for Weakly-supervised Temporal Action Localization Qianhan Feng et.al. 2407.08971 link
2024-07-11 Boosting Adversarial Transferability for Skeleton-based Action Recognition via Exploring the Model Posterior Space Yunfeng Diao et.al. 2407.08572 null
2024-07-12 Towards Adaptive Pseudo-label Learning for Semi-Supervised Temporal Action Localization Feixiang Zhou et.al. 2407.07673 null
2024-07-10 EA-VTR: Event-Aware Video-Text Retrieval Zongyang Ma et.al. 2407.07478 null
2024-07-09 Exploring Scalability of Self-Training for Open-Vocabulary Temporal Action Localization Jeongseok Hyun et.al. 2407.07024 link
2024-07-09 Rethinking Image-to-Video Adaptation: An Object-centric Perspective Rui Qian et.al. 2407.06871 null
2024-07-09 Masked Video and Body-worn IMU Autoencoder for Egocentric Action Recognition Mingfang Zhang et.al. 2407.06628 null
2024-07-08 Noise-Free Explanation for Driving Action Prediction Hongbo Zhu et.al. 2407.06339 link
2024-07-08 C2C: Component-to-Composition Learning for Zero-Shot Compositional Action Recognition Rongchang Li et.al. 2407.06113 link
2024-07-08 DMSD-CDFSAR: Distillation from Mixed-Source Domain for Cross-Domain Few-shot Action Recognition Fei Guo et.al. 2407.05657 null
2024-07-11 Helios: An extremely low power event-based gesture recognition for always-on smart eyewear Prarthana Bhattacharyya et.al. 2407.05206 null
2024-07-06 DailyDVS-200: A Comprehensive Benchmark Dataset for Event-Based Action Recognition Qi Wang et.al. 2407.05106 link
2024-07-05 AWT: Transferring Vision-Language Models via Augmentation, Weighting, and Transportation Yuhan Zhu et.al. 2407.04603 null
2024-07-05 TF-SASM: Training-free Spatial-aware Sparse Memory for Multi-object Tracking Thuc Nguyen-Quang et.al. 2407.04327 null
2024-07-05 Computer Vision for Clinical Gait Analysis: A Gait Abnormality Video Dataset Rahm Ranjan et.al. 2407.04190 null
2024-07-04 Robust Policy Learning for Multi-UAV Collision Avoidance with Causal Feature Selection Jiafan Zhuang et.al. 2407.04056 null
2024-07-04 On-Device Training Empowered Transfer Learning For Human Activity Recognition Pixi Kang et.al. 2407.03644 null
2024-07-03 Motion meets Attention: Video Motion Prompts Qixiang Chen et.al. 2407.03179 null
2024-07-02 Advancing Compressed Video Action Recognition through Progressive Knowledge Distillation Efstathia Soufleri et.al. 2407.02713 link
2024-07-02 Novel Human Machine Interface via Robust Hand Gesture Recognition System using Channel Pruned YOLOv5s Model Abir Sen et.al. 2407.02585 null
2024-07-02 Referring Atomic Video Action Recognition Kunyu Peng et.al. 2407.01872 link
2024-07-01 Mask and Compress: Efficient Skeleton-based Action Recognition in Continual Learning Matteo Mosconi et.al. 2407.01397 link
2024-06-30 Graph in Graph Neural Network Jiongshu Wang et.al. 2407.00696 link
2024-06-29 Diving Deeper Into Pedestrian Behavior Understanding: Intention Estimation, Action Prediction, and Event Risk Assessment Amir Rasouli et.al. 2407.00446 link
2024-06-29 PerAct2: A Perceiver Actor Framework for Bimanual Manipulation Tasks Markus Grotz et.al. 2407.00278 null
2024-06-27 VideoMambaPro: A Leap Forward for Mamba in Video Understanding Hui Lu et.al. 2406.19006 link
2024-06-28 CSI4Free: GAN-Augmented mmWave CSI for Improved Pose Classification Nabeel Nisar Bhat et.al. 2406.18684 null
2024-06-26 The Surprising Effectiveness of Multimodal Large Language Models for Video Moment Retrieval Meinardus Boris et.al. 2406.18113 link
2024-07-01 EgoVideo: Exploring Egocentric Foundation Model and Downstream Adaptation Baoqi Pei et.al. 2406.18070 link
2024-06-26 Expressive Keypoints for Skeleton-based Action Recognition via Skeleton Transformation Yijie Yang et.al. 2406.18011 link
2024-06-25 Using joint angles based on the international biomechanical standards for human action recognition and related tasks Kevin Schlegel et.al. 2406.17443 null
2024-06-21 Open-Vocabulary Temporal Action Localization using Multimodal Guidance Akshita Gupta et.al. 2406.15556 null
2024-06-21 SVFormer: A Direct Training Spiking Transformer for Efficient Video Action Recognition Liutao Yu et.al. 2406.15034 null
2024-06-21 Real-Time Hand Gesture Recognition: Integrating Skeleton-Based Data Fusion and Multi-Stream CNN Oluwaleke Yusuf et.al. 2406.15003 link
2024-06-20 Self-supervised Multi-actor Social Activity Understanding in Streaming Videos Shubham Trehan et.al. 2406.14472 null
2024-06-19 An Efficient yet High-Performance Method for Precise Radar-Based Imaging of Human Hand Poses Johanna Bräunig et.al. 2406.13464 null
2024-06-19 Part-aware Unified Representation of Language and Skeleton for Zero-shot Action Recognition Anqi Zhu et.al. 2406.13327 link
2024-06-21 Underwater Human-Robot and Human-Swarm Interaction: A Review and Perspective Sara Aldhaheri et.al. 2406.12473 null
2024-06-18 Deep self-supervised learning with visualisation for automatic gesture recognition Fabien Allemand et.al. 2406.12440 null
2024-06-17 Brain-inspired Computational Modeling of Action Recognition with Recurrent Spiking Neural Networks Equipped with Reinforcement Delay Learning Alireza Nadafian et.al. 2406.11778 null
2024-06-18 CM2-Net: Continual Cross-Modal Mapping Network for Driver Action Recognition Ruoyu Wang et.al. 2406.11340 null
2024-06-17 Expanding the Design Space of Computer Vision-based Interactive Systems for Group Dance Practice Soohwan Lee et.al. 2406.11236 null
2024-06-14 Nymeria: A Massive Collection of Multimodal Egocentric Daily Motion in the Wild Lingni Ma et.al. 2406.09905 null
2024-06-12 Enhancing End-to-End Autonomous Driving with Latent World Model Yingyan Li et.al. 2406.08481 link
2024-06-09 ALGO: Object-Grounded Visual Commonsense Reasoning for Open-World Egocentric Action Recognition Sanjoy Kundu et.al. 2406.05722 null
2024-06-07 SMART: Scene-motion-aware human action recognition framework for mental disorder group Zengyuan Lai et.al. 2406.04649 link
2024-06-06 Enhancing Sign Language Detection through Mediapipe and Convolutional Neural Networks (CNN) Aditya Raj Verma et.al. 2406.03729 null
2024-06-05 The Logarithmic Memristor-Based Bayesian Machine Clément Turck et.al. 2406.03492 null
2024-06-05 FILS: Self-Supervised Video Feature Prediction In Semantic Language Space Mona Ahmadian et.al. 2406.03447 null
2024-06-05 Self-Supervised Skeleton Action Representation Learning: A Benchmark and Beyond Jiahang Zhang et.al. 2406.02978 null
2024-06-04 Contrastive Language Video Time Pre-training Hengyue Liu et.al. 2406.02631 null
2024-06-04 DL-KDD: Dual-Light Knowledge Distillation for Action Recognition in the Dark Chi-Jui Chang et.al. 2406.02468 null
2024-06-04 A Generalized Apprenticeship Learning Framework for Modeling Heterogeneous Student Pedagogical Strategies Md Mirajul Islam et.al. 2406.02450 null
2024-06-04 Analyzing the Feature Extractor Networks for Face Image Synthesis Erdi Sarıtaş et.al. 2406.02153 link
2024-06-04 Analyzing the Effect of Combined Degradations on Face Recognition Erdi Sarıtaş et.al. 2406.02142 link
2024-06-03 ELSA: Evaluating Localization of Social Activities in Urban Streets Maryam Hosseini et.al. 2406.01551 null
2024-06-03 HHMR: Holistic Hand Mesh Recovery by Enhancing the Multimodal Controllability of Graph Diffusion Models Mengcheng Li et.al. 2406.01334 null
2024-06-03 Augmented Commonsense Knowledge for Remote Object Grounding Bahram Mohammadi et.al. 2406.01256 link
2024-06-03 Understanding the Cross-Domain Capabilities of Video-Based Few-Shot Action Recognition Models Georgia Markham et.al. 2406.01073 null
2024-06-02 An Information Compensation Framework for Zero-Shot Skeleton-based Action Recognition Haojun Xu et.al. 2406.00639 null
2024-05-31 Action-OOD: An End-to-End Skeleton-Based Model for Robust Out-of-Distribution Human Action Detection Jing Xu et.al. 2405.20633 link
2024-05-31 Vision-Language Meets the Skeleton: Progressively Distillation with Cross-Modal Knowledge for 3D Action Representation Learning Yang Chen et.al. 2405.20606 null
2024-05-30 ENTIRe-ID: An Extensive and Diverse Dataset for Person Re-Identification Serdar Yildiz et.al. 2405.20465 null
2024-05-30 From Forest to Zoo: Great Ape Behavior Recognition with ChimpBehave Michael Fuchs et.al. 2405.20025 null
2024-05-31 Multimodal Cross-Domain Few-Shot Learning for Egocentric Action Recognition Masashi Hatano et.al. 2405.19917 null
2024-05-30 EgoSurgery-Phase: A Dataset of Surgical Phase Recognition from Egocentric Open Surgery Videos Ryo Fujii et.al. 2405.19644 link
2024-05-30 SAM-E: Leveraging Visual Foundation Model with Sequence Imitation for Embodied Manipulation Junjie Zhang et.al. 2405.19586 null
2024-05-29 Matrix Manifold Neural Networks++ Xuan Son Nguyen et.al. 2405.19206 null
2024-05-29 Exploring AI-based Anonymization of Industrial Image and Video Data in the Context of Feature Preservation Sabrina Cynthia Triess et.al. 2405.19173 null
2024-05-28 Flow-Assisted Motion Learning Network for Weakly-Supervised Group Activity Recognition Muhammad Adi Nugroho et.al. 2405.18012 null
2024-05-30 Benchmarking Skeleton-based Motion Encoder Models for Clinical Applications: Estimating Parkinson’s Disease Severity in Walking Sequences Vida Adeli et.al. 2405.17817 link
2024-05-28 Hierarchical Action Recognition: A Contrastive Video-Language Approach with Hierarchical Interactions Rui Zhang et.al. 2405.17729 null
2024-05-28 EgoNCE++: Do Egocentric Video-Language Models Really Understand Hand-Object Interactions? Boshen Xu et.al. 2405.17719 link
2024-05-27 Advancements in Tactile Hand Gesture Recognition for Enhanced Human-Machine Interaction Chiara Fumelli et.al. 2405.17038 null
2024-05-27 A Cross-Dataset Study for Text-based 3D Human Motion Retrieval Léore Bensabath et.al. 2405.16909 null
2024-05-26 Flow Snapshot Neurons in Action: Deep Neural Networks Generalize to Biological Motion Perception Shuangpeng Han et.al. 2405.16493 null
2024-05-25 Application of Artificial Intelligence in Hand Gesture Recognition with Virtual Reality: Survey and Analysis of Hand Gesture Hardware Selection Jindi Wang et.al. 2405.16264 null
2024-05-22 From CNNs to Transformers in Multimodal Human Action Recognition: A Survey Muhammad Bilal Shaikh et.al. 2405.15813 null
2024-05-24 V-Zen: Efficient GUI Understanding and Precise Grounding With A Novel Multimodal LLM Abdur Rahman et.al. 2405.15341 link
2024-05-23 Enhanced Spatiotemporal Prediction Using Physical-guided And Frequency-enhanced Recurrent Neural Networks Xuanle Zhao et.al. 2405.14504 null
2024-05-23 SpGesture: Source-Free Domain-adaptive sEMG-based Gesture Recognition with Jaccard Attentive Spiking Neural Network Weiyu Guo et.al. 2405.14398 null
2024-05-23 MAMBA4D: Efficient Long-Sequence Point Cloud Video Understanding with Disentangled Spatial-Temporal State Space Models Jiuming Liu et.al. 2405.14338 null
2024-05-22 Counterfactual Gradients-based Quantification of Prediction Trust in Neural Networks Mohit Prabhushankar et.al. 2405.13758 null
2024-05-21 Identity-free Artificial Emotional Intelligence via Micro-Gesture Understanding Rong Gao et.al. 2405.13206 null
2024-05-22 Building Temporal Kernels with Orthogonal Polynomials Yan Ru Pei et.al. 2405.12179 link
2024-05-18 GestFormer: Multiscale Wavelet Pooling Transformer Network for Dynamic Hand Gesture Recognition Mallika Garg et.al. 2405.11180 link
2024-05-17 Air Signing and Privacy-Preserving Signature Verification for Digital Documents P. Sarveswarasarma et.al. 2405.10868 null
2024-05-17 MC-GPT: Empowering Vision-and-Language Navigation with Memory Map and Reasoning Chains Zhaohuan Zhan et.al. 2405.10620 null
2024-05-06 MEET: Mixture of Experts Extra Tree-Based sEMG Hand Gesture Identification Naveen Gehlot et.al. 2405.09562 null
2024-05-14 Wearable Sensor-Based Few-Shot Continual Learning on Hand Gestures for Motor-Impaired Individuals via Latent Embedding Exploitation Riyad Bin Rafiq et.al. 2405.08969 link
2024-05-14 The impact of Compositionality in Zero-shot Multi-label action recognition for Object-based tasks Carmela Calabrese et.al. 2405.08695 null
2024-05-15 POWQMIX: Weighted Value Factorization with Potentially Optimal Joint Actions Recognition for Cooperative Multi-Agent Reinforcement Learning Chang Huang et.al. 2405.08036 null
2024-05-13 Coarse or Fine? Recognising Action End States without Labels Davide Moltisanti et.al. 2405.07723 link
2024-05-11 PRENet: A Plane-Fit Redundancy Encoding Point Cloud Sequence Network for Real-Time 3D Action Recognition Shenglin He et.al. 2405.06929 null
2024-05-10 CasCalib: Cascaded Calibration for Motion Capture from Sparse Unsynchronized Cameras James Tang et.al. 2405.06845 link
2024-05-09 A Survey on Backbones for Deep Video Action Recognition Zixuan Tang et.al. 2405.05584 null
2024-05-06 OmniActions: Predicting Digital Actions in Response to Real-World Multimodal Sensory Inputs with LLMs Jiahao Nick Li et.al. 2405.03901 null
2024-05-05 JOSENet: A Joint Stream Embedding Network for Violence Detection in Surveillance Videos Pietro Nardelli et.al. 2405.02961 null
2024-05-03 On the Utility of External Agent Intention Predictor for Human-AI Coordination Chenxu Wang et.al. 2405.02229 null
2024-05-11 MVP-Shot: Multi-Velocity Progressive-Alignment Framework for Few-Shot Action Recognition Hongyu Qu et.al. 2405.02077 null
2024-05-03 Enhancing Micro Gesture Recognition for Emotion Understanding via Context-aware Visual-Text Contrastive Learning Deng Li et.al. 2405.01885 link
2024-05-02 Multi-view Action Recognition via Directed Gromov-Wasserstein Discrepancy Hoang-Quan Nguyen et.al. 2405.01337 null
2024-05-07 Towards Inclusive Face Recognition Through Synthetic Ethnicity Alteration Praveen Kumar Chandaliya et.al. 2405.01273 null
2024-04-30 One-Stage Open-Vocabulary Temporal Action Detection Leveraging Temporal Multi-scale and Action Label Features Trung Thanh Nguyen et.al. 2404.19542 link
2024-04-30 Cross-Block Fine-Grained Semantic Cascade for Skeleton-Based Sports Action Recognition Zhendong Liu et.al. 2404.19383 null
2024-04-28 Enhancing Action Recognition from Low-Quality Skeleton Data via Part-Level Knowledge Distillation Cuiwei Liu et.al. 2404.18206 null
2024-04-26 SDFD: Building a Versatile Synthetic Face Image Dataset with Diverse Attributes Georgia Baltsou et.al. 2404.17255 null
2024-04-25 Learning Discriminative Spatio-temporal Representations for Semi-supervised Action Recognition Yu Wang et.al. 2404.16416 null
2024-04-25 An Improved Graph Pooling Network for Skeleton-Based Action Recognition Cong Wu et.al. 2404.16359 null
2024-04-24 Unimodal and Multimodal Sensor Fusion for Wearable Activity Recognition Hymalai Bello et.al. 2404.16005 null
2024-04-24 3D Face Morphing Attack Generation using Non-Rigid Registration Jag Mohan Singh et.al. 2404.15765 null
2024-04-25 HDBN: A Novel Hybrid Dual-branch Network for Robust Skeleton-based Action Recognition Jinfu Liu et.al. 2404.15719 link
2024-04-23 Combating Missing Modalities in Egocentric Videos at Test Time Merey Ramazanova et.al. 2404.15161 null
2024-04-23 G3R: Generating Rich and Fine-grained mmWave Radar Data from 2D Videos for Generalized Gesture Recognition Kaikai Deng et.al. 2404.14934 null
2024-04-23 Driver Activity Classification Using Generalizable Representations from Vision-Language Models Ross Greer et.al. 2404.14906 null
2024-04-23 DENOISER: Rethinking the Robustness for Open-Vocabulary Action Recognition Haozhe Cheng et.al. 2404.14890 null
2024-04-22 1st Place Solution to the 1st SkatingVerse Challenge Tao Sun et.al. 2404.14032 null
2024-04-22 CoFInAl: Enhancing Action Quality Assessment with Coarse-to-Fine Instruction Alignment Kanglei Zhou et.al. 2404.13999 link
2024-04-21 Attack on Scene Flow using Point Clouds Haniyeh Ehsani Oskouie et.al. 2404.13621 null
2024-04-20 STAT: Towards Generalizable Temporal Action Localization Yangcen Liu et.al. 2404.13311 null
2024-04-19 Ring-a-Pose: A Ring for Continuous Hand Pose Tracking Tianhong Catherine Yu et.al. 2404.12980 null
2024-04-19 VoxAtnNet: A 3D Point Clouds Convolutional Neural Network for Generalizable Face Presentation Attack Detection Raghavendra Ramachandra et.al. 2404.12680 null
2024-04-18 DeepLocalization: Using change point detection for Temporal Action Localization Mohammed Shaiqur Rahman et.al. 2404.12258 null
2024-04-18 Aligning Actions and Walking to LLM-Generated Textual Descriptions Radu Chivereanu et.al. 2404.12192 link
2024-04-18 Simultaneous Detection and Interaction Reasoning for Object-Centric Action Recognition Xunsong Li et.al. 2404.11903 null
2024-04-18 sEMG-based Fine-grained Gesture Recognition via Improved LightGBM Model Xiupeng Qiao et.al. 2404.11861 null
2024-04-17 VG4D: Vision-Language Model Goes 4D Video Recognition Zhichao Deng et.al. 2404.11605 link
2024-04-17 A Data-Driven Representation for Sign Language Production Harry Walsh et.al. 2404.11499 link
2024-04-17 Lower Limb Movements Recognition Based on Feature Recursive Elimination and Backpropagation Neural Network Yongkai Ma et.al. 2404.11383 null
2024-04-17 Revisiting Noise Resilience Strategies in Gesture Recognition: Short-Term Enhancement in Surface Electromyographic Signal Analysis Weiyu Guo et.al. 2404.11213 null
2024-04-17 Kathakali Hand Gesture Recognition With Minimal Data Kavitha Raju et.al. 2404.11205 null
2024-04-16 HumMUSS: Human Motion Understanding using State Space Models Arnab Kumar Mondal et.al. 2404.10880 null
2024-04-17 Learning to Score Sign Language with Two-stage Method Hongli Wen et.al. 2404.10383 null
2024-04-16 MK-SGN: A Spiking Graph Convolutional Network with Multimodal Fusion and Knowledge Distillation for Skeleton-based Action Recognition Naichuan Zheng et.al. 2404.10210 null
2024-04-15 Design and Analysis of Efficient Attention in Transformers for Social Group Activity Recognition Masato Tamura et.al. 2404.09964 null
2024-04-15 A Diffusion-based Data Generator for Training Object Recognition Models in Ultra-Range Distance Eran Bamani et.al. 2404.09846 null
2024-04-15 Leveraging Temporal Contextualization for Video Action Recognition Minji Kim et.al. 2404.09490 link
2024-04-14 In My Perspective, In My Hands: Accurate Egocentric 2D Hand Pose and Action Recognition Wiktor Mucha et.al. 2404.09308 null
2024-04-13 Exploring Explainability in Video Action Recognition Avinab Saha et.al. 2404.09067 null
2024-04-12 MSSTNet: A Multi-Scale Spatio-Temporal CNN-Transformer Network for Dynamic Facial Expression Recognition Linhuang Wang et.al. 2404.08433 null
2024-04-11 Graph Integrated Language Transformers for Next Action Prediction in Complex Phone Calls Amin Hosseiny Marani et.al. 2404.08155 null
2024-04-11 Simba: Mamba augmented U-ShiftGCN for Skeletal Action Recognition in Videos Soumyabrata Chaudhuri et.al. 2404.07645 null
2024-04-15 Fine-Grained Side Information Guided Dual-Prompts for Zero-Shot Skeleton Action Recognition Yang Chen et.al. 2404.07487 null
2024-04-10 O-TALC: Steps Towards Combating Oversegmentation within Online Action Segmentation Matthew Kent Myers et.al. 2404.06894 null
2024-04-10 An Animation-based Augmentation Approach for Action Recognition from Discontinuous Video Xingyu Song et.al. 2404.06741 null
2024-04-07 X-VARS: Introducing Explainability in Football Refereeing with Multi-Modal Large Language Model Jan Held et.al. 2404.06332 null
2024-04-10 Algorithms for Caching and MTS with reduced number of predictions Karim Abdel Sadek et.al. 2404.06280 null
2024-04-09 ActNetFormer: Transformer-ResNet Hybrid Method for Semi-Supervised Action Recognition in Videos Sharana Dharshikgan Suresh Dass et.al. 2404.06243 link
2024-04-08 Localizing Moments of Actions in Untrimmed Videos of Infants with Autism Spectrum Disorder Halil Ismail Helvaci et.al. 2404.05849 null
2024-04-09 TIM: A Time Interval Machine for Audio-Visual Action Recognition Jacob Chalk et.al. 2404.05559 link
2024-04-11 Test-Time Zero-Shot Temporal Action Localization Benedetta Liberatori et.al. 2404.05426 link
2024-04-09 SDFR: Synthetic Data for Face Recognition Competition Hatef Otroshi Shahreza et.al. 2404.04580 null
2024-04-05 PhysPT: Physics-aware Pretrained Transformer for Estimating Human Dynamics from Monocular Videos Yufei Zhang et.al. 2404.04430 null
2024-04-05 Koala: Key frame-conditioned long video-LLM Reuben Tan et.al. 2404.04346 null
2024-04-04 UniAV: Unified Audio-Visual Perception for Multi-Task Video Localization Tiantian Geng et.al. 2404.03179 null
2024-04-03 Optimizing the Deployment of Tiny Transformers on Low-Power MCUs Victor J. B. Jung et.al. 2404.02945 link
2024-04-03 Multi-Scale Spatial-Temporal Self-Attention Graph Convolutional Networks for Skeleton-based Action Recognition Ikuo Nakamura et.al. 2404.02624 null
2024-04-02 PREGO: online mistake detection in PRocedural EGOcentric videos Alessandro Flaborea et.al. 2404.01933 link
2024-04-02 Disentangled Pre-training for Human-Object Interaction Detection Zhuolong Li et.al. 2404.01725 link
2024-04-02 Language Model Guided Interpretable Video Action Reasoning Ning Wang et.al. 2404.01591 null
2024-04-02 Leveraging YOLO-World and GPT-4V LMMs for Zero-Shot Person Detection and Action Recognition in Drone Imagery Christian Limberg et.al. 2404.01571 null
2024-04-01 LoSA: Long-Short-range Adapter for Scaling End-to-End Temporal Action Localization Akshita Gupta et.al. 2404.01282 null
2024-03-31 LLMs are Good Action Recognizers Haoxuan Qu et.al. 2404.00532 null
2024-03-29 Latent Embedding Clustering for Occlusion Robust Head Pose Estimation José Celestino et.al. 2403.20251 null
2024-03-29 A Unified Framework for Human-centric Point Cloud Video Understanding Yiteng Xu et.al. 2403.20031 null
2024-03-28 Zero-shot Prompt-based Video Encoder for Surgical Gesture Recognition Mingxing Rao et.al. 2403.19786 link
2024-03-28 Hypergraph-based Multi-View Action Recognition using Event Cameras Yue Gao et.al. 2403.19316 null
2024-03-27 PLOT-TAL – Prompt Learning with Optimal Transport for Few-Shot Temporal Action Localization Edward Fish et.al. 2403.18915 null
2024-03-27 iFace: Hand-Over-Face Gesture Recognition Leveraging Impedance Sensing Mengxi Liu et.al. 2403.18433 null
2024-03-27 An Evolutionary Network Architecture Search Framework with Adaptive Multimodal Fusion for Hand Gesture Recognition Yizhang Xia et.al. 2403.18208 null
2024-03-26 OmniVid: A Generative Framework for Universal Video Understanding Junke Wang et.al. 2403.17935 link
2024-03-25 Understanding Long Videos in One Multimodal Language Model Pass Kanchana Ranasinghe et.al. 2403.16998 link
2024-03-25 Benchmarks and Challenges in Pose Estimation for Egocentric Hand Interactions with Objects Zicong Fan et.al. 2403.16428 null
2024-03-24 Emotion Recognition from the perspective of Activity Recognition Savinay Nagendra et.al. 2403.16263 null
2024-03-22 InternVideo2: Scaling Video Foundation Models for Multimodal Video Understanding Yi Wang et.al. 2403.15377 link
2024-03-22 Gesture-Controlled Aerial Robot Formation for Human-Swarm Interaction in Safety Monitoring Applications Vít Krátký et.al. 2403.15333 null
2024-03-22 GCN-DevLSTM: Path Development for Skeleton-Based Action Recognition Lei Jiang et.al. 2403.15212 link
2024-03-21 Transfer Learning for Cross-dataset Isolated Sign Language Recognition in Under-Resourced Datasets Ahmet Alp Kindiroglu et.al. 2403.14534 link
2024-03-20 Hierarchical NeuroSymbolic Approach for Action Quality Assessment Lauren Okamoto et.al. 2403.13798 null
2024-03-19 Selective, Interpretable, and Motion Consistent Privacy Attribute Obfuscation for Action Recognition Filip Ilic et.al. 2403.12710 null
2024-03-19 ExACT: Language-guided Conceptual Reasoning and Uncertainty Estimation for Event-based Action Recognition and More Jiazhou Zhou et.al. 2403.12534 null
2024-03-19 VideoBadminton: A Video Dataset for Badminton Action Recognition Qi Li et.al. 2403.12385 null
2024-03-19 Multi-View Video-Based Learning: Leveraging Weak Labels for Frame-Level Perception Vijay John et.al. 2403.11616 null
2024-03-19 VIHE: Virtual In-Hand Eye Transformer for 3D Robotic Manipulation Weiyao Wang et.al. 2403.11461 null
2024-03-17 A Lie Group Approach to Riemannian Batch Normalization Ziheng Chen et.al. 2403.11261 link
2024-03-17 Boosting Semi-Supervised Temporal Action Localization by Learning from Non-Target Classes Kun Xia et.al. 2403.11189 null
2024-03-16 CoPlay: Audio-agnostic Cognitive Scaling for Acoustic Sensing Yin Li et.al. 2403.10796 null
2024-03-15 CrossGLG: LLM Guides One-shot Skeleton-based 3D Action Recognition in a Cross-level Manner Tingbing Yan et.al. 2403.10082 null
2024-03-15 Skeleton-Based Human Action Recognition with Noisy Labels Yi Xu et.al. 2403.09975 null
2024-03-14 On the Utility of 3D Hand Poses for Action Recognition Md Salman Shamil et.al. 2403.09805 null
2024-03-14 3D-VLA: A 3D Vision-Language-Action Generative World Model Haoyu Zhen et.al. 2403.09631 link
2024-03-14 SkateFormer: Skeletal-Temporal Transformer for Human Action Recognition Jeonghyeok Do et.al. 2403.09508 link
2024-03-14 EventRPG: Event Data Augmentation with Relevance Propagation Guidance Mingyuan Sun et.al. 2403.09274 link
2024-03-14 Leveraging Foundation Model Automatic Data Augmentation Strategies and Skeletal Points for Hands Action Recognition in Industrial Assembly Lines Liang Wu et.al. 2403.09056 null
2024-03-13 Low-Cost and Real-Time Industrial Human Action Recognitions Based on Large-Scale Foundation Models Wensheng Liang et.al. 2403.08420 null
2024-03-13 NaturalVLM: Leveraging Fine-grained Natural Language for Affordance-Guided Visual Manipulation Ran Xu et.al. 2403.08355 null
2024-03-13 ManiGaussian: Dynamic Gaussian Splatting for Multi-task Robotic Manipulation Guanxing Lu et.al. 2403.08321 link
2024-03-12 NavCoT: Boosting LLM-Based Vision-and-Language Navigation via Learning Disentangled Reasoning Bingqian Lin et.al. 2403.07376 link
2024-03-12 BID: Boundary-Interior Decoding for Unsupervised Temporal Action Localization Pre-Trainin Qihang Fang et.al. 2403.07354 null
2024-03-11 Attention Prompt Tuning: Parameter-efficient Adaptation of Pre-trained Models for Spatiotemporal Modeling Wele Gedara Chaminda Bandara et.al. 2403.06978 link
2024-03-11 Deep Learning Approaches for Human Action Recognition in Video Data Yufei Xie et.al. 2403.06810 null
2024-03-11 Real-Time Multimodal Cognitive Assistant for Emergency Medical Services Keshara Weerasinghe et.al. 2403.06734 null
2024-03-11 Multimodal Transformers for Real-Time Surgical Activity Prediction Keshara Weerasinghe et.al. 2403.06705 link
2024-03-11 epsilon-Mesh Attack: A Surface-based Adversarial Point Cloud Attack for Facial Expression Recognition Batuhan Cengiz et.al. 2403.06661 null
2024-03-11 Density-Guided Label Smoothing for Temporal Localization of Driving Actions Tunc Alkanat et.al. 2403.06616 null
2024-03-11 Transformer-based Fusion of 2D-pose and Spatio-temporal Embeddings for Distracted Driver Action Recognition Erkut Akdag et.al. 2403.06577 null
2024-03-10 Coherent Temporal Synthesis for Incremental Action Segmentation Guodong Ding et.al. 2403.06102 null
2024-03-09 Dissecting Deep RL with High Update Ratios: Combatting Value Overestimation and Divergence Marcel Hussing et.al. 2403.05996 null
2024-03-08 Benchmarking Micro-action Recognition: Dataset, Methods, and Applications Dan Guo et.al. 2403.05234 link
2024-03-06 Video Relationship Detection Using Mixture of Experts Ala Shaabana et.al. 2403.03994 link
2024-03-05 Behavior Generation with Latent Actions Seungjae Lee et.al. 2403.03181 link
2024-03-05 Learning to Use Tools via Cooperative and Interactive Agents Zhengliang Shi et.al. 2403.03031 null
2024-03-04 Gesture recognition with Brownian reservoir computing using geometrically confined skyrmion dynamics Grischa Beneke et.al. 2403.01877 null
2024-03-04 A Simple Baseline for Efficient Hand Mesh Reconstruction Zhishan Zhou et.al. 2403.01813 null
2024-03-03 A Unified Model Selection Technique for Spectral Clustering Based Motion Segmentation Yuxiang Huang et.al. 2403.01606 null
2024-03-03 Rethinking CLIP-based Video Learners in Cross-Domain Open-Vocabulary Action Recognition Kun-Yu Lin et.al. 2403.01560 link
2024-03-02 Dynamic 3D Point Cloud Sequences as 2D Videos Yiming Zeng et.al. 2403.01129 null
2024-02-29 On the Design of Human-Robot Collaboration Gestures Anas Shrinah et.al. 2402.19058 null
2024-02-23 Multimodal Transformer With a Low-Computational-Cost Guarantee Sungjin Park et.al. 2402.15096 null
2024-02-17 Implementation of a Model of the Cortex Basal Ganglia Loop Naoya Arakawa et.al. 2402.13275 null
2024-02-20 Radar-Based Recognition of Static Hand Gestures in American Sign Language Christian Schuessler et.al. 2402.12800 null
2024-02-20 Learning Domain-Invariant Temporal Dynamics for Few-Shot Action Recognition Yuke Li et.al. 2402.12706 null
2024-02-19 Comprehensive Cognitive LLM Agent for Smartphone GUI Automation Xinbei Ma et.al. 2402.11941 link
2024-02-15 Hand Shape and Gesture Recognition using Multiscale Template Matching, Background Subtraction and Binary Image Analysis Ketan Suhaas Saichandran et.al. 2402.09663 null
2024-02-14 TikTokActions: A TikTok-Derived Video Dataset for Human Action Recognition Yang Qian et.al. 2402.08875 null
2024-02-13 BdSLW60: A Word-Level Bangla Sign Language Dataset Husne Ara Rubaiyeat et.al. 2402.08635 link
2024-02-13 Vision-Based Hand Gesture Customization from a Single Demonstration Soroush Shahi et.al. 2402.08420 null
2024-02-12 PBADet: A One-Stage Anchor-Free Approach for Part-Body Association Zhongpai Gao et.al. 2402.07814 null

(<a href=../README.md>back to main</a>)