Action Recognition - 2025-06 | Paper Arxiv Daily

Action Recognition - 2025-06

Publish Date	Title	Authors	PDF	Translate	Read	Code
2025-06-30	LineRetriever: Planning-Aware Observation Reduction for Web Agents	Imene Kerboua et.al.	2507.00210	translate	read	null
2025-06-30	Online Human Action Detection during Escorting	Siddhartha Mondal et.al.	2506.23573	translate	read	null
2025-06-29	DEL: Dense Event Localization for Multi-modal Audio-Visual Understanding	Mona Ahmadian et.al.	2506.23196	translate	read	null
2025-06-27	Frequency-Semantic Enhanced Variational Autoencoder for Zero-Shot Skeleton-based Action Recognition	Wenhan Wu et.al.	2506.22179	translate	read	null
2025-06-26	WorldVLA: Towards Autoregressive Action World Model	Jun Cen et.al.	2506.21539	translate	read	link
2025-06-26	EgoAdapt: Adaptive Multisensory Distillation and Policy Learning for Efficient Egocentric Perception	Sanjoy Chowdhury et.al.	2506.21080	translate	read	null
2025-06-25	How do Foundation Models Compare to Skeleton-Based Approaches for Gesture Recognition in Human-Robot Interaction?	Stephanie Käs et.al.	2506.20795	translate	read	null
2025-06-25	CARMA: Context-Aware Situational Grounding of Human-Robot Group Interactions by Combining Vision-Language Models with Object and Action Recognition	Joerg Deigmoeller et.al.	2506.20373	translate	read	null
2025-06-25	Feature Hallucination for Self-supervised Action Recognition	Lei Wang et.al.	2506.20342	translate	read	null
2025-06-27	ReactEMG: Zero-Shot, Low-Latency Intent Detection via sEMG	Runsheng Wang et.al.	2506.19815	translate	read	null
2025-06-24	Self-Paced Collaborative and Adversarial Network for Unsupervised Domain Adaptation	Weichen Zhang et.al.	2506.19267	translate	read	null
2025-06-23	Including Semantic Information via Word Embeddings for Skeleton-based Action Recognition	Dustin Aganian et.al.	2506.18721	translate	read	null
2025-06-23	Improving Weakly Supervised Temporal Action Localization by Exploiting Multi-resolution Information in Temporal Domain	Rui Su et.al.	2506.18261	translate	read	null
2025-06-23	Robot Tactile Gesture Recognition Based on Full-body Modular E-skin	Shuo Jiang et.al.	2506.18256	translate	read	null
2025-06-22	Adapting Vision-Language Models for Evaluating World Models	Mariya Hendriksen et.al.	2506.17967	translate	read	null
2025-06-21	Domain Generalization using Action Sequences for Egocentric Action Recognition	Amirshayan Nasirimajd et.al.	2506.17685	translate	read	null
2025-06-20	Wi-Fi Sensing Tool Release: Gathering 802.11ax Channel State Information from a Commercial Wi-Fi Access Point	Zisheng Wang et.al.	2506.16957	translate	read	null
2025-06-20	Language-driven Description Generation and Common Sense Reasoning for Video Action Recognition	Xiaodan Hu et.al.	2506.16701	translate	read	null
2025-06-19	CLIP-MG: Guiding Semantic Attention with Skeletal Pose Features and RGB Data for Micro-Gesture Recognition on the iMiGUE Dataset	Santosh Patapati et.al.	2506.16385	translate	read	null
2025-06-18	Accessible Gesture-Driven Augmented Reality Interaction System	Yikan Wang et.al.	2506.15189	translate	read	null
2025-06-17	CDP: Towards Robust Autoregressive Visuomotor Policy Learning via Causal Diffusion	Jiahua Ma et.al.	2506.14769	translate	read	null
2025-06-16	Leveraging Vision-Language Pre-training for Human Activity Recognition in Still Images	Cristina Mahanta et.al.	2506.13458	translate	read	null
2025-06-16	Active Multimodal Distillation for Few-shot Action Recognition	Weijia Feng et.al.	2506.13322	translate	read	null
2025-06-16	Action Dubber: Timing Audible Actions via Inflectional Flow	Wenlong Wan et.al.	2506.13320	translate	read	null
2025-06-15	Towards Fine-Grained Emotion Understanding via Skeleton-Based Micro-Gesture Recognition	Hao Xu et.al.	2506.12848	translate	read	null
2025-06-13	Pose Matters: Evaluating Vision Transformers and CNNs for Human Action Recognition on Small COCO Subsets	MingZe Tang et.al.	2506.11678	translate	read	null
2025-06-12	GynSurg: A Comprehensive Gynecology Laparoscopic Surgery Dataset	Sahar Nasirihaghighi et.al.	2506.11356	translate	read	null
2025-06-12	WaveFormer: A Lightweight Transformer Model for sEMG-based Gesture Recognition	Yanlong Chen et.al.	2506.11168	translate	read	null
2025-06-11	SLRNet: A Real-Time LSTM-Based Sign Language Recognition System	Sharvari Kamble et.al.	2506.11154	translate	read	link
2025-06-10	Gender Fairness of Machine Learning Algorithms for Pain Detection	Dylan Green et.al.	2506.11132	translate	read	null
2025-06-12	Eye, Robot: Learning to Look to Act with a BC-RL Perception-Action Loop	Justin Kerr et.al.	2506.10968	translate	read	null
2025-06-11	HopaDIFF: Holistic-Partial Aware Fourier Conditioned Diffusion for Referring Human Action Segmentation in Multi-Person Scenarios	Kunyu Peng et.al.	2506.09650	translate	read	link
2025-06-11	Time-Unified Diffusion Policy with Action Discrimination for Robotic Manipulation	Ye Niu et.al.	2506.09422	translate	read	null
2025-06-11	Synthetic Human Action Video Data Generation with Pose Transfer	Vaclav Knapp et.al.	2506.09411	translate	read	null
2025-06-11	An Effective End-to-End Solution for Multimodal Action Recognition	Songping Wang et.al.	2506.09345	translate	read	null
2025-06-10	Diver-Robot Communication Dataset for Underwater Hand Gesture Recognition	Igor Kvasić et.al.	2506.08974	translate	read	null
2025-06-09	BridgeVLA: Input-Output Alignment for Efficient 3D Manipulation Learning with Vision-Language Models	Peiyan Li et.al.	2506.07961	translate	read	link
2025-06-08	AugmentGest: Can Random Data Cropping Augmentation Boost Gesture Recognition Performance?	Nada Aboudeshish et.al.	2506.07216	translate	read	null
2025-06-08	SAP-Bench: Benchmarking Multimodal Large Language Models in Surgical Action Planning	Mengya Xu et.al.	2506.07196	translate	read	null
2025-06-07	PhysLab: A Benchmark Dataset for Multi-Granularity Visual Parsing of Physics Experiments	Minghao Zou et.al.	2506.06631	translate	read	null
2025-06-06	Conversational Interfaces for Parametric Conceptual Architectural Design: Integrating Mixed Reality with LLM-driven Interaction	Ruochen Ji et.al.	2506.06066	translate	read	null
2025-06-06	DriveAction: A Benchmark for Exploring Human-like Driving Decisions in VLA Models	Yuhan Hao et.al.	2506.05667	translate	read	null
2025-06-05	Robustness Evaluation for Video Models with Reinforcement Learning	Ashwin Ramesh Babu et.al.	2506.05431	translate	read	null
2025-06-04	Video, How Do Your Tokens Merge?	Sam Pollard et.al.	2506.03885	translate	read	null
2025-06-04	Zero-Shot Temporal Interaction Localization for Egocentric Videos	Erhang Zhang et.al.	2506.03662	translate	read	link
2025-06-04	Heterogeneous Skeleton-Based Action Representation Learning	Hongsong Wang et.al.	2506.03481	translate	read	null
2025-06-04	Go Beyond Earth: Understanding Human Actions and Scenes in Microgravity Environments	Di Wen et.al.	2506.02845	translate	read	link
2025-06-03	Technical Report for Ego4D Long-Term Action Anticipation Challenge 2025	Qiaohui Chu et.al.	2506.02550	translate	read	null
2025-06-03	VS-Bench: Evaluating VLMs for Strategic Reasoning and Decision-Making in Multi-Agent Environments	Zelai Xu et.al.	2506.02387	translate	read	link
2025-06-03	Multi-level and Multi-modal Action Anticipation	Seulgi Kim et.al.	2506.02382	translate	read	null
2025-06-02	TransAct V2: Lifelong User Action Sequence Modeling on Pinterest Recommendation	Xue Xia et.al.	2506.02267	translate	read	null
2025-06-02	SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics	Mustafa Shukor et.al.	2506.01844	translate	read	link
2025-06-02	Efficient Egocentric Action Recognition with Multimodal Data	Marco Calzavara et.al.	2506.01757	translate	read	null
2025-06-02	EPFL-Smart-Kitchen-30: Densely annotated cooking dataset with 3D kinematics to challenge video and language models	Andy Bonnetto et.al.	2506.01608	translate	read	link
2025-06-02	Sheep Facial Pain Assessment Under Weighted Graph Neural Networks	Alam Noor et.al.	2506.01468	translate	read	null
2025-06-02	EgoBrain: Synergizing Minds and Eyes For Human Action Understanding	Nie Lin et.al.	2506.01353	translate	read	null

(<a href=../Action_Recognition.md>back to Action Recognition</a>)