Action Recognition - 2024-10
Action Recognition - 2024-10
| Publish Date | Title | Authors | Translate | Read | Code | |
|---|---|---|---|---|---|---|
| 2024-10-31 | Technical Report for ActivityNet Challenge 2022 – Temporal Action Localization | Shimin Chen et.al. | 2411.00883 | translate | read | null |
| 2024-10-30 | A Simple and Effective Temporal Grounding Pipeline for Basketball Broadcast Footage | Levi Harris et.al. | 2411.00862 | translate | read | null |
| 2024-10-31 | Recovering Complete Actions for Cross-dataset Skeleton Action Recognition | Hanchao Liu et.al. | 2410.23641 | translate | read | null |
| 2024-10-30 | Keypoint Abstraction using Large Models for Object-Relative Imitation Learning | Xiaolin Fang et.al. | 2410.23254 | translate | read | null |
| 2024-10-30 | AtGCN: A Graph Convolutional Network For Ataxic Gait Detection | Karan Bania et.al. | 2410.22862 | translate | read | null |
| 2024-10-29 | ProMQA: Question Answering Dataset for Multimodal Procedural Activity Understanding | Kimihiro Hasegawa et.al. | 2410.22211 | translate | read | link |
| 2024-10-29 | Multi-Level Feature Distillation of Joint Teachers Trained on Distinct Image Datasets | Adrian Iordache et.al. | 2410.22184 | translate | read | link |
| 2024-10-28 | Enhancing Action Recognition by Leveraging the Hierarchical Structure of Actions and Textual Context | Manuel Benavent-Lledo et.al. | 2410.21275 | translate | read | link |
| 2024-10-28 | One-Step Diffusion Policy: Fast Visuomotor Policies via Diffusion Distillation | Zhendong Wang et.al. | 2410.21257 | translate | read | null |
| 2024-10-28 | Zero-Shot Action Recognition in Surveillance Videos | Joao Pereira et.al. | 2410.21113 | translate | read | null |
| 2024-10-28 | LiGAR: LiDAR-Guided Hierarchical Transformer for Multi-Modal Group Activity Recognition | Naga Venkata Sai Raviteja Chappa et.al. | 2410.21108 | translate | read | null |
| 2024-10-27 | Exocentric To Egocentric Transfer For Action Recognition: A Short Survey | Anirudh Thatipelli et.al. | 2410.20621 | translate | read | null |
| 2024-10-27 | Idempotent Unsupervised Representation Learning for Skeleton-Based Action Recognition | Lilang Lin et.al. | 2410.20349 | translate | read | null |
| 2024-10-28 | x-RAGE: eXtended Reality – Action & Gesture Events Dataset | Vivek Parmar et.al. | 2410.19486 | translate | read | null |
| 2024-10-24 | Ferret-UI 2: Mastering Universal User Interface Understanding Across Platforms | Zhangheng Li et.al. | 2410.18967 | translate | read | link |
| 2024-10-24 | Research on gesture recognition method based on SEDCNN-SVM | Mingjin Zhang et.al. | 2410.18557 | translate | read | null |
| 2024-10-23 | Unsupervised Domain Adaptation for Action Recognition via Self-Ensembling and Conditional Embedding Alignment | Indrajeet Ghosh et.al. | 2410.17489 | translate | read | link |
| 2024-10-22 | Are Visual-Language Models Effective in Action Recognition? A Comparative Study | Mahmoud Ali et.al. | 2410.17149 | translate | read | null |
| 2024-10-22 | Masked Differential Privacy | David Schneider et.al. | 2410.17098 | translate | read | null |
| 2024-10-22 | SpikMamba: When SNN meets Mamba in Event-based Human Action Recognition | Jiaqi Chen et.al. | 2410.16746 | translate | read | link |
| 2024-10-21 | Improving the Multi-label Atomic Activity Recognition by Robust Visual Feature and Advanced Attention @ ROAD++ Atomic Activity Recognition 2024 | Jiamin Cao et.al. | 2410.16037 | translate | read | null |
| 2024-10-19 | CAGE: Causal Attention Enables Data-Efficient Generalizable Robotic Manipulation | Shangning Xia et.al. | 2410.14974 | translate | read | null |
| 2024-10-18 | DFlow: Diverse Dialogue Flow Simulation with Large Language Models | Wanyu Du et.al. | 2410.14853 | translate | read | null |
| 2024-10-18 | Storyboard guided Alignment for Fine-grained Video Action Recognition | Enqi Liu et.al. | 2410.14238 | translate | read | null |
| 2024-10-17 | SimpleToM: Exposing the Gap between Explicit ToM Inference and Implicit ToM Application in LLMs | Yuling Gu et.al. | 2410.13648 | translate | read | null |
| 2024-10-16 | In-Context Learning Enables Robot Action Prediction in LLMs | Yida Yin et.al. | 2410.12782 | translate | read | null |
| 2024-10-14 | Continual Learning Improves Zero-Shot Action Recognition | Shreyank N Gowda et.al. | 2410.10497 | translate | read | null |
| 2024-10-16 | PIVOT-R: Primitive-Driven Waypoint-Aware World Model for Robotic Manipulation | Kaidong Zhang et.al. | 2410.10394 | translate | read | null |
| 2024-10-13 | EITNet: An IoT-Enhanced Framework for Real-Time Basketball Action Recognition | Jingyu Liu et.al. | 2410.09954 | translate | read | null |
| 2024-10-13 | Multi class activity classification in videos using Motion History Image generation | Senthilkumar Gopal et.al. | 2410.09902 | translate | read | link |
| 2024-10-12 | Advanced Gesture Recognition in Autism: Integrating YOLOv7, Video Augmentation and VideoMAE for Video Analysis | Amit Kumar Singh et.al. | 2410.09339 | translate | read | null |
| 2024-10-11 | Aerial Vision-and-Language Navigation via Semantic-Topo-Metric Representation Guided LLM Reasoning | Yunpeng Gao et.al. | 2410.08500 | translate | read | null |
| 2024-10-10 | Human Stone Toolmaking Action Grammar (HSTAG): A Challenging Benchmark for Fine-grained Motor Behavior Recognition | Cheng Liu et.al. | 2410.08410 | translate | read | null |
| 2024-10-10 | Understanding Spatio-Temporal Relations in Human-Object Interaction using Pyramid Graph Convolutional Network | Hao Xing et.al. | 2410.07912 | translate | read | null |
| 2024-10-09 | CHASE: Learning Convex Hull Adaptive Shift for Skeleton-based Multi-Entity Action Recognition | Yuhang Wen et.al. | 2410.07153 | translate | read | link |
| 2024-10-09 | Fourier-based Action Recognition for Wildlife Behavior Quantification with Event Cameras | Friedhelm Hamann et.al. | 2410.06698 | translate | read | null |
| 2024-10-08 | GR-2: A Generative Video-Language-Action Model with Web-Scale Knowledge for Robot Manipulation | Chi-Lam Cheang et.al. | 2410.06158 | translate | read | null |
| 2024-10-10 | ActionAtlas: A VideoQA Benchmark for Domain-specialized Action Recognition | Mohammadreza Salehi et.al. | 2410.05774 | translate | read | null |
| 2024-10-07 | Exploring Gestural Interaction with a Cushion Interface for Smart Home Control | Yuri Suzuki et.al. | 2410.04730 | translate | read | null |
| 2024-10-05 | TR-LLM: Integrating Trajectory Data for Scene-Aware LLM-Based Human Action Prediction | Kojiro Takeyama et.al. | 2410.03993 | translate | read | null |
| 2024-10-04 | Shadow Augmentation for Handwashing Action Recognition: from Synthetic to Real Datasets | Shengtai Ju et.al. | 2410.03984 | translate | read | null |
| 2024-10-04 | Action Selection Learning for Multi-label Multi-view Action Recognition | Trung Thanh Nguyen et.al. | 2410.03302 | translate | read | link |
| 2024-10-03 | DivScene: Benchmarking LVLMs for Object Navigation with Diverse Scenes and Objects | Zhaowei Wang et.al. | 2410.02730 | translate | read | link |
| 2024-10-03 | An Evaluation of Large Pre-Trained Models for Gesture Recognition using Synthetic Videos | Arun Reddy et.al. | 2410.02152 | translate | read | null |
| 2024-10-02 | Language Supervised Human Action Recognition with Salient Fusion: Construction Worker Action Recognition as a Use Case | Mohammad Mahdavian et.al. | 2410.01962 | translate | read | null |
| 2024-10-02 | Sparse Covariance Neural Networks | Andrea Cavallo et.al. | 2410.01669 | translate | read | link |
| 2024-10-02 | Towards Generalizable Vision-Language Robotic Manipulation: A Benchmark and LLM-guided 3D Policy | Ricardo Garcia et.al. | 2410.01345 | translate | read | link |
| 2024-10-01 | Dynamic Planning for LLM-based Graphical User Interface Automation | Shaoqing Zhang et.al. | 2410.00467 | translate | read | link |
(<a href=../Action_Recognition.md>back to Action Recognition</a>)