Reinforcement Learning - 2024-11
Reinforcement Learning - 2024-11
| Publish Date | Title | Authors | Translate | Read | Code | |
|---|---|---|---|---|---|---|
| 2024-11-29 | PDDLFuse: A Tool for Generating Diverse Planning Domains | Vedant Khandelwal et.al. | 2411.19886 | translate | read | null |
| 2024-11-29 | CAREL: Instruction-guided reinforcement learning with cross-modal auxiliary objectives | Armin Saghafian et.al. | 2411.19787 | translate | read | link |
| 2024-11-29 | HVAC-DPT: A Decision Pretrained Transformer for HVAC Control | Anaïs Berkes et.al. | 2411.19746 | translate | read | null |
| 2024-11-29 | Improving generalization of robot locomotion policies via Sharpness-Aware Reinforcement Learning | Severin Bochem et.al. | 2411.19732 | translate | read | null |
| 2024-11-29 | RMIO: A Model-Based MARL Framework for Scenarios with Observation Loss in Some Agents | Shi Zifeng et.al. | 2411.19639 | translate | read | null |
| 2024-11-29 | Build An Influential Bot In Social Media Simulations With Large Language Models | Bailu Jin et.al. | 2411.19635 | translate | read | null |
| 2024-11-29 | Adaptive dynamics of Ising spins in one dimension leveraging Reinforcement Learning | Anish Kumar et.al. | 2411.19602 | translate | read | null |
| 2024-11-29 | Solving Rubik’s Cube Without Tricky Sampling | Yicheng Lin et.al. | 2411.19583 | translate | read | null |
| 2024-11-29 | Training Agents with Weakly Supervised Feedback from Large Language Models | Dihong Gong et.al. | 2411.19547 | translate | read | null |
| 2024-11-29 | A Local Information Aggregation based Multi-Agent Reinforcement Learning for Robot Swarm Dynamic Task Allocation | Yang Lv et.al. | 2411.19526 | translate | read | null |
| 2024-11-27 | Robust Offline Reinforcement Learning with Linearly Structured $f$ -Divergence Regularization | Cheng Tang et.al. | 2411.18612 | translate | read | null |
| 2024-11-27 | A Talent-infused Policy-gradient Approach to Efficient Co-Design of Morphology and Task Allocation Behavior of Multi-Robot Systems | Prajit KrisshnaKumar et.al. | 2411.18519 | translate | read | null |
| 2024-11-27 | G3Flow: Generative 3D Semantic Flow for Pose-aware and Generalizable Object Manipulation | Tianxing Chen et.al. | 2411.18369 | translate | read | null |
| 2024-11-27 | Two-Timescale Digital Twin Assisted Model Interference and Retraining over Wireless Network | Jiayi Cong et.al. | 2411.18329 | translate | read | null |
| 2024-11-27 | Application of Soft Actor-Critic Algorithms in Optimizing Wastewater Treatment with Time Delays Integration | Esmaeel Mohammadi et.al. | 2411.18305 | translate | read | null |
| 2024-11-27 | NeoHebbian Synapses to Accelerate Online Training of Neuromorphic Hardware | Shubham Pande et.al. | 2411.18272 | translate | read | null |
| 2024-11-27 | Dynamic Retail Pricing via Q-Learning – A Reinforcement Learning Framework for Enhanced Revenue Management | Mohit Apte et.al. | 2411.18261 | translate | read | null |
| 2024-11-27 | Dependency-Aware CAV Task Scheduling via Diffusion-Based Reinforcement Learning | Xiang Cheng et.al. | 2411.18230 | translate | read | null |
| 2024-11-27 | Critic-V: VLM Critics Help Catch VLM Errors in Multimodal Reasoning | Di Zhang et.al. | 2411.18203 | translate | read | link |
| 2024-11-27 | Learning for Long-Horizon Planning via Neuro-Symbolic Abductive Imitation | Jie-Jing Shao et.al. | 2411.18201 | translate | read | link |
| 2024-11-26 | Multi-Objective Reinforcement Learning for Automated Resilient Cyber Defence | Ross O’Driscoll et.al. | 2411.17585 | translate | read | null |
| 2024-11-26 | Ensuring Safety in Target Pursuit Control: A CBF-Safe Reinforcement Learning Approach | Yaosheng Deng et.al. | 2411.17552 | translate | read | null |
| 2024-11-26 | IMPROVE: Improving Medical Plausibility without Reliance on HumanValidation – An Enhanced Prototype-Guided Diffusion Framework | Anurag Shandilya et.al. | 2411.17535 | translate | read | null |
| 2024-11-26 | Spatially Visual Perception for End-to-End Robotic Learning | Travis Davies et.al. | 2411.17458 | translate | read | null |
| 2024-11-26 | BPP-Search: Enhancing Tree of Thought Reasoning for Mathematical Modeling Problem Solving | Teng Wang et.al. | 2411.17404 | translate | read | null |
| 2024-11-26 | Joint Combinatorial Node Selection and Resource Allocations in the Lightning Network using Attention-based Reinforcement Learning | Mahdi Salahshour et.al. | 2411.17353 | translate | read | null |
| 2024-11-26 | SIL-RRT*: Learning Sampling Distribution through Self Imitation Learning | Xuzhe Dang et.al. | 2411.17293 | translate | read | null |
| 2024-11-26 | LHPF: Look back the History and Plan for the Future in Autonomous Driving | Sheng Wang et.al. | 2411.17253 | translate | read | null |
| 2024-11-26 | Self-reconfiguration Strategies for Space-distributed Spacecraft | Tianle Liu et.al. | 2411.17137 | translate | read | null |
| 2024-11-26 | LLM-Based Offline Learning for Embodied Agents via Consistency-Guided Reward Ensemble | Yujeong Lee et.al. | 2411.17135 | translate | read | null |
| 2024-11-25 | Self-Generated Critiques Boost Reward Modeling for Language Models | Yue Yu et.al. | 2411.16646 | translate | read | null |
| 2024-11-25 | Continual Deep Reinforcement Learning with Task-Agnostic Policy Distillation | Muhammad Burhan Hafez et.al. | 2411.16532 | translate | read | link |
| 2024-11-25 | Reinforcement Learning for Bidding Strategy Optimization in Day-Ahead Energy Market | Luca Di Persio et.al. | 2411.16519 | translate | read | null |
| 2024-11-25 | Unsupervised Event Outlier Detection in Continuous Time | Somjit Nath et.al. | 2411.16427 | translate | read | null |
| 2024-11-25 | CATP-LLM: Empowering Large Language Models for Cost-Aware Tool Planning | Duo Wu et.al. | 2411.16313 | translate | read | null |
| 2024-11-25 | Probing for Consciousness in Machines | Mathis Immertreu et.al. | 2411.16262 | translate | read | null |
| 2024-11-25 | Multi-Robot Reliable Navigation in Uncertain Topological Environments with Graph Attention Networks | Zhuoyuan Yu et.al. | 2411.16134 | translate | read | null |
| 2024-11-25 | End-to-End Steering for Autonomous Vehicles via Conditional Imitation Co-Learning | Mahmoud M. Kishky et.al. | 2411.16131 | translate | read | null |
| 2024-11-25 | Why the Agent Made that Decision: Explaining Deep Reinforcement Learning with Vision Masks | Rui Zuo et.al. | 2411.16120 | translate | read | null |
| 2024-11-25 | M3: Mamba-assisted Multi-Circuit Optimization via MBRL with Effective Scheduling | Youngmin Oh et.al. | 2411.16019 | translate | read | null |
| 2024-11-22 | WildLMa: Long Horizon Loco-Manipulation in the Wild | Ri-Zhao Qiu et.al. | 2411.15131 | translate | read | null |
| 2024-11-22 | Learning-based Trajectory Tracking for Bird-inspired Flapping-Wing Robots | Jiaze Cai et.al. | 2411.15130 | translate | read | null |
| 2024-11-22 | TÜLU 3: Pushing Frontiers in Open Language Model Post-Training | Nathan Lambert et.al. | 2411.15124 | translate | read | link |
| 2024-11-22 | On Multi-Agent Inverse Reinforcement Learning | Till Freihaut et.al. | 2411.15046 | translate | read | null |
| 2024-11-22 | Safe Multi-Agent Reinforcement Learning with Convergence to Generalized Nash Equilibrium | Zeyang Li et.al. | 2411.15036 | translate | read | null |
| 2024-11-22 | On the Linear Speedup of Personalized Federated Reinforcement Learning with Shared Representations | Guojun Xiong et.al. | 2411.15014 | translate | read | null |
| 2024-11-22 | Free Energy Projective Simulation (FEPS): Active inference with interpretability | Joséphine Pazem et.al. | 2411.14991 | translate | read | null |
| 2024-11-22 | Enhancing Exploration with Diffusion Policies in Hybrid Off-Policy RL: Application to Non-Prehensile Manipulation | Huy Le et.al. | 2411.14913 | translate | read | null |
| 2024-11-22 | Segmenting Action-Value Functions Over Time-Scales in SARSA using TD( $Δ$ ) | Mahammad Humayoo et.al. | 2411.14783 | translate | read | null |
| 2024-11-22 | Enhancing Molecular Design through Graph-based Topological Reinforcement Learning | Xiangyu Zhang et.al. | 2411.14726 | translate | read | null |
| 2024-11-21 | Multi-Agent Environments for Vehicle Routing Problems | Ricardo Gama et.al. | 2411.14411 | translate | read | null |
| 2024-11-21 | Marco-o1: Towards Open Reasoning Models for Open-Ended Solutions | Yu Zhao et.al. | 2411.14405 | translate | read | link |
| 2024-11-21 | 23 DoF Grasping Policies from a Raw Point Cloud | Martin Matak et.al. | 2411.14400 | translate | read | null |
| 2024-11-21 | Model Checking for Reinforcement Learning in Autonomous Driving: One Can Do More Than You Think! | Rong Gu et.al. | 2411.14375 | translate | read | null |
| 2024-11-21 | Convex Approximation of Probabilistic Reachable Sets from Small Samples Using Self-supervised Neural Networks | Jun Xiang et.al. | 2411.14356 | translate | read | null |
| 2024-11-21 | Logarithmic Neyman Regret for Adaptive Estimation of the Average Treatment Effect | Ojash Neopane et.al. | 2411.14341 | translate | read | null |
| 2024-11-21 | Explainable Multi-Agent Reinforcement Learning for Extended Reality Codec Adaptation | Pedro Enrique Iturria-Rivera et.al. | 2411.14264 | translate | read | null |
| 2024-11-21 | Generalizing End-To-End Autonomous Driving In Real-World Environments Using Zero-Shot LLMs | Zeyu Dong et.al. | 2411.14256 | translate | read | null |
| 2024-11-21 | Natural Language Reinforcement Learning | Xidong Feng et.al. | 2411.14251 | translate | read | link |
| 2024-11-21 | Umbrella Reinforcement Learning – computationally efficient tool for hard non-linear problems | Egor E. Nuzhin et.al. | 2411.14117 | translate | read | null |
| 2024-11-20 | BALROG: Benchmarking Agentic LLM and VLM Reasoning On Games | Davide Paglieri et.al. | 2411.13543 | translate | read | link |
| 2024-11-20 | Metacognition for Unknown Situations and Environments (MUSE) | Rodolfo Valiente et.al. | 2411.13537 | translate | read | null |
| 2024-11-20 | Robust Monocular Visual Odometry using Curriculum Learning | Assaf Lahiany et.al. | 2411.13438 | translate | read | null |
| 2024-11-20 | A Survey On Enhancing Reinforcement Learning in Complex Environments: Insights from Human and LLM Feedback | Alireza Rashidi Laleh et.al. | 2411.13410 | translate | read | null |
| 2024-11-20 | Fine-tuning Myoelectric Control through Reinforcement Learning in a Game Environment | Kilian Freitag et.al. | 2411.13327 | translate | read | null |
| 2024-11-20 | Backward Stochastic Control System with Entropy Regularization | Ziyue Chen et.al. | 2411.13219 | translate | read | null |
| 2024-11-20 | ViSTa Dataset: Do vision-language models understand sequential tasks? | Evžen Wybitul et.al. | 2411.13211 | translate | read | link |
| 2024-11-20 | Engagement-Driven Content Generation with Large Language Models | Erica Coppolillo et.al. | 2411.13187 | translate | read | null |
| 2024-11-20 | Learning Time-Optimal and Speed-Adjustable Tactile In-Hand Manipulation | Johannes Pitz et.al. | 2411.13148 | translate | read | null |
| 2024-11-20 | ReinFog: A DRL Empowered Framework for Resource Management in Edge and Cloud Computing Environments | Zhiyu Wang et.al. | 2411.13121 | translate | read | null |
| 2024-11-19 | ACING: Actor-Critic for Instruction Learning in Black-Box Large Language Models | Salma Kharrat et.al. | 2411.12736 | translate | read | link |
| 2024-11-19 | Reinforcement Learning, Collusion, and the Folk Theorem | Galit Askenazi-Golan et.al. | 2411.12725 | translate | read | null |
| 2024-11-19 | UBSoft: A Simulation Platform for Robotic Skill Learning in Unbounded Soft Environments | Chunru Lin et.al. | 2411.12711 | translate | read | null |
| 2024-11-19 | Instant Policy: In-Context Imitation Learning via Graph Diffusion | Vitalis Vosylius et.al. | 2411.12633 | translate | read | null |
| 2024-11-19 | Robotic transcatheter tricuspid valve replacement with hybrid enhanced intelligence: a new paradigm and first-in-vivo study | Shuangyi Wang et.al. | 2411.12478 | translate | read | null |
| 2024-11-19 | Variable-Frequency Imitation Learning for Variable-Speed Motion | Nozomu Masuya et.al. | 2411.12310 | translate | read | null |
| 2024-11-19 | Emergence of Implicit World Models from Mortal Agents | Kazuya Horibe et.al. | 2411.12304 | translate | read | null |
| 2024-11-19 | DT-RaDaR: Digital Twin Assisted Robot Navigation using Differential Ray-Tracing | Sunday Amatare et.al. | 2411.12284 | translate | read | null |
| 2024-11-19 | Error-Feedback Model for Output Correction in Bilateral Control-Based Imitation Learning | Hiroshi Sato et.al. | 2411.12255 | translate | read | null |
| 2024-11-19 | Efficient Training in Multi-Agent Reinforcement Learning: A Communication-Free Framework for the Box-Pushing Problem | David Ge et.al. | 2411.12246 | translate | read | null |
| 2024-11-18 | Design And Optimization Of Multi-rendezvous Manoeuvres Based On Reinforcement Learning And Convex Optimization | Antonio López Rivera et.al. | 2411.11778 | translate | read | null |
| 2024-11-18 | High-Speed Cornering Control and Real-Vehicle Deployment for Autonomous Electric Vehicles | Shiyue Zhao et.al. | 2411.11762 | translate | read | null |
| 2024-11-18 | Mapping out the Space of Human Feedback for Reinforcement Learning: A Conceptual Framework | Yannick Metz et.al. | 2411.11761 | translate | read | null |
| 2024-11-18 | Aligning Few-Step Diffusion Models with Dense Reward Difference Learning | Ziyi Zhang et.al. | 2411.11727 | translate | read | link |
| 2024-11-18 | Bitcoin Under Volatile Block Rewards: How Mempool Statistics Can Influence Bitcoin Mining | Roozbeh Sarenche et.al. | 2411.11702 | translate | read | null |
| 2024-11-18 | Robust Reinforcement Learning under Diffusion Models for Data with Jumps | Chenyang Jiang et.al. | 2411.11697 | translate | read | null |
| 2024-11-18 | Coevolution of Opinion Dynamics and Recommendation System: Modeling Analysis and Reinforcement Learning Based Manipulation | Yuhong Chen et.al. | 2411.11687 | translate | read | null |
| 2024-11-18 | No-regret Exploration in Shuffle Private Reinforcement Learning | Shaojie Bai et.al. | 2411.11647 | translate | read | null |
| 2024-11-18 | Signaling and Social Learning in Swarms of Robots | Leo Cazenille et.al. | 2411.11616 | translate | read | null |
| 2024-11-18 | A Pre-Trained Graph-Based Model for Adaptive Sequencing of Educational Documents | Jean Vassoyan et.al. | 2411.11520 | translate | read | null |
| 2024-11-15 | Mitigating Parameter Degeneracy using Joint Conditional Diffusion Model for WECC Composite Load Model in Power Systems | Feiqin Zhu et.al. | 2411.10431 | translate | read | null |
| 2024-11-15 | Continual Adversarial Reinforcement Learning (CARL) of False Data Injection detection: forgetting and explainability | Pooja Aslami et.al. | 2411.10367 | translate | read | null |
| 2024-11-15 | BMP: Bridging the Gap between B-Spline and Movement Primitives | Weiran Liao et.al. | 2411.10336 | translate | read | null |
| 2024-11-15 | Towards Sample-Efficiency and Generalization of Transfer and Inverse Reinforcement Learning: A Comprehensive Literature Review | Hossein Hassani et.al. | 2411.10268 | translate | read | null |
| 2024-11-15 | Learning Generalizable 3D Manipulation With 10 Demonstrations | Yu Ren et.al. | 2411.10203 | translate | read | null |
| 2024-11-15 | The Surprising Ineffectiveness of Pre-Trained Visual Representations for Model-Based Reinforcement Learning | Moritz Schneider et.al. | 2411.10175 | translate | read | null |
| 2024-11-15 | Imagine-2-Drive: High-Fidelity World Modeling in CARLA for Autonomous Vehicles | Anant Garg et.al. | 2411.10171 | translate | read | null |
| 2024-11-15 | Mitigating Sycophancy in Decoder-Only Transformer Architectures: Synthetic Data Intervention | Libo Wang et.al. | 2411.10156 | translate | read | link |
| 2024-11-15 | That Chip Has Sailed: A Critique of Unfounded Skepticism Around AI for Chip Design | Anna Goldie et.al. | 2411.10053 | translate | read | null |
| 2024-11-15 | Enforcing Cooperative Safety for Reinforcement Learning-based Mixed-Autonomy Platoon Control | Jingyuan Zhou et.al. | 2411.10031 | translate | read | null |
| 2024-11-14 | A Risk Sensitive Contract-unified Reinforcement Learning Approach for Option Hedging | Xianhua Peng et.al. | 2411.09659 | translate | read | null |
| 2024-11-14 | Motion Before Action: Diffusing Object Motion as Manipulation Condition | Yup Su et.al. | 2411.09658 | translate | read | null |
| 2024-11-14 | Tailoring interactions between active nematic defects with reinforcement learning | Carlos Floyd et.al. | 2411.09588 | translate | read | null |
| 2024-11-14 | Developement of Reinforcement Learning based Optimisation Method for Side-Sill Design | Aditya Borse et.al. | 2411.09499 | translate | read | null |
| 2024-11-14 | Approximated Variational Bayesian Inverse Reinforcement Learning for Large Language Model Alignment | Yuang Cai et.al. | 2411.09341 | translate | read | null |
| 2024-11-14 | Socio-Economic Consequences of Generative AI: A Review of Methodological Approaches | Carlos J. Costa et.al. | 2411.09313 | translate | read | null |
| 2024-11-14 | Enhancing reinforcement learning for population setpoint tracking in co-cultures | Sebastián Espinel-Ríos et.al. | 2411.09177 | translate | read | null |
| 2024-11-14 | Gazing at Rewards: Eye Movements as a Lens into Human and AI Decision-Making in Hybrid Visual Foraging | Bo Wang et.al. | 2411.09176 | translate | read | null |
| 2024-11-14 | Rationality based Innate-Values-driven Reinforcement Learning | Qin Yang et.al. | 2411.09160 | translate | read | null |
| 2024-11-14 | Secrecy Energy Efficiency Maximization in IRS-Assisted VLC MISO Networks with RSMA: A DS-PPO approach | Yangbo Guo et.al. | 2411.09146 | translate | read | null |
| 2024-11-13 | LLMStinger: Jailbreaking LLMs using RL fine-tuned LLMs | Piyush Jha et.al. | 2411.08862 | translate | read | null |
| 2024-11-13 | Goal-oriented Semantic Communication for Robot Arm Reconstruction in Digital Twin: Feature and Temporal Selections | Shutong Chen et.al. | 2411.08835 | translate | read | null |
| 2024-11-13 | Recommender systems and reinforcement learning for building control and occupant interaction: A text-mining driven review of scientific literature | Wenhao Zhang et.al. | 2411.08734 | translate | read | null |
| 2024-11-13 | Joint Model Caching and Resource Allocation in Generative AI-Enabled Wireless Edge Networks | Zhang Liu et.al. | 2411.08672 | translate | read | null |
| 2024-11-13 | Estimating unknown parameters in differential equations with a reinforcement learning based PSO method | Wenkui Sun et.al. | 2411.08651 | translate | read | null |
| 2024-11-13 | Towards Secure Intelligent O-RAN Architecture: Vulnerabilities, Threats and Promising Technical Solutions using LLMs | Mojdeh Karbalaee Motalleb et.al. | 2411.08640 | translate | read | null |
| 2024-11-13 | Robot See, Robot Do: Imitation Reward for Noisy Financial Environments | Sven Goluža et.al. | 2411.08637 | translate | read | null |
| 2024-11-13 | Precision-Focused Reinforcement Learning Model for Robotic Object Pushing | Lara Bergmann et.al. | 2411.08622 | translate | read | link |
| 2024-11-13 | Grammarization-Based Grasping with Deep Multi-Autoencoder Latent Space Exploration by Reinforcement Learning Agent | Leonidas Askianakis et.al. | 2411.08566 | translate | read | null |
| 2024-11-13 | Towards Practical Deep Schedulers for Allocating Cellular Radio Resources | Petteri Kela et.al. | 2411.08529 | translate | read | null |
| 2024-11-12 | Learning Memory Mechanisms for Decision Making through Demonstrations | William Yue et.al. | 2411.07954 | translate | read | link |
| 2024-11-12 | Doubly Mild Generalization for Offline Reinforcement Learning | Yixiu Mao et.al. | 2411.07934 | translate | read | link |
| 2024-11-12 | Scaling policy iteration based reinforcement learning for unknown discrete-time linear systems | Zhen Pang et.al. | 2411.07825 | translate | read | null |
| 2024-11-12 | Navigation with QPHIL: Quantizing Planner for Hierarchical Implicit Q-Learning | Alexi Canesse et.al. | 2411.07760 | translate | read | null |
| 2024-11-12 | Optimizing Traffic Signal Control using High-Dimensional State Representation and Efficient Deep Reinforcement Learning | Lawrence Francis et.al. | 2411.07759 | translate | read | null |
| 2024-11-12 | EMPERROR: A Flexible Generative Perception Error Model for Probing Self-Driving Planners | Niklas Hanselmann et.al. | 2411.07719 | translate | read | null |
| 2024-11-12 | Test Where Decisions Matter: Importance-driven Testing for Deep Reinforcement Learning | Stefan Pranger et.al. | 2411.07700 | translate | read | null |
| 2024-11-12 | Exploring Multi-Agent Reinforcement Learning for Unrelated Parallel Machine Scheduling | Maria Zampella et.al. | 2411.07634 | translate | read | null |
| 2024-11-12 | Direct Preference Optimization Using Sparse Feature-Level Constraints | Qingyu Yin et.al. | 2411.07618 | translate | read | null |
| 2024-11-12 | Entropy Controllable Direct Preference Optimization | Motoki Omura et.al. | 2411.07595 | translate | read | null |
| 2024-11-11 | ‘Explaining RL Decisions with Trajectories’: A Reproducibility Study | Karim Abdel Sadek et.al. | 2411.07200 | translate | read | link |
| 2024-11-11 | Joint Age-State Belief is All You Need: Minimizing AoII via Pull-Based Remote Estimation | Ismail Cosandal et.al. | 2411.07179 | translate | read | null |
| 2024-11-11 | Learning Multi-Agent Collaborative Manipulation for Long-Horizon Quadrupedal Pushing | Chuye Hong et.al. | 2411.07104 | translate | read | null |
| 2024-11-11 | A Multi-Agent Approach for REST API Testing with Semantic Graphs and LLM-Driven Inputs | Myeongsoo Kim et.al. | 2411.07098 | translate | read | null |
| 2024-11-11 | OCMDP: Observation-Constrained Markov Decision Process | Taiyi Wang et.al. | 2411.07087 | translate | read | null |
| 2024-11-11 | To Train or Not to Train: Balancing Efficiency and Training Cost in Deep Reinforcement Learning for Mobile Edge Computing | Maddalena Boscaro et.al. | 2411.07086 | translate | read | null |
| 2024-11-11 | Non-Adversarial Inverse Reinforcement Learning via Successor Feature Matching | Arnav Kumar Jain et.al. | 2411.07007 | translate | read | link |
| 2024-11-11 | Enhancing Robot Assistive Behaviour with Reinforcement Learning and Theory of Mind | Antonio Andriella et.al. | 2411.07003 | translate | read | link |
| 2024-11-11 | Imitation from Diverse Behaviors: Wasserstein Quality Diversity Imitation Learning with Single-Step Archive Exploration | Xingrui Yu et.al. | 2411.06965 | translate | read | null |
| 2024-11-11 | Streetwise Agents: Empowering Offline RL Policies to Outsmart Exogenous Stochastic Disturbances in RTC | Aditya Soni et.al. | 2411.06815 | translate | read | null |
| 2024-11-08 | Safe Reinforcement Learning of Robot Trajectories in the Presence of Moving Obstacles | Jonas Kiemel et.al. | 2411.05784 | translate | read | null |
| 2024-11-08 | Tract-RLFormer: A Tract-Specific RL policy based Decoder-only Transformer Network | Ankita Joshi et.al. | 2411.05757 | translate | read | null |
| 2024-11-08 | Topology-aware Reinforcement Feature Space Reconstruction for Graph Data | Wangyang Ying et.al. | 2411.05742 | translate | read | null |
| 2024-11-08 | Renewable Energy Powered and Open RAN-based Architecture for 5G Fixed Wireless Access Provisioning in Rural Areas | Anselme Ndikumana et.al. | 2411.05699 | translate | read | null |
| 2024-11-08 | Data-Driven Distributed Common Operational Picture from Heterogeneous Platforms using Multi-Agent Reinforcement Learning | Indranil Sur et.al. | 2411.05683 | translate | read | null |
| 2024-11-08 | Digital Twin Backed Closed-Loops for Energy-Aware and Open RAN-based Fixed Wireless Access Serving Rural Areas | Anselme Ndikumana et.al. | 2411.05664 | translate | read | null |
| 2024-11-08 | Acceleration for Deep Reinforcement Learning using Parallel and Distributed Computing: A Survey | Zhihong Liu et.al. | 2411.05614 | translate | read | null |
| 2024-11-08 | Smart navigation through a rotating barrier: Deep reinforcement learning with application to size-based separation of active microagents | Mohammad Hossein Masoudi et.al. | 2411.05587 | translate | read | null |
| 2024-11-08 | Tangled Program Graphs as an alternative to DRL-based control algorithms for UAVs | Hubert Szolc et.al. | 2411.05586 | translate | read | null |
| 2024-11-08 | Towards Active Flow Control Strategies Through Deep Reinforcement Learning | Ricard Montalà et.al. | 2411.05536 | translate | read | null |
| 2024-11-07 | Noisy Zero-Shot Coordination: Breaking The Common Knowledge Assumption In Zero-Shot Coordination Games | Usman Anwar et.al. | 2411.04976 | translate | read | link |
| 2024-11-07 | A Reinforcement Learning-Based Automatic Video Editing Method Using Pre-trained Vision-Language Model | Panwen Hu et.al. | 2411.04942 | translate | read | null |
| 2024-11-07 | Stem-OB: Generalizable Visual Imitation Learning with Stem-Like Convergent Observation through Diffusion Inversion | Kaizhe Hu et.al. | 2411.04919 | translate | read | link |
| 2024-11-07 | Evaluating Robustness of Reinforcement Learning Algorithms for Autonomous Shipping | Bavo Lesy et.al. | 2411.04915 | translate | read | null |
| 2024-11-07 | Think Smart, Act SMARL! Analyzing Probabilistic Logic Driven Safety in Multi-Agent Reinforcement Learning | Satchit Chatterji et.al. | 2411.04867 | translate | read | link |
| 2024-11-07 | Asymptotic regularity of a generalised stochastic Halpern scheme with applications | Nicholas Pischke et.al. | 2411.04845 | translate | read | null |
| 2024-11-07 | Plasticity Loss in Deep Reinforcement Learning: A Survey | Timo Klein et.al. | 2411.04832 | translate | read | null |
| 2024-11-07 | Harnessing the Power of Gradient-Based Simulations for Multi-Objective Optimization in Particle Accelerators | Kishansingh Rajput et.al. | 2411.04817 | translate | read | null |
| 2024-11-07 | AllGaits: Learning All Quadruped Gaits and Transitions | Guillaume Bellegarda et.al. | 2411.04787 | translate | read | null |
| 2024-11-07 | Navigating Trade-offs: Policy Summarization for Multi-Objective Reinforcement Learning | Zuzanna Osika et.al. | 2411.04784 | translate | read | link |
| 2024-11-06 | A Comparative Study of Deep Reinforcement Learning for Crop Production Management | Joseph Balderas et.al. | 2411.04106 | translate | read | null |
| 2024-11-06 | Interpretable and Efficient Data-driven Discovery and Control of Distributed Systems | Florian Wolf et.al. | 2411.04098 | translate | read | null |
| 2024-11-06 | Memorized action chunking with Transformers: Imitation learning for vision-based tissue surface scanning | Bochen Yang et.al. | 2411.04050 | translate | read | null |
| 2024-11-06 | Non-Stationary Learning of Neural Networks with Automatic Soft Parameter Reset | Alexandre Galashov et.al. | 2411.04034 | translate | read | null |
| 2024-11-06 | Predicting and Publishing Accurate Imbalance Prices Using Monte Carlo Tree Search | Fabio Pavirani et.al. | 2411.04011 | translate | read | null |
| 2024-11-06 | Object-Centric Dexterous Manipulation from Human Motion Data | Yuanpei Chen et.al. | 2411.04005 | translate | read | null |
| 2024-11-06 | ET-SEED: Efficient Trajectory-Level SE(3) Equivariant Diffusion Policy | Chenrui Tie et.al. | 2411.03990 | translate | read | null |
| 2024-11-06 | AdaSociety: An Adaptive Environment with Social Structures for Multi-Agent Decision-Making | Yizhe Huang et.al. | 2411.03865 | translate | read | link |
| 2024-11-06 | Beyond The Rainbow: High Performance Deep Reinforcement Learning On A Desktop PC | Tyler Clark et.al. | 2411.03820 | translate | read | null |
| 2024-11-06 | From Novice to Expert: LLM Agent Policy Optimization via Step-wise Reinforcement Learning | Zhirui Deng et.al. | 2411.03817 | translate | read | null |
| 2024-11-05 | Out-of-Distribution Recovery with Object-Centric Keypoint Inverse Policy For Visuomotor Imitation Learning | George Jiayuan Gao et.al. | 2411.03294 | translate | read | null |
| 2024-11-05 | Pre-trained Visual Dynamics Representations for Efficient Policy Learning | Hao Luo et.al. | 2411.03169 | translate | read | null |
| 2024-11-05 | Hierarchical Orchestra of Policies | Thomas P Cannon et.al. | 2411.03008 | translate | read | null |
| 2024-11-05 | Accelerating Task Generalisation with Multi-Level Hierarchical Options | Thomas P Cannon et.al. | 2411.02998 | translate | read | null |
| 2024-11-05 | Autonomous Decision Making for UAV Cooperative Pursuit-Evasion Game with Reinforcement Learning | Yang Zhao et.al. | 2411.02983 | translate | read | null |
| 2024-11-05 | Transformer-Based Fault-Tolerant Control for Fixed-Wing UAVs Using Knowledge Distillation and In-Context Adaptation | Francisco Giral et.al. | 2411.02975 | translate | read | null |
| 2024-11-05 | Embedding Safety into RL: A New Take on Trust Region Methods | Nikola Milosevic et.al. | 2411.02957 | translate | read | null |
| 2024-11-05 | The Unreasonable Effectiveness of LLMs for Query Optimization | Peter Akioyamen et.al. | 2411.02862 | translate | read | link |
| 2024-11-05 | ADOPT: Modified Adam Can Converge with Any $β_2$ with the Optimal Rate | Shohei Taniguchi et.al. | 2411.02853 | translate | read | link |
| 2024-11-05 | When to Localize? A Risk-Constrained Reinforcement Learning Approach | Chak Lam Shek et.al. | 2411.02788 | translate | read | null |
| 2024-11-04 | Simulation of Nanorobots with Artificial Intelligence and Reinforcement Learning for Advanced Cancer Cell Detection and Tracking | Shahab Kavousinejad et.al. | 2411.02345 | translate | read | link |
| 2024-11-04 | WebRL: Training LLM Web Agents via Self-Evolving Online Curriculum Reinforcement Learning | Zehan Qi et.al. | 2411.02337 | translate | read | null |
| 2024-11-04 | Targeted Manipulation and Deception Emerge when Optimizing LLMs for User Feedback | Marcus Williams et.al. | 2411.02306 | translate | read | link |
| 2024-11-04 | N-Gram Induction Heads for In-Context RL: Improving Stability and Reducing Data Needs | Ilya Zisman et.al. | 2411.01958 | translate | read | null |
| 2024-11-04 | RoboCrowd: Scaling Robot Data Collection through Crowdsourcing | Suvir Mirchandani et.al. | 2411.01915 | translate | read | null |
| 2024-11-04 | Efficient Active Imitation Learning with Random Network Distillation | Emilien Biré et.al. | 2411.01894 | translate | read | null |
| 2024-11-04 | Align-SLM: Textless Spoken Language Models with Reinforcement Learning from AI Feedback | Guan-Ting Lin et.al. | 2411.01834 | translate | read | null |
| 2024-11-04 | Risk-sensitive control as inference with Rényi divergence | Kaito Ito et.al. | 2411.01827 | translate | read | null |
| 2024-11-04 | IRS-Enhanced Secure Semantic Communication Networks: Cross-Layer and Context-Awared Resource Allocation | Lingyi Wang et.al. | 2411.01821 | translate | read | null |
| 2024-11-04 | So You Think You Can Scale Up Autonomous Robot Data Collection? | Suvir Mirchandani et.al. | 2411.01813 | translate | read | null |
(<a href=../Reinforcement_Learning.md>back to Reinforcement Learning</a>)