Reinforcement Learning - 2024-11

Publish Date Title Authors PDF Translate Read Code
2024-11-29 PDDLFuse: A Tool for Generating Diverse Planning Domains Vedant Khandelwal et.al. 2411.19886 translate read null
2024-11-29 CAREL: Instruction-guided reinforcement learning with cross-modal auxiliary objectives Armin Saghafian et.al. 2411.19787 translate read link
2024-11-29 HVAC-DPT: A Decision Pretrained Transformer for HVAC Control Anaïs Berkes et.al. 2411.19746 translate read null
2024-11-29 Improving generalization of robot locomotion policies via Sharpness-Aware Reinforcement Learning Severin Bochem et.al. 2411.19732 translate read null
2024-11-29 RMIO: A Model-Based MARL Framework for Scenarios with Observation Loss in Some Agents Shi Zifeng et.al. 2411.19639 translate read null
2024-11-29 Build An Influential Bot In Social Media Simulations With Large Language Models Bailu Jin et.al. 2411.19635 translate read null
2024-11-29 Adaptive dynamics of Ising spins in one dimension leveraging Reinforcement Learning Anish Kumar et.al. 2411.19602 translate read null
2024-11-29 Solving Rubik’s Cube Without Tricky Sampling Yicheng Lin et.al. 2411.19583 translate read null
2024-11-29 Training Agents with Weakly Supervised Feedback from Large Language Models Dihong Gong et.al. 2411.19547 translate read null
2024-11-29 A Local Information Aggregation based Multi-Agent Reinforcement Learning for Robot Swarm Dynamic Task Allocation Yang Lv et.al. 2411.19526 translate read null
2024-11-27 Robust Offline Reinforcement Learning with Linearly Structured $f$ -Divergence Regularization Cheng Tang et.al. 2411.18612 translate read null
2024-11-27 A Talent-infused Policy-gradient Approach to Efficient Co-Design of Morphology and Task Allocation Behavior of Multi-Robot Systems Prajit KrisshnaKumar et.al. 2411.18519 translate read null
2024-11-27 G3Flow: Generative 3D Semantic Flow for Pose-aware and Generalizable Object Manipulation Tianxing Chen et.al. 2411.18369 translate read null
2024-11-27 Two-Timescale Digital Twin Assisted Model Interference and Retraining over Wireless Network Jiayi Cong et.al. 2411.18329 translate read null
2024-11-27 Application of Soft Actor-Critic Algorithms in Optimizing Wastewater Treatment with Time Delays Integration Esmaeel Mohammadi et.al. 2411.18305 translate read null
2024-11-27 NeoHebbian Synapses to Accelerate Online Training of Neuromorphic Hardware Shubham Pande et.al. 2411.18272 translate read null
2024-11-27 Dynamic Retail Pricing via Q-Learning – A Reinforcement Learning Framework for Enhanced Revenue Management Mohit Apte et.al. 2411.18261 translate read null
2024-11-27 Dependency-Aware CAV Task Scheduling via Diffusion-Based Reinforcement Learning Xiang Cheng et.al. 2411.18230 translate read null
2024-11-27 Critic-V: VLM Critics Help Catch VLM Errors in Multimodal Reasoning Di Zhang et.al. 2411.18203 translate read link
2024-11-27 Learning for Long-Horizon Planning via Neuro-Symbolic Abductive Imitation Jie-Jing Shao et.al. 2411.18201 translate read link
2024-11-26 Multi-Objective Reinforcement Learning for Automated Resilient Cyber Defence Ross O’Driscoll et.al. 2411.17585 translate read null
2024-11-26 Ensuring Safety in Target Pursuit Control: A CBF-Safe Reinforcement Learning Approach Yaosheng Deng et.al. 2411.17552 translate read null
2024-11-26 IMPROVE: Improving Medical Plausibility without Reliance on HumanValidation – An Enhanced Prototype-Guided Diffusion Framework Anurag Shandilya et.al. 2411.17535 translate read null
2024-11-26 Spatially Visual Perception for End-to-End Robotic Learning Travis Davies et.al. 2411.17458 translate read null
2024-11-26 BPP-Search: Enhancing Tree of Thought Reasoning for Mathematical Modeling Problem Solving Teng Wang et.al. 2411.17404 translate read null
2024-11-26 Joint Combinatorial Node Selection and Resource Allocations in the Lightning Network using Attention-based Reinforcement Learning Mahdi Salahshour et.al. 2411.17353 translate read null
2024-11-26 SIL-RRT*: Learning Sampling Distribution through Self Imitation Learning Xuzhe Dang et.al. 2411.17293 translate read null
2024-11-26 LHPF: Look back the History and Plan for the Future in Autonomous Driving Sheng Wang et.al. 2411.17253 translate read null
2024-11-26 Self-reconfiguration Strategies for Space-distributed Spacecraft Tianle Liu et.al. 2411.17137 translate read null
2024-11-26 LLM-Based Offline Learning for Embodied Agents via Consistency-Guided Reward Ensemble Yujeong Lee et.al. 2411.17135 translate read null
2024-11-25 Self-Generated Critiques Boost Reward Modeling for Language Models Yue Yu et.al. 2411.16646 translate read null
2024-11-25 Continual Deep Reinforcement Learning with Task-Agnostic Policy Distillation Muhammad Burhan Hafez et.al. 2411.16532 translate read link
2024-11-25 Reinforcement Learning for Bidding Strategy Optimization in Day-Ahead Energy Market Luca Di Persio et.al. 2411.16519 translate read null
2024-11-25 Unsupervised Event Outlier Detection in Continuous Time Somjit Nath et.al. 2411.16427 translate read null
2024-11-25 CATP-LLM: Empowering Large Language Models for Cost-Aware Tool Planning Duo Wu et.al. 2411.16313 translate read null
2024-11-25 Probing for Consciousness in Machines Mathis Immertreu et.al. 2411.16262 translate read null
2024-11-25 Multi-Robot Reliable Navigation in Uncertain Topological Environments with Graph Attention Networks Zhuoyuan Yu et.al. 2411.16134 translate read null
2024-11-25 End-to-End Steering for Autonomous Vehicles via Conditional Imitation Co-Learning Mahmoud M. Kishky et.al. 2411.16131 translate read null
2024-11-25 Why the Agent Made that Decision: Explaining Deep Reinforcement Learning with Vision Masks Rui Zuo et.al. 2411.16120 translate read null
2024-11-25 M3: Mamba-assisted Multi-Circuit Optimization via MBRL with Effective Scheduling Youngmin Oh et.al. 2411.16019 translate read null
2024-11-22 WildLMa: Long Horizon Loco-Manipulation in the Wild Ri-Zhao Qiu et.al. 2411.15131 translate read null
2024-11-22 Learning-based Trajectory Tracking for Bird-inspired Flapping-Wing Robots Jiaze Cai et.al. 2411.15130 translate read null
2024-11-22 TÜLU 3: Pushing Frontiers in Open Language Model Post-Training Nathan Lambert et.al. 2411.15124 translate read link
2024-11-22 On Multi-Agent Inverse Reinforcement Learning Till Freihaut et.al. 2411.15046 translate read null
2024-11-22 Safe Multi-Agent Reinforcement Learning with Convergence to Generalized Nash Equilibrium Zeyang Li et.al. 2411.15036 translate read null
2024-11-22 On the Linear Speedup of Personalized Federated Reinforcement Learning with Shared Representations Guojun Xiong et.al. 2411.15014 translate read null
2024-11-22 Free Energy Projective Simulation (FEPS): Active inference with interpretability Joséphine Pazem et.al. 2411.14991 translate read null
2024-11-22 Enhancing Exploration with Diffusion Policies in Hybrid Off-Policy RL: Application to Non-Prehensile Manipulation Huy Le et.al. 2411.14913 translate read null
2024-11-22 Segmenting Action-Value Functions Over Time-Scales in SARSA using TD( $Δ$ ) Mahammad Humayoo et.al. 2411.14783 translate read null
2024-11-22 Enhancing Molecular Design through Graph-based Topological Reinforcement Learning Xiangyu Zhang et.al. 2411.14726 translate read null
2024-11-21 Multi-Agent Environments for Vehicle Routing Problems Ricardo Gama et.al. 2411.14411 translate read null
2024-11-21 Marco-o1: Towards Open Reasoning Models for Open-Ended Solutions Yu Zhao et.al. 2411.14405 translate read link
2024-11-21 23 DoF Grasping Policies from a Raw Point Cloud Martin Matak et.al. 2411.14400 translate read null
2024-11-21 Model Checking for Reinforcement Learning in Autonomous Driving: One Can Do More Than You Think! Rong Gu et.al. 2411.14375 translate read null
2024-11-21 Convex Approximation of Probabilistic Reachable Sets from Small Samples Using Self-supervised Neural Networks Jun Xiang et.al. 2411.14356 translate read null
2024-11-21 Logarithmic Neyman Regret for Adaptive Estimation of the Average Treatment Effect Ojash Neopane et.al. 2411.14341 translate read null
2024-11-21 Explainable Multi-Agent Reinforcement Learning for Extended Reality Codec Adaptation Pedro Enrique Iturria-Rivera et.al. 2411.14264 translate read null
2024-11-21 Generalizing End-To-End Autonomous Driving In Real-World Environments Using Zero-Shot LLMs Zeyu Dong et.al. 2411.14256 translate read null
2024-11-21 Natural Language Reinforcement Learning Xidong Feng et.al. 2411.14251 translate read link
2024-11-21 Umbrella Reinforcement Learning – computationally efficient tool for hard non-linear problems Egor E. Nuzhin et.al. 2411.14117 translate read null
2024-11-20 BALROG: Benchmarking Agentic LLM and VLM Reasoning On Games Davide Paglieri et.al. 2411.13543 translate read link
2024-11-20 Metacognition for Unknown Situations and Environments (MUSE) Rodolfo Valiente et.al. 2411.13537 translate read null
2024-11-20 Robust Monocular Visual Odometry using Curriculum Learning Assaf Lahiany et.al. 2411.13438 translate read null
2024-11-20 A Survey On Enhancing Reinforcement Learning in Complex Environments: Insights from Human and LLM Feedback Alireza Rashidi Laleh et.al. 2411.13410 translate read null
2024-11-20 Fine-tuning Myoelectric Control through Reinforcement Learning in a Game Environment Kilian Freitag et.al. 2411.13327 translate read null
2024-11-20 Backward Stochastic Control System with Entropy Regularization Ziyue Chen et.al. 2411.13219 translate read null
2024-11-20 ViSTa Dataset: Do vision-language models understand sequential tasks? Evžen Wybitul et.al. 2411.13211 translate read link
2024-11-20 Engagement-Driven Content Generation with Large Language Models Erica Coppolillo et.al. 2411.13187 translate read null
2024-11-20 Learning Time-Optimal and Speed-Adjustable Tactile In-Hand Manipulation Johannes Pitz et.al. 2411.13148 translate read null
2024-11-20 ReinFog: A DRL Empowered Framework for Resource Management in Edge and Cloud Computing Environments Zhiyu Wang et.al. 2411.13121 translate read null
2024-11-19 ACING: Actor-Critic for Instruction Learning in Black-Box Large Language Models Salma Kharrat et.al. 2411.12736 translate read link
2024-11-19 Reinforcement Learning, Collusion, and the Folk Theorem Galit Askenazi-Golan et.al. 2411.12725 translate read null
2024-11-19 UBSoft: A Simulation Platform for Robotic Skill Learning in Unbounded Soft Environments Chunru Lin et.al. 2411.12711 translate read null
2024-11-19 Instant Policy: In-Context Imitation Learning via Graph Diffusion Vitalis Vosylius et.al. 2411.12633 translate read null
2024-11-19 Robotic transcatheter tricuspid valve replacement with hybrid enhanced intelligence: a new paradigm and first-in-vivo study Shuangyi Wang et.al. 2411.12478 translate read null
2024-11-19 Variable-Frequency Imitation Learning for Variable-Speed Motion Nozomu Masuya et.al. 2411.12310 translate read null
2024-11-19 Emergence of Implicit World Models from Mortal Agents Kazuya Horibe et.al. 2411.12304 translate read null
2024-11-19 DT-RaDaR: Digital Twin Assisted Robot Navigation using Differential Ray-Tracing Sunday Amatare et.al. 2411.12284 translate read null
2024-11-19 Error-Feedback Model for Output Correction in Bilateral Control-Based Imitation Learning Hiroshi Sato et.al. 2411.12255 translate read null
2024-11-19 Efficient Training in Multi-Agent Reinforcement Learning: A Communication-Free Framework for the Box-Pushing Problem David Ge et.al. 2411.12246 translate read null
2024-11-18 Design And Optimization Of Multi-rendezvous Manoeuvres Based On Reinforcement Learning And Convex Optimization Antonio López Rivera et.al. 2411.11778 translate read null
2024-11-18 High-Speed Cornering Control and Real-Vehicle Deployment for Autonomous Electric Vehicles Shiyue Zhao et.al. 2411.11762 translate read null
2024-11-18 Mapping out the Space of Human Feedback for Reinforcement Learning: A Conceptual Framework Yannick Metz et.al. 2411.11761 translate read null
2024-11-18 Aligning Few-Step Diffusion Models with Dense Reward Difference Learning Ziyi Zhang et.al. 2411.11727 translate read link
2024-11-18 Bitcoin Under Volatile Block Rewards: How Mempool Statistics Can Influence Bitcoin Mining Roozbeh Sarenche et.al. 2411.11702 translate read null
2024-11-18 Robust Reinforcement Learning under Diffusion Models for Data with Jumps Chenyang Jiang et.al. 2411.11697 translate read null
2024-11-18 Coevolution of Opinion Dynamics and Recommendation System: Modeling Analysis and Reinforcement Learning Based Manipulation Yuhong Chen et.al. 2411.11687 translate read null
2024-11-18 No-regret Exploration in Shuffle Private Reinforcement Learning Shaojie Bai et.al. 2411.11647 translate read null
2024-11-18 Signaling and Social Learning in Swarms of Robots Leo Cazenille et.al. 2411.11616 translate read null
2024-11-18 A Pre-Trained Graph-Based Model for Adaptive Sequencing of Educational Documents Jean Vassoyan et.al. 2411.11520 translate read null
2024-11-15 Mitigating Parameter Degeneracy using Joint Conditional Diffusion Model for WECC Composite Load Model in Power Systems Feiqin Zhu et.al. 2411.10431 translate read null
2024-11-15 Continual Adversarial Reinforcement Learning (CARL) of False Data Injection detection: forgetting and explainability Pooja Aslami et.al. 2411.10367 translate read null
2024-11-15 BMP: Bridging the Gap between B-Spline and Movement Primitives Weiran Liao et.al. 2411.10336 translate read null
2024-11-15 Towards Sample-Efficiency and Generalization of Transfer and Inverse Reinforcement Learning: A Comprehensive Literature Review Hossein Hassani et.al. 2411.10268 translate read null
2024-11-15 Learning Generalizable 3D Manipulation With 10 Demonstrations Yu Ren et.al. 2411.10203 translate read null
2024-11-15 The Surprising Ineffectiveness of Pre-Trained Visual Representations for Model-Based Reinforcement Learning Moritz Schneider et.al. 2411.10175 translate read null
2024-11-15 Imagine-2-Drive: High-Fidelity World Modeling in CARLA for Autonomous Vehicles Anant Garg et.al. 2411.10171 translate read null
2024-11-15 Mitigating Sycophancy in Decoder-Only Transformer Architectures: Synthetic Data Intervention Libo Wang et.al. 2411.10156 translate read link
2024-11-15 That Chip Has Sailed: A Critique of Unfounded Skepticism Around AI for Chip Design Anna Goldie et.al. 2411.10053 translate read null
2024-11-15 Enforcing Cooperative Safety for Reinforcement Learning-based Mixed-Autonomy Platoon Control Jingyuan Zhou et.al. 2411.10031 translate read null
2024-11-14 A Risk Sensitive Contract-unified Reinforcement Learning Approach for Option Hedging Xianhua Peng et.al. 2411.09659 translate read null
2024-11-14 Motion Before Action: Diffusing Object Motion as Manipulation Condition Yup Su et.al. 2411.09658 translate read null
2024-11-14 Tailoring interactions between active nematic defects with reinforcement learning Carlos Floyd et.al. 2411.09588 translate read null
2024-11-14 Developement of Reinforcement Learning based Optimisation Method for Side-Sill Design Aditya Borse et.al. 2411.09499 translate read null
2024-11-14 Approximated Variational Bayesian Inverse Reinforcement Learning for Large Language Model Alignment Yuang Cai et.al. 2411.09341 translate read null
2024-11-14 Socio-Economic Consequences of Generative AI: A Review of Methodological Approaches Carlos J. Costa et.al. 2411.09313 translate read null
2024-11-14 Enhancing reinforcement learning for population setpoint tracking in co-cultures Sebastián Espinel-Ríos et.al. 2411.09177 translate read null
2024-11-14 Gazing at Rewards: Eye Movements as a Lens into Human and AI Decision-Making in Hybrid Visual Foraging Bo Wang et.al. 2411.09176 translate read null
2024-11-14 Rationality based Innate-Values-driven Reinforcement Learning Qin Yang et.al. 2411.09160 translate read null
2024-11-14 Secrecy Energy Efficiency Maximization in IRS-Assisted VLC MISO Networks with RSMA: A DS-PPO approach Yangbo Guo et.al. 2411.09146 translate read null
2024-11-13 LLMStinger: Jailbreaking LLMs using RL fine-tuned LLMs Piyush Jha et.al. 2411.08862 translate read null
2024-11-13 Goal-oriented Semantic Communication for Robot Arm Reconstruction in Digital Twin: Feature and Temporal Selections Shutong Chen et.al. 2411.08835 translate read null
2024-11-13 Recommender systems and reinforcement learning for building control and occupant interaction: A text-mining driven review of scientific literature Wenhao Zhang et.al. 2411.08734 translate read null
2024-11-13 Joint Model Caching and Resource Allocation in Generative AI-Enabled Wireless Edge Networks Zhang Liu et.al. 2411.08672 translate read null
2024-11-13 Estimating unknown parameters in differential equations with a reinforcement learning based PSO method Wenkui Sun et.al. 2411.08651 translate read null
2024-11-13 Towards Secure Intelligent O-RAN Architecture: Vulnerabilities, Threats and Promising Technical Solutions using LLMs Mojdeh Karbalaee Motalleb et.al. 2411.08640 translate read null
2024-11-13 Robot See, Robot Do: Imitation Reward for Noisy Financial Environments Sven Goluža et.al. 2411.08637 translate read null
2024-11-13 Precision-Focused Reinforcement Learning Model for Robotic Object Pushing Lara Bergmann et.al. 2411.08622 translate read link
2024-11-13 Grammarization-Based Grasping with Deep Multi-Autoencoder Latent Space Exploration by Reinforcement Learning Agent Leonidas Askianakis et.al. 2411.08566 translate read null
2024-11-13 Towards Practical Deep Schedulers for Allocating Cellular Radio Resources Petteri Kela et.al. 2411.08529 translate read null
2024-11-12 Learning Memory Mechanisms for Decision Making through Demonstrations William Yue et.al. 2411.07954 translate read link
2024-11-12 Doubly Mild Generalization for Offline Reinforcement Learning Yixiu Mao et.al. 2411.07934 translate read link
2024-11-12 Scaling policy iteration based reinforcement learning for unknown discrete-time linear systems Zhen Pang et.al. 2411.07825 translate read null
2024-11-12 Navigation with QPHIL: Quantizing Planner for Hierarchical Implicit Q-Learning Alexi Canesse et.al. 2411.07760 translate read null
2024-11-12 Optimizing Traffic Signal Control using High-Dimensional State Representation and Efficient Deep Reinforcement Learning Lawrence Francis et.al. 2411.07759 translate read null
2024-11-12 EMPERROR: A Flexible Generative Perception Error Model for Probing Self-Driving Planners Niklas Hanselmann et.al. 2411.07719 translate read null
2024-11-12 Test Where Decisions Matter: Importance-driven Testing for Deep Reinforcement Learning Stefan Pranger et.al. 2411.07700 translate read null
2024-11-12 Exploring Multi-Agent Reinforcement Learning for Unrelated Parallel Machine Scheduling Maria Zampella et.al. 2411.07634 translate read null
2024-11-12 Direct Preference Optimization Using Sparse Feature-Level Constraints Qingyu Yin et.al. 2411.07618 translate read null
2024-11-12 Entropy Controllable Direct Preference Optimization Motoki Omura et.al. 2411.07595 translate read null
2024-11-11 ‘Explaining RL Decisions with Trajectories’: A Reproducibility Study Karim Abdel Sadek et.al. 2411.07200 translate read link
2024-11-11 Joint Age-State Belief is All You Need: Minimizing AoII via Pull-Based Remote Estimation Ismail Cosandal et.al. 2411.07179 translate read null
2024-11-11 Learning Multi-Agent Collaborative Manipulation for Long-Horizon Quadrupedal Pushing Chuye Hong et.al. 2411.07104 translate read null
2024-11-11 A Multi-Agent Approach for REST API Testing with Semantic Graphs and LLM-Driven Inputs Myeongsoo Kim et.al. 2411.07098 translate read null
2024-11-11 OCMDP: Observation-Constrained Markov Decision Process Taiyi Wang et.al. 2411.07087 translate read null
2024-11-11 To Train or Not to Train: Balancing Efficiency and Training Cost in Deep Reinforcement Learning for Mobile Edge Computing Maddalena Boscaro et.al. 2411.07086 translate read null
2024-11-11 Non-Adversarial Inverse Reinforcement Learning via Successor Feature Matching Arnav Kumar Jain et.al. 2411.07007 translate read link
2024-11-11 Enhancing Robot Assistive Behaviour with Reinforcement Learning and Theory of Mind Antonio Andriella et.al. 2411.07003 translate read link
2024-11-11 Imitation from Diverse Behaviors: Wasserstein Quality Diversity Imitation Learning with Single-Step Archive Exploration Xingrui Yu et.al. 2411.06965 translate read null
2024-11-11 Streetwise Agents: Empowering Offline RL Policies to Outsmart Exogenous Stochastic Disturbances in RTC Aditya Soni et.al. 2411.06815 translate read null
2024-11-08 Safe Reinforcement Learning of Robot Trajectories in the Presence of Moving Obstacles Jonas Kiemel et.al. 2411.05784 translate read null
2024-11-08 Tract-RLFormer: A Tract-Specific RL policy based Decoder-only Transformer Network Ankita Joshi et.al. 2411.05757 translate read null
2024-11-08 Topology-aware Reinforcement Feature Space Reconstruction for Graph Data Wangyang Ying et.al. 2411.05742 translate read null
2024-11-08 Renewable Energy Powered and Open RAN-based Architecture for 5G Fixed Wireless Access Provisioning in Rural Areas Anselme Ndikumana et.al. 2411.05699 translate read null
2024-11-08 Data-Driven Distributed Common Operational Picture from Heterogeneous Platforms using Multi-Agent Reinforcement Learning Indranil Sur et.al. 2411.05683 translate read null
2024-11-08 Digital Twin Backed Closed-Loops for Energy-Aware and Open RAN-based Fixed Wireless Access Serving Rural Areas Anselme Ndikumana et.al. 2411.05664 translate read null
2024-11-08 Acceleration for Deep Reinforcement Learning using Parallel and Distributed Computing: A Survey Zhihong Liu et.al. 2411.05614 translate read null
2024-11-08 Smart navigation through a rotating barrier: Deep reinforcement learning with application to size-based separation of active microagents Mohammad Hossein Masoudi et.al. 2411.05587 translate read null
2024-11-08 Tangled Program Graphs as an alternative to DRL-based control algorithms for UAVs Hubert Szolc et.al. 2411.05586 translate read null
2024-11-08 Towards Active Flow Control Strategies Through Deep Reinforcement Learning Ricard Montalà et.al. 2411.05536 translate read null
2024-11-07 Noisy Zero-Shot Coordination: Breaking The Common Knowledge Assumption In Zero-Shot Coordination Games Usman Anwar et.al. 2411.04976 translate read link
2024-11-07 A Reinforcement Learning-Based Automatic Video Editing Method Using Pre-trained Vision-Language Model Panwen Hu et.al. 2411.04942 translate read null
2024-11-07 Stem-OB: Generalizable Visual Imitation Learning with Stem-Like Convergent Observation through Diffusion Inversion Kaizhe Hu et.al. 2411.04919 translate read link
2024-11-07 Evaluating Robustness of Reinforcement Learning Algorithms for Autonomous Shipping Bavo Lesy et.al. 2411.04915 translate read null
2024-11-07 Think Smart, Act SMARL! Analyzing Probabilistic Logic Driven Safety in Multi-Agent Reinforcement Learning Satchit Chatterji et.al. 2411.04867 translate read link
2024-11-07 Asymptotic regularity of a generalised stochastic Halpern scheme with applications Nicholas Pischke et.al. 2411.04845 translate read null
2024-11-07 Plasticity Loss in Deep Reinforcement Learning: A Survey Timo Klein et.al. 2411.04832 translate read null
2024-11-07 Harnessing the Power of Gradient-Based Simulations for Multi-Objective Optimization in Particle Accelerators Kishansingh Rajput et.al. 2411.04817 translate read null
2024-11-07 AllGaits: Learning All Quadruped Gaits and Transitions Guillaume Bellegarda et.al. 2411.04787 translate read null
2024-11-07 Navigating Trade-offs: Policy Summarization for Multi-Objective Reinforcement Learning Zuzanna Osika et.al. 2411.04784 translate read link
2024-11-06 A Comparative Study of Deep Reinforcement Learning for Crop Production Management Joseph Balderas et.al. 2411.04106 translate read null
2024-11-06 Interpretable and Efficient Data-driven Discovery and Control of Distributed Systems Florian Wolf et.al. 2411.04098 translate read null
2024-11-06 Memorized action chunking with Transformers: Imitation learning for vision-based tissue surface scanning Bochen Yang et.al. 2411.04050 translate read null
2024-11-06 Non-Stationary Learning of Neural Networks with Automatic Soft Parameter Reset Alexandre Galashov et.al. 2411.04034 translate read null
2024-11-06 Predicting and Publishing Accurate Imbalance Prices Using Monte Carlo Tree Search Fabio Pavirani et.al. 2411.04011 translate read null
2024-11-06 Object-Centric Dexterous Manipulation from Human Motion Data Yuanpei Chen et.al. 2411.04005 translate read null
2024-11-06 ET-SEED: Efficient Trajectory-Level SE(3) Equivariant Diffusion Policy Chenrui Tie et.al. 2411.03990 translate read null
2024-11-06 AdaSociety: An Adaptive Environment with Social Structures for Multi-Agent Decision-Making Yizhe Huang et.al. 2411.03865 translate read link
2024-11-06 Beyond The Rainbow: High Performance Deep Reinforcement Learning On A Desktop PC Tyler Clark et.al. 2411.03820 translate read null
2024-11-06 From Novice to Expert: LLM Agent Policy Optimization via Step-wise Reinforcement Learning Zhirui Deng et.al. 2411.03817 translate read null
2024-11-05 Out-of-Distribution Recovery with Object-Centric Keypoint Inverse Policy For Visuomotor Imitation Learning George Jiayuan Gao et.al. 2411.03294 translate read null
2024-11-05 Pre-trained Visual Dynamics Representations for Efficient Policy Learning Hao Luo et.al. 2411.03169 translate read null
2024-11-05 Hierarchical Orchestra of Policies Thomas P Cannon et.al. 2411.03008 translate read null
2024-11-05 Accelerating Task Generalisation with Multi-Level Hierarchical Options Thomas P Cannon et.al. 2411.02998 translate read null
2024-11-05 Autonomous Decision Making for UAV Cooperative Pursuit-Evasion Game with Reinforcement Learning Yang Zhao et.al. 2411.02983 translate read null
2024-11-05 Transformer-Based Fault-Tolerant Control for Fixed-Wing UAVs Using Knowledge Distillation and In-Context Adaptation Francisco Giral et.al. 2411.02975 translate read null
2024-11-05 Embedding Safety into RL: A New Take on Trust Region Methods Nikola Milosevic et.al. 2411.02957 translate read null
2024-11-05 The Unreasonable Effectiveness of LLMs for Query Optimization Peter Akioyamen et.al. 2411.02862 translate read link
2024-11-05 ADOPT: Modified Adam Can Converge with Any $β_2$ with the Optimal Rate Shohei Taniguchi et.al. 2411.02853 translate read link
2024-11-05 When to Localize? A Risk-Constrained Reinforcement Learning Approach Chak Lam Shek et.al. 2411.02788 translate read null
2024-11-04 Simulation of Nanorobots with Artificial Intelligence and Reinforcement Learning for Advanced Cancer Cell Detection and Tracking Shahab Kavousinejad et.al. 2411.02345 translate read link
2024-11-04 WebRL: Training LLM Web Agents via Self-Evolving Online Curriculum Reinforcement Learning Zehan Qi et.al. 2411.02337 translate read null
2024-11-04 Targeted Manipulation and Deception Emerge when Optimizing LLMs for User Feedback Marcus Williams et.al. 2411.02306 translate read link
2024-11-04 N-Gram Induction Heads for In-Context RL: Improving Stability and Reducing Data Needs Ilya Zisman et.al. 2411.01958 translate read null
2024-11-04 RoboCrowd: Scaling Robot Data Collection through Crowdsourcing Suvir Mirchandani et.al. 2411.01915 translate read null
2024-11-04 Efficient Active Imitation Learning with Random Network Distillation Emilien Biré et.al. 2411.01894 translate read null
2024-11-04 Align-SLM: Textless Spoken Language Models with Reinforcement Learning from AI Feedback Guan-Ting Lin et.al. 2411.01834 translate read null
2024-11-04 Risk-sensitive control as inference with Rényi divergence Kaito Ito et.al. 2411.01827 translate read null
2024-11-04 IRS-Enhanced Secure Semantic Communication Networks: Cross-Layer and Context-Awared Resource Allocation Lingyi Wang et.al. 2411.01821 translate read null
2024-11-04 So You Think You Can Scale Up Autonomous Robot Data Collection? Suvir Mirchandani et.al. 2411.01813 translate read null

(<a href=../Reinforcement_Learning.md>back to Reinforcement Learning</a>)