Reinforcement Learning - 2024-07
Reinforcement Learning - 2024-07
| Publish Date | Title | Authors | Translate | Read | Code | |
|---|---|---|---|---|---|---|
| 2024-07-31 | CREW: Facilitating Human-AI Teaming Research | Lingyu Zhang et.al. | 2408.00170 | translate | read | null |
| 2024-07-31 | Formal Ethical Obligations in Reinforcement Learning Agents: Verification and Policy Updates | Colin Shea-Blymyer et.al. | 2408.00147 | translate | read | null |
| 2024-07-31 | Adaptive Transit Signal Priority based on Deep Reinforcement Learning and Connected Vehicles in a Traffic Microsimulation Environment | Dickness Kwesiga et.al. | 2408.00098 | translate | read | null |
| 2024-07-31 | Berkeley Humanoid: A Research Platform for Learning-based Control | Qiayuan Liao et.al. | 2407.21781 | translate | read | null |
| 2024-07-31 | Human-Machine Co-Adaptation for Robot-Assisted Rehabilitation via Dual-Agent Multiple Model Reinforcement Learning (DAMMRL) | Yang An et.al. | 2407.21734 | translate | read | null |
| 2024-07-31 | Multi-agent reinforcement learning for the control of three-dimensional Rayleigh-Bénard convection | Joel Vasanth et.al. | 2407.21565 | translate | read | null |
| 2024-07-31 | Black box meta-learning intrinsic rewards for sparse-reward environments | Octavio Pappalardo et.al. | 2407.21546 | translate | read | null |
| 2024-07-31 | Multi-agent Assessment with QoS Enhancement for HD Map Updates in a Vehicular Network | Jeffrey Redondo et.al. | 2407.21460 | translate | read | null |
| 2024-07-31 | ProSpec RL: Plan Ahead, then Execute | Liangliang Liu et.al. | 2407.21359 | translate | read | null |
| 2024-07-31 | Image-Based Deep Reinforcement Learning with Intrinsically Motivated Stimuli: On the Execution of Complex Robotic Tasks | David Valencia et.al. | 2407.21338 | translate | read | null |
| 2024-07-31 | Tractable and Provably Efficient Distributional Reinforcement Learning with General Value Function Approximation | Taehyun Cho et.al. | 2407.21260 | translate | read | null |
| 2024-07-30 | VITAL: Visual Teleoperation to Enhance Robot Learning through Human-in-the-Loop Corrections | Hamidreza Kasaei et.al. | 2407.21244 | translate | read | null |
| 2024-07-30 | Learning Stable Robot Grasping with Transformer-based Tactile Control Policies | En Yen Puang et.al. | 2407.21172 | translate | read | link |
| 2024-07-30 | Securing Proof of Stake Blockchains: Leveraging Multi-Agent Reinforcement Learning for Detecting and Mitigating Malicious Nodes | Faisal Haque Bappy et.al. | 2407.20983 | translate | read | null |
| 2024-07-30 | How to Choose a Reinforcement-Learning Algorithm | Fabian Bongratz et.al. | 2407.20917 | translate | read | null |
| 2024-07-30 | ARCLE: The Abstraction and Reasoning Corpus Learning Environment for Reinforcement Learning | Hosung Lee et.al. | 2407.20806 | translate | read | link |
| 2024-07-30 | Diffusion Augmented Agents: A Framework for Efficient Exploration and Transfer Learning | Norman Di Palo et.al. | 2407.20798 | translate | read | link |
| 2024-07-30 | Architectural Influence on Variational Quantum Circuits in Multi-Agent Reinforcement Learning: Evolutionary Strategies for Optimization | Michael Kölle et.al. | 2407.20739 | translate | read | null |
| 2024-07-30 | Online Prediction-Assisted Safe Reinforcement Learning for Electric Vehicle Charging Station Recommendation in Dynamically Coupled Transportation-Power Systems | Qionghua Liao et.al. | 2407.20679 | translate | read | null |
| 2024-07-30 | Towards Generalizable Reinforcement Learning via Causality-Guided Self-Adaptive Representations | Yupei Yang et.al. | 2407.20651 | translate | read | null |
| 2024-07-30 | Wireless Multi-User Interactive Virtual Reality in Metaverse with Edge-Device Collaborative Computing | Caolu Xu et.al. | 2407.20523 | translate | read | null |
| 2024-07-30 | Boosting Efficiency in Task-Agnostic Exploration through Causal Knowledge | Yupei Yang et.al. | 2407.20506 | translate | read | link |
| 2024-07-29 | A Method for Fast Autonomy Transfer in Reinforcement Learning | Dinuka Sahabandu et.al. | 2407.20466 | translate | read | null |
| 2024-07-29 | SAPG: Split and Aggregate Policy Gradients | Jayesh Singla et.al. | 2407.20230 | translate | read | null |
| 2024-07-29 | Privileged Reinforcement and Communication Learning for Distributed, Bandwidth-limited Multi-robot Exploration | Yixiao Ma et.al. | 2407.20203 | translate | read | null |
| 2024-07-29 | Language-Conditioned Offline RL for Multi-Robot Navigation | Steven Morad et.al. | 2407.20164 | translate | read | null |
| 2024-07-29 | Quantum Machine Learning Architecture Search via Deep Reinforcement Learning | Xin Dai et.al. | 2407.20147 | translate | read | null |
| 2024-07-29 | Diffusion-DICE: In-Sample Diffusion Guidance for Offline Reinforcement Learning | Liyuan Mao et.al. | 2407.20109 | translate | read | null |
| 2024-07-29 | Counterfactual rewards promote collective transport using individually controlled swarm microrobots | Veit-Lorenz Heuthe et.al. | 2407.20041 | translate | read | null |
| 2024-07-29 | Collision Probability Distribution Estimation via Temporal Difference Learning | Thomas Steinecker et.al. | 2407.20000 | translate | read | link |
| 2024-07-29 | Integrated Communications and Security: RIS-Assisted Simultaneous Transmission and Generation of Secret Keys | Ning Gao et.al. | 2407.19960 | translate | read | null |
| 2024-07-29 | A Differential Dynamic Programming Framework for Inverse Reinforcement Learning | Kun Cao et.al. | 2407.19902 | translate | read | null |
| 2024-07-29 | Imitation Learning for Intra-Day Power Grid Operation through Topology Actions | Matthijs de Jong et.al. | 2407.19865 | translate | read | null |
| 2024-07-26 | SOAP-RL: Sequential Option Advantage Propagation for Reinforcement Learning in POMDP Environments | Shu Ishida et.al. | 2407.18913 | translate | read | null |
| 2024-07-26 | Lessons from Learning to Spin “Pens” | Jun Wang et.al. | 2407.18902 | translate | read | null |
| 2024-07-26 | SHANGUS: Deep Reinforcement Learning Meets Heuristic Optimization for Speedy Frontier-Based Exploration of Autonomous Vehicles in Unknown Spaces | Seunghyeop Nam et.al. | 2407.18892 | translate | read | null |
| 2024-07-26 | An Accelerated Multi-level Monte Carlo Approach for Average Reward Reinforcement Learning with General Policy Parametrization | Swetha Ganesh et.al. | 2407.18878 | translate | read | null |
| 2024-07-26 | QT-TDM: Planning with Transformer Dynamics Model and Autoregressive Q-Learning | Mostafa Kotb et.al. | 2407.18841 | translate | read | null |
| 2024-07-26 | The Cross-environment Hyperparameter Setting Benchmark for Reinforcement Learning | Andrew Patterson et.al. | 2407.18840 | translate | read | null |
| 2024-07-26 | Learning a Shape-Conditioned Agent for Purely Tactile In-Hand Manipulation of Various Objects | Johannes Pitz et.al. | 2407.18834 | translate | read | null |
| 2024-07-26 | Online Planning in POMDPs with State-Requests | Raphael Avalos et.al. | 2407.18812 | translate | read | null |
| 2024-07-26 | Tuning the kinetics of intracellular transport | Ardra Suchitran et.al. | 2407.18784 | translate | read | null |
| 2024-07-26 | A Deep Reinforcement Learning Approach to Wavefront Control for Exoplanet Imaging | Yann Gutierrez et.al. | 2407.18733 | translate | read | null |
| 2024-07-25 | Recursive Introspection: Teaching Language Model Agents How to Self-Improve | Yuxiao Qu et.al. | 2407.18219 | translate | read | null |
| 2024-07-25 | Differentiable Quantum Architecture Search in Asynchronous Quantum Reinforcement Learning | Samuel Yen-Chi Chen et.al. | 2407.18202 | translate | read | null |
| 2024-07-25 | Maximum Entropy On-Policy Actor-Critic via Entropy Advantage Estimation | Jean Seong Bjorn Choe et.al. | 2407.18143 | translate | read | null |
| 2024-07-25 | MapTune: Advancing ASIC Technology Mapping via Reinforcement Learning Guided Library Tuning | Mingju Liu et.al. | 2407.18110 | translate | read | link |
| 2024-07-25 | Principal-Agent Reinforcement Learning | Dima Ivanov et.al. | 2407.18074 | translate | read | null |
| 2024-07-25 | Multi-Agent Deep Reinforcement Learning for Resilience Optimization in 5G RAN | Soumeya Kaada et.al. | 2407.18066 | translate | read | null |
| 2024-07-25 | Personalized and Context-aware Route Planning for Edge-assisted Vehicles | Dinesh Cyril Selvaraj et.al. | 2407.17980 | translate | read | null |
| 2024-07-25 | Optimal Hessian/Jacobian-Free Nonconvex-PL Bilevel Optimization | Feihu Huang et.al. | 2407.17823 | translate | read | null |
| 2024-07-25 | Advanced deep-reinforcement-learning methods for flow control: group-invariant and positional-encoding networks improve learning speed and quality | Joogoo Jeon et.al. | 2407.17822 | translate | read | null |
| 2024-07-25 | Preliminary Results of Neuromorphic Controller Design and a Parkinson’s Disease Dataset Building for Closed-Loop Deep Brain Stimulation | Ananna Biswas et.al. | 2407.17756 | translate | read | null |
| 2024-07-24 | Traversing Pareto Optimal Policies: Provably Efficient Multi-Objective Reinforcement Learning | Shuang Qiu et.al. | 2407.17466 | translate | read | null |
| 2024-07-24 | Toward human-centered shared autonomy AI paradigms for human-robot teaming in healthcare | Reza Abiri et.al. | 2407.17464 | translate | read | null |
| 2024-07-24 | SoNIC: Safe Social Navigation with Adaptive Conformal Inference and Constrained Reinforcement Learning | Jianpeng Yao et.al. | 2407.17460 | translate | read | null |
| 2024-07-24 | Joint Transmit and Jamming Power Optimization for Secrecy in Energy Harvesting Networks: A Reinforcement Learning Approach | Shalini Tripathi et.al. | 2407.17435 | translate | read | null |
| 2024-07-24 | Market Making with Exogenous Competition | Robert Boyce et.al. | 2407.17393 | translate | read | null |
| 2024-07-24 | MoveLight: Enhancing Traffic Signal Control through Movement-Centric Deep Reinforcement Learning | Junqi Shao et.al. | 2407.17303 | translate | read | null |
| 2024-07-24 | Pretrained Visual Representations in Reinforcement Learning | Emlyn Williams et.al. | 2407.17238 | translate | read | null |
| 2024-07-24 | Sublinear Regret for An Actor-Critic Algorithm in Continuous-Time Linear-Quadratic Reinforcement Learning | Yilie Huang et.al. | 2407.17226 | translate | read | null |
| 2024-07-24 | Take a Step and Reconsider: Sequence Decoding for Self-Improved Neural Combinatorial Optimization | Jonathan Pirnay et.al. | 2407.17206 | translate | read | link |
| 2024-07-24 | Path Following and Stabilisation of a Bicycle Model using a Reinforcement Learning Approach | Sebastian Weyrer et.al. | 2407.17156 | translate | read | null |
| 2024-07-23 | A Simulation Benchmark for Autonomous Racing with Large-Scale Human Data | Adrian Remonda et.al. | 2407.16680 | translate | read | link |
| 2024-07-23 | From Imitation to Refinement – Residual RL for Precise Visual Assembly | Lars Ankile et.al. | 2407.16677 | translate | read | null |
| 2024-07-23 | Efficient Discovery of Actual Causality using Abstraction-Refinement | Arshia Rafieioskouei et.al. | 2407.16629 | translate | read | null |
| 2024-07-23 | Functional Acceleration for Policy Mirror Descent | Veronica Chelu et.al. | 2407.16602 | translate | read | null |
| 2024-07-23 | Real-Time Interactions Between Human Controllers and Remote Devices in Metaverse | Kan Chen et.al. | 2407.16591 | translate | read | null |
| 2024-07-23 | TLCR: Token-Level Continuous Reward for Fine-grained Reinforcement Learning from Human Feedback | Eunseop Yoon et.al. | 2407.16574 | translate | read | null |
| 2024-07-23 | Cross Anything: General Quadruped Robot Navigation through Complex Terrains | Shaoting Zhu et.al. | 2407.16412 | translate | read | null |
| 2024-07-23 | Evaluating Uncertainties in Electricity Markets via Machine Learning and Quantum Computing | Shuyang Zhu et.al. | 2407.16404 | translate | read | null |
| 2024-07-23 | Reinforcement Learning-based Adaptive Mitigation of Uncorrected DRAM Errors in the Field | Isaac Boixaderas et.al. | 2407.16377 | translate | read | null |
| 2024-07-23 | Arbitrary quantum states preparation aided by deep reinforcement learning | Zhao-Wei Wang et.al. | 2407.16368 | translate | read | null |
| 2024-07-22 | WayEx: Waypoint Exploration using a Single Demonstration | Mara Levy et.al. | 2407.15849 | translate | read | null |
| 2024-07-23 | QueST: Self-Supervised Skill Abstractions for Learning Continuous Control | Atharva Mete et.al. | 2407.15840 | translate | read | null |
| 2024-07-22 | Importance Sampling-Guided Meta-Training for Intelligent Agents in Highly Interactive Environments | Mansur Arief et.al. | 2407.15839 | translate | read | null |
| 2024-07-22 | On shallow planning under partial observability | Randy Lefebvre et.al. | 2407.15820 | translate | read | null |
| 2024-07-22 | Learning to Manipulate Anywhere: A Visual Generalizable Framework For Reinforcement Learning | Zhecheng Yuan et.al. | 2407.15815 | translate | read | null |
| 2024-07-22 | Concept-Based Interpretable Reinforcement Learning with Limited to No Human Labels | Zhuorui Ye et.al. | 2407.15786 | translate | read | null |
| 2024-07-22 | Diffusion Model Based Resource Allocation Strategy in Ultra-Reliable Wireless Networked Control Systems | Amirhassan Babazadeh Darabi et.al. | 2407.15784 | translate | read | null |
| 2024-07-22 | How to Shrink Confidence Sets for Many Equivalent Discrete Distributions? | Odalric-Ambrym Maillard et.al. | 2407.15662 | translate | read | null |
| 2024-07-22 | Evaluation of Reinforcement Learning for Autonomous Penetration Testing using A3C, Q-learning and DQN | Norman Becker et.al. | 2407.15656 | translate | read | null |
| 2024-07-22 | Reinforcement Learning Meets Visual Odometry | Nico Messikommer et.al. | 2407.15626 | translate | read | null |
| 2024-07-19 | Catastrophic Goodhart: regularizing RLHF with KL divergence does not mitigate heavy-tailed reward misspecification | Thomas Kwa et.al. | 2407.14503 | translate | read | null |
| 2024-07-19 | Explainable Post hoc Portfolio Management Financial Policy of a Deep Reinforcement Learning agent | Alejandra de la Rica Escudero et.al. | 2407.14486 | translate | read | link |
| 2024-07-19 | Data-Centric Human Preference Optimization with Rationales | Hoang Anh Just et.al. | 2407.14477 | translate | read | null |
| 2024-07-19 | FuzzTheREST: An Intelligent Automated Black-box RESTful API Fuzzer | Tiago Dias et.al. | 2407.14361 | translate | read | null |
| 2024-07-19 | Hyperparameter Optimization for Driving Strategies Based on Reinforcement Learning | Nihal Acharya Adde et.al. | 2407.14262 | translate | read | null |
| 2024-07-19 | On Policy Evaluation Algorithms in Distributional Reinforcement Learning | Julian Gerstenberg et.al. | 2407.14175 | translate | read | null |
| 2024-07-19 | A Comparative Study of Deep Reinforcement Learning Models: DQN vs PPO vs A2C | Neil De La Fuente et.al. | 2407.14151 | translate | read | link |
| 2024-07-19 | Track-MDP: Reinforcement Learning for Target Tracking with Controlled Sensing | Adarsh M. Subramaniam et.al. | 2407.13995 | translate | read | null |
| 2024-07-19 | The Effect of Training Schedules on Morphological Robustness and Generalization | Edoardo Barba et.al. | 2407.13965 | translate | read | link |
| 2024-07-18 | Event-Triggered Reinforcement Learning Based Joint Resource Allocation for Ultra-Reliable Low-Latency V2X Communications | Nasir Khan et.al. | 2407.13947 | translate | read | null |
| 2024-07-18 | Random Latent Exploration for Deep Reinforcement Learning | Srinath Mahankali et.al. | 2407.13755 | translate | read | null |
| 2024-07-18 | Optimistic Q-learning for average reward and episodic reinforcement learning | Priyank Agrawal et.al. | 2407.13743 | translate | read | null |
| 2024-07-18 | Understanding Reinforcement Learning-Based Fine-Tuning of Diffusion Models: A Tutorial and Review | Masatoshi Uehara et.al. | 2407.13734 | translate | read | null |
| 2024-07-18 | A Comprehensive Review of Recommender Systems: Transitioning from Theory to Practice | Shaina Raza et.al. | 2407.13699 | translate | read | null |
| 2024-07-18 | Misspecified $Q$ -Learning with Sparse Linear Function Approximation: Tight Bounds on Approximation Error | Ally Yalei Du et.al. | 2407.13622 | translate | read | null |
| 2024-07-18 | Hyp2Nav: Hyperbolic Planning and Curiosity for Crowd Navigation | Alessandro Flaborea et.al. | 2407.13567 | translate | read | null |
| 2024-07-18 | Model-based Policy Optimization using Symbolic World Model | Andrey Gorodetskiy et.al. | 2407.13518 | translate | read | null |
| 2024-07-18 | Instance Selection for Dynamic Algorithm Configuration with Reinforcement Learning: Improving Generalization | Carolin Benjamins et.al. | 2407.13513 | translate | read | null |
| 2024-07-18 | LIMT: Language-Informed Multi-Task Visual World Models | Elie Aljalbout et.al. | 2407.13466 | translate | read | null |
| 2024-07-18 | The Art of Imitation: Learning Long-Horizon Manipulation Tasks from Few Demonstrations | Jan Ole von Hartz et.al. | 2407.13432 | translate | read | null |
| 2024-07-17 | Navigating the Smog: A Cooperative Multi-Agent RL for Accurate Air Pollution Mapping through Data Assimilation | Ichrak Mokhtari et.al. | 2407.12539 | translate | read | null |
| 2024-07-17 | Towards Collaborative Intelligence: Propagating Intentions and Reasoning for Multi-Agent Coordination with Large Language Models | Xihe Qiu et.al. | 2407.12532 | translate | read | null |
| 2024-07-17 | Subequivariant Reinforcement Learning in 3D Multi-Entity Physical Environments | Runfa Chen et.al. | 2407.12505 | translate | read | null |
| 2024-07-17 | Estimating Reaction Barriers with Deep Reinforcement Learning | Adittya Pal et.al. | 2407.12453 | translate | read | null |
| 2024-07-17 | Energy-Guided Diffusion Sampling for Offline-to-Online Reinforcement Learning | Xu-Hui Liu et.al. | 2407.12448 | translate | read | link |
| 2024-07-17 | Variable-Agnostic Causal Exploration for Reinforcement Learning | Minh Hoang Nguyen et.al. | 2407.12437 | translate | read | null |
| 2024-07-17 | Flow Matching Imitation Learning for Multi-Support Manipulation | Quentin Rouxel et.al. | 2407.12381 | translate | read | null |
| 2024-07-17 | A foundation model approach to guide antimicrobial peptide design in the era of artificial intelligence driven scientific discovery | Jike Wang et.al. | 2407.12296 | translate | read | null |
| 2024-07-17 | Chip Placement with Diffusion | Vint Lee et.al. | 2407.12282 | translate | read | null |
| 2024-07-17 | Individualized Federated Learning for Traffic Prediction with Error Driven Aggregation | Hang Chen et.al. | 2407.12226 | translate | read | link |
| 2024-07-16 | Why long model-based rollouts are no reason for bad Q-value estimates | Philipp Wissmann et.al. | 2407.11751 | translate | read | null |
| 2024-07-16 | Pareto local search for a multi-objective demand response problem in residential areas with heat pumps and electric vehicles | Thomas Dengiz et.al. | 2407.11719 | translate | read | null |
| 2024-07-16 | A Comparative Analysis of Interactive Reinforcement Learning Algorithms in Warehouse Robot Grid Based Environment | Arunabh Bora et.al. | 2407.11671 | translate | read | null |
| 2024-07-16 | Exciting Action: Investigating Efficient Exploration for Learning Musculoskeletal Humanoid Locomotion | Henri-Jacques Geiß et.al. | 2407.11658 | translate | read | null |
| 2024-07-16 | Building Resilience in Wireless Communication Systems With a Secret-Key Budget | Karl-Ludwig Besser et.al. | 2407.11604 | translate | read | null |
| 2024-07-16 | Learning to Imitate Spatial Organization in Multi-robot Systems | Ayomide O. Agunloye et.al. | 2407.11592 | translate | read | null |
| 2024-07-16 | Green Resource Allocation in Cloud-Native O-RAN Enabled Small Cell Networks | Rana M. Sohaib et.al. | 2407.11563 | translate | read | null |
| 2024-07-16 | RobotKeyframing: Learning Locomotion with High-Level Objectives via Mixture of Dense and Sparse Rewards | Fatemeh Zargarbashi et.al. | 2407.11562 | translate | read | null |
| 2024-07-16 | Imitation learning with artificial neural networks for demand response with a heuristic control approach for heat pumps | Thomas Dengiz et.al. | 2407.11561 | translate | read | null |
| 2024-07-16 | DRL-based Joint Resource Scheduling of eMBB and URLLC in O-RAN | Rana M. Sohaib et.al. | 2407.11558 | translate | read | null |
| 2024-07-15 | Walking the Values in Bayesian Inverse Reinforcement Learning | Ondrej Bajgar et.al. | 2407.10971 | translate | read | null |
| 2024-07-15 | BECAUSE: Bilinear Causal Representation for Generalizable Offline Model-based Reinforcement Learning | Haohong Lin et.al. | 2407.10967 | translate | read | null |
| 2024-07-15 | Hedging Beyond the Mean: A Distributional Reinforcement Learning Perspective for Hedging Portfolios with Structured Products | Anil Sharma et.al. | 2407.10903 | translate | read | null |
| 2024-07-15 | Offline Reinforcement Learning with Imputed Rewards | Carlo Romeo et.al. | 2407.10839 | translate | read | null |
| 2024-07-15 | Exploration in Knowledge Transfer Utilizing Reinforcement Learning | Adam Jedlička et.al. | 2407.10835 | translate | read | null |
| 2024-07-15 | GuideLight: “Industrial Solution” Guidance for More Practical Traffic Signal Control Agents | Haoyuan Jiang et.al. | 2407.10811 | translate | read | null |
| 2024-07-15 | DINO Pre-training for Vision-based End-to-end Autonomous Driving | Shubham Juneja et.al. | 2407.10803 | translate | read | null |
| 2024-07-15 | Last-Iterate Global Convergence of Policy Gradients for Constrained Reinforcement Learning | Alessandro Montenegro et.al. | 2407.10775 | translate | read | null |
| 2024-07-16 | Back to Newton’s Laws: Learning Vision-based Agile Flight via Differentiable Physics | Yuang Zhang et.al. | 2407.10648 | translate | read | null |
| 2024-07-15 | Balancing the Scales: Reinforcement Learning for Fair Classification | Leon Eshuijs et.al. | 2407.10629 | translate | read | null |
| 2024-07-12 | Learning Coordinated Maneuver in Adversarial Environments | Zechen Hu et.al. | 2407.09469 | translate | read | null |
| 2024-07-12 | ASTPrompter: Weakly Supervised Automated Language Model Red-Teaming to Identify Likely Toxic Prompts | Amelia F. Hardy et.al. | 2407.09447 | translate | read | null |
| 2024-07-12 | A Benchmark Environment for Offline Reinforcement Learning in Racing Games | Girolamo Macaluso et.al. | 2407.09415 | translate | read | link |
| 2024-07-12 | Instruction Following with Goal-Conditioned Reinforcement Learning in Virtual Environments | Zoya Volovikova et.al. | 2407.09287 | translate | read | null |
| 2024-07-12 | GNN with Model-based RL for Multi-agent Systems | Hanxiao Chen et.al. | 2407.09249 | translate | read | null |
| 2024-07-12 | Constrained Intrinsic Motivation for Reinforcement Learning | Xiang Zheng et.al. | 2407.09247 | translate | read | null |
| 2024-07-12 | Decentralized multi-agent reinforcement learning algorithm using a cluster-synchronized laser network | Shun Kotoku et.al. | 2407.09124 | translate | read | null |
| 2024-07-12 | New Desiderata for Direct Preference Optimization | Xiangkun Hu et.al. | 2407.09072 | translate | read | null |
| 2024-07-12 | Aligning Diffusion Behaviors with Q-functions for Efficient Continuous Control | Huayu Chen et.al. | 2407.09024 | translate | read | null |
| 2024-07-12 | Communication-Aware Reinforcement Learning for Cooperative Adaptive Cruise Control | Sicong Jiang et.al. | 2407.08964 | translate | read | null |
| 2024-07-11 | MetaUrban: A Simulation Platform for Embodied AI in Urban Spaces | Wayne Wu et.al. | 2407.08725 | translate | read | null |
| 2024-07-11 | RoboMorph: Evolving Robot Morphology using Large Language Models | Kevin Qiu et.al. | 2407.08626 | translate | read | null |
| 2024-07-11 | A Review of Nine Physics Engines for Reinforcement Learning Research | Michael Kaup et.al. | 2407.08590 | translate | read | null |
| 2024-07-11 | HACMan++: Spatially-Grounded Motion Primitives for Manipulation | Bowen Jiang et.al. | 2407.08585 | translate | read | null |
| 2024-07-11 | Imitation Learning for Robotic Assisted Ultrasound Examination of Deep Venous Thrombosis using Kernelized Movement Primitives | Diego Dall’Alba et.al. | 2407.08506 | translate | read | null |
| 2024-07-11 | TLDR: Unsupervised Goal-Conditioned RL via Temporal Distance-Aware Representations | Junik Bae et.al. | 2407.08464 | translate | read | null |
| 2024-07-11 | Distributed Deep Reinforcement Learning Based Gradient Quantization for Federated Learning Enabled Vehicle Edge Computing | Cui Zhang et.al. | 2407.08462 | translate | read | null |
| 2024-07-11 | Joint Optimization of Age of Information and Energy Consumption in NR-V2X System based on Deep Reinforcement Learning | Shulin Song et.al. | 2407.08458 | translate | read | link |
| 2024-07-11 | A Cantor-Kantorovich Metric Between Markov Decision Processes with Application to Transfer Learning | Adrien Banse et.al. | 2407.08324 | translate | read | null |
| 2024-07-11 | A Deep Reinforcement Learning Framework and Methodology for Reducing the Sim-to-Real Gap in ASV Navigation | Luis F W Batista et.al. | 2407.08263 | translate | read | null |
| 2024-07-10 | Learning In-Hand Translation Using Tactile Skin With Shear and Normal Force Sensing | Jessica Yin et.al. | 2407.07885 | translate | read | null |
| 2024-07-10 | Green Screen Augmentation Enables Scene Generalisation in Robotic Manipulation | Eugene Teoh et.al. | 2407.07868 | translate | read | null |
| 2024-07-10 | Reinforcement Learning of Adaptive Acquisition Policies for Inverse Problems | Gianluigi Silvestri et.al. | 2407.07794 | translate | read | null |
| 2024-07-11 | BiGym: A Demo-Driven Mobile Bi-Manual Manipulation Benchmark | Nikita Chernyadev et.al. | 2407.07788 | translate | read | null |
| 2024-07-10 | Continuous Control with Coarse-to-fine Reinforcement Learning | Younggyo Seo et.al. | 2407.07787 | translate | read | null |
| 2024-07-10 | Towards Human-Like Driving: Active Inference in Autonomous Vehicle Control | Elahe Delavari et.al. | 2407.07684 | translate | read | null |
| 2024-07-10 | Pessimism Meets Risk: Risk-Sensitive Offline Reinforcement Learning | Dake Zhang et.al. | 2407.07631 | translate | read | null |
| 2024-07-10 | Resource Allocation for Twin Maintenance and Computing Task Processing in Digital Twin Vehicular Edge Computing Network | Yu Xie et.al. | 2407.07575 | translate | read | link |
| 2024-07-10 | CM-DQN: A Value-Based Deep Reinforcement Learning Model to Simulate Confirmation Bias | Jiacheng Shen et.al. | 2407.07454 | translate | read | link |
| 2024-07-10 | Real-time system optimal traffic routing under uncertainties – Can physics models boost reinforcement learning? | Zemian Ke et.al. | 2407.07364 | translate | read | null |
| 2024-07-09 | Safe and Reliable Training of Learning-Based Aerospace Controllers | Udayan Mandal et.al. | 2407.07088 | translate | read | null |
| 2024-07-09 | Hypothetical Minds: Scaffolding Theory of Mind for Multi-Agent Tasks with Large Language Models | Logan Cross et.al. | 2407.07086 | translate | read | link |
| 2024-07-09 | Can Learned Optimization Make Reinforcement Learning Less Difficult? | Alexander David Goldie et.al. | 2407.07082 | translate | read | link |
| 2024-07-09 | A Unified Approach to Multi-task Legged Navigation: Temporal Logic Meets Reinforcement Learning | Jesse Jiang et.al. | 2407.06931 | translate | read | null |
| 2024-07-09 | Intercepting Unauthorized Aerial Robots in Controlled Airspace Using Reinforcement Learning | Francisco Giral et.al. | 2407.06909 | translate | read | null |
| 2024-07-09 | Learning From Crowdsourced Noisy Labels: A Signal Processing Perspective | Shahana Ibrahim et.al. | 2407.06902 | translate | read | null |
| 2024-07-09 | Energy Efficient Fair STAR-RIS for Mobile Users | Ashok S. Kumar et.al. | 2407.06868 | translate | read | null |
| 2024-07-09 | Frequency and Generalisation of Periodic Activation Functions in Reinforcement Learning | Augustine N. Mavor-Parker et.al. | 2407.06756 | translate | read | null |
| 2024-07-09 | Hierarchical Average-Reward Linearly-solvable Markov Decision Processes | Guillermo Infante et.al. | 2407.06690 | translate | read | null |
| 2024-07-09 | Powerful and Flexible: Personalized Text-to-Image Generation via Reinforcement Learning | Fanyue Wei et.al. | 2407.06642 | translate | read | link |
| 2024-07-08 | Periodic agent-state based Q-learning for POMDPs | Amit Sinha et.al. | 2407.06121 | translate | read | null |
| 2024-07-08 | QTRL: Toward Practical Quantum Reinforcement Learning via Quantum-Train | Chen-Yu Liu et.al. | 2407.06103 | translate | read | null |
| 2024-07-08 | Stranger Danger! Identifying and Avoiding Unpredictable Pedestrians in RL-based Social Robot Navigation | Sara Pohland et.al. | 2407.06056 | translate | read | link |
| 2024-07-08 | iLLM-TSC: Integration reinforcement learning and large language model for traffic signal control policy improvement | Aoyu Pang et.al. | 2407.06025 | translate | read | link |
| 2024-07-08 | Multimodal Diffusion Transformer: Learning Versatile Behavior from Multimodal Goals | Moritz Reuss et.al. | 2407.05996 | translate | read | null |
| 2024-07-08 | On Bellman equations for continuous-time policy evaluation I: discretization and approximation | Wenlong Mou et.al. | 2407.05966 | translate | read | null |
| 2024-07-08 | Graph Anomaly Detection with Noisy Labels by Reinforcement Learning | Zhu Wang et.al. | 2407.05934 | translate | read | null |
| 2024-07-08 | FedMRL: Data Heterogeneity Aware Federated Multi-agent Deep Reinforcement Learning for Medical Imaging | Pranab Sahoo et.al. | 2407.05800 | translate | read | link |
| 2024-07-08 | Structural Generalization in Autonomous Cyber Incident Response with Message-Passing Neural Networks and Reinforcement Learning | Jakob Nyberg et.al. | 2407.05775 | translate | read | link |
| 2024-07-08 | Multi-agent Reinforcement Learning-based Network Intrusion Detection System | Amine Tellache et.al. | 2407.05766 | translate | read | null |
| 2024-07-05 | Graph Reinforcement Learning in Power Grids: A Survey | Mohamed Hassouna et.al. | 2407.04522 | translate | read | null |
| 2024-07-05 | Using Petri Nets as an Integrated Constraint Mechanism for Reinforcement Learning Tasks | Timon Sachweh et.al. | 2407.04481 | translate | read | null |
| 2024-07-05 | Hindsight Preference Learning for Offline Preference-based Reinforcement Learning | Chen-Xiao Gao et.al. | 2407.04451 | translate | read | link |
| 2024-07-05 | Enhancing Safety for Autonomous Agents in Partly Concealed Urban Traffic Environments Through Representation-Based Shielding | Pierre Haritz et.al. | 2407.04343 | translate | read | null |
| 2024-07-05 | Gradient-based Regularization for Action Smoothness in Robotic Control with Reinforcement Learning | I Lee et.al. | 2407.04315 | translate | read | null |
| 2024-07-05 | Robust Decision Transformer: Tackling Data Corruption in Offline RL via Sequence Modeling | Jiawei Xu et.al. | 2407.04285 | translate | read | null |
| 2024-07-05 | Unsupervised Video Summarization via Reinforcement Learning and a Trained Evaluator | Mehryar Abbasi et.al. | 2407.04258 | translate | read | null |
| 2024-07-05 | PA-LOCO: Learning Perturbation-Adaptive Locomotion for Quadruped Robots | Zhiyuan Xiao et.al. | 2407.04224 | translate | read | null |
| 2024-07-05 | Autoverse: An Evolvable Game Langugage for Learning Robust Embodied Agents | Sam Earle et.al. | 2407.04221 | translate | read | null |
| 2024-07-04 | Orchestrating LLMs with Different Personalizations | Jin Peng Zhou et.al. | 2407.04181 | translate | read | null |
| 2024-07-03 | Value-Penalized Auxiliary Control from Examples for Learning without Rewards or Demonstrations | Trevor Ablett et.al. | 2407.03311 | translate | read | link |
| 2024-07-03 | A Review of the Applications of Deep Learning-Based Emergent Communication | Brendon Boldt et.al. | 2407.03302 | translate | read | null |
| 2024-07-03 | Cooperative Multi-Agent Deep Reinforcement Learning Methods for UAV-aided Mobile Edge Computing Networks | Mintae Kim et.al. | 2407.03280 | translate | read | null |
| 2024-07-03 | Policy-guided Monte Carlo on general state spaces: Application to glass-forming mixtures | Leonardo Galliano et.al. | 2407.03275 | translate | read | null |
| 2024-07-03 | PPO-based Dynamic Control of Uncertain Floating Platforms in the Zero-G Environment | Mahya Ramezani et.al. | 2407.03224 | translate | read | null |
| 2024-07-03 | Combining AI Control Systems and Human Decision Support via Robustness and Criticality | Walt Woods et.al. | 2407.03210 | translate | read | null |
| 2024-07-03 | Bunny-VisionPro: Real-Time Bimanual Dexterous Teleoperation for Imitation Learning | Runyu Ding et.al. | 2407.03162 | translate | read | null |
| 2024-07-03 | Reinforcement Learning for Sequence Design Leveraging Protein Language Models | Jithendaraa Subramanian et.al. | 2407.03154 | translate | read | null |
| 2024-07-03 | Warm-up Free Policy Optimization: Improved Regret in Linear Markov Decision Processes | Asaf Cassel et.al. | 2407.03065 | translate | read | null |
| 2024-07-03 | Improving Conversational Abilities of Quantized Large Language Models via Direct Preference Alignment | Janghwan Lee et.al. | 2407.03051 | translate | read | null |
| 2024-07-02 | PWM: Policy Learning with Large World Models | Ignat Georgiev et.al. | 2407.02466 | translate | read | null |
| 2024-07-02 | Predicting Visual Attention in Graphic Design Documents | Souradeep Chakraborty et.al. | 2407.02439 | translate | read | null |
| 2024-07-02 | Reinforcement Learning and Machine ethics:a systematic review | Ajay Vishwanath et.al. | 2407.02425 | translate | read | null |
| 2024-07-02 | Talking to Machines: do you read me? | Lina M. Rojas-Barahona et.al. | 2407.02354 | translate | read | null |
| 2024-07-02 | DextrAH-G: Pixels-to-Action Dexterous Arm-Hand Grasping with Geometric Fabrics | Tyler Ga Wei Lum et.al. | 2407.02274 | translate | read | null |
| 2024-07-02 | Safe CoR: A Dual-Expert Approach to Integrating Imitation Learning and Safe Reinforcement Learning Using Constraint Rewards | Hyeokjin Kwon et.al. | 2407.02245 | translate | read | null |
| 2024-07-02 | Robust Zero-Shot Text-to-Speech Synthesis with Reverse Inference Optimization | Yuchen Hu et.al. | 2407.02243 | translate | read | null |
| 2024-07-02 | Safety-Driven Deep Reinforcement Learning Framework for Cobots: A Sim2Real Approach | Ammar N. Abbas et.al. | 2407.02231 | translate | read | link |
| 2024-07-02 | Physics-Informed Model and Hybrid Planning for Efficient Dyna-Style Reinforcement Learning | Zakariae El Asri et.al. | 2407.02217 | translate | read | null |
| 2024-07-02 | Cost-Effective Proxy Reward Model Construction with On-Policy and Active Learning | Yifang Chen et.al. | 2407.02119 | translate | read | null |
(<a href=../Reinforcement_Learning.md>back to Reinforcement Learning</a>)