Reinforcement Learning - 2024-05

Publish Date Title Authors PDF Translate Read Code
2024-05-31 Exploratory Preference Optimization: Harnessing Implicit Q*-Approximation for Sample-Efficient RLHF Tengyang Xie et.al. 2405.21046 translate read null
2024-05-31 Direct Alignment of Language Models via Quality-Aware Self-Refinement Runsheng Yu et.al. 2405.21040 translate read null
2024-05-31 Generating Triangulations and Fibrations with Reinforcement Learning Per Berglund et.al. 2405.21017 translate read null
2024-05-31 Bayesian Design Principles for Offline-to-Online Reinforcement Learning Hao Hu et.al. 2405.20984 translate read null
2024-05-31 Goal-Oriented Sensor Reporting Scheduling for Non-linear Dynamic System Monitoring Prasoon Raghuwanshi et.al. 2405.20983 translate read null
2024-05-31 SaySelf: Teaching LLMs to Express Confidence with Self-Reflective Rationales Tianyang Xu et.al. 2405.20974 translate read link
2024-05-31 Amortizing intractable inference in diffusion models for vision, language, and control Siddarth Venkatraman et.al. 2405.20971 translate read link
2024-05-31 Enhancing Efficiency of Safe Reinforcement Learning via Sample Manipulation Shangding Gu et.al. 2405.20860 translate read null
2024-05-31 Improving Reward Models with Synthetic Critiques Zihuiwen Ye et.al. 2405.20850 translate read null
2024-05-30 Group Robust Preference Optimization in Reward-free RLHF Shyam Sundhar Ramesh et.al. 2405.20304 translate read link
2024-05-30 Evaluating Large Language Model Biases in Persona-Steered Generation Andy Liu et.al. 2405.20253 translate read link
2024-05-30 InstructionCP: A fast approach to transfer Large Language Models into target language Kuang-Ming Chen et.al. 2405.20175 translate read null
2024-05-30 Enhancing Battlefield Awareness: An Aerial RIS-assisted ISAC System with Deep Reinforcement Learning Hyunsang Cho et.al. 2405.20168 translate read null
2024-05-30 Randomized Exploration for Reinforcement Learning with Multinomial Logistic Function Approximation Wooseong Cho et.al. 2405.20165 translate read null
2024-05-30 NoiseBoost: Alleviating Hallucination with Noise Perturbation for Multimodal Large Language Models Kai Wu et.al. 2405.20081 translate read null
2024-05-30 Would I Lie To You? Inference Time Alignment of Language Models using Direct Preference Heads Avelina Asada Hadji-Kyriacou et.al. 2405.20053 translate read link
2024-05-30 Deep Reinforcement Learning for Intrusion Detection in IoT: A Survey Afrah Gueriani et.al. 2405.20038 translate read null
2024-05-30 Safe Multi-agent Reinforcement Learning with Natural Language Constraints Ziyan Wang et.al. 2405.20018 translate read null
2024-05-30 LAGMA: LAtent Goal-guided Multi-Agent Reinforcement Learning Hyungho Na et.al. 2405.19998 translate read null
2024-05-29 Self-Exploring Language Models: Active Preference Elicitation for Online Alignment Shenao Zhang et.al. 2405.19332 translate read link
2024-05-29 Value-Incentivized Preference Optimization: A Unified Approach to Online and Offline RLHF Shicong Cen et.al. 2405.19320 translate read null
2024-05-29 Robust Preference Optimization through Reward Model Distillation Adam Fisch et.al. 2405.19316 translate read null
2024-05-29 Data Efficient Behavior Cloning for Fine Manipulation via Continuity-based Corrective Labels Abhay Deshpande et.al. 2405.19307 translate read null
2024-05-29 Act Natural! Projecting Autonomous System Trajectories Into Naturalistic Behavior Sets Hamzah I. Khan et.al. 2405.19292 translate read null
2024-05-29 Rich-Observation Reinforcement Learning with Continuous Latent Dynamics Yuda Song et.al. 2405.19269 translate read null
2024-05-29 Exploring the impact of traffic signal control and connected and automated vehicles on intersections safety: A deep reinforcement learning approach Amir Hossein Karbasi et.al. 2405.19236 translate read null
2024-05-29 Diffusion-based Dynamics Models for Long-Horizon Rollout in Offline Reinforcement Learning Hanye Zhao et.al. 2405.19189 translate read null
2024-05-29 Conditional Latent ODEs for Motion Prediction in Autonomous Driving Khang Truong Giang et.al. 2405.19183 translate read null
2024-05-29 A Study of Plasticity Loss in On-Policy Deep Reinforcement Learning Arthur Juliani et.al. 2405.19153 translate read null
2024-05-28 Hierarchical World Models as Visual Whole-Body Humanoid Controllers Nicklas Hansen et.al. 2405.18418 translate read null
2024-05-28 Value Alignment and Trust in Human-Robot Interaction: Insights from Simulation and User Study Shreyas Bhat et.al. 2405.18324 translate read null
2024-05-28 Highway Reinforcement Learning Yuhui Wang et.al. 2405.18289 translate read null
2024-05-28 Extreme Value Monte Carlo Tree Search Masataro Asai et.al. 2405.18248 translate read null
2024-05-28 Recurrent Natural Policy Gradient for POMDPs Semih Cayci et.al. 2405.18221 translate read null
2024-05-28 Safe Multi-Agent Reinforcement Learning with Bilevel Optimization in Autonomous Driving Zhi Zheng et.al. 2405.18209 translate read link
2024-05-28 Mutation-Bias Learning in Games Johann Bauer et.al. 2405.18190 translate read null
2024-05-28 Safe Reinforcement Learning in Black-Box Environments via Adaptive Shielding Daniel Bethell et.al. 2405.18180 translate read link
2024-05-28 Defending Large Language Models Against Jailbreak Attacks via Layer-specific Editing Wei Zhao et.al. 2405.18166 translate read link
2024-05-28 PyTAG: Tabletop Games for Multi-Agent Reinforcement Learning Martin Balla et.al. 2405.18123 translate read link
2024-05-27 A Recipe for Unbounded Data Augmentation in Visual Reinforcement Learning Abdulaziz Almuzairee et.al. 2405.17416 translate read null
2024-05-27 Rethinking Transformers in Solving POMDPs Chenhao Lu et.al. 2405.17358 translate read link
2024-05-27 Opinion-Guided Reinforcement Learning Kyanna Dagenais et.al. 2405.17287 translate read null
2024-05-27 DPN: Decoupling Partition and Navigation for Neural Solvers of Min-max Vehicle Routing Problems Zhi Zheng et.al. 2405.17272 translate read link
2024-05-27 Surprise-Adaptive Intrinsic Motivation for Unsupervised Reinforcement Learning Adriana Hugessen et.al. 2405.17243 translate read null
2024-05-27 InsigHTable: Insight-driven Hierarchical Table Visualization with Reinforcement Learning Guozheng Li et.al. 2405.17229 translate read null
2024-05-27 Learning Generic and Dynamic Locomotion of Humanoids Across Discrete Terrains Shangqun Yu et.al. 2405.17227 translate read null
2024-05-27 Flow control of three-dimensional cylinders transitioning to turbulence via multi-agent reinforcement learning P. Suárez et.al. 2405.17210 translate read null
2024-05-27 CoSLight: Co-optimizing Collaborator Selection and Decision-making to Enhance Traffic Signal Control Jingqing Ruan et.al. 2405.17152 translate read link
2024-05-27 Q-value Regularized Transformer for Offline Reinforcement Learning Shengchao Hu et.al. 2405.17098 translate read null
2024-05-24 Inverse-RLignment: Inverse Reinforcement Learning from Demonstrations for LLM Alignment Hao Sun et.al. 2405.15624 translate read null
2024-05-24 Neuromorphic dreaming: A pathway to efficient learning in artificial agents Ingo Blakowski et.al. 2405.15616 translate read null
2024-05-24 OMNI-EPIC: Open-endedness via Models of human Notions of Interestingness with Environments Programmed in Code Maxence Faldor et.al. 2405.15568 translate read link
2024-05-24 Learning Generalizable Human Motion Generator with Reinforcement Learning Yunyao Mao et.al. 2405.15541 translate read null
2024-05-24 Randomized algorithms and PAC bounds for inverse reinforcement learning in continuous spaces Angeliki Kamoutsi et.al. 2405.15509 translate read null
2024-05-24 Human-in-the-loop Reinforcement Learning for Data Quality Monitoring in Particle Physics Experiments Olivia Jullian Parra et.al. 2405.15508 translate read null
2024-05-24 TD3 Based Collision Free Motion Planning for Robot Navigation Hao Liu et.al. 2405.15460 translate read null
2024-05-24 Counterexample-Guided Repair of Reinforcement Learning Systems Using Safety Critics David Boetius et.al. 2405.15430 translate read null
2024-05-24 Model-free reinforcement learning with noisy actions for automated experimental control in optics Lea Richtmann et.al. 2405.15421 translate read null
2024-05-24 Efficient Recurrent Off-Policy RL Requires a Context-Encoder-Specific Learning Rate Fan-Ming Luo et.al. 2405.15384 translate read null
2024-05-23 Privileged Sensing Scaffolds Reinforcement Learning Edward S. Hu et.al. 2405.14853 translate read null
2024-05-23 Axioms for AI Alignment from Human Feedback Luise Ge et.al. 2405.14758 translate read null
2024-05-23 AGILE: A Novel Framework of LLM Agents Peiyuan Feng et.al. 2405.14751 translate read link
2024-05-23 Policy Gradient Methods for Risk-Sensitive Distributional Reinforcement Learning with Provable Convergence Minheng Xiao et.al. 2405.14749 translate read null
2024-05-23 SimPO: Simple Preference Optimization with a Reference-Free Reward Yu Meng et.al. 2405.14734 translate read link
2024-05-23 Multi-turn Reinforcement Learning from Preference Human Feedback Lior Shani et.al. 2405.14655 translate read null
2024-05-23 Reinforcement Learning for Fine-tuning Text-to-speech Diffusion Models Jingyi Chen et.al. 2405.14632 translate read null
2024-05-23 Which Experiences Are Influential for RL Agents? Efficiently Estimating The Influence of Experiences Takuya Hiraoka et.al. 2405.14629 translate read null
2024-05-23 Closed-form Symbolic Solutions: A New Perspective on Solving Partial Differential Equations Shu Wei et.al. 2405.14620 translate read null
2024-05-23 Discretization of continuous input spaces in the hippocampal autoencoder Adrian F. Amil et.al. 2405.14600 translate read null
2024-05-21 Energy Rank Alignment: Using Preference Optimization to Search Chemical Space at Scale Shriram Chennakesavalu et.al. 2405.12961 translate read null
2024-05-21 Effect of Synthetic Jets Actuator Parameters on Deep Reinforcement Learning-Based Flow Control Performance in a Square Cylinder Wang Jia et.al. 2405.12834 translate read null
2024-05-21 Deep Reinforcement Learning for Time-Critical Wilderness Search And Rescue Using Drones Jan-Hendrik Ewers et.al. 2405.12800 translate read null
2024-05-21 Generative AI and Large Language Models for Cyber Security: All Insights You Need Mohamed Amine Ferrag et.al. 2405.12750 translate read null
2024-05-21 Reinforcement Learning Enabled Peer-to-Peer Energy Trading for Dairy Farms Mian Ibad Ali Shah et.al. 2405.12716 translate read null
2024-05-21 A Multimodal Learning-based Approach for Autonomous Landing of UAV Francisco Neves et.al. 2405.12681 translate read null
2024-05-21 Learning Causal Dynamics Models in Object-Oriented Environments Zhongwei Yu et.al. 2405.12615 translate read null
2024-05-21 PhiBE: A PDE-based Bellman Equation for Continuous Time Policy Evaluation Yuhua Zhu et.al. 2405.12535 translate read null
2024-05-21 GASE: Graph Attention Sampling with Edges Fusion for Solving Vehicle Routing Problems Zhenwei Wang et.al. 2405.12475 translate read null
2024-05-21 Physics-based Scene Layout Generation from Human Motion Jianan Li et.al. 2405.12460 translate read null
2024-05-20 Is Mamba Compatible with Trajectory Optimization in Offline Reinforcement Learning? Yang Dai et.al. 2405.12094 translate read null
2024-05-20 PARALLELGPUOS: A Concurrent OS-level GPU Checkpoint and Restore System using Validated Speculation Zhuobin Huang et.al. 2405.12079 translate read null
2024-05-20 Scrutinize What We Ignore: Reining Task Representation Shift In Context-Based Offline Meta Reinforcement Learning Hai Zhang et.al. 2405.12001 translate read null
2024-05-20 Robust Deep Reinforcement Learning with Adaptive Adversarial Perturbations in Action Space Qianmei Liu et.al. 2405.11982 translate read null
2024-05-20 A Constraint-Enforcing Reward for Adversarial Attacks on Text Classifiers Tom Roth et.al. 2405.11904 translate read null
2024-05-20 Intuitive Fine-Tuning: Towards Unifying SFT and RLHF into a Single Process Ermo Hua et.al. 2405.11870 translate read link
2024-05-20 Reward-Punishment Reinforcement Learning with Maximum Entropy Jiexin Wang et.al. 2405.11784 translate read null
2024-05-20 Efficient Multi-agent Reinforcement Learning by Planning Qihan Liu et.al. 2405.11778 translate read link
2024-05-20 Learning Future Representation with Synthetic Observations for Sample-efficient Reinforcement Learning Xin Liu et.al. 2405.11740 translate read null
2024-05-20 Highway Graph to Accelerate Reinforcement Learning Zidu Yin et.al. 2405.11727 translate read link
2024-05-17 Application of Artificial Intelligence in Schizophrenia Rehabilitation Management: Systematic Literature Review Hongyi Yang et.al. 2405.10883 translate read null
2024-05-17 Automated Radiology Report Generation: A Review of Recent Advances Phillip Sloan et.al. 2405.10842 translate read null
2024-05-17 Combining Teacher-Student with Representation Learning: A Concurrent Teacher-Student Reinforcement Learning Paradigm for Legged Locomotion Hongxi Wang et.al. 2405.10830 translate read null
2024-05-17 Large Language Model (LLM) for Telecommunications: A Comprehensive Survey on Principles, Key Techniques, and Opportunities Hao Zhou et.al. 2405.10825 translate read null
2024-05-17 A Functional Model Method for Nonconvex Nonsmooth Conditional Stochastic Optimization Andrzej Ruszczyński et.al. 2405.10815 translate read null
2024-05-17 SignLLM: Sign Languages Production Large Language Models Sen Fang et.al. 2405.10718 translate read null
2024-05-17 Sample-Efficient Constrained Reinforcement Learning with General Parameterization Washim Uddin Mondal et.al. 2405.10624 translate read null
2024-05-17 An Efficient Learning Control Framework With Sim-to-Real for String-Type Artificial Muscle-Driven Robotic Systems Jiyue Tao et.al. 2405.10576 translate read null
2024-05-17 Time-Varying Constraint-Aware Reinforcement Learning for Energy Storage Control Jaeik Jeong et.al. 2405.10536 translate read null
2024-05-17 Towards Better Question Generation in QA-Based Event Extraction Zijin Hong et.al. 2405.10517 translate read null
2024-05-16 Stochastic Q-learning for Large Discrete Action Spaces Fares Fourati et.al. 2405.10310 translate read null
2024-05-16 Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning Yuexiang Zhai et.al. 2405.10292 translate read null
2024-05-16 Keep It Private: Unsupervised Privatization of Online Text Calvin Bao et.al. 2405.10260 translate read link
2024-05-16 A Design Trajectory Map of Human-AI Collaborative Reinforcement Learning Systems: Survey and Taxonomy Zhaoxing Li et.al. 2405.10214 translate read null
2024-05-16 Continuous Transfer Learning for UAV Communication-aware Trajectory Design Chenrui Sun et.al. 2405.10087 translate read null
2024-05-16 Optimizing Search and Rescue UAV Connectivity in Challenging Terrain through Multi Q-Learning Mohammed M. H. Qazzaz et.al. 2405.10042 translate read null
2024-05-16 Reward Centering Abhishek Naik et.al. 2405.09999 translate read null
2024-05-16 Combining RL and IL using a dynamic, performance-based modulation over learning signals and its application to local planning Francisco Leiva et.al. 2405.09760 translate read null
2024-05-16 NIFTY Financial News Headlines Dataset Raeid Saqur et.al. 2405.09747 translate read null
2024-05-15 Fast Two-Time-Scale Stochastic Gradient Method with Applications in Reinforcement Learning Sihan Zeng et.al. 2405.09660 translate read null
2024-05-15 Reinforcement Learning-Based Framework for the Intelligent Adaptation of User Interfaces Daniel Gaspar-Figueiredo et.al. 2405.09255 translate read null
2024-05-15 DVS-RG: Differential Variable Speed Limits Control using Deep Reinforcement Learning with Graph State Representation Jingwen Yang et.al. 2405.09163 translate read null
2024-05-15 CarDreamer: Open-Source Learning Platform for World Model based Autonomous Driving Dechen Gao et.al. 2405.09111 translate read null
2024-05-15 Chaos-based reinforcement learning with TD3 Toshitaka Matsuki et.al. 2405.09086 translate read null
2024-05-15 Deep Learning in Earthquake Engineering: A Comprehensive Review Yazhou Xie et.al. 2405.09021 translate read null
2024-05-14 Large Language Models for Human-Machine Collaborative Particle Accelerator Tuning through Natural Language Jan Kaiser et.al. 2405.08888 translate read null
2024-05-14 Stable Inverse Reinforcement Learning: Policies from Control Lyapunov Landscapes Samuel Tesfazgi et.al. 2405.08756 translate read null
2024-05-14 Hierarchical Resource Partitioning on Modern GPUs: A Reinforcement Learning Approach Urvij Saroliya et.al. 2405.08754 translate read null
2024-05-14 Reinformer: Max-Return Sequence Modeling for offline RL Zifeng Zhuang et.al. 2405.08740 translate read null
2024-05-14 I-CTRL: Imitation to Control Humanoid Robots Through Constrained Reinforcement Learning Yashuai Yan et.al. 2405.08726 translate read null
2024-05-15 Enhancing Reinforcement Learning in Sensor Fusion: A Comparative Analysis of Cubature and Sampling-based Integration Methods for Rover Search Planning Jan-Hendrik Ewers et.al. 2405.08691 translate read null
2024-05-14 A Distributed Approach to Autonomous Intersection Management via Multi-Agent Reinforcement Learning Matteo Cederle et.al. 2405.08655 translate read link
2024-05-14 vMFER: Von Mises-Fisher Experience Resampling Based on Uncertainty of Gradient Directions for Policy Improvement Yiwen Zhu et.al. 2405.08638 translate read null
2024-05-14 Optimizing Deep Reinforcement Learning for American Put Option Hedging Reilly Pickard et.al. 2405.08602 translate read null
2024-05-14 Python-Based Reinforcement Learning on Simulink Models Georg Schäfer et.al. 2405.08567 translate read null
2024-05-14 Growing Artificial Neural Networks for Control: the Role of Neuronal Diversity Eleni Nisioti et.al. 2405.08510 translate read null
2024-05-13 Hierarchical Decision Mamba André Correia et.al. 2405.07943 translate read link
2024-05-13 RLHF Workflow: From Reward Modeling to Online RLHF Hanze Dong et.al. 2405.07863 translate read link
2024-05-13 Adaptive Exploration for Data-Efficient General Value Function Evaluations Arushi Jain et.al. 2405.07838 translate read null
2024-05-13 Fixed Point Theory Analysis of a Lambda Policy Iteration with Randomization for the Ćirić Contraction Operator Abdelkader Belhenniche et.al. 2405.07824 translate read null
2024-05-13 Hamiltonian-based Quantum Reinforcement Learning for Neural Combinatorial Optimization Georg Kruse et.al. 2405.07790 translate read null
2024-05-13 Hype or Heuristic? Quantum Reinforcement Learning for Join Order Optimisation Maja Franz et.al. 2405.07770 translate read null
2024-05-13 CAGES: Cost-Aware Gradient Entropy Search for Efficient Local Multi-Fidelity Bayesian Optimization Wei-Ting Tang et.al. 2405.07760 translate read null
2024-05-13 MADRL-Based Rate Adaptation for 360 $\degree$ Video Streaming with Multi-Viewpoint Prediction Haopeng Wang et.al. 2405.07759 translate read null
2024-05-13 Neural Network Compression for Reinforcement Learning Tasks Dmitry A. Ivanov et.al. 2405.07748 translate read null
2024-05-13 Backdoor Removal for Generative Large Language Models Haoran Li et.al. 2405.07667 translate read null
2024-05-10 Value Augmented Sampling for Language Model Alignment and Personalization Seungwook Han et.al. 2405.06639 translate read link
2024-05-10 EcoEdgeTwin: Enhanced 6G Network via Mobile Edge Computing and Digital Twin Integration Synthia Hossain Karobi et.al. 2405.06507 translate read null
2024-05-10 Advantageous and disadvantageous inequality aversion can be taught through vicarious learning of others’ preferences Shen Zhang et.al. 2405.06500 translate read null
2024-05-10 Contextual Affordances for Safe Exploration in Robotic Scenarios William Z. Ye et.al. 2405.06422 translate read null
2024-05-10 Projection by Convolution: Optimal Sample Complexity for Reinforcement Learning in Continuous-Space MDPs Davide Maran et.al. 2405.06363 translate read null
2024-05-10 Learning Latent Dynamic Robust Representations for World Models Ruixiang Sun et.al. 2405.06263 translate read link
2024-05-10 Contrastive Representation for Data Filtering in Cross-Domain Offline Reinforcement Learning Xiaoyu Wen et.al. 2405.06192 translate read link
2024-05-10 (A Partial Survey of) Decentralized, Cooperative Multi-Agent Reinforcement Learning Christopher Amato et.al. 2405.06161 translate read null
2024-05-09 An RNN-policy gradient approach for quantum architecture search Gang Wang et.al. 2405.05892 translate read null
2024-05-09 Safe Exploration Using Bayesian World Models and Log-Barrier Optimization Yarden As et.al. 2405.05890 translate read null
2024-05-09 ExACT: An End-to-End Autonomous Excavator System Using Action Chunking With Transformers Liangliang Chen et.al. 2405.05861 translate read null
2024-05-09 Policy Gradient with Active Importance Sampling Matteo Papini et.al. 2405.05630 translate read null
2024-05-09 An Automatic Prompt Generation System for Tabular Data Tasks Ashlesha Akella et.al. 2405.05618 translate read null
2024-05-09 Dynamic Deep Factor Graph for Multi-Agent Reinforcement Learning Yuchen Shi et.al. 2405.05542 translate read link
2024-05-08 Model-Free Robust $φ$ -Divergence Reinforcement Learning Using Both Offline and Online Data Kishan Panaganti et.al. 2405.05468 translate read null
2024-05-08 Markowitz Meets Bellman: Knowledge-distilled Reinforcement Learning for Portfolio Management Gang Hu et.al. 2405.05449 translate read null
2024-05-08 Learning to Play Pursuit-Evasion with Dynamic and Sensor Constraints Burak M. Gonultas et.al. 2405.05372 translate read null
2024-05-08 Offline Model-Based Optimization via Policy-Guided Gradient Search Yassine Chemingui et.al. 2405.05349 translate read link
2024-05-08 Conversational Topic Recommendation in Counseling and Psychotherapy with Decision Transformer and Large Language Models Aylin Gunal et.al. 2405.05060 translate read null
2024-05-08 Fault Identification Enhancement with Reinforcement Learning (FIERL) Valentina Zaccaria et.al. 2405.04938 translate read link
2024-05-07 RACER: Epistemic Risk-Sensitive RL Enables Fast Driving with Fewer Crashes Kyle Stachowicz et.al. 2405.04714 translate read null
2024-05-07 Proximal Policy Optimization with Adaptive Exploration Andrei Lixandru et.al. 2405.04664 translate read null
2024-05-07 ACEGEN: Reinforcement learning of generative chemical agents for drug discovery Albert Bou et.al. 2405.04657 translate read link
2024-05-07 TorchDriveEnv: A Reinforcement Learning Benchmark for Autonomous Driving with Reactive, Realistic, and Diverse Non-Playable Characters Jonathan Wilder Lavington et.al. 2405.04491 translate read null
2024-05-07 Designing, Developing, and Validating Network Intelligence for Scaling in Service-Based Architectures based on Deep Reinforcement Learning Paola Soto et.al. 2405.04441 translate read null
2024-05-08 DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model DeepSeek-AI et.al. 2405.04434 translate read link
2024-05-07 The Curse of Diversity in Ensemble-Based Exploration Zhixuan Lin et.al. 2405.04342 translate read link
2024-05-07 Deception in Reinforced Autonomous Agents: The Unconventional Rabbit Hat Trick in Legislation Atharvan Dogra et.al. 2405.04325 translate read null
2024-05-07 Genetic Drift Regularization: on preventing Actor Injection from breaking Evolution Strategies Paul Templier et.al. 2405.04322 translate read null
2024-05-07 Improving Offline Reinforcement Learning with Inaccurate Simulators Yiwen Hou et.al. 2405.04307 translate read null
2024-05-07 Deep Reinforcement Learning for Multi-User RF Charging with Non-linear Energy Harvesters Amirhossein Azarbahram et.al. 2405.04218 translate read null
2024-05-07 In-context Learning for Automated Driving Scenarios Ziqi Zhou et.al. 2405.04135 translate read null
2024-05-07 Ranking-based Client Selection with Imitation Learning for Efficient Federated Learning Chunlin Tian et.al. 2405.04122 translate read null
2024-05-06 $ε$ -Policy Gradient for Online Pricing Lukasz Szpruch et.al. 2405.03624 translate read null
2024-05-06 Position Paper: Leveraging Foundational Models for Black-Box Optimization: Benefits, Challenges, and Future Directions Xingyou Song et.al. 2405.03547 translate read null
2024-05-06 ReinWiFi: A Reinforcement-Learning-Based Framework for the Application-Layer QoS Optimization of WiFi Networks Qianren Li et.al. 2405.03526 translate read null
2024-05-06 Robotic Constrained Imitation Learning for the Peg Transfer Task in Fundamentals of Laparoscopic Surgery Kento Kawaharazuka et.al. 2405.03440 translate read null
2024-05-06 Reverse Forward Curriculum Learning for Extreme Sample and Demonstration Efficiency in Reinforcement Learning Stone Tao et.al. 2405.03379 translate read null
2024-05-06 Enhancing Q-Learning with Large Language Model Heuristics Xiefeng Wu et.al. 2405.03341 translate read null
2024-05-06 Artificial Intelligence in the Autonomous Navigation of Endovascular Interventions: A Systematic Review Harry Robertshaw et.al. 2405.03305 translate read null
2024-05-06 End-to-End Reinforcement Learning of Curative Curtailment with Partial Measurement Availability Hinrikus Wolf et.al. 2405.03262 translate read null
2024-05-06 Federated Reinforcement Learning with Constraint Heterogeneity Hao Jin et.al. 2405.03236 translate read null
2024-05-06 Robot Air Hockey: A Manipulation Testbed for Robot Learning with Reinforcement Learning Caleb Chuck et.al. 2405.03113 translate read null
2024-05-03 Geometric Fabrics: a Safe Guiding Medium for Policy Learning Karl Van Wyk et.al. 2405.02250 translate read null
2024-05-03 Learning Optimal Deterministic Policies with Stochastic Policy Gradients Alessandro Montenegro et.al. 2405.02235 translate read null
2024-05-03 The Cambridge RoboMaster: An Agile Multi-Robot Research Platform Jan Blumenkamp et.al. 2405.02198 translate read null
2024-05-03 Imitation Learning in Discounted Linear MDPs without exploration assumptions Luca Viano et.al. 2405.02181 translate read null
2024-05-03 Simulating the economic impact of rationality through reinforcement learning and agent-based modelling Simone Brusatin et.al. 2405.02161 translate read null
2024-05-03 Zero-Sum Positional Differential Games as a Framework for Robust Reinforcement Learning: Deep Q-Learning Approach Anton Plaksin et.al. 2405.02044 translate read null
2024-05-03 Model-based reinforcement learning for protein backbone design Frederic Renard et.al. 2405.01983 translate read null
2024-05-03 Rescale-Invariant Federated Reinforcement Learning for Resource Allocation in V2X Networks Kaidi Xu et.al. 2405.01961 translate read null
2024-05-03 Instance-Conditioned Adaptation for Large-scale Generalization of Neural Combinatorial Optimization Changliang Zhou et.al. 2405.01906 translate read null
2024-05-03 Reinforcement Learning control strategies for Electric Vehicles and Renewable energy sources Virtual Power Plants Francesco Maldonato et.al. 2405.01889 translate read link
2024-05-02 Plan-Seq-Learn: Language Model Guided RL for Solving Long Horizon Robotics Tasks Murtaza Dalal et.al. 2405.01534 translate read null
2024-05-02 FLAME: Factuality-Aware Alignment for Large Language Models Sheng-Chieh Lin et.al. 2405.01525 translate read null
2024-05-02 NeMo-Aligner: Scalable Toolkit for Efficient Model Alignment Gerald Shen et.al. 2405.01481 translate read link
2024-05-02 IntervenGen: Interventional Data Generation for Robust and Data-Efficient Robot Imitation Learning Ryan Hoque et.al. 2405.01472 translate read null
2024-05-02 Goal-conditioned reinforcement learning for ultrasound navigation guidance Abdoul Aziz Amadou et.al. 2405.01409 translate read null
2024-05-02 Learning Force Control for Legged Manipulation Tifanny Portela et.al. 2405.01402 translate read null
2024-05-02 Constrained Reinforcement Learning Under Model Mismatch Zhongchang Sun et.al. 2405.01327 translate read null
2024-05-02 Non-iterative Optimization of Trajectory and Radio Resource for Aerial Network Hyeonsu Lyu et.al. 2405.01314 translate read null
2024-05-02 Behavior Imitation for Manipulator Control and Grasping with Deep Reinforcement Learning Liu Qiyuan et.al. 2405.01284 translate read null
2024-05-02 Reinforcement Learning for Edit-Based Non-Autoregressive Neural Machine Translation Hao Wang et.al. 2405.01280 translate read null
2024-05-01 Self-Play Preference Optimization for Language Model Alignment Yue Wu et.al. 2405.00675 translate read null
2024-05-01 No Representation, No Trust: Connecting Representation, Collapse, and Trust Issues in PPO Skander Moalla et.al. 2405.00662 translate read link
2024-05-01 HUGO – Highlighting Unseen Grid Options: Combining Deep Reinforcement Learning with a Heuristic Target Topology Approach Malte Lehna et.al. 2405.00629 translate read null
2024-05-01 Koopman-based Deep Learning for Nonlinear System Estimation Zexin Sun et.al. 2405.00627 translate read null
2024-05-01 Queue-based Eco-Driving at Roundabouts with Reinforcement Learning Anna-Lena Schlamp et.al. 2405.00625 translate read null
2024-05-01 The Real, the Better: Aligning Large Language Models with Online Human Behaviors Guanying Jiang et.al. 2405.00578 translate read null
2024-05-01 Mixture of insighTful Experts (MoTE): The Synergy of Thought Chains and Expert Mixtures in Self-Alignment Zhili Liu et.al. 2405.00557 translate read null
2024-05-01 Navigating WebAI: Training Agents to Complete Web Tasks with Large Language Models and Reinforcement Learning Lucas-Andreï Thil et.al. 2405.00516 translate read null
2024-05-01 MetaRM: Shifted Distributions Alignment via Meta-Learning Shihan Dou et.al. 2405.00438 translate read null
2024-05-01 UCB-driven Utility Function Search for Multi-objective Reinforcement Learning Yucheng Shi et.al. 2405.00410 translate read link

(<a href=../Reinforcement_Learning.md>back to Reinforcement Learning</a>)