Reinforcement Learning - 2024-04

Publish Date Title Authors PDF Translate Read Code
2024-04-30 Collaborative Control Method of Transit Signal Priority Based on Cooperative Game and Reinforcement Learning Hao Qin et.al. 2404.19683 translate read null
2024-04-30 Towards Generalist Robot Learning from Internet Video: A Survey Robert McCarthy et.al. 2404.19664 translate read null
2024-04-30 Short term vs. long term: optimization of microswimmer navigation on different time horizons Navid Mousavi et.al. 2404.19561 translate read null
2024-04-30 Continual Model-based Reinforcement Learning for Data Efficient Wireless Network Optimisation Cengis Hasan et.al. 2404.19462 translate read null
2024-04-30 Imitation Learning: A Survey of Learning Methods, Environments and Metrics Nathan Gavenski et.al. 2404.19456 translate read null
2024-04-30 Countering Reward Over-optimization in LLM with Demonstration-Guided Reinforcement Learning Mathieu Rita et.al. 2404.19409 translate read link
2024-04-30 Numeric Reward Machines Kristina Levina et.al. 2404.19370 translate read null
2024-04-30 Pessimistic Value Iteration for Multi-Task Data Sharing in Offline Reinforcement Learning Chenjia Bai et.al. 2404.19346 translate read link
2024-04-30 Provably Efficient Information-Directed Sampling Algorithms for Multi-Agent Reinforcement Learning Qiaosheng Zhang et.al. 2404.19292 translate read null
2024-04-30 DiffuseLoco: Real-Time Legged Locomotion Control with Diffusion from Offline Datasets Xiaoyu Huang et.al. 2404.19264 translate read null
2024-04-29 DPO Meets PPO: Reinforced Token Optimization for RLHF Han Zhong et.al. 2404.18922 translate read null
2024-04-29 Sample-Efficient Robust Multi-Agent Reinforcement Learning in the Face of Environmental Uncertainty Laixi Shi et.al. 2404.18909 translate read null
2024-04-29 Overcoming Knowledge Barriers: Online Imitation Learning from Observation with Pretrained World Models Xingyuan Zhang et.al. 2404.18896 translate read null
2024-04-29 More RLHF, More Trust? On The Impact of Human Preference Alignment On Language Model Trustworthiness Aaron J. Li et.al. 2404.18870 translate read link
2024-04-29 Performance-Aligned LLMs for Generating Fast Code Daniel Nichols et.al. 2404.18864 translate read null
2024-04-29 PlanNetX: Learning an Efficient Neural Network Planner from MPC for Longitudinal Control Jasper Hoffmann et.al. 2404.18863 translate read null
2024-04-30 Winning the Social Media Influence Battle: Uncertainty-Aware Opinions to Understand and Spread True Information via Competitive Influence Maximization Qi Zhang et.al. 2404.18826 translate read null
2024-04-29 Control Policy Correction Framework for Reinforcement Learning-based Energy Arbitrage Strategies Seyed Soroush Karimi Madahi et.al. 2404.18821 translate read null
2024-04-29 Multi-Agent Synchronization Tasks Rolando Fernandez et.al. 2404.18798 translate read null
2024-04-29 Resource-rational reinforcement learning and sensorimotor causal states Sarah Marzen et.al. 2404.18775 translate read null
2024-04-26 Probabilistic Inference in Language Models via Twisted Sequential Monte Carlo Stephen Zhao et.al. 2404.17546 translate read null
2024-04-26 Ag2Manip: Learning Novel Manipulation Skills with Agent-Agnostic Visual and Action Representations Puhao Li et.al. 2404.17521 translate read link
2024-04-26 Quantum Multi-Agent Reinforcement Learning for Aerial Ad-hoc Networks Theodora-Augustina Drăgan et.al. 2404.17499 translate read null
2024-04-26 Q-Learning to navigate turbulence without a map Marco Rando et.al. 2404.17495 translate read null
2024-04-26 Adaptive speed planning for Unmanned Vehicle Based on Deep Reinforcement Learning Hao Liu et.al. 2404.17379 translate read null
2024-04-26 When to Trust LLMs: Aligning Confidence with Response Quality Shuchang Tao et.al. 2404.17287 translate read null
2024-04-26 Enhancing Privacy and Security of Autonomous UAV Navigation Vatsal Aggarwal et.al. 2404.17225 translate read null
2024-04-26 Beyond Imitation: A Life-long Policy Learning Framework for Path Tracking Control of Autonomous Driving C. Gong et.al. 2404.17198 translate read null
2024-04-26 An Explainable Deep Reinforcement Learning Model for Warfarin Maintenance Dosing Using Policy Distillation and Action Forging Sadjad Anzabi Zadeh et.al. 2404.17187 translate read null
2024-04-25 Compiler for Distributed Quantum Computing: a Reinforcement Learning Approach Panagiotis Promponas et.al. 2404.17077 translate read null
2024-04-25 REBEL: Reinforcement Learning via Regressing Relative Rewards Zhaolin Gao et.al. 2404.16767 translate read null
2024-04-25 Distilling Privileged Information for Dubins Traveling Salesman Problems with Neighborhoods Min Kyu Shin et.al. 2404.16721 translate read null
2024-04-25 RUMOR: Reinforcement learning for Understanding a Model of the Real World for Navigation in Dynamic Environments Diego Martinez-Baselga et.al. 2404.16672 translate read null
2024-04-25 Hippocrates: An Open-Source Framework for Advancing Large Language Models in Healthcare Emre Can Acikgoz et.al. 2404.16621 translate read null
2024-04-25 Exploring the Dynamics of Data Transmission in 5G Networks: A Conceptual Analysis Nikita Smirnov et.al. 2404.16508 translate read null
2024-04-25 Leveraging Pretrained Latent Representations for Few-Shot Imitation Learning on a Dexterous Robotic Hand Davide Liconti et.al. 2404.16483 translate read null
2024-04-25 A Dual Perspective of Reinforcement Learning for Imposing Policy Constraints Bram De Cooman et.al. 2404.16468 translate read null
2024-04-25 Offline Reinforcement Learning with Behavioral Supervisor Tuning Padmanaba Srinivasan et.al. 2404.16399 translate read null
2024-04-25 SwarmRL: Building the Future of Smart Active Systems Samuel Tovey et.al. 2404.16388 translate read link
2024-04-25 Reinforcement Learning with Generative Models for Compact Support Sets Nico Schiavone et.al. 2404.16300 translate read link
2024-04-24 DPO: Differential reinforcement learning with application to optimal configuration search Chandrajit Bajaj et.al. 2404.15617 translate read null
2024-04-24 GRSN: Gated Recurrent Spiking Neurons for POMDPs and MARL Lang Qin et.al. 2404.15597 translate read null
2024-04-24 Multi-Agent Reinforcement Learning for Energy Networks: Computational Challenges, Progress and Open Problems Sarah Keren et.al. 2404.15583 translate read null
2024-04-23 An MRP Formulation for Supervised Learning: Generalized Temporal Difference Learning Models Yangchen Pan et.al. 2404.15518 translate read null
2024-04-23 The Power of Resets in Online Reinforcement Learning Zakaria Mhammedi et.al. 2404.15417 translate read null
2024-04-23 Planning the path with Reinforcement Learning: Optimal Robot Motion Planning in RoboCup Small Size League Environments Mateus G. Machado et.al. 2404.15410 translate read link
2024-04-23 Reinforcement Learning with Adaptive Control Regularization for Safe Control of Critical Systems Haozhe Tian et.al. 2404.15199 translate read null
2024-04-23 Multimodal Large Language Model is a Human-Aligned Annotator for Text-to-Image Generation Xun Wu et.al. 2404.15100 translate read null
2024-04-23 Impedance Matching: Enabling an RL-Based Running Jump in a Quadruped Robot Neil Guan et.al. 2404.15096 translate read null
2024-04-23 Using deep reinforcement learning to promote sustainable human behaviour on a common pool resource problem Raphael Koster et.al. 2404.15059 translate read null
2024-04-23 Cache-Aware Reinforcement Learning in Large-Scale Recommender Systems Xiaoshuang Chen et.al. 2404.14961 translate read null
2024-04-23 Multi-Objective Deep Reinforcement Learning for 5G Base Station Placement to Support Localisation for Future Sustainable Traffic Ahmed Al-Tahmeesschi et.al. 2404.14954 translate read null
2024-04-23 MultiSTOP: Solving Functional Equations with Reinforcement Learning Alessandro Trenta et.al. 2404.14909 translate read null
2024-04-23 Unitary Synthesis of Clifford+T Circuits with Reinforcement Learning Sebastian Rietsch et.al. 2404.14865 translate read null
2024-04-23 Evolutionary Reinforcement Learning via Cooperative Coevolution Chengpeng Hu et.al. 2404.14763 translate read null
2024-04-23 Rank2Reward: Learning Shaped Reward Functions from Passive Video Daniel Yang et.al. 2404.14735 translate read null
2024-04-22 Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data Fahim Tajwar et.al. 2404.14367 translate read link
2024-04-22 PLUTO: Pushing the Limit of Imitation Learning-based Planning for Autonomous Driving Jie Cheng et.al. 2404.14327 translate read null
2024-04-22 Multi-Agent Hybrid SAC for Joint SS-DSA in CRNs David R. Nickel et.al. 2404.14319 translate read null
2024-04-22 LLM-Personalize: Aligning LLM Planners with Human Preferences via Reinforced Self-Training for Housekeeping Robots Dongge Han et.al. 2404.14285 translate read null
2024-04-22 Beyond the Edge: An Advanced Exploration of Reinforcement Learning for Mobile Edge Computing, its Applications, and Future Research Trajectories Ning Yang et.al. 2404.14238 translate read null
2024-04-22 Multi-agent Reinforcement Learning-based Joint Precoding and Phase Shift Optimization for RIS-aided Cell-Free Massive MIMO Systems Yiyang Zhu et.al. 2404.14092 translate read null
2024-04-22 Mechanistic Interpretability for AI Safety – A Review Leonard Bereska et.al. 2404.14082 translate read null
2024-04-22 Research on Robot Path Planning Based on Reinforcement Learning Wang Ruiqi et.al. 2404.14077 translate read link
2024-04-22 Multi-view Disentanglement for Reinforcement Learning with Multiple Cameras Mhairi Dunion et.al. 2404.14064 translate read link
2024-04-22 A survey of air combat behavior modeling using machine learning Patrick Ribu Gorton et.al. 2404.13954 translate read null
2024-04-19 Mapping Social Choice Theory to RLHF Jessica Dai et.al. 2404.13038 translate read null
2024-04-19 Deep Reinforcement Learning-Based Active Flow Control of an Elliptical Cylinder: Transitioning from an Elliptical Cylinder to a Circular Cylinder and a Flat Plate Wang Jia et.al. 2404.13003 translate read null
2024-04-19 Goal Exploration via Adaptive Skill Distribution for Goal-Conditioned Reinforcement Learning Lisheng Wu et.al. 2404.12999 translate read null
2024-04-19 MM-PhyRLHF: Reinforcement Learning Framework for Multimodal Physics Question-Answering Avinash Anand et.al. 2404.12926 translate read null
2024-04-19 Zero-Shot Stitching in Reinforcement Learning using Relative Representations Antonio Pio Ricciardi et.al. 2404.12917 translate read null
2024-04-19 MAexp: A Generic Platform for RL-based Multi-Agent Exploration Shaohao Zhu et.al. 2404.12824 translate read link
2024-04-19 Adaptive Regularization of Representation Rank as an Implicit Constraint of Bellman Equation Qiang He et.al. 2404.12754 translate read link
2024-04-19 Demonstration of quantum projective simulation on a single-photon-based quantum computer Giacomo Franceschetto et.al. 2404.12729 translate read null
2024-04-19 Energy Conserved Failure Detection for NS-IoT Systems Guojin Liu et.al. 2404.12713 translate read null
2024-04-19 Single-Task Continual Offline Reinforcement Learning Sibo Gai et.al. 2404.12639 translate read null
2024-04-18 From $r$ to $Q^*$ : Your Language Model is Secretly a Q-Function Rafael Rafailov et.al. 2404.12358 translate read null
2024-04-18 Improving the interpretability of GNN predictions through conformal-based graph sparsification Pablo Sanchez-Martin et.al. 2404.12356 translate read link
2024-04-18 Practical Considerations for Discrete-Time Implementations of Continuous-Time Control Barrier Function-Based Safety Filters Lukas Brunke et.al. 2404.12329 translate read null
2024-04-18 ASID: Active Exploration for System Identification in Robotic Manipulation Marius Memmel et.al. 2404.12308 translate read null
2024-04-18 RISE: 3D Perception Makes Real-World Robot Imitation Simple and Effective Chenxi Wang et.al. 2404.12281 translate read null
2024-04-18 Privacy-Preserving UCB Decision Process Verification via zk-SNARKs Xikun Jiang et.al. 2404.12186 translate read null
2024-04-18 Aligning language models with human preferences Tomasz Korbak et.al. 2404.12150 translate read link
2024-04-19 Robust and Adaptive Deep Reinforcement Learning for Enhancing Flow Control around a Square Cylinder with Varying Reynolds Numbers Wang Jia et.al. 2404.12123 translate read null
2024-04-18 X-Light: Cross-City Traffic Signal Control Using Transformer on Transformer as Meta Multi-Agent Reinforcement Learner Haoyuan Jiang et.al. 2404.12090 translate read link
2024-04-18 Trajectory Planning for Autonomous Vehicle Using Iterative Reward Prediction in Reinforcement Learning Hyunwoo Park et.al. 2404.12079 translate read null
2024-04-17 Prompt Optimizer of Text-to-Image Diffusion Models for Abstract Concept Understanding Zezhong Fan et.al. 2404.11589 translate read null
2024-04-17 Deep Policy Optimization with Temporal Logic Constraints Ameesh Shah et.al. 2404.11578 translate read null
2024-04-17 Spatio-Temporal Motion Retargeting for Quadruped Robots Taerim Yoon et.al. 2404.11557 translate read null
2024-04-17 VC Theory for Inventory Policies Yaqi Xie et.al. 2404.11509 translate read null
2024-04-17 Learn to Tour: Operator Design For Solution Feasibility Mapping in Pickup-and-delivery Traveling Salesman Problem Bowen Fang et.al. 2404.11458 translate read null
2024-04-17 What-if Analysis Framework for Digital Twins in 6G Wireless Network Management Elif Ak et.al. 2404.11394 translate read null
2024-04-17 Convergence of Policy Gradient for Stochastic Linear-Quadratic Control Problem in Infinite Horizon Xinpei Zhang et.al. 2404.11382 translate read null
2024-04-17 Following the Human Thread in Social Navigation Luca Scofano et.al. 2404.11327 translate read link
2024-04-17 On Learning Parities with Dependent Noise Noah Golowich et.al. 2404.11325 translate read null
2024-04-17 Physics-informed Actor-Critic for Coordination of Virtual Inertia from Power Distribution Systems Simon Stock et.al. 2404.11149 translate read null
2024-04-16 Settling Constant Regrets in Linear Markov Decision Processes Weitong Zhang et.al. 2404.10745 translate read null
2024-04-16 N-Agent Ad Hoc Teamwork Caroline Wang et.al. 2404.10740 translate read null
2024-04-16 Bootstrapping Linear Models for Fast Online Adaptation in Human-Agent Collaboration Benjamin A Newman et.al. 2404.10733 translate read null
2024-04-16 Randomized Exploration in Cooperative Multi-Agent Reinforcement Learning Hao-Lun Hsu et.al. 2404.10728 translate read null
2024-04-16 Automatic re-calibration of quantum devices by reinforcement learning T. Crosta et.al. 2404.10726 translate read null
2024-04-16 Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study Shusheng Xu et.al. 2404.10719 translate read null
2024-04-16 Simplex Decomposition for Portfolio Allocation Constraints in Reinforcement Learning David Winkel et.al. 2404.10683 translate read null
2024-04-16 SCALE: Self-Correcting Visual Navigation for Mobile Robots via Anti-Novelty Estimation Chang Chen et.al. 2404.10675 translate read null
2024-04-16 Continual Offline Reinforcement Learning via Diffusion-based Dual Generative Replay Jinmei Liu et.al. 2404.10662 translate read link
2024-04-16 Trajectory Planning using Reinforcement Learning for Interactive Overtaking Maneuvers in Autonomous Racing Scenarios Levent Ögretmen et.al. 2404.10658 translate read null
2024-04-15 Unveiling Imitation Learning: Exploring the Impact of Data Falsity to Large Language Model Hyunsoo Cho et.al. 2404.09717 translate read null
2024-04-15 Higher Replay Ratio Empowers Sample-Efficient Multi-Agent Reinforcement Learning Linjie Xu et.al. 2404.09715 translate read null
2024-04-15 Learn Your Reference Model for Real Good Alignment Alexey Gorbatovski et.al. 2404.09656 translate read null
2024-04-15 Reliability Estimation of News Media Sources: Birds of a Feather Flock Together Sergio Burdisso et.al. 2404.09565 translate read null
2024-04-15 Inferring Behavior-Specific Context Improves Zero-Shot Generalization in Reinforcement Learning Tidiane Camaret Ndir et.al. 2404.09521 translate read link
2024-04-14 Correlated Mean Field Imitation Learning Zhiyu Zhao et.al. 2404.09324 translate read null
2024-04-14 Egret: Reinforcement Mechanism for Sequential Computation Offloading in Edge Computing Haosong Peng et.al. 2404.09285 translate read null
2024-04-14 A Reinforcement Learning Based Backfilling Strategy for HPC Batch Jobs Elliot Kolker-Hicks et.al. 2404.09264 translate read null
2024-04-14 Knowledgeable Agents by Offline Reinforcement Learning from Large Language Model Rollouts Jing-Cheng Pang et.al. 2404.09248 translate read null
2024-04-14 Advanced Intelligent Optimization Algorithms for Multi-Objective Optimal Power Flow in Future Power Systems: A Review Yuyan Li et.al. 2404.09203 translate read null
2024-04-12 Enhancing Autonomous Vehicle Training with Language Model Integration and Critical Scenario Generation Hanlin Tian et.al. 2404.08570 translate read null
2024-04-12 RLHF Deciphered: A Critical Analysis of Reinforcement Learning from Human Feedback for LLMs Shreyas Chaudhari et.al. 2404.08555 translate read null
2024-04-12 Advancing Forest Fire Prevention: Deep Reinforcement Learning for Effective Firebreak Placement Lucas Murray et.al. 2404.08523 translate read null
2024-04-12 Adversarial Imitation Learning via Boosting Jonathan D. Chang et.al. 2404.08513 translate read null
2024-04-12 Prescribing Optimal Health-Aware Operation for Urban Air Mobility with Deep Reinforcement Learning Mina Montazeri et.al. 2404.08497 translate read null
2024-04-12 Dataset Reset Policy Optimization for RLHF Jonathan D. Chang et.al. 2404.08495 translate read link
2024-04-12 Anti-Byzantine Attacks Enabled Vehicle Selection for Asynchronous Federated Learning in Vehicular Edge Computing Cui Zhang et.al. 2404.08444 translate read null
2024-04-12 SIR-RL: Reinforcement Learning for Optimized Policy Control during Epidemiological Outbreaks in Emerging Market and Developing Economies Maeghal Jain et.al. 2404.08423 translate read null
2024-04-12 TDANet: Target-Directed Attention Network For Object-Goal Visual Navigation With Zero-Shot Ability Shiwei Lian et.al. 2404.08353 translate read null
2024-04-12 Agile and versatile bipedal robot tracking control through reinforcement learning Jiayi Li et.al. 2404.08246 translate read null
2024-04-11 High-Dimension Human Value Representation in Large Language Models Samuel Cahyawijaya et.al. 2404.07900 translate read null
2024-04-11 Data-Driven System Identification of Quadrotors Subject to Motor Delays Jonas Eschmann et.al. 2404.07837 translate read null
2024-04-11 On the Sample Efficiency of Abstractions and Potential-Based Reward Shaping in Reinforcement Learning Giuseppe Canonaco et.al. 2404.07826 translate read null
2024-04-11 An Overview of Diffusion Models: Applications, Guided Generation, Statistical Rates and Optimization Minshuo Chen et.al. 2404.07771 translate read null
2024-04-11 Differentially Private Reinforcement Learning with Self-Play Dan Qiao et.al. 2404.07559 translate read null
2024-04-11 Enhancing Policy Gradient with the Polyak Step-Size Adaption Yunxiang Li et.al. 2404.07525 translate read null
2024-04-11 Generative Probabilistic Planning for Optimizing Supply Chain Networks Hyung-il Ahn et.al. 2404.07511 translate read null
2024-04-11 Neural Fault Injection: Generating Software Faults from Natural Language Domenico Cotroneo et.al. 2404.07491 translate read null
2024-04-11 Leveraging Domain-Unlabeled Data in Offline Reinforcement Learning across Two Domains Soichiro Nishimori et.al. 2404.07465 translate read null
2024-04-11 UAV-enabled Collaborative Beamforming via Multi-Agent Deep Reinforcement Learning Saichao Liu et.al. 2404.07453 translate read null
2024-04-10 Reward Learning from Suboptimal Demonstrations with Applications in Surgical Electrocautery Zohre Karimi et.al. 2404.07185 translate read null
2024-04-10 Adaptive behavior with stable synapses Cristiano Capone et.al. 2404.07150 translate read null
2024-04-10 How Consistent are Clinicians? Evaluating the Predictability of Sepsis Disease Progression with Dynamics Models Unnseo Park et.al. 2404.07148 translate read null
2024-04-10 Rethinking Out-of-Distribution Detection for Reinforcement Learning: Advancing Methods for Evaluation and Detection Linas Nasvytis et.al. 2404.07099 translate read link
2024-04-10 Improving Language Model Reasoning with Self-motivated Learning Yunlong Feng et.al. 2404.07017 translate read null
2024-04-10 Agent-driven Generative Semantic Communication for Remote Surveillance Wanting Yang et.al. 2404.06997 translate read null
2024-04-10 Deep Reinforcement Learning for Mobile Robot Path Planning Hao Liu et.al. 2404.06974 translate read null
2024-04-10 UAV-Assisted Enhanced Coverage and Capacity in Dynamic MU-mMIMO IoT Systems: A Deep Reinforcement Learning Approach MohammadMahdi Ghadaksaz et.al. 2404.06726 translate read null
2024-04-10 Dual Ensemble Kalman Filter for Stochastic Optimal Control Anant A. Joshi et.al. 2404.06696 translate read null
2024-04-09 Graph Reinforcement Learning for Combinatorial Optimization: A Survey and Unifying Perspective Victor-Alexandru Darvariu et.al. 2404.06492 translate read null
2024-04-09 Deep Reinforcement Learning-Based Approach for a Single Vehicle Persistent Surveillance Problem with Fuel Constraints Hritik Bana et.al. 2404.06423 translate read null
2024-04-09 The Power in Communication: Power Regularization of Communication for Autonomy in Cooperative Multi-Agent Reinforcement Learning Nancirose Piazza et.al. 2404.06387 translate read null
2024-04-09 Policy-Guided Diffusion Matthew Thomas Jackson et.al. 2404.06356 translate read link
2024-04-09 Generative Pre-Trained Transformer for Symbolic Regression Base In-Context Reinforcement Learning Yanjie Li et.al. 2404.06330 translate read null
2024-04-09 Diverse Randomized Value Functions: A Provably Pessimistic Approach for Offline Reinforcement Learning Xudong Yu et.al. 2404.06188 translate read null
2024-04-09 A quantum information theoretic analysis of reinforcement learning-assisted quantum architecture search Abhishek Sadhu et.al. 2404.06174 translate read null
2024-04-09 Adaptable Recovery Behaviors in Robotics: A Behavior Trees and Motion Generators(BTMG) Approach for Failure Management Faseeh Ahmad et.al. 2404.06129 translate read null
2024-04-09 Automatic Configuration Tuning on Cloud Database: A Survey Limeng Zhang et.al. 2404.06043 translate read null
2024-04-09 Commute with Community: Enhancing Shared Travel through Social Networks Tian Siyuan et.al. 2404.05987 translate read null
2024-04-08 Humanoid-Gym: Reinforcement Learning for Humanoid Robot with Zero-Shot Sim2Real Transfer Xinyang Gu et.al. 2404.05695 translate read null
2024-04-08 YaART: Yet Another ART Rendering Technology Sergey Kastryulin et.al. 2404.05666 translate read null
2024-04-08 Dynamic Backtracking in GFlowNet: Enhancing Decision Steps with Reward-Dependent Adjustment Mechanisms Shuai Guo et.al. 2404.05576 translate read null
2024-04-08 Optimal Flow Admission Control in Edge Computing via Safe Reinforcement Learning A. Fox et.al. 2404.05564 translate read null
2024-04-08 Best-of-Venom: Attacking RLHF by Injecting Poisoned Preference Data Tim Baumgärtner et.al. 2404.05530 translate read null
2024-04-08 CNN-based Game State Detection for a Foosball Table David Hagens et.al. 2404.05357 translate read null
2024-04-08 Long-horizon Locomotion and Manipulation on a Quadrupedal Robot with Large Language Models Yutao Ouyang et.al. 2404.05291 translate read null
2024-04-08 SAFE-GIL: SAFEty Guided Imitation Learning Yusuf Umut Ciftci et.al. 2404.05249 translate read null
2024-04-08 MeSA-DRL: Memory-Enhanced Deep Reinforcement Learning for Advanced Socially Aware Robot Navigation in Crowded Environments Mannan Saeed Muhammad et.al. 2404.05203 translate read null
2024-04-08 Decision Transformer for Wireless Communications: A New Paradigm of Resource Management Jie Zhang et.al. 2404.05199 translate read null
2024-04-05 Growing Q-Networks: Solving Continuous Control Tasks with Adaptive Control Resolution Tim Seyde et.al. 2404.04253 translate read null
2024-04-05 Continual Policy Distillation of Reinforcement Learning-based Controllers for Soft Robotic In-Hand Manipulation Lanpei Li et.al. 2404.04219 translate read null
2024-04-05 Enhancing IoT Intelligence: A Transformer-based Reinforcement Learning Methodology Gaith Rjoub et.al. 2404.04205 translate read null
2024-04-05 Intervention-Assisted Policy Gradient Methods for Online Stochastic Queuing Network Optimization: Technical Report Jerrod Wigmore et.al. 2404.04106 translate read null
2024-04-05 Dynamic Prompt Optimizing for Text-to-Image Generation Wenyi Mo et.al. 2404.04095 translate read link
2024-04-05 Demonstration Guided Multi-Objective Reinforcement Learning Junlin Lu et.al. 2404.03997 translate read null
2024-04-05 A proximal policy optimization based intelligent home solar management Kode Creer et.al. 2404.03888 translate read null
2024-04-05 Heterogeneous Multi-Agent Reinforcement Learning for Zero-Shot Scalable Collaboration Xudong Guo et.al. 2404.03869 translate read null
2024-04-04 Exploration is Harder than Prediction: Cryptographically Separating Reinforcement Learning from Supervised Learning Noah Golowich et.al. 2404.03774 translate read null
2024-04-04 A Reinforcement Learning based Reset Policy for CDCL SAT Solvers Chunxiao Li et.al. 2404.03753 translate read null
2024-04-04 AutoWebGLM: Bootstrap And Reinforce A Large Language Model-based Web Navigating Agent Hanyu Lai et.al. 2404.03648 translate read link
2024-04-04 Sequential Recommendation for Optimizing Both Immediate Feedback and Long-term Retention Ziru Liu et.al. 2404.03637 translate read link
2024-04-04 Laser Learning Environment: A new environment for coordination-critical multi-agent tasks Yannick Molinghen et.al. 2404.03596 translate read link
2024-04-04 Distributionally Robust Reinforcement Learning with Interactive Data Collection: Fundamental Hardness and Near-Optimal Algorithm Miao Lu et.al. 2404.03578 translate read null
2024-04-04 Embodied AI with Two Arms: Zero-shot Learning, Safety and Modularity Jake Varley et.al. 2404.03570 translate read null
2024-04-04 AdaGlimpse: Active Visual Exploration with Arbitrary Glimpse Position and Scale Adam Pardyl et.al. 2404.03482 translate read link
2024-04-04 Integrating Hyperparameter Search into GramML Hernán Ceferino Vázquez et.al. 2404.03419 translate read link
2024-04-04 Can Small Language Models Help Large Language Models Reason Better?: LM-Guided Chain-of-Thought Jooyoung Lee et.al. 2404.03414 translate read null
2024-04-04 SENSOR: Imitate Third-Person Expert’s Behaviors via Active Sensoring Kaichen Huang et.al. 2404.03386 translate read null
2024-04-04 DIDA: Denoised Imitation Learning based on Domain Adaptation Kaichen Huang et.al. 2404.03382 translate read null
2024-04-03 Learning Quadrupedal Locomotion via Differentiable Simulation Clemens Schwarke et.al. 2404.02887 translate read null
2024-04-03 Unsupervised Learning of Effective Actions in Robotics Marko Zaric et.al. 2404.02728 translate read link
2024-04-03 Reinforcement Learning in Categorical Cybernetics Jules Hedges et.al. 2404.02688 translate read null
2024-04-03 Solving a Real-World Optimization Problem Using Proximal Policy Optimization with Curriculum Learning and Reward Engineering Abhijeet Pendyala et.al. 2404.02577 translate read null
2024-04-03 SliceIt! – A Dual Simulator Framework for Learning Robot Food Slicing Cristian C. Beltran-Hernandez et.al. 2404.02569 translate read link
2024-04-03 Grid-Mapping Pseudo-Count Constraint for Offline Reinforcement Learning Yi Shen et.al. 2404.02545 translate read link
2024-04-03 Versatile Scene-Consistent Traffic Scenario Generation as Optimization with Diffusion Zhiyu Huang et.al. 2404.02524 translate read null
2024-04-03 Joint Optimization on Uplink OFDMA and MU-MIMO for IEEE 802.11ax: Deep Hierarchical Reinforcement Learning Approach Hyeonho Noh et.al. 2404.02486 translate read null
2024-04-03 Deep Reinforcement Learning for Traveling Purchaser Problems Haofeng Yuan et.al. 2404.02476 translate read null
2024-04-03 Electric Vehicle Routing Problem for Emergency Power Supply: Towards Telecom Base Station Relief Daisuke Kikuta et.al. 2404.02448 translate read link
2024-04-02 Tuning for the Unknown: Revisiting Evaluation Strategies for Lifelong RL Golnaz Mesbahi et.al. 2404.02113 translate read null
2024-04-02 Emergence of Chemotactic Strategies with Multi-Agent Reinforcement Learning Samuel Tovey et.al. 2404.01999 translate read null
2024-04-02 VLRM: Vision-Language Models act as Reward Models for Image Captioning Maksim Dzabraev et.al. 2404.01911 translate read null
2024-04-02 Active Exploration in Bayesian Model-based Reinforcement Learning for Robot Manipulation Carlos Plou et.al. 2404.01867 translate read null
2024-04-02 Keeping Behavioral Programs Alive: Specifying and Executing Liveness Requirements Tom Yaacov et.al. 2404.01858 translate read null
2024-04-02 EV2Gym: A Flexible V2G Simulator for EV Smart Charging Research and Benchmarking Stavros Orfanoudakis et.al. 2404.01849 translate read null
2024-04-02 Doubly-Robust Off-Policy Evaluation with Estimated Logging Policy Kyungbok Lee et.al. 2404.01830 translate read null
2024-04-02 Imitation Game: A Model-based and Imitation Learning Deep Reinforcement Learning Hybrid Eric MSP Veith et.al. 2404.01794 translate read null
2024-04-02 Unifying Qualitative and Quantitative Safety Verification of DNN-Controlled Systems Dapeng Zhi et.al. 2404.01769 translate read null
2024-04-02 Asymptotics of Language Model Alignment Joy Qiping Yang et.al. 2404.01730 translate read null

(<a href=../Reinforcement_Learning.md>back to Reinforcement Learning</a>)