Reinforcement Learning - 2024-04

Publish Date	Title	Authors	PDF	Translate	Read	Code
2024-04-30	Collaborative Control Method of Transit Signal Priority Based on Cooperative Game and Reinforcement Learning	Hao Qin et.al.	2404.19683	translate	read	null
2024-04-30	Towards Generalist Robot Learning from Internet Video: A Survey	Robert McCarthy et.al.	2404.19664	translate	read	null
2024-04-30	Short term vs. long term: optimization of microswimmer navigation on different time horizons	Navid Mousavi et.al.	2404.19561	translate	read	null
2024-04-30	Continual Model-based Reinforcement Learning for Data Efficient Wireless Network Optimisation	Cengis Hasan et.al.	2404.19462	translate	read	null
2024-04-30	Imitation Learning: A Survey of Learning Methods, Environments and Metrics	Nathan Gavenski et.al.	2404.19456	translate	read	null
2024-04-30	Countering Reward Over-optimization in LLM with Demonstration-Guided Reinforcement Learning	Mathieu Rita et.al.	2404.19409	translate	read	link
2024-04-30	Numeric Reward Machines	Kristina Levina et.al.	2404.19370	translate	read	null
2024-04-30	Pessimistic Value Iteration for Multi-Task Data Sharing in Offline Reinforcement Learning	Chenjia Bai et.al.	2404.19346	translate	read	link
2024-04-30	Provably Efficient Information-Directed Sampling Algorithms for Multi-Agent Reinforcement Learning	Qiaosheng Zhang et.al.	2404.19292	translate	read	null
2024-04-30	DiffuseLoco: Real-Time Legged Locomotion Control with Diffusion from Offline Datasets	Xiaoyu Huang et.al.	2404.19264	translate	read	null
2024-04-29	DPO Meets PPO: Reinforced Token Optimization for RLHF	Han Zhong et.al.	2404.18922	translate	read	null
2024-04-29	Sample-Efficient Robust Multi-Agent Reinforcement Learning in the Face of Environmental Uncertainty	Laixi Shi et.al.	2404.18909	translate	read	null
2024-04-29	Overcoming Knowledge Barriers: Online Imitation Learning from Observation with Pretrained World Models	Xingyuan Zhang et.al.	2404.18896	translate	read	null
2024-04-29	More RLHF, More Trust? On The Impact of Human Preference Alignment On Language Model Trustworthiness	Aaron J. Li et.al.	2404.18870	translate	read	link
2024-04-29	Performance-Aligned LLMs for Generating Fast Code	Daniel Nichols et.al.	2404.18864	translate	read	null
2024-04-29	PlanNetX: Learning an Efficient Neural Network Planner from MPC for Longitudinal Control	Jasper Hoffmann et.al.	2404.18863	translate	read	null
2024-04-30	Winning the Social Media Influence Battle: Uncertainty-Aware Opinions to Understand and Spread True Information via Competitive Influence Maximization	Qi Zhang et.al.	2404.18826	translate	read	null
2024-04-29	Control Policy Correction Framework for Reinforcement Learning-based Energy Arbitrage Strategies	Seyed Soroush Karimi Madahi et.al.	2404.18821	translate	read	null
2024-04-29	Multi-Agent Synchronization Tasks	Rolando Fernandez et.al.	2404.18798	translate	read	null
2024-04-29	Resource-rational reinforcement learning and sensorimotor causal states	Sarah Marzen et.al.	2404.18775	translate	read	null
2024-04-26	Probabilistic Inference in Language Models via Twisted Sequential Monte Carlo	Stephen Zhao et.al.	2404.17546	translate	read	null
2024-04-26	Ag2Manip: Learning Novel Manipulation Skills with Agent-Agnostic Visual and Action Representations	Puhao Li et.al.	2404.17521	translate	read	link
2024-04-26	Quantum Multi-Agent Reinforcement Learning for Aerial Ad-hoc Networks	Theodora-Augustina Drăgan et.al.	2404.17499	translate	read	null
2024-04-26	Q-Learning to navigate turbulence without a map	Marco Rando et.al.	2404.17495	translate	read	null
2024-04-26	Adaptive speed planning for Unmanned Vehicle Based on Deep Reinforcement Learning	Hao Liu et.al.	2404.17379	translate	read	null
2024-04-26	When to Trust LLMs: Aligning Confidence with Response Quality	Shuchang Tao et.al.	2404.17287	translate	read	null
2024-04-26	Enhancing Privacy and Security of Autonomous UAV Navigation	Vatsal Aggarwal et.al.	2404.17225	translate	read	null
2024-04-26	Beyond Imitation: A Life-long Policy Learning Framework for Path Tracking Control of Autonomous Driving	C. Gong et.al.	2404.17198	translate	read	null
2024-04-26	An Explainable Deep Reinforcement Learning Model for Warfarin Maintenance Dosing Using Policy Distillation and Action Forging	Sadjad Anzabi Zadeh et.al.	2404.17187	translate	read	null
2024-04-25	Compiler for Distributed Quantum Computing: a Reinforcement Learning Approach	Panagiotis Promponas et.al.	2404.17077	translate	read	null
2024-04-25	REBEL: Reinforcement Learning via Regressing Relative Rewards	Zhaolin Gao et.al.	2404.16767	translate	read	null
2024-04-25	Distilling Privileged Information for Dubins Traveling Salesman Problems with Neighborhoods	Min Kyu Shin et.al.	2404.16721	translate	read	null
2024-04-25	RUMOR: Reinforcement learning for Understanding a Model of the Real World for Navigation in Dynamic Environments	Diego Martinez-Baselga et.al.	2404.16672	translate	read	null
2024-04-25	Hippocrates: An Open-Source Framework for Advancing Large Language Models in Healthcare	Emre Can Acikgoz et.al.	2404.16621	translate	read	null
2024-04-25	Exploring the Dynamics of Data Transmission in 5G Networks: A Conceptual Analysis	Nikita Smirnov et.al.	2404.16508	translate	read	null
2024-04-25	Leveraging Pretrained Latent Representations for Few-Shot Imitation Learning on a Dexterous Robotic Hand	Davide Liconti et.al.	2404.16483	translate	read	null
2024-04-25	A Dual Perspective of Reinforcement Learning for Imposing Policy Constraints	Bram De Cooman et.al.	2404.16468	translate	read	null
2024-04-25	Offline Reinforcement Learning with Behavioral Supervisor Tuning	Padmanaba Srinivasan et.al.	2404.16399	translate	read	null
2024-04-25	SwarmRL: Building the Future of Smart Active Systems	Samuel Tovey et.al.	2404.16388	translate	read	link
2024-04-25	Reinforcement Learning with Generative Models for Compact Support Sets	Nico Schiavone et.al.	2404.16300	translate	read	link
2024-04-24	DPO: Differential reinforcement learning with application to optimal configuration search	Chandrajit Bajaj et.al.	2404.15617	translate	read	null
2024-04-24	GRSN: Gated Recurrent Spiking Neurons for POMDPs and MARL	Lang Qin et.al.	2404.15597	translate	read	null
2024-04-24	Multi-Agent Reinforcement Learning for Energy Networks: Computational Challenges, Progress and Open Problems	Sarah Keren et.al.	2404.15583	translate	read	null
2024-04-23	An MRP Formulation for Supervised Learning: Generalized Temporal Difference Learning Models	Yangchen Pan et.al.	2404.15518	translate	read	null
2024-04-23	The Power of Resets in Online Reinforcement Learning	Zakaria Mhammedi et.al.	2404.15417	translate	read	null
2024-04-23	Planning the path with Reinforcement Learning: Optimal Robot Motion Planning in RoboCup Small Size League Environments	Mateus G. Machado et.al.	2404.15410	translate	read	link
2024-04-23	Reinforcement Learning with Adaptive Control Regularization for Safe Control of Critical Systems	Haozhe Tian et.al.	2404.15199	translate	read	null
2024-04-23	Multimodal Large Language Model is a Human-Aligned Annotator for Text-to-Image Generation	Xun Wu et.al.	2404.15100	translate	read	null
2024-04-23	Impedance Matching: Enabling an RL-Based Running Jump in a Quadruped Robot	Neil Guan et.al.	2404.15096	translate	read	null
2024-04-23	Using deep reinforcement learning to promote sustainable human behaviour on a common pool resource problem	Raphael Koster et.al.	2404.15059	translate	read	null
2024-04-23	Cache-Aware Reinforcement Learning in Large-Scale Recommender Systems	Xiaoshuang Chen et.al.	2404.14961	translate	read	null
2024-04-23	Multi-Objective Deep Reinforcement Learning for 5G Base Station Placement to Support Localisation for Future Sustainable Traffic	Ahmed Al-Tahmeesschi et.al.	2404.14954	translate	read	null
2024-04-23	MultiSTOP: Solving Functional Equations with Reinforcement Learning	Alessandro Trenta et.al.	2404.14909	translate	read	null
2024-04-23	Unitary Synthesis of Clifford+T Circuits with Reinforcement Learning	Sebastian Rietsch et.al.	2404.14865	translate	read	null
2024-04-23	Evolutionary Reinforcement Learning via Cooperative Coevolution	Chengpeng Hu et.al.	2404.14763	translate	read	null
2024-04-23	Rank2Reward: Learning Shaped Reward Functions from Passive Video	Daniel Yang et.al.	2404.14735	translate	read	null
2024-04-22	Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data	Fahim Tajwar et.al.	2404.14367	translate	read	link
2024-04-22	PLUTO: Pushing the Limit of Imitation Learning-based Planning for Autonomous Driving	Jie Cheng et.al.	2404.14327	translate	read	null
2024-04-22	Multi-Agent Hybrid SAC for Joint SS-DSA in CRNs	David R. Nickel et.al.	2404.14319	translate	read	null
2024-04-22	LLM-Personalize: Aligning LLM Planners with Human Preferences via Reinforced Self-Training for Housekeeping Robots	Dongge Han et.al.	2404.14285	translate	read	null
2024-04-22	Beyond the Edge: An Advanced Exploration of Reinforcement Learning for Mobile Edge Computing, its Applications, and Future Research Trajectories	Ning Yang et.al.	2404.14238	translate	read	null
2024-04-22	Multi-agent Reinforcement Learning-based Joint Precoding and Phase Shift Optimization for RIS-aided Cell-Free Massive MIMO Systems	Yiyang Zhu et.al.	2404.14092	translate	read	null
2024-04-22	Mechanistic Interpretability for AI Safety – A Review	Leonard Bereska et.al.	2404.14082	translate	read	null
2024-04-22	Research on Robot Path Planning Based on Reinforcement Learning	Wang Ruiqi et.al.	2404.14077	translate	read	link
2024-04-22	Multi-view Disentanglement for Reinforcement Learning with Multiple Cameras	Mhairi Dunion et.al.	2404.14064	translate	read	link
2024-04-22	A survey of air combat behavior modeling using machine learning	Patrick Ribu Gorton et.al.	2404.13954	translate	read	null
2024-04-19	Mapping Social Choice Theory to RLHF	Jessica Dai et.al.	2404.13038	translate	read	null
2024-04-19	Deep Reinforcement Learning-Based Active Flow Control of an Elliptical Cylinder: Transitioning from an Elliptical Cylinder to a Circular Cylinder and a Flat Plate	Wang Jia et.al.	2404.13003	translate	read	null
2024-04-19	Goal Exploration via Adaptive Skill Distribution for Goal-Conditioned Reinforcement Learning	Lisheng Wu et.al.	2404.12999	translate	read	null
2024-04-19	MM-PhyRLHF: Reinforcement Learning Framework for Multimodal Physics Question-Answering	Avinash Anand et.al.	2404.12926	translate	read	null
2024-04-19	Zero-Shot Stitching in Reinforcement Learning using Relative Representations	Antonio Pio Ricciardi et.al.	2404.12917	translate	read	null
2024-04-19	MAexp: A Generic Platform for RL-based Multi-Agent Exploration	Shaohao Zhu et.al.	2404.12824	translate	read	link
2024-04-19	Adaptive Regularization of Representation Rank as an Implicit Constraint of Bellman Equation	Qiang He et.al.	2404.12754	translate	read	link
2024-04-19	Demonstration of quantum projective simulation on a single-photon-based quantum computer	Giacomo Franceschetto et.al.	2404.12729	translate	read	null
2024-04-19	Energy Conserved Failure Detection for NS-IoT Systems	Guojin Liu et.al.	2404.12713	translate	read	null
2024-04-19	Single-Task Continual Offline Reinforcement Learning	Sibo Gai et.al.	2404.12639	translate	read	null
2024-04-18	From $r$ to $Q^*$ : Your Language Model is Secretly a Q-Function	Rafael Rafailov et.al.	2404.12358	translate	read	null
2024-04-18	Improving the interpretability of GNN predictions through conformal-based graph sparsification	Pablo Sanchez-Martin et.al.	2404.12356	translate	read	link
2024-04-18	Practical Considerations for Discrete-Time Implementations of Continuous-Time Control Barrier Function-Based Safety Filters	Lukas Brunke et.al.	2404.12329	translate	read	null
2024-04-18	ASID: Active Exploration for System Identification in Robotic Manipulation	Marius Memmel et.al.	2404.12308	translate	read	null
2024-04-18	RISE: 3D Perception Makes Real-World Robot Imitation Simple and Effective	Chenxi Wang et.al.	2404.12281	translate	read	null
2024-04-18	Privacy-Preserving UCB Decision Process Verification via zk-SNARKs	Xikun Jiang et.al.	2404.12186	translate	read	null
2024-04-18	Aligning language models with human preferences	Tomasz Korbak et.al.	2404.12150	translate	read	link
2024-04-19	Robust and Adaptive Deep Reinforcement Learning for Enhancing Flow Control around a Square Cylinder with Varying Reynolds Numbers	Wang Jia et.al.	2404.12123	translate	read	null
2024-04-18	X-Light: Cross-City Traffic Signal Control Using Transformer on Transformer as Meta Multi-Agent Reinforcement Learner	Haoyuan Jiang et.al.	2404.12090	translate	read	link
2024-04-18	Trajectory Planning for Autonomous Vehicle Using Iterative Reward Prediction in Reinforcement Learning	Hyunwoo Park et.al.	2404.12079	translate	read	null
2024-04-17	Prompt Optimizer of Text-to-Image Diffusion Models for Abstract Concept Understanding	Zezhong Fan et.al.	2404.11589	translate	read	null
2024-04-17	Deep Policy Optimization with Temporal Logic Constraints	Ameesh Shah et.al.	2404.11578	translate	read	null
2024-04-17	Spatio-Temporal Motion Retargeting for Quadruped Robots	Taerim Yoon et.al.	2404.11557	translate	read	null
2024-04-17	VC Theory for Inventory Policies	Yaqi Xie et.al.	2404.11509	translate	read	null
2024-04-17	Learn to Tour: Operator Design For Solution Feasibility Mapping in Pickup-and-delivery Traveling Salesman Problem	Bowen Fang et.al.	2404.11458	translate	read	null
2024-04-17	What-if Analysis Framework for Digital Twins in 6G Wireless Network Management	Elif Ak et.al.	2404.11394	translate	read	null
2024-04-17	Convergence of Policy Gradient for Stochastic Linear-Quadratic Control Problem in Infinite Horizon	Xinpei Zhang et.al.	2404.11382	translate	read	null
2024-04-17	Following the Human Thread in Social Navigation	Luca Scofano et.al.	2404.11327	translate	read	link
2024-04-17	On Learning Parities with Dependent Noise	Noah Golowich et.al.	2404.11325	translate	read	null
2024-04-17	Physics-informed Actor-Critic for Coordination of Virtual Inertia from Power Distribution Systems	Simon Stock et.al.	2404.11149	translate	read	null
2024-04-16	Settling Constant Regrets in Linear Markov Decision Processes	Weitong Zhang et.al.	2404.10745	translate	read	null
2024-04-16	N-Agent Ad Hoc Teamwork	Caroline Wang et.al.	2404.10740	translate	read	null
2024-04-16	Bootstrapping Linear Models for Fast Online Adaptation in Human-Agent Collaboration	Benjamin A Newman et.al.	2404.10733	translate	read	null
2024-04-16	Randomized Exploration in Cooperative Multi-Agent Reinforcement Learning	Hao-Lun Hsu et.al.	2404.10728	translate	read	null
2024-04-16	Automatic re-calibration of quantum devices by reinforcement learning	T. Crosta et.al.	2404.10726	translate	read	null
2024-04-16	Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study	Shusheng Xu et.al.	2404.10719	translate	read	null
2024-04-16	Simplex Decomposition for Portfolio Allocation Constraints in Reinforcement Learning	David Winkel et.al.	2404.10683	translate	read	null
2024-04-16	SCALE: Self-Correcting Visual Navigation for Mobile Robots via Anti-Novelty Estimation	Chang Chen et.al.	2404.10675	translate	read	null
2024-04-16	Continual Offline Reinforcement Learning via Diffusion-based Dual Generative Replay	Jinmei Liu et.al.	2404.10662	translate	read	link
2024-04-16	Trajectory Planning using Reinforcement Learning for Interactive Overtaking Maneuvers in Autonomous Racing Scenarios	Levent Ögretmen et.al.	2404.10658	translate	read	null
2024-04-15	Unveiling Imitation Learning: Exploring the Impact of Data Falsity to Large Language Model	Hyunsoo Cho et.al.	2404.09717	translate	read	null
2024-04-15	Higher Replay Ratio Empowers Sample-Efficient Multi-Agent Reinforcement Learning	Linjie Xu et.al.	2404.09715	translate	read	null
2024-04-15	Learn Your Reference Model for Real Good Alignment	Alexey Gorbatovski et.al.	2404.09656	translate	read	null
2024-04-15	Reliability Estimation of News Media Sources: Birds of a Feather Flock Together	Sergio Burdisso et.al.	2404.09565	translate	read	null
2024-04-15	Inferring Behavior-Specific Context Improves Zero-Shot Generalization in Reinforcement Learning	Tidiane Camaret Ndir et.al.	2404.09521	translate	read	link
2024-04-14	Correlated Mean Field Imitation Learning	Zhiyu Zhao et.al.	2404.09324	translate	read	null
2024-04-14	Egret: Reinforcement Mechanism for Sequential Computation Offloading in Edge Computing	Haosong Peng et.al.	2404.09285	translate	read	null
2024-04-14	A Reinforcement Learning Based Backfilling Strategy for HPC Batch Jobs	Elliot Kolker-Hicks et.al.	2404.09264	translate	read	null
2024-04-14	Knowledgeable Agents by Offline Reinforcement Learning from Large Language Model Rollouts	Jing-Cheng Pang et.al.	2404.09248	translate	read	null
2024-04-14	Advanced Intelligent Optimization Algorithms for Multi-Objective Optimal Power Flow in Future Power Systems: A Review	Yuyan Li et.al.	2404.09203	translate	read	null
2024-04-12	Enhancing Autonomous Vehicle Training with Language Model Integration and Critical Scenario Generation	Hanlin Tian et.al.	2404.08570	translate	read	null
2024-04-12	RLHF Deciphered: A Critical Analysis of Reinforcement Learning from Human Feedback for LLMs	Shreyas Chaudhari et.al.	2404.08555	translate	read	null
2024-04-12	Advancing Forest Fire Prevention: Deep Reinforcement Learning for Effective Firebreak Placement	Lucas Murray et.al.	2404.08523	translate	read	null
2024-04-12	Adversarial Imitation Learning via Boosting	Jonathan D. Chang et.al.	2404.08513	translate	read	null
2024-04-12	Prescribing Optimal Health-Aware Operation for Urban Air Mobility with Deep Reinforcement Learning	Mina Montazeri et.al.	2404.08497	translate	read	null
2024-04-12	Dataset Reset Policy Optimization for RLHF	Jonathan D. Chang et.al.	2404.08495	translate	read	link
2024-04-12	Anti-Byzantine Attacks Enabled Vehicle Selection for Asynchronous Federated Learning in Vehicular Edge Computing	Cui Zhang et.al.	2404.08444	translate	read	null
2024-04-12	SIR-RL: Reinforcement Learning for Optimized Policy Control during Epidemiological Outbreaks in Emerging Market and Developing Economies	Maeghal Jain et.al.	2404.08423	translate	read	null
2024-04-12	TDANet: Target-Directed Attention Network For Object-Goal Visual Navigation With Zero-Shot Ability	Shiwei Lian et.al.	2404.08353	translate	read	null
2024-04-12	Agile and versatile bipedal robot tracking control through reinforcement learning	Jiayi Li et.al.	2404.08246	translate	read	null
2024-04-11	High-Dimension Human Value Representation in Large Language Models	Samuel Cahyawijaya et.al.	2404.07900	translate	read	null
2024-04-11	Data-Driven System Identification of Quadrotors Subject to Motor Delays	Jonas Eschmann et.al.	2404.07837	translate	read	null
2024-04-11	On the Sample Efficiency of Abstractions and Potential-Based Reward Shaping in Reinforcement Learning	Giuseppe Canonaco et.al.	2404.07826	translate	read	null
2024-04-11	An Overview of Diffusion Models: Applications, Guided Generation, Statistical Rates and Optimization	Minshuo Chen et.al.	2404.07771	translate	read	null
2024-04-11	Differentially Private Reinforcement Learning with Self-Play	Dan Qiao et.al.	2404.07559	translate	read	null
2024-04-11	Enhancing Policy Gradient with the Polyak Step-Size Adaption	Yunxiang Li et.al.	2404.07525	translate	read	null
2024-04-11	Generative Probabilistic Planning for Optimizing Supply Chain Networks	Hyung-il Ahn et.al.	2404.07511	translate	read	null
2024-04-11	Neural Fault Injection: Generating Software Faults from Natural Language	Domenico Cotroneo et.al.	2404.07491	translate	read	null
2024-04-11	Leveraging Domain-Unlabeled Data in Offline Reinforcement Learning across Two Domains	Soichiro Nishimori et.al.	2404.07465	translate	read	null
2024-04-11	UAV-enabled Collaborative Beamforming via Multi-Agent Deep Reinforcement Learning	Saichao Liu et.al.	2404.07453	translate	read	null
2024-04-10	Reward Learning from Suboptimal Demonstrations with Applications in Surgical Electrocautery	Zohre Karimi et.al.	2404.07185	translate	read	null
2024-04-10	Adaptive behavior with stable synapses	Cristiano Capone et.al.	2404.07150	translate	read	null
2024-04-10	How Consistent are Clinicians? Evaluating the Predictability of Sepsis Disease Progression with Dynamics Models	Unnseo Park et.al.	2404.07148	translate	read	null
2024-04-10	Rethinking Out-of-Distribution Detection for Reinforcement Learning: Advancing Methods for Evaluation and Detection	Linas Nasvytis et.al.	2404.07099	translate	read	link
2024-04-10	Improving Language Model Reasoning with Self-motivated Learning	Yunlong Feng et.al.	2404.07017	translate	read	null
2024-04-10	Agent-driven Generative Semantic Communication for Remote Surveillance	Wanting Yang et.al.	2404.06997	translate	read	null
2024-04-10	Deep Reinforcement Learning for Mobile Robot Path Planning	Hao Liu et.al.	2404.06974	translate	read	null
2024-04-10	UAV-Assisted Enhanced Coverage and Capacity in Dynamic MU-mMIMO IoT Systems: A Deep Reinforcement Learning Approach	MohammadMahdi Ghadaksaz et.al.	2404.06726	translate	read	null
2024-04-10	Dual Ensemble Kalman Filter for Stochastic Optimal Control	Anant A. Joshi et.al.	2404.06696	translate	read	null
2024-04-09	Graph Reinforcement Learning for Combinatorial Optimization: A Survey and Unifying Perspective	Victor-Alexandru Darvariu et.al.	2404.06492	translate	read	null
2024-04-09	Deep Reinforcement Learning-Based Approach for a Single Vehicle Persistent Surveillance Problem with Fuel Constraints	Hritik Bana et.al.	2404.06423	translate	read	null
2024-04-09	The Power in Communication: Power Regularization of Communication for Autonomy in Cooperative Multi-Agent Reinforcement Learning	Nancirose Piazza et.al.	2404.06387	translate	read	null
2024-04-09	Policy-Guided Diffusion	Matthew Thomas Jackson et.al.	2404.06356	translate	read	link
2024-04-09	Generative Pre-Trained Transformer for Symbolic Regression Base In-Context Reinforcement Learning	Yanjie Li et.al.	2404.06330	translate	read	null
2024-04-09	Diverse Randomized Value Functions: A Provably Pessimistic Approach for Offline Reinforcement Learning	Xudong Yu et.al.	2404.06188	translate	read	null
2024-04-09	A quantum information theoretic analysis of reinforcement learning-assisted quantum architecture search	Abhishek Sadhu et.al.	2404.06174	translate	read	null
2024-04-09	Adaptable Recovery Behaviors in Robotics: A Behavior Trees and Motion Generators(BTMG) Approach for Failure Management	Faseeh Ahmad et.al.	2404.06129	translate	read	null
2024-04-09	Automatic Configuration Tuning on Cloud Database: A Survey	Limeng Zhang et.al.	2404.06043	translate	read	null
2024-04-09	Commute with Community: Enhancing Shared Travel through Social Networks	Tian Siyuan et.al.	2404.05987	translate	read	null
2024-04-08	Humanoid-Gym: Reinforcement Learning for Humanoid Robot with Zero-Shot Sim2Real Transfer	Xinyang Gu et.al.	2404.05695	translate	read	null
2024-04-08	YaART: Yet Another ART Rendering Technology	Sergey Kastryulin et.al.	2404.05666	translate	read	null
2024-04-08	Dynamic Backtracking in GFlowNet: Enhancing Decision Steps with Reward-Dependent Adjustment Mechanisms	Shuai Guo et.al.	2404.05576	translate	read	null
2024-04-08	Optimal Flow Admission Control in Edge Computing via Safe Reinforcement Learning	A. Fox et.al.	2404.05564	translate	read	null
2024-04-08	Best-of-Venom: Attacking RLHF by Injecting Poisoned Preference Data	Tim Baumgärtner et.al.	2404.05530	translate	read	null
2024-04-08	CNN-based Game State Detection for a Foosball Table	David Hagens et.al.	2404.05357	translate	read	null
2024-04-08	Long-horizon Locomotion and Manipulation on a Quadrupedal Robot with Large Language Models	Yutao Ouyang et.al.	2404.05291	translate	read	null
2024-04-08	SAFE-GIL: SAFEty Guided Imitation Learning	Yusuf Umut Ciftci et.al.	2404.05249	translate	read	null
2024-04-08	MeSA-DRL: Memory-Enhanced Deep Reinforcement Learning for Advanced Socially Aware Robot Navigation in Crowded Environments	Mannan Saeed Muhammad et.al.	2404.05203	translate	read	null
2024-04-08	Decision Transformer for Wireless Communications: A New Paradigm of Resource Management	Jie Zhang et.al.	2404.05199	translate	read	null
2024-04-05	Growing Q-Networks: Solving Continuous Control Tasks with Adaptive Control Resolution	Tim Seyde et.al.	2404.04253	translate	read	null
2024-04-05	Continual Policy Distillation of Reinforcement Learning-based Controllers for Soft Robotic In-Hand Manipulation	Lanpei Li et.al.	2404.04219	translate	read	null
2024-04-05	Enhancing IoT Intelligence: A Transformer-based Reinforcement Learning Methodology	Gaith Rjoub et.al.	2404.04205	translate	read	null
2024-04-05	Intervention-Assisted Policy Gradient Methods for Online Stochastic Queuing Network Optimization: Technical Report	Jerrod Wigmore et.al.	2404.04106	translate	read	null
2024-04-05	Dynamic Prompt Optimizing for Text-to-Image Generation	Wenyi Mo et.al.	2404.04095	translate	read	link
2024-04-05	Demonstration Guided Multi-Objective Reinforcement Learning	Junlin Lu et.al.	2404.03997	translate	read	null
2024-04-05	A proximal policy optimization based intelligent home solar management	Kode Creer et.al.	2404.03888	translate	read	null
2024-04-05	Heterogeneous Multi-Agent Reinforcement Learning for Zero-Shot Scalable Collaboration	Xudong Guo et.al.	2404.03869	translate	read	null
2024-04-04	Exploration is Harder than Prediction: Cryptographically Separating Reinforcement Learning from Supervised Learning	Noah Golowich et.al.	2404.03774	translate	read	null
2024-04-04	A Reinforcement Learning based Reset Policy for CDCL SAT Solvers	Chunxiao Li et.al.	2404.03753	translate	read	null
2024-04-04	AutoWebGLM: Bootstrap And Reinforce A Large Language Model-based Web Navigating Agent	Hanyu Lai et.al.	2404.03648	translate	read	link
2024-04-04	Sequential Recommendation for Optimizing Both Immediate Feedback and Long-term Retention	Ziru Liu et.al.	2404.03637	translate	read	link
2024-04-04	Laser Learning Environment: A new environment for coordination-critical multi-agent tasks	Yannick Molinghen et.al.	2404.03596	translate	read	link
2024-04-04	Distributionally Robust Reinforcement Learning with Interactive Data Collection: Fundamental Hardness and Near-Optimal Algorithm	Miao Lu et.al.	2404.03578	translate	read	null
2024-04-04	Embodied AI with Two Arms: Zero-shot Learning, Safety and Modularity	Jake Varley et.al.	2404.03570	translate	read	null
2024-04-04	AdaGlimpse: Active Visual Exploration with Arbitrary Glimpse Position and Scale	Adam Pardyl et.al.	2404.03482	translate	read	link
2024-04-04	Integrating Hyperparameter Search into GramML	Hernán Ceferino Vázquez et.al.	2404.03419	translate	read	link
2024-04-04	Can Small Language Models Help Large Language Models Reason Better?: LM-Guided Chain-of-Thought	Jooyoung Lee et.al.	2404.03414	translate	read	null
2024-04-04	SENSOR: Imitate Third-Person Expert’s Behaviors via Active Sensoring	Kaichen Huang et.al.	2404.03386	translate	read	null
2024-04-04	DIDA: Denoised Imitation Learning based on Domain Adaptation	Kaichen Huang et.al.	2404.03382	translate	read	null
2024-04-03	Learning Quadrupedal Locomotion via Differentiable Simulation	Clemens Schwarke et.al.	2404.02887	translate	read	null
2024-04-03	Unsupervised Learning of Effective Actions in Robotics	Marko Zaric et.al.	2404.02728	translate	read	link
2024-04-03	Reinforcement Learning in Categorical Cybernetics	Jules Hedges et.al.	2404.02688	translate	read	null
2024-04-03	Solving a Real-World Optimization Problem Using Proximal Policy Optimization with Curriculum Learning and Reward Engineering	Abhijeet Pendyala et.al.	2404.02577	translate	read	null
2024-04-03	SliceIt! – A Dual Simulator Framework for Learning Robot Food Slicing	Cristian C. Beltran-Hernandez et.al.	2404.02569	translate	read	link
2024-04-03	Grid-Mapping Pseudo-Count Constraint for Offline Reinforcement Learning	Yi Shen et.al.	2404.02545	translate	read	link
2024-04-03	Versatile Scene-Consistent Traffic Scenario Generation as Optimization with Diffusion	Zhiyu Huang et.al.	2404.02524	translate	read	null
2024-04-03	Joint Optimization on Uplink OFDMA and MU-MIMO for IEEE 802.11ax: Deep Hierarchical Reinforcement Learning Approach	Hyeonho Noh et.al.	2404.02486	translate	read	null
2024-04-03	Deep Reinforcement Learning for Traveling Purchaser Problems	Haofeng Yuan et.al.	2404.02476	translate	read	null
2024-04-03	Electric Vehicle Routing Problem for Emergency Power Supply: Towards Telecom Base Station Relief	Daisuke Kikuta et.al.	2404.02448	translate	read	link
2024-04-02	Tuning for the Unknown: Revisiting Evaluation Strategies for Lifelong RL	Golnaz Mesbahi et.al.	2404.02113	translate	read	null
2024-04-02	Emergence of Chemotactic Strategies with Multi-Agent Reinforcement Learning	Samuel Tovey et.al.	2404.01999	translate	read	null
2024-04-02	VLRM: Vision-Language Models act as Reward Models for Image Captioning	Maksim Dzabraev et.al.	2404.01911	translate	read	null
2024-04-02	Active Exploration in Bayesian Model-based Reinforcement Learning for Robot Manipulation	Carlos Plou et.al.	2404.01867	translate	read	null
2024-04-02	Keeping Behavioral Programs Alive: Specifying and Executing Liveness Requirements	Tom Yaacov et.al.	2404.01858	translate	read	null
2024-04-02	EV2Gym: A Flexible V2G Simulator for EV Smart Charging Research and Benchmarking	Stavros Orfanoudakis et.al.	2404.01849	translate	read	null
2024-04-02	Doubly-Robust Off-Policy Evaluation with Estimated Logging Policy	Kyungbok Lee et.al.	2404.01830	translate	read	null
2024-04-02	Imitation Game: A Model-based and Imitation Learning Deep Reinforcement Learning Hybrid	Eric MSP Veith et.al.	2404.01794	translate	read	null
2024-04-02	Unifying Qualitative and Quantitative Safety Verification of DNN-Controlled Systems	Dapeng Zhi et.al.	2404.01769	translate	read	null
2024-04-02	Asymptotics of Language Model Alignment	Joy Qiping Yang et.al.	2404.01730	translate	read	null

(<a href=../Reinforcement_Learning.md>back to Reinforcement Learning</a>)