Reinforcement Learning - 2024-03

Publish Date	Title	Authors	PDF	Translate	Read	Code
2024-03-29	Learning Visual Quadrupedal Loco-Manipulation from Demonstrations	Zhengmao He et.al.	2403.20328	translate	read	null
2024-03-29	Active flow control of a turbulent separation bubble through deep reinforcement learning	Bernat Font et.al.	2403.20295	translate	read	null
2024-03-29	Functional Bilevel Optimization for Machine Learning	Ieva Petrulionyte et.al.	2403.20233	translate	read	null
2024-03-29	Decentralized Multimedia Data Sharing in IoV: A Learning-based Equilibrium of Supply and Demand	Jiani Fan et.al.	2403.20218	translate	read	null
2024-03-29	Biologically-Plausible Topology Improved Spiking Actor Network for Efficient Deep Reinforcement Learning	Duzhen Zhang et.al.	2403.20163	translate	read	null
2024-03-29	CAESAR: Enhancing Federated RL in Heterogeneous MDPs through Convergence-Aware Sampling with Screening	Hei Yi Mak et.al.	2403.20156	translate	read	null
2024-03-29	A Learning-based Incentive Mechanism for Mobile AIGC Service in Decentralized Internet of Vehicles	Jiani Fan et.al.	2403.20151	translate	read	null
2024-03-29	Mol-AIR: Molecular Reinforcement Learning with Adaptive Intrinsic Rewards for Goal-directed Molecular Generation	Jinyeong Park et.al.	2403.20109	translate	read	link
2024-03-29	Reinforcement learning for graph theory, II. Small Ramsey numbers	Mohammad Ghebleh et.al.	2403.20055	translate	read	null
2024-03-29	Nonparametric Bellman Mappings for Reinforcement Learning: Application to Robust Adaptive Filtering	Yuki Akiyama et.al.	2403.20020	translate	read	null
2024-03-28	Human-compatible driving partners through data-regularized self-play reinforcement learning	Daphne Cornelisse et.al.	2403.19648	translate	read	link
2024-03-28	Keypoint Action Tokens Enable In-Context Imitation Learning in Robotics	Norman Di Palo et.al.	2403.19578	translate	read	null
2024-03-28	Jointly Training and Pruning CNNs via Learnable Agent Guidance and Alignment	Alireza Ganjdanesh et.al.	2403.19490	translate	read	null
2024-03-28	Offline Imitation Learning from Multiple Baselines with Applications to Compiler Optimization	Teodor V. Marinov et.al.	2403.19462	translate	read	null
2024-03-28	RiEMann: Near Real-Time SE(3)-Equivariant Robot Manipulation without Point Cloud Segmentation	Chongkai Gao et.al.	2403.19460	translate	read	null
2024-03-28	EDA-Driven Preprocessing for SAT Solving	Zhengyuan Shi et.al.	2403.19446	translate	read	null
2024-03-28	Mixed Preference Optimization: Reinforcement Learning with Data Selection and Better Reference Model	Qi Gou et.al.	2403.19443	translate	read	null
2024-03-28	Fine-Tuning Language Models with Reward Learning on Policy	Hao Lang et.al.	2403.19279	translate	read	link
2024-03-28	Removing the need for ground truth UWB data collection: self-supervised ranging error correction using deep reinforcement learning	Dieter Coppens et.al.	2403.19262	translate	read	null
2024-03-28	Inferring Latent Temporal Sparse Coordination Graph for Multi-Agent Reinforcement Learning	Wei Duan et.al.	2403.19253	translate	read	null
2024-03-27	Duolando: Follower GPT with Off-Policy Reinforcement Learning for Dance Accompaniment	Li Siyao et.al.	2403.18811	translate	read	null
2024-03-27	CaT: Constraints as Terminations for Legged Locomotion Reinforcement Learning	Elliot Chane-Sane et.al.	2403.18765	translate	read	null
2024-03-27	Probabilistic Model Checking of Stochastic Reinforcement Learning Policies	Dennis Gross et.al.	2403.18725	translate	read	null
2024-03-27	Fpga-Based Neural Thrust Controller for UAVs	Sharif Azem et.al.	2403.18703	translate	read	null
2024-03-27	Safe and Robust Reinforcement-Learning: Principles and Practice	Taku Yamagata et.al.	2403.18539	translate	read	null
2024-03-27	Bridging the Gap: Regularized Reinforcement Learning for Improved Classical Motion Planning with Safety Modules	Elias Goldsztejn et.al.	2403.18524	translate	read	null
2024-03-27	VersaT2I: Improving Text-to-Image Models with Versatile Reward	Jianshu Guo et.al.	2403.18493	translate	read	null
2024-03-27	Scaling Vision-and-Language Navigation With Offline RL	Valay Bundele et.al.	2403.18454	translate	read	null
2024-03-27	FRESCO: Federated Reinforcement Energy System for Cooperative Optimization	Nicolas Mauricio Cuadrado et.al.	2403.18444	translate	read	null
2024-03-27	Reinforcement learning for graph theory, I. Reimplementation of Wagner’s approach	Salem Al-Yakoob et.al.	2403.18429	translate	read	null
2024-03-26	TractOracle: towards an anatomically-informed reward function for RL-based tractography	Antoine Théberge et.al.	2403.17845	translate	read	null
2024-03-26	Learning the Optimal Power Flow: Environment Design Matters	Thomas Wolgast et.al.	2403.17831	translate	read	link
2024-03-26	Depending on yourself when you should: Mentoring LLM with RL agents to become the master in cybersecurity games	Yikuan Yan et.al.	2403.17674	translate	read	null
2024-03-26	Learning Goal-Directed Object Pushing in Cluttered Scenes with Location-Based Attention	Nils Dengler et.al.	2403.17667	translate	read	null
2024-03-26	Uncertainty-aware Distributional Offline Reinforcement Learning	Xiaocong Chen et.al.	2403.17646	translate	read	null
2024-03-26	PeersimGym: An Environment for Solving the Task Offloading Problem with Reinforcement Learning	Frederico Metelo et.al.	2403.17637	translate	read	null
2024-03-26	Retentive Decision Transformer with Adaptive Masking for Reinforcement Learning based Recommendation Systems	Siyu Wang et.al.	2403.17634	translate	read	null
2024-03-26	LASIL: Learner-Aware Supervised Imitation Learning For Long-term Microscopic Traffic Simulation	Ke Guo et.al.	2403.17601	translate	read	link
2024-03-26	Towards a Zero-Data, Controllable, Adaptive Dialog System	Dirk Väth et.al.	2403.17582	translate	read	null
2024-03-26	VDSC: Enhancing Exploration Timing with Value Discrepancy and State Counts	Marius Captari et.al.	2403.17542	translate	read	null
2024-03-25	An LLM-Based Digital Twin for Optimizing Human-in-the Loop Systems	Hanqing Yang et.al.	2403.16809	translate	read	null
2024-03-25	Enhancing Software Effort Estimation through Reinforcement Learning-based Project Management-Oriented Feature Selection	Haoyang Chen et.al.	2403.16749	translate	read	null
2024-03-25	Deep Reinforcement Learning and Mean-Variance Strategies for Responsible Portfolio Optimization	Fernando Acero et.al.	2403.16667	translate	read	null
2024-03-25	Skill Q-Network: Learning Adaptive Skill Ensemble for Mapless Navigation in Unknown Environments	Hyunki Seong et.al.	2403.16664	translate	read	null
2024-03-25	Trajectory Planning of Robotic Manipulator in Dynamic Environment Exploiting DRL	Osama Ahmad et.al.	2403.16652	translate	read	null
2024-03-25	CLHA: A Simple yet Effective Contrastive Learning Framework for Human Alignment	Feiteng Fang et.al.	2403.16649	translate	read	link
2024-03-25	Counter-example guided Imitation Learning of Feedback Controllers from Temporal Logic Specifications	Thao Dang et.al.	2403.16593	translate	read	null
2024-03-25	Arm-Constrained Curriculum Learning for Loco-Manipulation of the Wheel-Legged Robot	Zifan Wang et.al.	2403.16535	translate	read	link
2024-03-25	Towards Cooperative Maneuver Planning in Mixed Traffic at Urban Intersections	Marvin Klimke et.al.	2403.16478	translate	read	null
2024-03-25	If CLIP Could Talk: Understanding Vision-Language Model Representations Through Their Preferred Concept Descriptions	Reza Esfandiarpoor et.al.	2403.16442	translate	read	link
2024-03-25	Physics-informed RL for Maximal Safety Probability Estimation	Hikaru Hoshino et.al.	2403.16391	translate	read	null
2024-03-25	Learning Action-based Representations Using Invariance	Max Rudolph et.al.	2403.16369	translate	read	null
2024-03-22	Can large language models explore in-context?	Akshay Krishnamurthy et.al.	2403.15371	translate	read	null
2024-03-22	Planning with a Learned Policy Basis to Optimally Solve Complex Tasks	Guillermo Infante et.al.	2403.15301	translate	read	null
2024-03-22	Blockchain-based Pseudonym Management for Vehicle Twin Migrations in Vehicular Edge Metaverse	Jiawen Kang et.al.	2403.15285	translate	read	null
2024-03-22	Parametric PDE Control with Deep Reinforcement Learning and Differentiable L0-Sparse Polynomial Policies	Nicolò Botteghi et.al.	2403.15267	translate	read	null
2024-03-22	Self-Improvement for Neural Combinatorial Optimization: Sample without Replacement, but Improvement	Jonathan Pirnay et.al.	2403.15180	translate	read	null
2024-03-22	Subequivariant Reinforcement Learning Framework for Coordinated Motion Control	Haoyu Wang et.al.	2403.15100	translate	read	null
2024-03-22	Improved Long Short-Term Memory-based Wastewater Treatment Simulators for Deep Reinforcement Learning	Esmaeel Mohammadi et.al.	2403.15091	translate	read	null
2024-03-22	Automated Feature Selection for Inverse Reinforcement Learning	Daulet Baimukashev et.al.	2403.15079	translate	read	null
2024-03-22	Testing for Fault Diversity in Reinforcement Learning	Quentin Mazouni et.al.	2403.15065	translate	read	null
2024-03-22	Evidence-Driven Retrieval Augmented Response Generation for Online Misinformation	Zhenrui Yue et.al.	2403.14952	translate	read	null
2024-03-21	Rethinking Adversarial Inverse Reinforcement Learning: From the Angles of Policy Imitation and Transferable Reward Recovery	Yangchun Zhang et.al.	2403.14593	translate	read	null
2024-03-21	A Mathematical Introduction to Deep Reinforcement Learning for 5G/6G Applications	Farhad Rezazadeh et.al.	2403.14516	translate	read	null
2024-03-21	Constrained Reinforcement Learning with Smoothed Log Barrier Function	Baohe Zhang et.al.	2403.14508	translate	read	null
2024-03-21	On the continuity and smoothness of the value function in reinforcement learning and optimal control	Hans Harder et.al.	2403.14432	translate	read	null
2024-03-21	Emergent communication and learning pressures in language models: a language evolution perspective	Lukas Galke et.al.	2403.14427	translate	read	null
2024-03-21	Task-optimal data-driven surrogate models for eNMPC via differentiable simulation and optimization	Daniel Mayfrank et.al.	2403.14425	translate	read	null
2024-03-21	A reinforcement learning guided hybrid evolutionary algorithm for the latency location routing problem	Yuji Zou et.al.	2403.14405	translate	read	link
2024-03-21	Distilling Reinforcement Learning Policies for Interpretable Robot Locomotion: Gradient Boosting Machines and Symbolic Regression	Fernando Acero et.al.	2403.14328	translate	read	null
2024-03-21	Bayesian Optimization for Sample-Efficient Policy Improvement in Robotic Manipulation	Adrian Röfer et.al.	2403.14305	translate	read	null
2024-03-21	Reactor Optimization Benchmark by Reinforcement Learning	Deborah Schwarcz et.al.	2403.14273	translate	read	link
2024-03-20	Information-Theoretic Distillation for Reference-less Summarization	Jaehun Jung et.al.	2403.13780	translate	read	null
2024-03-20	Towards Principled Representation Learning from Videos for Reinforcement Learning	Dipendra Misra et.al.	2403.13765	translate	read	null
2024-03-20	Reinforcement Learning for Online Testing of Autonomous Driving Systems: a Replication and Extension Study	Luca Giamattei et.al.	2403.13729	translate	read	null
2024-03-20	Reward-Driven Automated Curriculum Learning for Interaction-Aware Self-Driving at Unsignalized Intersections	Zengqi Peng et.al.	2403.13674	translate	read	null
2024-03-20	Multi-agent Reinforcement Traffic Signal Control based on Interpretable Influence Mechanism and Biased ReLU Approximation	Zhiyue Luo et.al.	2403.13639	translate	read	null
2024-03-20	Dynamic Reward Adjustment in Multi-Reward Reinforcement Learning for Counselor Reflection Generation	Do June Min et.al.	2403.13578	translate	read	link
2024-03-20	GeRM: A Generalist Robotic Model with Mixture-of-experts for Quadruped Robot	Wenxuan Song et.al.	2403.13358	translate	read	null
2024-03-20	Waypoint-Based Reinforcement Learning for Robot Manipulation Tasks	Shaunak A. Mehta et.al.	2403.13281	translate	read	null
2024-03-20	Federated reinforcement learning for robot motion planning with zero-shot generalization	Zhenyuan Yuan et.al.	2403.13245	translate	read	null
2024-03-20	Graph Attention Network-based Block Propagation with Optimal AoI and Reputation in Web 3.0	Jiana Liao et.al.	2403.13237	translate	read	null
2024-03-19	Sample Complexity of Offline Distributionally Robust Linear Markov Decision Processes	He Wang et.al.	2403.12946	translate	read	null
2024-03-19	Vid2Robot: End-to-end Video-conditioned Policy Learning with Cross-Attention Transformers	Vidhi Jain et.al.	2403.12943	translate	read	null
2024-03-19	Adaptive Visual Imitation Learning for Robotic Assisted Feeding Across Varied Bowl Configurations and Food Types	Rui Liu et.al.	2403.12891	translate	read	null
2024-03-19	HYDRA: A Hyper Agent for Dynamic Compositional Visual Reasoning	Fucai Ke et.al.	2403.12884	translate	read	null
2024-03-19	Equivariant Ensembles and Regularization for Reinforcement Learning in Map-based Path Planning	Mirco Theile et.al.	2403.12856	translate	read	null
2024-03-19	Policy Bifurcation in Safe Reinforcement Learning	Wenjun Zou et.al.	2403.12847	translate	read	link
2024-03-19	AnySkill: Learning Open-Vocabulary Physical Skill for Interactive Agents	Jieming Cui et.al.	2403.12835	translate	read	null
2024-03-19	Oriented and Non-oriented Cubical Surfaces in The Penteract	Manuel Estevez et.al.	2403.12825	translate	read	null
2024-03-19	Dynamic Manipulation of Deformable Objects using Imitation Learning with Adaptation to Hardware Constraints	Eric Hannus et.al.	2403.12685	translate	read	null
2024-03-19	Automated Contrastive Learning Strategy Search for Time Series	Baoyu Jing et.al.	2403.12641	translate	read	null
2024-03-18	The Value of Reward Lookahead in Reinforcement Learning	Nadav Merlis et.al.	2403.11637	translate	read	null
2024-03-18	Offline Multitask Representation Learning for Reinforcement Learning	Haque Ishfaq et.al.	2403.11574	translate	read	null
2024-03-18	Reinforcement Learning with Token-level Feedback for Controllable Text Generation	Wendi Li et.al.	2403.11558	translate	read	null
2024-03-18	TARN-VIST: Topic Aware Reinforcement Network for Visual Storytelling	Weiran Chen et.al.	2403.11550	translate	read	null
2024-03-18	State-Separated SARSA: A Practical Sequential Decision-Making Algorithm with Recovering Rewards	Yuto Tanimoto et.al.	2403.11520	translate	read	link
2024-03-18	Demystifying Deep Reinforcement Learning-Based Autonomous Vehicle Decision-Making	Hanxi Wan et.al.	2403.11432	translate	read	null
2024-03-18	Variational Sampling of Temporal Trajectories	Jurijs Nazarovs et.al.	2403.11418	translate	read	null
2024-03-17	Independent RL for Cooperative-Competitive Agents: A Mean-Field Perspective	Muhammad Aneeq uz Zaman et.al.	2403.11345	translate	read	null
2024-03-17	Causality from Bottom to Top: A Survey	Abraham Itzhak Weinberg et.al.	2403.11219	translate	read	null
2024-03-17	Continuous Jumping of a Parallel Wire-Driven Monopedal Robot RAMIEL Using Reinforcement Learning	Kento Kawaharazuka et.al.	2403.11205	translate	read	null
2024-03-14	Minimax Optimal and Computationally Efficient Algorithms for Distributionally Robust Offline Reinforcement Learning	Zhishuai Liu et.al.	2403.09621	translate	read	null
2024-03-14	ExploRLLM: Guiding Exploration in Reinforcement Learning with Large Language Models	Runyu Ma et.al.	2403.09583	translate	read	null
2024-03-14	A Reinforcement Learning Approach to Dairy Farm Battery Management using Q Learning	Nawazish Ali et.al.	2403.09499	translate	read	null
2024-03-14	Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision	Zhiqing Sun et.al.	2403.09472	translate	read	link
2024-03-14	A Deep Reinforcement Learning Approach for Autonomous Reconfigurable Intelligent Surfaces	Hyuckjin Choi et.al.	2403.09270	translate	read	null
2024-03-14	Leveraging Constraint Programming in a Deep Learning Approach for Dynamically Solving the Flexible Job-Shop Scheduling Problem	Imanol Echeverria et.al.	2403.09249	translate	read	null
2024-03-14	Rumor Mitigation in Social Media Platforms with Deep Reinforcement Learning	Hongyuan Su et.al.	2403.09217	translate	read	null
2024-03-14	MetroGNN: Metro Network Expansion with Reinforcement Learning	Hongyuan Su et.al.	2403.09197	translate	read	null
2024-03-14	SINDy-RL: Interpretable and Efficient Model-Based Reinforcement Learning	Nicholas Zolman et.al.	2403.09110	translate	read	link
2024-03-14	CodeUltraFeedback: An LLM-as-a-Judge Dataset for Aligning Large Language Models to Coding Preferences	Martin Weyssow et.al.	2403.09032	translate	read	link
2024-03-13	TeaMs-RL: Teaching LLMs to Teach Themselves Better Instructions via Reinforcement Learning	Shangding Gu et.al.	2403.08694	translate	read	null
2024-03-13	Digital Twin-assisted Reinforcement Learning for Resource-aware Microservice Offloading in Edge Computing	Xiangchun Chen et.al.	2403.08687	translate	read	null
2024-03-13	Meta Reinforcement Learning for Resource Allocation in Aerial Active-RIS-assisted Networks with Rate-Splitting Multiple Access	Sajad Faramarzi et.al.	2403.08648	translate	read	null
2024-03-13	Human Alignment of Large Language Models through Online Preference Optimisation	Daniele Calandriello et.al.	2403.08635	translate	read	null
2024-03-13	Specification Overfitting in Artificial Intelligence	Benjamin Roth et.al.	2403.08425	translate	read	null
2024-03-13	Optimizing Risk-averse Human-AI Hybrid Teams	Andrew Fuchs et.al.	2403.08386	translate	read	null
2024-03-13	Learning to Describe for Predicting Zero-shot Drug-Drug Interactions	Fangqi Zhu et.al.	2403.08377	translate	read	link
2024-03-13	LLM-Assisted Light: Leveraging Large Language Model Capabilities for Human-Mimetic Traffic Signal Control in Complex Urban Environments	Maonan Wang et.al.	2403.08337	translate	read	link
2024-03-14	HRLAIF: Improvements in Helpfulness and Harmlessness in Open-domain Reinforcement Learning From AI Feedback	Ang Li et.al.	2403.08309	translate	read	null
2024-03-13	SpaceOctopus: An Octopus-inspired Motion Planning Framework for Multi-arm Space Robot	Wenbo Zhao et.al.	2403.08219	translate	read	null
2024-03-12	TeleMoMa: A Modular and Versatile Teleoperation System for Mobile Manipulation	Shivin Dass et.al.	2403.07869	translate	read	null
2024-03-12	Exploring Safety Generalization Challenges of Large Language Models via Code	Qibing Ren et.al.	2403.07865	translate	read	null
2024-03-12	DexCap: Scalable and Portable Mocap Data Collection System for Dexterous Manipulation	Chen Wang et.al.	2403.07788	translate	read	null
2024-03-12	Improving Reinforcement Learning from Human Feedback Using Contrastive Rewards	Wei Shen et.al.	2403.07708	translate	read	null
2024-03-12	Symmetric Q-learning: Reducing Skewness of Bellman Error in Online Reinforcement Learning	Motoki Omura et.al.	2403.07704	translate	read	null
2024-03-12	Optimizing Negative Prompts for Enhanced Aesthetics and Fidelity in Text-To-Image Generation	Michael Ogezi et.al.	2403.07605	translate	read	null
2024-03-12	An Improved Strategy for Blood Glucose Control Using Multi-Step Deep Reinforcement Learning	Weiwei Gu et.al.	2403.07566	translate	read	null
2024-03-12	Ensembling Prioritized Hybrid Policies for Multi-agent Pathfinding	Huijie Tang et.al.	2403.07559	translate	read	link
2024-03-12	Constrained Optimal Fuel Consumption of HEV: A Constrained Reinforcement Learning Approach	Shuchang Yan et.al.	2403.07503	translate	read	null
2024-03-12	Optimization of Pressure Management Strategies for Geological CO2 Sequestration Using Surrogate Model-based Reinforcement Learning	Jungang Chen et.al.	2403.07360	translate	read	null
2024-03-11	Acquiring Diverse Skills using Curriculum Reinforcement Learning with Mixture of Experts	Onur Celik et.al.	2403.06966	translate	read	null
2024-03-11	Unveiling the Significance of Toddler-Inspired Reward Transition in Goal-Oriented Reinforcement Learning	Junseok Park et.al.	2403.06880	translate	read	null
2024-03-11	Quantifying the Sensitivity of Inverse Reinforcement Learning to Misspecification	Joar Skalse et.al.	2403.06854	translate	read	null
2024-03-11	In-context Exploration-Exploitation for Reinforcement Learning	Zhenwen Dai et.al.	2403.06826	translate	read	null
2024-03-11	ε-Neural Thompson Sampling of Deep Brain Stimulation for Parkinson Disease Treatment	Hao-Lun Hsu et.al.	2403.06814	translate	read	null
2024-03-11	From Factor Models to Deep Learning: Machine Learning in Reshaping Empirical Asset Pricing	Junyi Ye et.al.	2403.06779	translate	read	null
2024-03-11	ALaRM: Align Language Models via Hierarchical Rewards Modeling	Yuhang Lai et.al.	2403.06754	translate	read	null
2024-03-11	Generalising Multi-Agent Cooperation through Task-Agnostic Communication	Dulhan Jayalath et.al.	2403.06750	translate	read	link
2024-03-11	Enhancing Image Caption Generation Using Reinforcement Learning with Human Feedback	Adarsh N L et.al.	2403.06735	translate	read	null
2024-03-11	Large Model driven Radiology Report Generation with Clinical Quality Reinforcement Learning	Zijian Zhou et.al.	2403.06728	translate	read	null
2024-03-08	Will GPT-4 Run DOOM?	Adrian de Wynter et.al.	2403.05468	translate	read	null
2024-03-08	Switching the Loss Reduces the Cost in Batch Reinforcement Learning	Alex Ayoub et.al.	2403.05385	translate	read	null
2024-03-08	Overcoming Reward Overoptimization via Adversarial Policy Optimization with Lightweight Uncertainty Estimation	Xiaoying Zhang et.al.	2403.05171	translate	read	null
2024-03-08	Inverse Design of Photonic Crystal Surface Emitting Lasers is a Sequence Modeling Problem	Ceyao Zhang et.al.	2403.05149	translate	read	null
2024-03-08	ChatUIE: Exploring Chat-based Unified Information Extraction using Large Language Models	Jun Xu et.al.	2403.05132	translate	read	null
2024-03-08	RLPeri: Accelerating Visual Perimetry Test with Reinforcement Learning and Convolutional Feature Extraction	Tanvi Verma et.al.	2403.05112	translate	read	null
2024-03-08	Efficient Data Collection for Robotic Manipulation via Compositional Generalization	Jensen Gao et.al.	2403.05110	translate	read	null
2024-03-08	Simulating Battery-Powered TinyML Systems Optimised using Reinforcement Learning in Image-Based Anomaly Detection	Jared M. Ping et.al.	2403.05106	translate	read	null
2024-03-08	Reset & Distill: A Recipe for Overcoming Negative Transfer in Continual Reinforcement Learning	Hongjoon Ahn et.al.	2403.05066	translate	read	null
2024-03-08	Aligning Large Language Models for Controllable Recommendations	Wensheng Lu et.al.	2403.05063	translate	read	null
2024-03-07	Teaching Large Language Models to Reason with Reinforcement Learning	Alex Havrilla et.al.	2403.04642	translate	read	null
2024-03-07	Zero-shot cross-modal transfer of Reinforcement Learning policies through a Global Workspace	Léopold Maytié et.al.	2403.04588	translate	read	null
2024-03-07	Learning Agility Adaptation for Flight in Clutter	Guangyu Zhao et.al.	2403.04586	translate	read	null
2024-03-07	Improved Algorithm for Adversarial Linear Mixture MDPs with Bandit Feedback and Unknown Transition	Long-Fei Li et.al.	2403.04568	translate	read	null
2024-03-07	Vlearn: Off-Policy Learning with Efficient State-Value Function Estimation	Fabian Otto et.al.	2403.04453	translate	read	null
2024-03-07	Learning Human-to-Humanoid Real-Time Whole-Body Teleoperation	Tairan He et.al.	2403.04436	translate	read	null
2024-03-07	iTRPL: An Intelligent and Trusted RPL Protocol based on Multi-Agent Reinforcement Learning	Debasmita Dey et.al.	2403.04416	translate	read	null
2024-03-07	Model-free $H_{\infty}$ control of Itô stochastic system via off-policy reinforcement learning	Jing Guo Jing Guo et.al.	2403.04412	translate	read	null
2024-03-07	Model-Free Load Frequency Control of Nonlinear Power Systems Based on Deep Reinforcement Learning	Xiaodi Chen et.al.	2403.04374	translate	read	null
2024-03-07	Symmetry Considerations for Learning Task Symmetric Robot Policies	Mayank Mittal et.al.	2403.04359	translate	read	null
2024-03-06	3D Diffusion Policy	Yanjie Ze et.al.	2403.03954	translate	read	link
2024-03-06	Stop Regressing: Training Value Functions via Classification for Scalable Deep RL	Jesse Farebrother et.al.	2403.03950	translate	read	null
2024-03-06	Reconciling Reality through Simulation: A Real-to-Sim-to-Real Approach for Robust Manipulation	Marcel Torne et.al.	2403.03949	translate	read	null
2024-03-06	Dexterous Legged Locomotion in Confined 3D Spaces with Reinforcement Learning	Zifan Xu et.al.	2403.03848	translate	read	null
2024-03-06	A Survey on Applications of Reinforcement Learning in Spatial Resource Allocation	Di Zhang et.al.	2403.03643	translate	read	null
2024-03-06	Benchmarking Hallucination in Large Language Models based on Unanswerable Math Word Problem	Yuhong Sun et.al.	2403.03558	translate	read	link
2024-03-06	Population-aware Online Mirror Descent for Mean-Field Games by Deep Reinforcement Learning	Zida Wu et.al.	2403.03552	translate	read	null
2024-03-05	RACE-SM: Reinforcement Learning Based Autonomous Control for Social On-Ramp Merging	Jordan Poots et.al.	2403.03359	translate	read	null
2024-03-05	Bi-KVIL: Keypoints-based Visual Imitation Learning of Bimanual Manipulation Tasks	Jianfeng Gao et.al.	2403.03270	translate	read	null
2024-03-05	Reaching Consensus in Cooperative Multi-Agent Reinforcement Learning with Goal Imagination	Liangzhou Wang et.al.	2403.03172	translate	read	null
2024-03-05	Leveraging Federated Learning and Edge Computing for Recommendation Systems within Cloud Computing Networks	Yaqian Qi et.al.	2403.03165	translate	read	null
2024-03-05	Language Guided Exploration for RL Agents in Text Environments	Hitesh Golchha et.al.	2403.03141	translate	read	null
2024-03-05	SplAgger: Split Aggregation for Meta-Reinforcement Learning	Jacob Beck et.al.	2403.03020	translate	read	null
2024-03-05	Autonomous vehicle decision and control through reinforcement learning with traffic flow randomization	Yuan Lin et.al.	2403.02882	translate	read	null
2024-03-05	SpaceHopper: A Small-Scale Legged Robot for Exploring Low-Gravity Celestial Bodies	Alexander Spiridonov et.al.	2403.02831	translate	read	null
2024-03-05	A Zero-Shot Reinforcement Learning Strategy for Autonomous Guidewire Navigation	Valentina Scarponi et.al.	2403.02777	translate	read	null
2024-03-05	RT-Sketch: Goal-Conditioned Imitation Learning from Hand-Drawn Sketches	Priya Sundaresan et.al.	2403.02709	translate	read	null
2024-03-05	Fighting Game Adaptive Background Music for Improved Gameplay	Ibrahim Khan et.al.	2403.02701	translate	read	null
2024-03-05	PPS-QMIX: Periodically Parameter Sharing for Accelerating Convergence of Multi-Agent Reinforcement Learning	Ke Zhang et.al.	2403.02635	translate	read	null
2024-03-02	Improving the Validity of Automatically Generated Feedback via Reinforcement Learning	Alexander Scarlatos et.al.	2403.01304	translate	read	link
2024-03-02	Automatic Speech Recognition using Advanced Deep Learning Approaches: A survey	Hamza Kheddar et.al.	2403.01255	translate	read	null
2024-03-02	Balancing Exploration and Exploitation in LLM using Soft RLLF for Enhanced Negation Understanding	Ha-Thanh Nguyen et.al.	2403.01185	translate	read	null
2024-03-02	Efficient Episodic Memory Utilization of Cooperative Multi-Agent Reinforcement Learning	Hyungho Na et.al.	2403.01112	translate	read	null
2024-03-02	Continuous Mean-Zero Disagreement-Regularized Imitation Learning (CMZ-DRIL)	Noah Ford et.al.	2403.01059	translate	read	null
2024-03-01	A Holistic Power Optimization Approach for Microgrid Control Based on Deep Reinforcement Learning	Fulong Yao et.al.	2403.01013	translate	read	null
2024-03-01	Policy Optimization for PDE Control with a Warm Start	Xiangyuan Zhang et.al.	2403.01005	translate	read	null
2024-03-01	On the Role of Information Structure in Reinforcement Learning for Partially-Observable Sequential Teams and Games	Awni Altabaa et.al.	2403.00993	translate	read	null
2024-03-01	SELFI: Autonomous Self-Improvement with Reinforcement Learning for Social Navigation	Noriaki Hirose et.al.	2403.00991	translate	read	null
2024-03-01	Scale-free Adversarial Reinforcement Learning	Mingyu Chen et.al.	2403.00930	translate	read	null

(<a href=../Reinforcement_Learning.md>back to Reinforcement Learning</a>)