Reinforcement Learning - 2024-06

Publish Date	Title	Authors	PDF	Translate	Read	Code
2024-06-28	PoliFormer: Scaling On-Policy RL with Transformers Results in Masterful Navigators	Kuo-Hao Zeng et.al.	2406.20083	translate	read	null
2024-06-28	Applying RLAIF for Code Generation with API-usage in Lightweight LLMs	Sujan Dutta et.al.	2406.20060	translate	read	null
2024-06-28	HumanVLA: Towards Vision-Language Directed Object Rearrangement by Physical Humanoid	Xinyu Xu et.al.	2406.19972	translate	read	null
2024-06-28	Perception Stitching: Zero-Shot Perception Encoder Transfer for Visuomotor Robot Policies	Pingcheng Jian et.al.	2406.19971	translate	read	null
2024-06-28	Operator World Models for Reinforcement Learning	Pietro Novelli et.al.	2406.19861	translate	read	null
2024-06-28	3D Operation of Autonomous Excavator based on Reinforcement Learning through Independent Reward for Individual Joints	Yoonkyu Yoo et.al.	2406.19848	translate	read	null
2024-06-28	Reinforcement Learning for Efficient Design and Control Co-optimisation of Energy Systems	Marine Cauz et.al.	2406.19825	translate	read	null
2024-06-28	Identifying Ordinary Differential Equations for Data-efficient Model-based Reinforcement Learning	Tobias Nagel et.al.	2406.19817	translate	read	null
2024-06-28	Fuzzy Logic Guided Reward Function Variation: An Oracle for Testing Reinforcement Learning Programs	Shiyu Zhang et.al.	2406.19812	translate	read	null
2024-06-28	Decision Transformer for IRS-Assisted Systems with Diffusion-Driven Generative Channels	Jie Zhang et.al.	2406.19769	translate	read	null
2024-06-27	Efficient World Models with Context-Aware Tokenization	Vincent Micheli et.al.	2406.19320	translate	read	link
2024-06-27	Averaging log-likelihoods in direct alignment	Nathan Grinsztajn et.al.	2406.19188	translate	read	null
2024-06-27	Contrastive Policy Gradient: Aligning LLMs on sequence-level scores in a supervised-friendly fashion	Yannis Flet-Berliac et.al.	2406.19185	translate	read	null
2024-06-27	Learning Pareto Set for Multi-Objective Continuous Robot Control	Tianye Shu et.al.	2406.18924	translate	read	link
2024-06-27	Autonomous Control of a Novel Closed Chain Five Bar Active Suspension via Deep Reinforcement Learning	Nishesh Singh et.al.	2406.18899	translate	read	null
2024-06-27	State and Input Constrained Output-Feedback Adaptive Optimal Control of Affine Nonlinear Systems	Tochukwu Elijah Ogri et.al.	2406.18804	translate	read	null
2024-06-26	Decentralized Semantic Traffic Control in AVs Using RL and DQN for Dynamic Roadblocks	Emanuel Figetakis et.al.	2406.18741	translate	read	null
2024-06-26	Confident Natural Policy Gradient for Local Planning in $q_π$ -realizable Constrained MDPs	Tian Tian et.al.	2406.18529	translate	read	null
2024-06-26	Mental Modeling of Reinforcement Learning Agents by Language Models	Wenhao Lu et.al.	2406.18505	translate	read	null
2024-06-26	Preference Elicitation for Offline Reinforcement Learning	Alizée Pace et.al.	2406.18450	translate	read	null
2024-06-26	Mixture of Experts in a Mixture of RL settings	Timon Willi et.al.	2406.18420	translate	read	null
2024-06-26	AlphaForge: A Framework to Mine and Dynamically Combine Formulaic Alpha Factors	Hao Shi et.al.	2406.18394	translate	read	null
2024-06-26	Reinforcement Learning with Intrinsically Motivated Feedback Graph for Lost-sales Inventory Control	Zifan Liu et.al.	2406.18351	translate	read	null
2024-06-26	AI Alignment through Reinforcement Learning from Human Feedback? Contradictions and Limitations	Adam Dahlgren Lindström et.al.	2406.18346	translate	read	null
2024-06-26	Spatial-temporal Hierarchical Reinforcement Learning for Interpretable Pathology Image Super-Resolution	Wenting Chen et.al.	2406.18310	translate	read	link
2024-06-26	Combining Automated Optimisation of Hyperparameters and Reward Shape	Julian Dierkes et.al.	2406.18293	translate	read	link
2024-06-26	Weak Reward Model Transforms Generative Models into Robust Causal Event Extraction Systems	Italo Luis da Silva et.al.	2406.18245	translate	read	link
2024-06-25	EXTRACT: Efficient Policy Learning by Extracting Transferrable Robot Skills from Offline Data	Jesse Zhang et.al.	2406.17768	translate	read	null
2024-06-25	When does Self-Prediction help? Understanding Auxiliary Tasks in Reinforcement Learning	Claas Voelcker et.al.	2406.17718	translate	read	null
2024-06-25	Privacy Preserving Reinforcement Learning for Population Processes	Samuel Yang-Zhao et.al.	2406.17649	translate	read	null
2024-06-25	KANQAS: Kolmogorov Arnold Network for Quantum Architecture Search	Akash Kundu et.al.	2406.17630	translate	read	link
2024-06-25	Leveraging Reinforcement Learning in Red Teaming for Advanced Ransomware Attack Simulations	Cheng Wang et.al.	2406.17576	translate	read	null
2024-06-25	On the consistency of hyper-parameter selection in value-based deep reinforcement learning	Johan Obando-Ceron et.al.	2406.17523	translate	read	null
2024-06-25	BricksRL: A Platform for Democratizing Robotics and Reinforcement Learning Research and Education with LEGO	Sebastian Dittert et.al.	2406.17490	translate	read	null
2024-06-25	CuDA2: An approach for Incorporating Traitor Agents into Cooperative Multi-Agent Systems	Zhen Chen et.al.	2406.17425	translate	read	null
2024-06-25	Joint Admission Control and Resource Allocation of Virtual Network Embedding via Hierarchical Deep Reinforcement Learning	Tianfu Wang et.al.	2406.17334	translate	read	link
2024-06-25	The State-Action-Reward-State-Action Algorithm in Spatial Prisoner’s Dilemma Game	Lanyu Yang et.al.	2406.17326	translate	read	null
2024-06-24	Confidence Aware Inverse Constrained Reinforcement Learning	Sriram Ganapathi Subramanian et.al.	2406.16782	translate	read	null
2024-06-24	WARP: On the Benefits of Weight Averaged Rewarded Policies	Alexandre Ramé et.al.	2406.16768	translate	read	null
2024-06-24	The MRI Scanner as a Diagnostic: Image-less Active Sampling	Yuning Du et.al.	2406.16754	translate	read	null
2024-06-24	OCALM: Object-Centric Assessment with Language Models	Timo Kaufmann et.al.	2406.16748	translate	read	null
2024-06-24	Adversarial Contrastive Decoding: Boosting Safety Alignment of Large Language Models via Opposite Prompt Optimization	Zhengyue Zhao et.al.	2406.16743	translate	read	null
2024-06-24	Probabilistic Subgoal Representations for Hierarchical Reinforcement learning	Vivienne Huiling Wang et.al.	2406.16707	translate	read	null
2024-06-24	Decentralized RL-Based Data Transmission Scheme for Energy Efficient Harvesting	Rafaela Scaciota et.al.	2406.16624	translate	read	null
2024-06-24	Towards Physically Talented Aerial Robots with Tactically Smart Swarm Behavior thereof: An Efficient Co-design Approach	Prajit KrisshnaKumar et.al.	2406.16612	translate	read	null
2024-06-24	$\text{Alpha}^2$ : Discovering Logical Formulaic Alphas using Deep Reinforcement Learning	Feng Xu et.al.	2406.16505	translate	read	link
2024-06-24	Towards Comprehensive Preference Data Collection for Reward Modeling	Yulan Hu et.al.	2406.16486	translate	read	null
2024-06-21	MantisScore: Building Automatic Metrics to Simulate Fine-grained Human Feedback for Video Generation	Xuan He et.al.	2406.15252	translate	read	null
2024-06-21	Open Problem: Order Optimal Regret Bounds for Kernel-Based Reinforcement Learning	Sattar Vakili et.al.	2406.15250	translate	read	null
2024-06-21	Deep UAV Path Planning with Assured Connectivity in Dense Urban Setting	Jiyong Oh et.al.	2406.15225	translate	read	null
2024-06-21	Gaussian Splatting to Real World Flight Navigation Transfer with Liquid Networks	Alex Quach et.al.	2406.15149	translate	read	null
2024-06-21	KalMamba: Towards Efficient Probabilistic State Space Models for RL under Uncertainty	Philipp Becker et.al.	2406.15131	translate	read	null
2024-06-21	A Provably Efficient Option-Based Algorithm for both High-Level and Low-Level Learning	Gianluca Drappo et.al.	2406.15124	translate	read	null
2024-06-21	Towards General Negotiation Strategies with End-to-End Reinforcement Learning	Bram M. Renting et.al.	2406.15096	translate	read	null
2024-06-21	KnobTree: Intelligent Database Parameter Configuration via Explainable Reinforcement Learning	Jiahan Chen et.al.	2406.15073	translate	read	null
2024-06-21	Behaviour Distillation	Andrei Lupu et.al.	2406.15042	translate	read	link
2024-06-21	SiT: Symmetry-Invariant Transformers for Generalisation in Reinforcement Learning	Matthias Weissenbacher et.al.	2406.15025	translate	read	null
2024-06-20	CooHOI: Learning Cooperative Human-Object Interaction with Manipulated Object Dynamics	Jiawei Gao et.al.	2406.14558	translate	read	null
2024-06-20	MacroHFT: Memory Augmented Context-aware Reinforcement Learning On High Frequency Trading	Chuqiao Zong et.al.	2406.14537	translate	read	link
2024-06-20	RL on Incorrect Synthetic Data Scales the Efficiency of LLM Math Reasoning by Eight-Fold	Amrith Setlur et.al.	2406.14532	translate	read	link
2024-06-20	Learning telic-controllable state representations	Nadav Amir et.al.	2406.14476	translate	read	null
2024-06-20	Rewarding What Matters: Step-by-Step Reinforcement Learning for Task-Oriented Dialogue	Huifang Du et.al.	2406.14457	translate	read	null
2024-06-20	Revealing the learning process in reinforcement learning agents through attention-oriented metrics	Charlotte Beylier et.al.	2406.14324	translate	read	null
2024-06-20	Resource Optimization for Tail-Based Control in Wireless Networked Control Systems	Rasika Vijithasena et.al.	2406.14301	translate	read	null
2024-06-21	REVEAL-IT: REinforcement learning with Visibility of Evolving Agent poLicy for InTerpretability	Shuang Ao et.al.	2406.14214	translate	read	link
2024-06-20	Optimizing Novelty of Top-k Recommendations using Large Language Models and Reinforcement Learning	Amit Sharma et.al.	2406.14169	translate	read	null
2024-06-20	Iterative Sizing Field Prediction for Adaptive Mesh Generation From Expert Demonstrations	Niklas Freymuth et.al.	2406.14161	translate	read	link
2024-06-18	Interpretable Preferences via Multi-Objective Reward Modeling and Mixture-of-Experts	Haoxiang Wang et.al.	2406.12845	translate	read	link
2024-06-18	Injection Optimization at Particle Accelerators via Reinforcement Learning: From Simulation to Real-World Application	Awal Awal et.al.	2406.12735	translate	read	null
2024-06-18	A Systematization of the Wagner Framework: Graph Theory Conjectures and Reinforcement Learning	Flora Angileri et.al.	2406.12667	translate	read	null
2024-06-18	Reinforcement-Learning based routing for packet-optical networks with hybrid telemetry	A. L. García Navarro et.al.	2406.12602	translate	read	null
2024-06-18	Discovering Minimal Reinforcement Learning Environments	Jarek Liesen et.al.	2406.12589	translate	read	null
2024-06-18	RichRAG: Crafting Rich Responses for Multi-faceted Queries in Retrieval-Augmented Generation	Shuting Wang et.al.	2406.12566	translate	read	null
2024-06-18	A Super-human Vision-based Reinforcement Learning Agent for Autonomous Racing in Gran Turismo	Miguel Vasco et.al.	2406.12563	translate	read	null
2024-06-18	Offline Imitation Learning with Model-based Reverse Augmentation	Jie-Jing Shao et.al.	2406.12550	translate	read	null
2024-06-18	Demonstrating Agile Flight from Pixels without State Estimation	Ismail Geles et.al.	2406.12505	translate	read	null
2024-06-18	Autonomous navigation of catheters and guidewires in mechanical thrombectomy using inverse reinforcement learning	Harry Robertshaw et.al.	2406.12499	translate	read	null
2024-06-17	WPO: Enhancing RLHF with Weighted Preference Optimization	Wenxuan Zhou et.al.	2406.11827	translate	read	link
2024-06-17	Computationally Efficient RL under Linear Bellman Completeness for Deterministic Dynamics	Runzhe Wu et.al.	2406.11810	translate	read	null
2024-06-17	Run Time Assured Reinforcement Learning for Six Degree-of-Freedom Spacecraft Inspection	Kyle Dunlap et.al.	2406.11795	translate	read	null
2024-06-17	FetchBench: A Simulation Benchmark for Robot Fetching	Beining Han et.al.	2406.11793	translate	read	null
2024-06-17	Optimal Transport-Assisted Risk-Sensitive Q-Learning	Zahra Shahrooei et.al.	2406.11774	translate	read	null
2024-06-17	Measuring memorization in RLHF for code completion	Aneesh Pappu et.al.	2406.11715	translate	read	null
2024-06-17	The Role of Inherent Bellman Error in Offline Reinforcement Learning with Linear Function Approximation	Noah Golowich et.al.	2406.11686	translate	read	null
2024-06-17	Communication-Efficient MARL for Platoon Stability and Energy-efficiency Co-optimization in Cooperative Adaptive Cruise Control of CAVs	Min Hua et.al.	2406.11653	translate	read	null
2024-06-17	Linear Bellman Completeness Suffices for Efficient Online Reinforcement Learning with Few Actions	Noah Golowich et.al.	2406.11640	translate	read	null
2024-06-17	Style Transfer with Multi-iteration Preference Optimization	Shuai Liu et.al.	2406.11581	translate	read	null
2024-06-14	Regularizing Hidden States Enables Learning Generalizable Reward Model for LLMs	Rui Yang et.al.	2406.10216	translate	read	null
2024-06-14	A Fundamental Trade-off in Aligned Language Models and its Relation to Sampling Adaptors	Naaman Tan et.al.	2406.10203	translate	read	null
2024-06-14	Misam: Using ML in Dataflow Selection of Sparse-Sparse Matrix Multiplication	Sanjali Yadav et.al.	2406.10166	translate	read	null
2024-06-14	Sycophancy to Subterfuge: Investigating Reward-Tampering in Large Language Models	Carson Denison et.al.	2406.10162	translate	read	link
2024-06-14	BiKC: Keypose-Conditioned Consistency Policy for Bimanual Robotic Manipulation	Dongjie Yu et.al.	2406.10093	translate	read	null
2024-06-14	PRIMER: Perception-Aware Robust Learning-based Multiagent Trajectory Planner	Kota Kondo et.al.	2406.10060	translate	read	null
2024-06-14	Bridging the Communication Gap: Artificial Agents Learning Sign Language through Imitation	Federico Tavella et.al.	2406.10043	translate	read	null
2024-06-14	ROAR: Reinforcing Original to Augmented Data Ratio Dynamics for Wav2Vec2.0 Based ASR	Vishwanath Pratap Singh et.al.	2406.09999	translate	read	null
2024-06-14	Robust Model-Based Reinforcement Learning with an Adversarial Auxiliary Model	Siemen Herremans et.al.	2406.09976	translate	read	link
2024-06-14	InstructRL4Pix: Training Diffusion for Image Editing by Reinforcement Learning	Tiancheng Li et.al.	2406.09973	translate	read	null
2024-06-13	Aligning Vision Models with Human Aesthetics in Retrieval: Benchmarks and Algorithms	Miaosen Zhang et.al.	2406.09397	translate	read	null
2024-06-13	Is Value Learning Really the Main Bottleneck in Offline RL?	Seohong Park et.al.	2406.09329	translate	read	null
2024-06-13	OpenVLA: An Open-Source Vision-Language-Action Model	Moo Jin Kim et.al.	2406.09246	translate	read	null
2024-06-13	AutomaChef: A Physics-informed Demonstration-guided Learning Framework for Granular Material Manipulation	Minglun Wei et.al.	2406.09178	translate	read	null
2024-06-13	Direct Imitation Learning-based Visual Servoing using the Large Projection Formulation	Sayantan Auddy et.al.	2406.09120	translate	read	null
2024-06-13	Adaptive Actor-Critic Based Optimal Regulation for Drift-Free Uncertain Nonlinear Systems	Ashwin P. Dani et.al.	2406.09097	translate	read	null
2024-06-13	DiffPoGAN: Diffusion Policies with Generative Adversarial Networks for Offline Reinforcement Learning	Xuemin Hu et.al.	2406.09089	translate	read	null
2024-06-13	Data-driven modeling and supervisory control system optimization for plug-in hybrid electric vehicles	Hao Zhang et.al.	2406.09082	translate	read	null
2024-06-13	Latent Assistance Networks: Rediscovering Hyperbolic Tangents in RL	Jacob E. Kooi et.al.	2406.09079	translate	read	null
2024-06-13	Dispelling the Mirage of Progress in Offline MARL through Standardised Baselines and Evaluation	Claude Formanek et.al.	2406.09068	translate	read	null
2024-06-12	RILe: Reinforced Imitation Learning	Mert Albaba et.al.	2406.08472	translate	read	null
2024-06-12	Adaptive Swarm Mesh Refinement using Deep Reinforcement Learning with Local Rewards	Niklas Freymuth et.al.	2406.08440	translate	read	null
2024-06-12	RRLS : Robust Reinforcement Learning Suite	Adil Zouitine et.al.	2406.08406	translate	read	link
2024-06-12	Scaling Value Iteration Networks to 5000 Layers for Extreme Long-Term Planning	Yuhui Wang et.al.	2406.08404	translate	read	null
2024-06-12	Time-Constrained Robust MDPs	Adil Zouitine et.al.	2406.08395	translate	read	null
2024-06-12	Residual Learning and Context Encoding for Adaptive Offline-to-Online Reinforcement Learning	Mohammadreza Nakhaei et.al.	2406.08238	translate	read	link
2024-06-12	MaIL: Improving Imitation Learning with Mamba	Xiaogang Jia et.al.	2406.08234	translate	read	null
2024-06-12	Explore-Go: Leveraging Exploration for Generalisation in Deep Reinforcement Learning	Max Weltevrede et.al.	2406.08069	translate	read	null
2024-06-12	Deep reinforcement learning with positional context for intraday trading	Sven Goluža et.al.	2406.08013	translate	read	null
2024-06-12	Efficient Adaptation in Mixed-Motive Environments via Hierarchical Opponent Modeling and Planning	Yizhe Huang et.al.	2406.08002	translate	read	null
2024-06-11	CDSA: Conservative Denoising Score-based Algorithm for Offline Reinforcement Learning	Zeyuan Liu et.al.	2406.07541	translate	read	null
2024-06-11	BAKU: An Efficient Transformer for Multi-Task Policy Learning	Siddhant Haldar et.al.	2406.07539	translate	read	null
2024-06-11	Reinforcement Learning from Human Feedback without Reward Inference: Model-Free Algorithm and Instance-Dependent Analysis	Qining Zhang et.al.	2406.07455	translate	read	null
2024-06-11	Enhanced Gene Selection in Single-Cell Genomics: Pre-Filtering Synergy and Reinforced Optimization	Weiliang Zhang et.al.	2406.07418	translate	read	null
2024-06-11	Federated Multi-Agent DRL for Radio Resource Management in Industrial 6G in-X subnetworks	Bjarke Madsen et.al.	2406.07383	translate	read	null
2024-06-11	World Models with Hints of Large Language Models for Goal Achieving	Zeyuan Liu et.al.	2406.07381	translate	read	null
2024-06-11	EdgeTimer: Adaptive Multi-Timescale Scheduling in Mobile Edge Computing with Deep Reinforcement Learning	Yijun Hao et.al.	2406.07342	translate	read	null
2024-06-11	Beyond Training: Optimizing Reinforcement Learning Based Job Shop Scheduling Through Adaptive Action Sampling	Constantin Waubert de Puiseau et.al.	2406.07325	translate	read	null
2024-06-11	Multi-objective Reinforcement learning from AI Feedback	Marcus Williams et.al.	2406.07295	translate	read	null
2024-06-11	Hybrid Reinforcement Learning from Offline Observation Alone	Yuda Song et.al.	2406.07253	translate	read	null
2024-06-10	Verification-Guided Shielding for Deep Reinforcement Learning	Davide Corsi et.al.	2406.06507	translate	read	null
2024-06-10	Adaptive Opponent Policy Detection in Multi-Agent MDPs: Real-Time Strategy Switch Identification Using Running Error Estimation	Mohidul Haque Mridul et.al.	2406.06500	translate	read	null
2024-06-10	Boosting Robustness in Preference-Based Reinforcement Learning with Dynamic Sparsity	Calarina Muslimani et.al.	2406.06495	translate	read	null
2024-06-10	Towards Real-World Efficiency: Domain Randomization in Reinforcement Learning for Pre-Capture of Free-Floating Moving Targets by Autonomous Robots	Bahador Beigomi et.al.	2406.06460	translate	read	link
2024-06-10	Is Value Functions Estimation with Classification Plug-and-play for Offline Reinforcement Learning?	Denis Tarasov et.al.	2406.06309	translate	read	link
2024-06-10	Learning-based cognitive architecture for enhancing coordination in human groups	Antonio Grotta et.al.	2406.06297	translate	read	null
2024-06-10	Deep Multi-Objective Reinforcement Learning for Utility-Based Infrastructural Maintenance Optimization	Jesse van Remmerden et.al.	2406.06184	translate	read	null
2024-06-10	Mastering truss structure optimization with tree search	Gabriel E. Garayalde et.al.	2406.06145	translate	read	null
2024-06-10	EXPIL: Explanatory Predicate Invention for Learning in Games	Jingyuan Sha et.al.	2406.06107	translate	read	null
2024-06-10	Sim-To-Real Transfer for Visual Reinforcement Learning of Deformable Object Manipulation for Robot-Assisted Surgery	Paul Maria Scheikl et.al.	2406.06092	translate	read	null
2024-06-07	LINX: A Language Driven Generative System for Goal-Oriented Automated Data Exploration	Tavor Lipman et.al.	2406.05107	translate	read	null
2024-06-07	Massively Multiagent Minigames for Training Generalist Agents	Kyoung Whan Choe et.al.	2406.05071	translate	read	link
2024-06-07	Online Frequency Scheduling by Learning Parallel Actions	Anastasios Giovanidis et.al.	2406.05041	translate	read	null
2024-06-07	Optimizing Automatic Differentiation with Deep Reinforcement Learning	Jamie Lohoff et.al.	2406.05027	translate	read	null
2024-06-07	Designs for Enabling Collaboration in Human-Machine Teaming via Interactive and Explainable Systems	Rohan Paleja et.al.	2406.05003	translate	read	null
2024-06-07	SLOPE: Search with Learned Optimal Pruning-based Expansion	Davor Bokan et.al.	2406.04935	translate	read	link
2024-06-07	Sim-to-real Transfer of Deep Reinforcement Learning Agents for Online Coverage Path Planning	Arvi Jonnarth et.al.	2406.04920	translate	read	null
2024-06-07	Online Adaptation for Enhancing Imitation Learning Policies	Federico Malato et.al.	2406.04913	translate	read	link
2024-06-07	Stabilizing Extreme Q-learning by Maclaurin Expansion	Motoki Omura et.al.	2406.04896	translate	read	null
2024-06-07	Primitive Agentic First-Order Optimization	R. Sala et.al.	2406.04841	translate	read	null
2024-06-06	ATraDiff: Accelerating Online Reinforcement Learning with Imaginary Trajectories	Qianlan Yang et.al.	2406.04323	translate	read	null
2024-06-06	Self-Play with Adversarial Critic: Provable and Scalable Offline Alignment for Language Models	Xiang Ji et.al.	2406.04274	translate	read	null
2024-06-06	Multi-Agent Imitation Learning: Value is Easy, Regret is Hard	Jingwu Tang et.al.	2406.04219	translate	read	null
2024-06-06	Aligning Agents like Large Language Models	Adam Jelley et.al.	2406.04208	translate	read	null
2024-06-06	MARLander: A Local Path Planning for Drone Swarms using Multiagent Deep Reinforcement Learning	Demetros Aschu et.al.	2406.04159	translate	read	null
2024-06-06	Deterministic Uncertainty Propagation for Improved Model-Based Offline Reinforcement Learning	Abdullah Akgül et.al.	2406.04088	translate	read	null
2024-06-06	Bootstrapping Expectiles in Reinforcement Learning	Pierre Clavier et.al.	2406.04081	translate	read	null
2024-06-06	Spatio-temporal Early Prediction based on Multi-objective Reinforcement Learning	Wei Shao et.al.	2406.04035	translate	read	link
2024-06-06	Contrastive Sparse Autoencoders for Interpreting Planning of Chess-Playing Agents	Yoann Poupart et.al.	2406.04028	translate	read	link
2024-06-06	HackAtari: Atari Learning Environments for Robust and Continual Reinforcement Learning	Quentin Delfosse et.al.	2406.03997	translate	read	link
2024-06-05	Automating Turkish Educational Quiz Generation Using Large Language Models	Kamyar Zeinalipour et.al.	2406.03397	translate	read	null
2024-06-05	LLM-based Rewriting of Inappropriate Argumentation using Reinforcement Learning from Machine Feedback	Timon Ziegenbein et.al.	2406.03363	translate	read	link
2024-06-05	UDQL: Bridging The Gap between MSE Loss and The Optimal Value Function in Offline Reinforcement Learning	Yu Zhang et.al.	2406.03324	translate	read	null
2024-06-05	Revisiting Scalable Hessian Diagonal Approximations for Applications in Reinforcement Learning	Mohamed Elsayed et.al.	2406.03276	translate	read	null
2024-06-05	Prompt-based Visual Alignment for Zero-shot Policy Transfer	Haihan Gao et.al.	2406.03250	translate	read	null
2024-06-05	Fine-Grained Causal Dynamics Learning with Quantization for Improving Robustness in Reinforcement Learning	Inwoo Hwang et.al.	2406.03234	translate	read	link
2024-06-05	CommonPower: Supercharging Machine Learning for Smart Grids	Michael Eichelbeck et.al.	2406.03231	translate	read	link
2024-06-05	Object Manipulation in Marine Environments using Reinforcement Learning	Ahmed Nader et.al.	2406.03223	translate	read	null
2024-06-05	Adaptive Distance Functions via Kelvin Transformation	Rafael I. Cabral Muchacho et.al.	2406.03200	translate	read	null
2024-06-05	DEER: A Delay-Resilient Framework for Reinforcement Learning with Variable Delays	Bo Xia et.al.	2406.03102	translate	read	null
2024-06-04	RoboCasa: Large-Scale Simulation of Everyday Tasks for Generalist Robots	Soroush Nasiriany et.al.	2406.02523	translate	read	link
2024-06-04	Offline Bayesian Aleatoric and Epistemic Uncertainty Quantification and Posterior Value Optimisation in Finite-State MDPs	Filippo Valdettaro et.al.	2406.02456	translate	read	null
2024-06-04	A Generalized Apprenticeship Learning Framework for Modeling Heterogeneous Student Pedagogical Strategies	Md Mirajul Islam et.al.	2406.02450	translate	read	null
2024-06-04	Algorithmic Collusion in Dynamic Pricing with Deep Reinforcement Learning	Shidi Deng et.al.	2406.02437	translate	read	null
2024-06-04	Seed-TTS: A Family of High-Quality Versatile Speech Generation Models	Philip Anastassiou et.al.	2406.02430	translate	read	link
2024-06-04	Query-based Semantic Gaussian Field for Scene Representation in Reinforcement Learning	Jiaxu Wang et.al.	2406.02370	translate	read	null
2024-06-04	How to Explore with Belief: State Entropy Maximization in POMDPs	Riccardo Zamboni et.al.	2406.02295	translate	read	null
2024-06-04	Smaller Batches, Bigger Gains? Investigating the Impact of Batch Sizes on Reinforcement Learning Based Real-World Production Scheduling	Arthur Müller et.al.	2406.02294	translate	read	null
2024-06-04	Test-Time Regret Minimization in Meta Reinforcement Learning	Mirco Mutti et.al.	2406.02282	translate	read	null
2024-06-04	Reinforcement Learning with Lookahead Information	Nadav Merlis et.al.	2406.02258	translate	read	null
2024-06-03	Fusion-PSRO: Nash Policy Fusion for Policy Space Response Oracles	Jiesong Lian et.al.	2405.21027	translate	read	null

(<a href=../Reinforcement_Learning.md>back to Reinforcement Learning</a>)