Reinforcement Learning - 2024-11

Publish Date	Title	Authors	PDF	Translate	Read	Code
2024-11-29	PDDLFuse: A Tool for Generating Diverse Planning Domains	Vedant Khandelwal et.al.	2411.19886	translate	read	null
2024-11-29	CAREL: Instruction-guided reinforcement learning with cross-modal auxiliary objectives	Armin Saghafian et.al.	2411.19787	translate	read	link
2024-11-29	HVAC-DPT: A Decision Pretrained Transformer for HVAC Control	Anaïs Berkes et.al.	2411.19746	translate	read	null
2024-11-29	Improving generalization of robot locomotion policies via Sharpness-Aware Reinforcement Learning	Severin Bochem et.al.	2411.19732	translate	read	null
2024-11-29	RMIO: A Model-Based MARL Framework for Scenarios with Observation Loss in Some Agents	Shi Zifeng et.al.	2411.19639	translate	read	null
2024-11-29	Build An Influential Bot In Social Media Simulations With Large Language Models	Bailu Jin et.al.	2411.19635	translate	read	null
2024-11-29	Adaptive dynamics of Ising spins in one dimension leveraging Reinforcement Learning	Anish Kumar et.al.	2411.19602	translate	read	null
2024-11-29	Solving Rubik’s Cube Without Tricky Sampling	Yicheng Lin et.al.	2411.19583	translate	read	null
2024-11-29	Training Agents with Weakly Supervised Feedback from Large Language Models	Dihong Gong et.al.	2411.19547	translate	read	null
2024-11-29	A Local Information Aggregation based Multi-Agent Reinforcement Learning for Robot Swarm Dynamic Task Allocation	Yang Lv et.al.	2411.19526	translate	read	null
2024-11-27	Robust Offline Reinforcement Learning with Linearly Structured $f$ -Divergence Regularization	Cheng Tang et.al.	2411.18612	translate	read	null
2024-11-27	A Talent-infused Policy-gradient Approach to Efficient Co-Design of Morphology and Task Allocation Behavior of Multi-Robot Systems	Prajit KrisshnaKumar et.al.	2411.18519	translate	read	null
2024-11-27	G3Flow: Generative 3D Semantic Flow for Pose-aware and Generalizable Object Manipulation	Tianxing Chen et.al.	2411.18369	translate	read	null
2024-11-27	Two-Timescale Digital Twin Assisted Model Interference and Retraining over Wireless Network	Jiayi Cong et.al.	2411.18329	translate	read	null
2024-11-27	Application of Soft Actor-Critic Algorithms in Optimizing Wastewater Treatment with Time Delays Integration	Esmaeel Mohammadi et.al.	2411.18305	translate	read	null
2024-11-27	NeoHebbian Synapses to Accelerate Online Training of Neuromorphic Hardware	Shubham Pande et.al.	2411.18272	translate	read	null
2024-11-27	Dynamic Retail Pricing via Q-Learning – A Reinforcement Learning Framework for Enhanced Revenue Management	Mohit Apte et.al.	2411.18261	translate	read	null
2024-11-27	Dependency-Aware CAV Task Scheduling via Diffusion-Based Reinforcement Learning	Xiang Cheng et.al.	2411.18230	translate	read	null
2024-11-27	Critic-V: VLM Critics Help Catch VLM Errors in Multimodal Reasoning	Di Zhang et.al.	2411.18203	translate	read	link
2024-11-27	Learning for Long-Horizon Planning via Neuro-Symbolic Abductive Imitation	Jie-Jing Shao et.al.	2411.18201	translate	read	link
2024-11-26	Multi-Objective Reinforcement Learning for Automated Resilient Cyber Defence	Ross O’Driscoll et.al.	2411.17585	translate	read	null
2024-11-26	Ensuring Safety in Target Pursuit Control: A CBF-Safe Reinforcement Learning Approach	Yaosheng Deng et.al.	2411.17552	translate	read	null
2024-11-26	IMPROVE: Improving Medical Plausibility without Reliance on HumanValidation – An Enhanced Prototype-Guided Diffusion Framework	Anurag Shandilya et.al.	2411.17535	translate	read	null
2024-11-26	Spatially Visual Perception for End-to-End Robotic Learning	Travis Davies et.al.	2411.17458	translate	read	null
2024-11-26	BPP-Search: Enhancing Tree of Thought Reasoning for Mathematical Modeling Problem Solving	Teng Wang et.al.	2411.17404	translate	read	null
2024-11-26	Joint Combinatorial Node Selection and Resource Allocations in the Lightning Network using Attention-based Reinforcement Learning	Mahdi Salahshour et.al.	2411.17353	translate	read	null
2024-11-26	SIL-RRT*: Learning Sampling Distribution through Self Imitation Learning	Xuzhe Dang et.al.	2411.17293	translate	read	null
2024-11-26	LHPF: Look back the History and Plan for the Future in Autonomous Driving	Sheng Wang et.al.	2411.17253	translate	read	null
2024-11-26	Self-reconfiguration Strategies for Space-distributed Spacecraft	Tianle Liu et.al.	2411.17137	translate	read	null
2024-11-26	LLM-Based Offline Learning for Embodied Agents via Consistency-Guided Reward Ensemble	Yujeong Lee et.al.	2411.17135	translate	read	null
2024-11-25	Self-Generated Critiques Boost Reward Modeling for Language Models	Yue Yu et.al.	2411.16646	translate	read	null
2024-11-25	Continual Deep Reinforcement Learning with Task-Agnostic Policy Distillation	Muhammad Burhan Hafez et.al.	2411.16532	translate	read	link
2024-11-25	Reinforcement Learning for Bidding Strategy Optimization in Day-Ahead Energy Market	Luca Di Persio et.al.	2411.16519	translate	read	null
2024-11-25	Unsupervised Event Outlier Detection in Continuous Time	Somjit Nath et.al.	2411.16427	translate	read	null
2024-11-25	CATP-LLM: Empowering Large Language Models for Cost-Aware Tool Planning	Duo Wu et.al.	2411.16313	translate	read	null
2024-11-25	Probing for Consciousness in Machines	Mathis Immertreu et.al.	2411.16262	translate	read	null
2024-11-25	Multi-Robot Reliable Navigation in Uncertain Topological Environments with Graph Attention Networks	Zhuoyuan Yu et.al.	2411.16134	translate	read	null
2024-11-25	End-to-End Steering for Autonomous Vehicles via Conditional Imitation Co-Learning	Mahmoud M. Kishky et.al.	2411.16131	translate	read	null
2024-11-25	Why the Agent Made that Decision: Explaining Deep Reinforcement Learning with Vision Masks	Rui Zuo et.al.	2411.16120	translate	read	null
2024-11-25	M3: Mamba-assisted Multi-Circuit Optimization via MBRL with Effective Scheduling	Youngmin Oh et.al.	2411.16019	translate	read	null
2024-11-22	WildLMa: Long Horizon Loco-Manipulation in the Wild	Ri-Zhao Qiu et.al.	2411.15131	translate	read	null
2024-11-22	Learning-based Trajectory Tracking for Bird-inspired Flapping-Wing Robots	Jiaze Cai et.al.	2411.15130	translate	read	null
2024-11-22	TÜLU 3: Pushing Frontiers in Open Language Model Post-Training	Nathan Lambert et.al.	2411.15124	translate	read	link
2024-11-22	On Multi-Agent Inverse Reinforcement Learning	Till Freihaut et.al.	2411.15046	translate	read	null
2024-11-22	Safe Multi-Agent Reinforcement Learning with Convergence to Generalized Nash Equilibrium	Zeyang Li et.al.	2411.15036	translate	read	null
2024-11-22	On the Linear Speedup of Personalized Federated Reinforcement Learning with Shared Representations	Guojun Xiong et.al.	2411.15014	translate	read	null
2024-11-22	Free Energy Projective Simulation (FEPS): Active inference with interpretability	Joséphine Pazem et.al.	2411.14991	translate	read	null
2024-11-22	Enhancing Exploration with Diffusion Policies in Hybrid Off-Policy RL: Application to Non-Prehensile Manipulation	Huy Le et.al.	2411.14913	translate	read	null
2024-11-22	Segmenting Action-Value Functions Over Time-Scales in SARSA using TD( $Δ$ )	Mahammad Humayoo et.al.	2411.14783	translate	read	null
2024-11-22	Enhancing Molecular Design through Graph-based Topological Reinforcement Learning	Xiangyu Zhang et.al.	2411.14726	translate	read	null
2024-11-21	Multi-Agent Environments for Vehicle Routing Problems	Ricardo Gama et.al.	2411.14411	translate	read	null
2024-11-21	Marco-o1: Towards Open Reasoning Models for Open-Ended Solutions	Yu Zhao et.al.	2411.14405	translate	read	link
2024-11-21	23 DoF Grasping Policies from a Raw Point Cloud	Martin Matak et.al.	2411.14400	translate	read	null
2024-11-21	Model Checking for Reinforcement Learning in Autonomous Driving: One Can Do More Than You Think!	Rong Gu et.al.	2411.14375	translate	read	null
2024-11-21	Convex Approximation of Probabilistic Reachable Sets from Small Samples Using Self-supervised Neural Networks	Jun Xiang et.al.	2411.14356	translate	read	null
2024-11-21	Logarithmic Neyman Regret for Adaptive Estimation of the Average Treatment Effect	Ojash Neopane et.al.	2411.14341	translate	read	null
2024-11-21	Explainable Multi-Agent Reinforcement Learning for Extended Reality Codec Adaptation	Pedro Enrique Iturria-Rivera et.al.	2411.14264	translate	read	null
2024-11-21	Generalizing End-To-End Autonomous Driving In Real-World Environments Using Zero-Shot LLMs	Zeyu Dong et.al.	2411.14256	translate	read	null
2024-11-21	Natural Language Reinforcement Learning	Xidong Feng et.al.	2411.14251	translate	read	link
2024-11-21	Umbrella Reinforcement Learning – computationally efficient tool for hard non-linear problems	Egor E. Nuzhin et.al.	2411.14117	translate	read	null
2024-11-20	BALROG: Benchmarking Agentic LLM and VLM Reasoning On Games	Davide Paglieri et.al.	2411.13543	translate	read	link
2024-11-20	Metacognition for Unknown Situations and Environments (MUSE)	Rodolfo Valiente et.al.	2411.13537	translate	read	null
2024-11-20	Robust Monocular Visual Odometry using Curriculum Learning	Assaf Lahiany et.al.	2411.13438	translate	read	null
2024-11-20	A Survey On Enhancing Reinforcement Learning in Complex Environments: Insights from Human and LLM Feedback	Alireza Rashidi Laleh et.al.	2411.13410	translate	read	null
2024-11-20	Fine-tuning Myoelectric Control through Reinforcement Learning in a Game Environment	Kilian Freitag et.al.	2411.13327	translate	read	null
2024-11-20	Backward Stochastic Control System with Entropy Regularization	Ziyue Chen et.al.	2411.13219	translate	read	null
2024-11-20	ViSTa Dataset: Do vision-language models understand sequential tasks?	Evžen Wybitul et.al.	2411.13211	translate	read	link
2024-11-20	Engagement-Driven Content Generation with Large Language Models	Erica Coppolillo et.al.	2411.13187	translate	read	null
2024-11-20	Learning Time-Optimal and Speed-Adjustable Tactile In-Hand Manipulation	Johannes Pitz et.al.	2411.13148	translate	read	null
2024-11-20	ReinFog: A DRL Empowered Framework for Resource Management in Edge and Cloud Computing Environments	Zhiyu Wang et.al.	2411.13121	translate	read	null
2024-11-19	ACING: Actor-Critic for Instruction Learning in Black-Box Large Language Models	Salma Kharrat et.al.	2411.12736	translate	read	link
2024-11-19	Reinforcement Learning, Collusion, and the Folk Theorem	Galit Askenazi-Golan et.al.	2411.12725	translate	read	null
2024-11-19	UBSoft: A Simulation Platform for Robotic Skill Learning in Unbounded Soft Environments	Chunru Lin et.al.	2411.12711	translate	read	null
2024-11-19	Instant Policy: In-Context Imitation Learning via Graph Diffusion	Vitalis Vosylius et.al.	2411.12633	translate	read	null
2024-11-19	Robotic transcatheter tricuspid valve replacement with hybrid enhanced intelligence: a new paradigm and first-in-vivo study	Shuangyi Wang et.al.	2411.12478	translate	read	null
2024-11-19	Variable-Frequency Imitation Learning for Variable-Speed Motion	Nozomu Masuya et.al.	2411.12310	translate	read	null
2024-11-19	Emergence of Implicit World Models from Mortal Agents	Kazuya Horibe et.al.	2411.12304	translate	read	null
2024-11-19	DT-RaDaR: Digital Twin Assisted Robot Navigation using Differential Ray-Tracing	Sunday Amatare et.al.	2411.12284	translate	read	null
2024-11-19	Error-Feedback Model for Output Correction in Bilateral Control-Based Imitation Learning	Hiroshi Sato et.al.	2411.12255	translate	read	null
2024-11-19	Efficient Training in Multi-Agent Reinforcement Learning: A Communication-Free Framework for the Box-Pushing Problem	David Ge et.al.	2411.12246	translate	read	null
2024-11-18	Design And Optimization Of Multi-rendezvous Manoeuvres Based On Reinforcement Learning And Convex Optimization	Antonio López Rivera et.al.	2411.11778	translate	read	null
2024-11-18	High-Speed Cornering Control and Real-Vehicle Deployment for Autonomous Electric Vehicles	Shiyue Zhao et.al.	2411.11762	translate	read	null
2024-11-18	Mapping out the Space of Human Feedback for Reinforcement Learning: A Conceptual Framework	Yannick Metz et.al.	2411.11761	translate	read	null
2024-11-18	Aligning Few-Step Diffusion Models with Dense Reward Difference Learning	Ziyi Zhang et.al.	2411.11727	translate	read	link
2024-11-18	Bitcoin Under Volatile Block Rewards: How Mempool Statistics Can Influence Bitcoin Mining	Roozbeh Sarenche et.al.	2411.11702	translate	read	null
2024-11-18	Robust Reinforcement Learning under Diffusion Models for Data with Jumps	Chenyang Jiang et.al.	2411.11697	translate	read	null
2024-11-18	Coevolution of Opinion Dynamics and Recommendation System: Modeling Analysis and Reinforcement Learning Based Manipulation	Yuhong Chen et.al.	2411.11687	translate	read	null
2024-11-18	No-regret Exploration in Shuffle Private Reinforcement Learning	Shaojie Bai et.al.	2411.11647	translate	read	null
2024-11-18	Signaling and Social Learning in Swarms of Robots	Leo Cazenille et.al.	2411.11616	translate	read	null
2024-11-18	A Pre-Trained Graph-Based Model for Adaptive Sequencing of Educational Documents	Jean Vassoyan et.al.	2411.11520	translate	read	null
2024-11-15	Mitigating Parameter Degeneracy using Joint Conditional Diffusion Model for WECC Composite Load Model in Power Systems	Feiqin Zhu et.al.	2411.10431	translate	read	null
2024-11-15	Continual Adversarial Reinforcement Learning (CARL) of False Data Injection detection: forgetting and explainability	Pooja Aslami et.al.	2411.10367	translate	read	null
2024-11-15	BMP: Bridging the Gap between B-Spline and Movement Primitives	Weiran Liao et.al.	2411.10336	translate	read	null
2024-11-15	Towards Sample-Efficiency and Generalization of Transfer and Inverse Reinforcement Learning: A Comprehensive Literature Review	Hossein Hassani et.al.	2411.10268	translate	read	null
2024-11-15	Learning Generalizable 3D Manipulation With 10 Demonstrations	Yu Ren et.al.	2411.10203	translate	read	null
2024-11-15	The Surprising Ineffectiveness of Pre-Trained Visual Representations for Model-Based Reinforcement Learning	Moritz Schneider et.al.	2411.10175	translate	read	null
2024-11-15	Imagine-2-Drive: High-Fidelity World Modeling in CARLA for Autonomous Vehicles	Anant Garg et.al.	2411.10171	translate	read	null
2024-11-15	Mitigating Sycophancy in Decoder-Only Transformer Architectures: Synthetic Data Intervention	Libo Wang et.al.	2411.10156	translate	read	link
2024-11-15	That Chip Has Sailed: A Critique of Unfounded Skepticism Around AI for Chip Design	Anna Goldie et.al.	2411.10053	translate	read	null
2024-11-15	Enforcing Cooperative Safety for Reinforcement Learning-based Mixed-Autonomy Platoon Control	Jingyuan Zhou et.al.	2411.10031	translate	read	null
2024-11-14	A Risk Sensitive Contract-unified Reinforcement Learning Approach for Option Hedging	Xianhua Peng et.al.	2411.09659	translate	read	null
2024-11-14	Motion Before Action: Diffusing Object Motion as Manipulation Condition	Yup Su et.al.	2411.09658	translate	read	null
2024-11-14	Tailoring interactions between active nematic defects with reinforcement learning	Carlos Floyd et.al.	2411.09588	translate	read	null
2024-11-14	Developement of Reinforcement Learning based Optimisation Method for Side-Sill Design	Aditya Borse et.al.	2411.09499	translate	read	null
2024-11-14	Approximated Variational Bayesian Inverse Reinforcement Learning for Large Language Model Alignment	Yuang Cai et.al.	2411.09341	translate	read	null
2024-11-14	Socio-Economic Consequences of Generative AI: A Review of Methodological Approaches	Carlos J. Costa et.al.	2411.09313	translate	read	null
2024-11-14	Enhancing reinforcement learning for population setpoint tracking in co-cultures	Sebastián Espinel-Ríos et.al.	2411.09177	translate	read	null
2024-11-14	Gazing at Rewards: Eye Movements as a Lens into Human and AI Decision-Making in Hybrid Visual Foraging	Bo Wang et.al.	2411.09176	translate	read	null
2024-11-14	Rationality based Innate-Values-driven Reinforcement Learning	Qin Yang et.al.	2411.09160	translate	read	null
2024-11-14	Secrecy Energy Efficiency Maximization in IRS-Assisted VLC MISO Networks with RSMA: A DS-PPO approach	Yangbo Guo et.al.	2411.09146	translate	read	null
2024-11-13	LLMStinger: Jailbreaking LLMs using RL fine-tuned LLMs	Piyush Jha et.al.	2411.08862	translate	read	null
2024-11-13	Goal-oriented Semantic Communication for Robot Arm Reconstruction in Digital Twin: Feature and Temporal Selections	Shutong Chen et.al.	2411.08835	translate	read	null
2024-11-13	Recommender systems and reinforcement learning for building control and occupant interaction: A text-mining driven review of scientific literature	Wenhao Zhang et.al.	2411.08734	translate	read	null
2024-11-13	Joint Model Caching and Resource Allocation in Generative AI-Enabled Wireless Edge Networks	Zhang Liu et.al.	2411.08672	translate	read	null
2024-11-13	Estimating unknown parameters in differential equations with a reinforcement learning based PSO method	Wenkui Sun et.al.	2411.08651	translate	read	null
2024-11-13	Towards Secure Intelligent O-RAN Architecture: Vulnerabilities, Threats and Promising Technical Solutions using LLMs	Mojdeh Karbalaee Motalleb et.al.	2411.08640	translate	read	null
2024-11-13	Robot See, Robot Do: Imitation Reward for Noisy Financial Environments	Sven Goluža et.al.	2411.08637	translate	read	null
2024-11-13	Precision-Focused Reinforcement Learning Model for Robotic Object Pushing	Lara Bergmann et.al.	2411.08622	translate	read	link
2024-11-13	Grammarization-Based Grasping with Deep Multi-Autoencoder Latent Space Exploration by Reinforcement Learning Agent	Leonidas Askianakis et.al.	2411.08566	translate	read	null
2024-11-13	Towards Practical Deep Schedulers for Allocating Cellular Radio Resources	Petteri Kela et.al.	2411.08529	translate	read	null
2024-11-12	Learning Memory Mechanisms for Decision Making through Demonstrations	William Yue et.al.	2411.07954	translate	read	link
2024-11-12	Doubly Mild Generalization for Offline Reinforcement Learning	Yixiu Mao et.al.	2411.07934	translate	read	link
2024-11-12	Scaling policy iteration based reinforcement learning for unknown discrete-time linear systems	Zhen Pang et.al.	2411.07825	translate	read	null
2024-11-12	Navigation with QPHIL: Quantizing Planner for Hierarchical Implicit Q-Learning	Alexi Canesse et.al.	2411.07760	translate	read	null
2024-11-12	Optimizing Traffic Signal Control using High-Dimensional State Representation and Efficient Deep Reinforcement Learning	Lawrence Francis et.al.	2411.07759	translate	read	null
2024-11-12	EMPERROR: A Flexible Generative Perception Error Model for Probing Self-Driving Planners	Niklas Hanselmann et.al.	2411.07719	translate	read	null
2024-11-12	Test Where Decisions Matter: Importance-driven Testing for Deep Reinforcement Learning	Stefan Pranger et.al.	2411.07700	translate	read	null
2024-11-12	Exploring Multi-Agent Reinforcement Learning for Unrelated Parallel Machine Scheduling	Maria Zampella et.al.	2411.07634	translate	read	null
2024-11-12	Direct Preference Optimization Using Sparse Feature-Level Constraints	Qingyu Yin et.al.	2411.07618	translate	read	null
2024-11-12	Entropy Controllable Direct Preference Optimization	Motoki Omura et.al.	2411.07595	translate	read	null
2024-11-11	‘Explaining RL Decisions with Trajectories’: A Reproducibility Study	Karim Abdel Sadek et.al.	2411.07200	translate	read	link
2024-11-11	Joint Age-State Belief is All You Need: Minimizing AoII via Pull-Based Remote Estimation	Ismail Cosandal et.al.	2411.07179	translate	read	null
2024-11-11	Learning Multi-Agent Collaborative Manipulation for Long-Horizon Quadrupedal Pushing	Chuye Hong et.al.	2411.07104	translate	read	null
2024-11-11	A Multi-Agent Approach for REST API Testing with Semantic Graphs and LLM-Driven Inputs	Myeongsoo Kim et.al.	2411.07098	translate	read	null
2024-11-11	OCMDP: Observation-Constrained Markov Decision Process	Taiyi Wang et.al.	2411.07087	translate	read	null
2024-11-11	To Train or Not to Train: Balancing Efficiency and Training Cost in Deep Reinforcement Learning for Mobile Edge Computing	Maddalena Boscaro et.al.	2411.07086	translate	read	null
2024-11-11	Non-Adversarial Inverse Reinforcement Learning via Successor Feature Matching	Arnav Kumar Jain et.al.	2411.07007	translate	read	link
2024-11-11	Enhancing Robot Assistive Behaviour with Reinforcement Learning and Theory of Mind	Antonio Andriella et.al.	2411.07003	translate	read	link
2024-11-11	Imitation from Diverse Behaviors: Wasserstein Quality Diversity Imitation Learning with Single-Step Archive Exploration	Xingrui Yu et.al.	2411.06965	translate	read	null
2024-11-11	Streetwise Agents: Empowering Offline RL Policies to Outsmart Exogenous Stochastic Disturbances in RTC	Aditya Soni et.al.	2411.06815	translate	read	null
2024-11-08	Safe Reinforcement Learning of Robot Trajectories in the Presence of Moving Obstacles	Jonas Kiemel et.al.	2411.05784	translate	read	null
2024-11-08	Tract-RLFormer: A Tract-Specific RL policy based Decoder-only Transformer Network	Ankita Joshi et.al.	2411.05757	translate	read	null
2024-11-08	Topology-aware Reinforcement Feature Space Reconstruction for Graph Data	Wangyang Ying et.al.	2411.05742	translate	read	null
2024-11-08	Renewable Energy Powered and Open RAN-based Architecture for 5G Fixed Wireless Access Provisioning in Rural Areas	Anselme Ndikumana et.al.	2411.05699	translate	read	null
2024-11-08	Data-Driven Distributed Common Operational Picture from Heterogeneous Platforms using Multi-Agent Reinforcement Learning	Indranil Sur et.al.	2411.05683	translate	read	null
2024-11-08	Digital Twin Backed Closed-Loops for Energy-Aware and Open RAN-based Fixed Wireless Access Serving Rural Areas	Anselme Ndikumana et.al.	2411.05664	translate	read	null
2024-11-08	Acceleration for Deep Reinforcement Learning using Parallel and Distributed Computing: A Survey	Zhihong Liu et.al.	2411.05614	translate	read	null
2024-11-08	Smart navigation through a rotating barrier: Deep reinforcement learning with application to size-based separation of active microagents	Mohammad Hossein Masoudi et.al.	2411.05587	translate	read	null
2024-11-08	Tangled Program Graphs as an alternative to DRL-based control algorithms for UAVs	Hubert Szolc et.al.	2411.05586	translate	read	null
2024-11-08	Towards Active Flow Control Strategies Through Deep Reinforcement Learning	Ricard Montalà et.al.	2411.05536	translate	read	null
2024-11-07	Noisy Zero-Shot Coordination: Breaking The Common Knowledge Assumption In Zero-Shot Coordination Games	Usman Anwar et.al.	2411.04976	translate	read	link
2024-11-07	A Reinforcement Learning-Based Automatic Video Editing Method Using Pre-trained Vision-Language Model	Panwen Hu et.al.	2411.04942	translate	read	null
2024-11-07	Stem-OB: Generalizable Visual Imitation Learning with Stem-Like Convergent Observation through Diffusion Inversion	Kaizhe Hu et.al.	2411.04919	translate	read	link
2024-11-07	Evaluating Robustness of Reinforcement Learning Algorithms for Autonomous Shipping	Bavo Lesy et.al.	2411.04915	translate	read	null
2024-11-07	Think Smart, Act SMARL! Analyzing Probabilistic Logic Driven Safety in Multi-Agent Reinforcement Learning	Satchit Chatterji et.al.	2411.04867	translate	read	link
2024-11-07	Asymptotic regularity of a generalised stochastic Halpern scheme with applications	Nicholas Pischke et.al.	2411.04845	translate	read	null
2024-11-07	Plasticity Loss in Deep Reinforcement Learning: A Survey	Timo Klein et.al.	2411.04832	translate	read	null
2024-11-07	Harnessing the Power of Gradient-Based Simulations for Multi-Objective Optimization in Particle Accelerators	Kishansingh Rajput et.al.	2411.04817	translate	read	null
2024-11-07	AllGaits: Learning All Quadruped Gaits and Transitions	Guillaume Bellegarda et.al.	2411.04787	translate	read	null
2024-11-07	Navigating Trade-offs: Policy Summarization for Multi-Objective Reinforcement Learning	Zuzanna Osika et.al.	2411.04784	translate	read	link
2024-11-06	A Comparative Study of Deep Reinforcement Learning for Crop Production Management	Joseph Balderas et.al.	2411.04106	translate	read	null
2024-11-06	Interpretable and Efficient Data-driven Discovery and Control of Distributed Systems	Florian Wolf et.al.	2411.04098	translate	read	null
2024-11-06	Memorized action chunking with Transformers: Imitation learning for vision-based tissue surface scanning	Bochen Yang et.al.	2411.04050	translate	read	null
2024-11-06	Non-Stationary Learning of Neural Networks with Automatic Soft Parameter Reset	Alexandre Galashov et.al.	2411.04034	translate	read	null
2024-11-06	Predicting and Publishing Accurate Imbalance Prices Using Monte Carlo Tree Search	Fabio Pavirani et.al.	2411.04011	translate	read	null
2024-11-06	Object-Centric Dexterous Manipulation from Human Motion Data	Yuanpei Chen et.al.	2411.04005	translate	read	null
2024-11-06	ET-SEED: Efficient Trajectory-Level SE(3) Equivariant Diffusion Policy	Chenrui Tie et.al.	2411.03990	translate	read	null
2024-11-06	AdaSociety: An Adaptive Environment with Social Structures for Multi-Agent Decision-Making	Yizhe Huang et.al.	2411.03865	translate	read	link
2024-11-06	Beyond The Rainbow: High Performance Deep Reinforcement Learning On A Desktop PC	Tyler Clark et.al.	2411.03820	translate	read	null
2024-11-06	From Novice to Expert: LLM Agent Policy Optimization via Step-wise Reinforcement Learning	Zhirui Deng et.al.	2411.03817	translate	read	null
2024-11-05	Out-of-Distribution Recovery with Object-Centric Keypoint Inverse Policy For Visuomotor Imitation Learning	George Jiayuan Gao et.al.	2411.03294	translate	read	null
2024-11-05	Pre-trained Visual Dynamics Representations for Efficient Policy Learning	Hao Luo et.al.	2411.03169	translate	read	null
2024-11-05	Hierarchical Orchestra of Policies	Thomas P Cannon et.al.	2411.03008	translate	read	null
2024-11-05	Accelerating Task Generalisation with Multi-Level Hierarchical Options	Thomas P Cannon et.al.	2411.02998	translate	read	null
2024-11-05	Autonomous Decision Making for UAV Cooperative Pursuit-Evasion Game with Reinforcement Learning	Yang Zhao et.al.	2411.02983	translate	read	null
2024-11-05	Transformer-Based Fault-Tolerant Control for Fixed-Wing UAVs Using Knowledge Distillation and In-Context Adaptation	Francisco Giral et.al.	2411.02975	translate	read	null
2024-11-05	Embedding Safety into RL: A New Take on Trust Region Methods	Nikola Milosevic et.al.	2411.02957	translate	read	null
2024-11-05	The Unreasonable Effectiveness of LLMs for Query Optimization	Peter Akioyamen et.al.	2411.02862	translate	read	link
2024-11-05	ADOPT: Modified Adam Can Converge with Any $β_2$ with the Optimal Rate	Shohei Taniguchi et.al.	2411.02853	translate	read	link
2024-11-05	When to Localize? A Risk-Constrained Reinforcement Learning Approach	Chak Lam Shek et.al.	2411.02788	translate	read	null
2024-11-04	Simulation of Nanorobots with Artificial Intelligence and Reinforcement Learning for Advanced Cancer Cell Detection and Tracking	Shahab Kavousinejad et.al.	2411.02345	translate	read	link
2024-11-04	WebRL: Training LLM Web Agents via Self-Evolving Online Curriculum Reinforcement Learning	Zehan Qi et.al.	2411.02337	translate	read	null
2024-11-04	Targeted Manipulation and Deception Emerge when Optimizing LLMs for User Feedback	Marcus Williams et.al.	2411.02306	translate	read	link
2024-11-04	N-Gram Induction Heads for In-Context RL: Improving Stability and Reducing Data Needs	Ilya Zisman et.al.	2411.01958	translate	read	null
2024-11-04	RoboCrowd: Scaling Robot Data Collection through Crowdsourcing	Suvir Mirchandani et.al.	2411.01915	translate	read	null
2024-11-04	Efficient Active Imitation Learning with Random Network Distillation	Emilien Biré et.al.	2411.01894	translate	read	null
2024-11-04	Align-SLM: Textless Spoken Language Models with Reinforcement Learning from AI Feedback	Guan-Ting Lin et.al.	2411.01834	translate	read	null
2024-11-04	Risk-sensitive control as inference with Rényi divergence	Kaito Ito et.al.	2411.01827	translate	read	null
2024-11-04	IRS-Enhanced Secure Semantic Communication Networks: Cross-Layer and Context-Awared Resource Allocation	Lingyi Wang et.al.	2411.01821	translate	read	null
2024-11-04	So You Think You Can Scale Up Autonomous Robot Data Collection?	Suvir Mirchandani et.al.	2411.01813	translate	read	null

(<a href=../Reinforcement_Learning.md>back to Reinforcement Learning</a>)