Reinforcement Learning - 2024-06

Publish Date Title Authors PDF Translate Read Code
2024-06-28 PoliFormer: Scaling On-Policy RL with Transformers Results in Masterful Navigators Kuo-Hao Zeng et.al. 2406.20083 translate read null
2024-06-28 Applying RLAIF for Code Generation with API-usage in Lightweight LLMs Sujan Dutta et.al. 2406.20060 translate read null
2024-06-28 HumanVLA: Towards Vision-Language Directed Object Rearrangement by Physical Humanoid Xinyu Xu et.al. 2406.19972 translate read null
2024-06-28 Perception Stitching: Zero-Shot Perception Encoder Transfer for Visuomotor Robot Policies Pingcheng Jian et.al. 2406.19971 translate read null
2024-06-28 Operator World Models for Reinforcement Learning Pietro Novelli et.al. 2406.19861 translate read null
2024-06-28 3D Operation of Autonomous Excavator based on Reinforcement Learning through Independent Reward for Individual Joints Yoonkyu Yoo et.al. 2406.19848 translate read null
2024-06-28 Reinforcement Learning for Efficient Design and Control Co-optimisation of Energy Systems Marine Cauz et.al. 2406.19825 translate read null
2024-06-28 Identifying Ordinary Differential Equations for Data-efficient Model-based Reinforcement Learning Tobias Nagel et.al. 2406.19817 translate read null
2024-06-28 Fuzzy Logic Guided Reward Function Variation: An Oracle for Testing Reinforcement Learning Programs Shiyu Zhang et.al. 2406.19812 translate read null
2024-06-28 Decision Transformer for IRS-Assisted Systems with Diffusion-Driven Generative Channels Jie Zhang et.al. 2406.19769 translate read null
2024-06-27 Efficient World Models with Context-Aware Tokenization Vincent Micheli et.al. 2406.19320 translate read link
2024-06-27 Averaging log-likelihoods in direct alignment Nathan Grinsztajn et.al. 2406.19188 translate read null
2024-06-27 Contrastive Policy Gradient: Aligning LLMs on sequence-level scores in a supervised-friendly fashion Yannis Flet-Berliac et.al. 2406.19185 translate read null
2024-06-27 Learning Pareto Set for Multi-Objective Continuous Robot Control Tianye Shu et.al. 2406.18924 translate read link
2024-06-27 Autonomous Control of a Novel Closed Chain Five Bar Active Suspension via Deep Reinforcement Learning Nishesh Singh et.al. 2406.18899 translate read null
2024-06-27 State and Input Constrained Output-Feedback Adaptive Optimal Control of Affine Nonlinear Systems Tochukwu Elijah Ogri et.al. 2406.18804 translate read null
2024-06-26 Decentralized Semantic Traffic Control in AVs Using RL and DQN for Dynamic Roadblocks Emanuel Figetakis et.al. 2406.18741 translate read null
2024-06-26 Confident Natural Policy Gradient for Local Planning in $q_π$ -realizable Constrained MDPs Tian Tian et.al. 2406.18529 translate read null
2024-06-26 Mental Modeling of Reinforcement Learning Agents by Language Models Wenhao Lu et.al. 2406.18505 translate read null
2024-06-26 Preference Elicitation for Offline Reinforcement Learning Alizée Pace et.al. 2406.18450 translate read null
2024-06-26 Mixture of Experts in a Mixture of RL settings Timon Willi et.al. 2406.18420 translate read null
2024-06-26 AlphaForge: A Framework to Mine and Dynamically Combine Formulaic Alpha Factors Hao Shi et.al. 2406.18394 translate read null
2024-06-26 Reinforcement Learning with Intrinsically Motivated Feedback Graph for Lost-sales Inventory Control Zifan Liu et.al. 2406.18351 translate read null
2024-06-26 AI Alignment through Reinforcement Learning from Human Feedback? Contradictions and Limitations Adam Dahlgren Lindström et.al. 2406.18346 translate read null
2024-06-26 Spatial-temporal Hierarchical Reinforcement Learning for Interpretable Pathology Image Super-Resolution Wenting Chen et.al. 2406.18310 translate read link
2024-06-26 Combining Automated Optimisation of Hyperparameters and Reward Shape Julian Dierkes et.al. 2406.18293 translate read link
2024-06-26 Weak Reward Model Transforms Generative Models into Robust Causal Event Extraction Systems Italo Luis da Silva et.al. 2406.18245 translate read link
2024-06-25 EXTRACT: Efficient Policy Learning by Extracting Transferrable Robot Skills from Offline Data Jesse Zhang et.al. 2406.17768 translate read null
2024-06-25 When does Self-Prediction help? Understanding Auxiliary Tasks in Reinforcement Learning Claas Voelcker et.al. 2406.17718 translate read null
2024-06-25 Privacy Preserving Reinforcement Learning for Population Processes Samuel Yang-Zhao et.al. 2406.17649 translate read null
2024-06-25 KANQAS: Kolmogorov Arnold Network for Quantum Architecture Search Akash Kundu et.al. 2406.17630 translate read link
2024-06-25 Leveraging Reinforcement Learning in Red Teaming for Advanced Ransomware Attack Simulations Cheng Wang et.al. 2406.17576 translate read null
2024-06-25 On the consistency of hyper-parameter selection in value-based deep reinforcement learning Johan Obando-Ceron et.al. 2406.17523 translate read null
2024-06-25 BricksRL: A Platform for Democratizing Robotics and Reinforcement Learning Research and Education with LEGO Sebastian Dittert et.al. 2406.17490 translate read null
2024-06-25 CuDA2: An approach for Incorporating Traitor Agents into Cooperative Multi-Agent Systems Zhen Chen et.al. 2406.17425 translate read null
2024-06-25 Joint Admission Control and Resource Allocation of Virtual Network Embedding via Hierarchical Deep Reinforcement Learning Tianfu Wang et.al. 2406.17334 translate read link
2024-06-25 The State-Action-Reward-State-Action Algorithm in Spatial Prisoner’s Dilemma Game Lanyu Yang et.al. 2406.17326 translate read null
2024-06-24 Confidence Aware Inverse Constrained Reinforcement Learning Sriram Ganapathi Subramanian et.al. 2406.16782 translate read null
2024-06-24 WARP: On the Benefits of Weight Averaged Rewarded Policies Alexandre Ramé et.al. 2406.16768 translate read null
2024-06-24 The MRI Scanner as a Diagnostic: Image-less Active Sampling Yuning Du et.al. 2406.16754 translate read null
2024-06-24 OCALM: Object-Centric Assessment with Language Models Timo Kaufmann et.al. 2406.16748 translate read null
2024-06-24 Adversarial Contrastive Decoding: Boosting Safety Alignment of Large Language Models via Opposite Prompt Optimization Zhengyue Zhao et.al. 2406.16743 translate read null
2024-06-24 Probabilistic Subgoal Representations for Hierarchical Reinforcement learning Vivienne Huiling Wang et.al. 2406.16707 translate read null
2024-06-24 Decentralized RL-Based Data Transmission Scheme for Energy Efficient Harvesting Rafaela Scaciota et.al. 2406.16624 translate read null
2024-06-24 Towards Physically Talented Aerial Robots with Tactically Smart Swarm Behavior thereof: An Efficient Co-design Approach Prajit KrisshnaKumar et.al. 2406.16612 translate read null
2024-06-24 $\text{Alpha}^2$ : Discovering Logical Formulaic Alphas using Deep Reinforcement Learning Feng Xu et.al. 2406.16505 translate read link
2024-06-24 Towards Comprehensive Preference Data Collection for Reward Modeling Yulan Hu et.al. 2406.16486 translate read null
2024-06-21 MantisScore: Building Automatic Metrics to Simulate Fine-grained Human Feedback for Video Generation Xuan He et.al. 2406.15252 translate read null
2024-06-21 Open Problem: Order Optimal Regret Bounds for Kernel-Based Reinforcement Learning Sattar Vakili et.al. 2406.15250 translate read null
2024-06-21 Deep UAV Path Planning with Assured Connectivity in Dense Urban Setting Jiyong Oh et.al. 2406.15225 translate read null
2024-06-21 Gaussian Splatting to Real World Flight Navigation Transfer with Liquid Networks Alex Quach et.al. 2406.15149 translate read null
2024-06-21 KalMamba: Towards Efficient Probabilistic State Space Models for RL under Uncertainty Philipp Becker et.al. 2406.15131 translate read null
2024-06-21 A Provably Efficient Option-Based Algorithm for both High-Level and Low-Level Learning Gianluca Drappo et.al. 2406.15124 translate read null
2024-06-21 Towards General Negotiation Strategies with End-to-End Reinforcement Learning Bram M. Renting et.al. 2406.15096 translate read null
2024-06-21 KnobTree: Intelligent Database Parameter Configuration via Explainable Reinforcement Learning Jiahan Chen et.al. 2406.15073 translate read null
2024-06-21 Behaviour Distillation Andrei Lupu et.al. 2406.15042 translate read link
2024-06-21 SiT: Symmetry-Invariant Transformers for Generalisation in Reinforcement Learning Matthias Weissenbacher et.al. 2406.15025 translate read null
2024-06-20 CooHOI: Learning Cooperative Human-Object Interaction with Manipulated Object Dynamics Jiawei Gao et.al. 2406.14558 translate read null
2024-06-20 MacroHFT: Memory Augmented Context-aware Reinforcement Learning On High Frequency Trading Chuqiao Zong et.al. 2406.14537 translate read link
2024-06-20 RL on Incorrect Synthetic Data Scales the Efficiency of LLM Math Reasoning by Eight-Fold Amrith Setlur et.al. 2406.14532 translate read link
2024-06-20 Learning telic-controllable state representations Nadav Amir et.al. 2406.14476 translate read null
2024-06-20 Rewarding What Matters: Step-by-Step Reinforcement Learning for Task-Oriented Dialogue Huifang Du et.al. 2406.14457 translate read null
2024-06-20 Revealing the learning process in reinforcement learning agents through attention-oriented metrics Charlotte Beylier et.al. 2406.14324 translate read null
2024-06-20 Resource Optimization for Tail-Based Control in Wireless Networked Control Systems Rasika Vijithasena et.al. 2406.14301 translate read null
2024-06-21 REVEAL-IT: REinforcement learning with Visibility of Evolving Agent poLicy for InTerpretability Shuang Ao et.al. 2406.14214 translate read link
2024-06-20 Optimizing Novelty of Top-k Recommendations using Large Language Models and Reinforcement Learning Amit Sharma et.al. 2406.14169 translate read null
2024-06-20 Iterative Sizing Field Prediction for Adaptive Mesh Generation From Expert Demonstrations Niklas Freymuth et.al. 2406.14161 translate read link
2024-06-18 Interpretable Preferences via Multi-Objective Reward Modeling and Mixture-of-Experts Haoxiang Wang et.al. 2406.12845 translate read link
2024-06-18 Injection Optimization at Particle Accelerators via Reinforcement Learning: From Simulation to Real-World Application Awal Awal et.al. 2406.12735 translate read null
2024-06-18 A Systematization of the Wagner Framework: Graph Theory Conjectures and Reinforcement Learning Flora Angileri et.al. 2406.12667 translate read null
2024-06-18 Reinforcement-Learning based routing for packet-optical networks with hybrid telemetry A. L. García Navarro et.al. 2406.12602 translate read null
2024-06-18 Discovering Minimal Reinforcement Learning Environments Jarek Liesen et.al. 2406.12589 translate read null
2024-06-18 RichRAG: Crafting Rich Responses for Multi-faceted Queries in Retrieval-Augmented Generation Shuting Wang et.al. 2406.12566 translate read null
2024-06-18 A Super-human Vision-based Reinforcement Learning Agent for Autonomous Racing in Gran Turismo Miguel Vasco et.al. 2406.12563 translate read null
2024-06-18 Offline Imitation Learning with Model-based Reverse Augmentation Jie-Jing Shao et.al. 2406.12550 translate read null
2024-06-18 Demonstrating Agile Flight from Pixels without State Estimation Ismail Geles et.al. 2406.12505 translate read null
2024-06-18 Autonomous navigation of catheters and guidewires in mechanical thrombectomy using inverse reinforcement learning Harry Robertshaw et.al. 2406.12499 translate read null
2024-06-17 WPO: Enhancing RLHF with Weighted Preference Optimization Wenxuan Zhou et.al. 2406.11827 translate read link
2024-06-17 Computationally Efficient RL under Linear Bellman Completeness for Deterministic Dynamics Runzhe Wu et.al. 2406.11810 translate read null
2024-06-17 Run Time Assured Reinforcement Learning for Six Degree-of-Freedom Spacecraft Inspection Kyle Dunlap et.al. 2406.11795 translate read null
2024-06-17 FetchBench: A Simulation Benchmark for Robot Fetching Beining Han et.al. 2406.11793 translate read null
2024-06-17 Optimal Transport-Assisted Risk-Sensitive Q-Learning Zahra Shahrooei et.al. 2406.11774 translate read null
2024-06-17 Measuring memorization in RLHF for code completion Aneesh Pappu et.al. 2406.11715 translate read null
2024-06-17 The Role of Inherent Bellman Error in Offline Reinforcement Learning with Linear Function Approximation Noah Golowich et.al. 2406.11686 translate read null
2024-06-17 Communication-Efficient MARL for Platoon Stability and Energy-efficiency Co-optimization in Cooperative Adaptive Cruise Control of CAVs Min Hua et.al. 2406.11653 translate read null
2024-06-17 Linear Bellman Completeness Suffices for Efficient Online Reinforcement Learning with Few Actions Noah Golowich et.al. 2406.11640 translate read null
2024-06-17 Style Transfer with Multi-iteration Preference Optimization Shuai Liu et.al. 2406.11581 translate read null
2024-06-14 Regularizing Hidden States Enables Learning Generalizable Reward Model for LLMs Rui Yang et.al. 2406.10216 translate read null
2024-06-14 A Fundamental Trade-off in Aligned Language Models and its Relation to Sampling Adaptors Naaman Tan et.al. 2406.10203 translate read null
2024-06-14 Misam: Using ML in Dataflow Selection of Sparse-Sparse Matrix Multiplication Sanjali Yadav et.al. 2406.10166 translate read null
2024-06-14 Sycophancy to Subterfuge: Investigating Reward-Tampering in Large Language Models Carson Denison et.al. 2406.10162 translate read link
2024-06-14 BiKC: Keypose-Conditioned Consistency Policy for Bimanual Robotic Manipulation Dongjie Yu et.al. 2406.10093 translate read null
2024-06-14 PRIMER: Perception-Aware Robust Learning-based Multiagent Trajectory Planner Kota Kondo et.al. 2406.10060 translate read null
2024-06-14 Bridging the Communication Gap: Artificial Agents Learning Sign Language through Imitation Federico Tavella et.al. 2406.10043 translate read null
2024-06-14 ROAR: Reinforcing Original to Augmented Data Ratio Dynamics for Wav2Vec2.0 Based ASR Vishwanath Pratap Singh et.al. 2406.09999 translate read null
2024-06-14 Robust Model-Based Reinforcement Learning with an Adversarial Auxiliary Model Siemen Herremans et.al. 2406.09976 translate read link
2024-06-14 InstructRL4Pix: Training Diffusion for Image Editing by Reinforcement Learning Tiancheng Li et.al. 2406.09973 translate read null
2024-06-13 Aligning Vision Models with Human Aesthetics in Retrieval: Benchmarks and Algorithms Miaosen Zhang et.al. 2406.09397 translate read null
2024-06-13 Is Value Learning Really the Main Bottleneck in Offline RL? Seohong Park et.al. 2406.09329 translate read null
2024-06-13 OpenVLA: An Open-Source Vision-Language-Action Model Moo Jin Kim et.al. 2406.09246 translate read null
2024-06-13 AutomaChef: A Physics-informed Demonstration-guided Learning Framework for Granular Material Manipulation Minglun Wei et.al. 2406.09178 translate read null
2024-06-13 Direct Imitation Learning-based Visual Servoing using the Large Projection Formulation Sayantan Auddy et.al. 2406.09120 translate read null
2024-06-13 Adaptive Actor-Critic Based Optimal Regulation for Drift-Free Uncertain Nonlinear Systems Ashwin P. Dani et.al. 2406.09097 translate read null
2024-06-13 DiffPoGAN: Diffusion Policies with Generative Adversarial Networks for Offline Reinforcement Learning Xuemin Hu et.al. 2406.09089 translate read null
2024-06-13 Data-driven modeling and supervisory control system optimization for plug-in hybrid electric vehicles Hao Zhang et.al. 2406.09082 translate read null
2024-06-13 Latent Assistance Networks: Rediscovering Hyperbolic Tangents in RL Jacob E. Kooi et.al. 2406.09079 translate read null
2024-06-13 Dispelling the Mirage of Progress in Offline MARL through Standardised Baselines and Evaluation Claude Formanek et.al. 2406.09068 translate read null
2024-06-12 RILe: Reinforced Imitation Learning Mert Albaba et.al. 2406.08472 translate read null
2024-06-12 Adaptive Swarm Mesh Refinement using Deep Reinforcement Learning with Local Rewards Niklas Freymuth et.al. 2406.08440 translate read null
2024-06-12 RRLS : Robust Reinforcement Learning Suite Adil Zouitine et.al. 2406.08406 translate read link
2024-06-12 Scaling Value Iteration Networks to 5000 Layers for Extreme Long-Term Planning Yuhui Wang et.al. 2406.08404 translate read null
2024-06-12 Time-Constrained Robust MDPs Adil Zouitine et.al. 2406.08395 translate read null
2024-06-12 Residual Learning and Context Encoding for Adaptive Offline-to-Online Reinforcement Learning Mohammadreza Nakhaei et.al. 2406.08238 translate read link
2024-06-12 MaIL: Improving Imitation Learning with Mamba Xiaogang Jia et.al. 2406.08234 translate read null
2024-06-12 Explore-Go: Leveraging Exploration for Generalisation in Deep Reinforcement Learning Max Weltevrede et.al. 2406.08069 translate read null
2024-06-12 Deep reinforcement learning with positional context for intraday trading Sven Goluža et.al. 2406.08013 translate read null
2024-06-12 Efficient Adaptation in Mixed-Motive Environments via Hierarchical Opponent Modeling and Planning Yizhe Huang et.al. 2406.08002 translate read null
2024-06-11 CDSA: Conservative Denoising Score-based Algorithm for Offline Reinforcement Learning Zeyuan Liu et.al. 2406.07541 translate read null
2024-06-11 BAKU: An Efficient Transformer for Multi-Task Policy Learning Siddhant Haldar et.al. 2406.07539 translate read null
2024-06-11 Reinforcement Learning from Human Feedback without Reward Inference: Model-Free Algorithm and Instance-Dependent Analysis Qining Zhang et.al. 2406.07455 translate read null
2024-06-11 Enhanced Gene Selection in Single-Cell Genomics: Pre-Filtering Synergy and Reinforced Optimization Weiliang Zhang et.al. 2406.07418 translate read null
2024-06-11 Federated Multi-Agent DRL for Radio Resource Management in Industrial 6G in-X subnetworks Bjarke Madsen et.al. 2406.07383 translate read null
2024-06-11 World Models with Hints of Large Language Models for Goal Achieving Zeyuan Liu et.al. 2406.07381 translate read null
2024-06-11 EdgeTimer: Adaptive Multi-Timescale Scheduling in Mobile Edge Computing with Deep Reinforcement Learning Yijun Hao et.al. 2406.07342 translate read null
2024-06-11 Beyond Training: Optimizing Reinforcement Learning Based Job Shop Scheduling Through Adaptive Action Sampling Constantin Waubert de Puiseau et.al. 2406.07325 translate read null
2024-06-11 Multi-objective Reinforcement learning from AI Feedback Marcus Williams et.al. 2406.07295 translate read null
2024-06-11 Hybrid Reinforcement Learning from Offline Observation Alone Yuda Song et.al. 2406.07253 translate read null
2024-06-10 Verification-Guided Shielding for Deep Reinforcement Learning Davide Corsi et.al. 2406.06507 translate read null
2024-06-10 Adaptive Opponent Policy Detection in Multi-Agent MDPs: Real-Time Strategy Switch Identification Using Running Error Estimation Mohidul Haque Mridul et.al. 2406.06500 translate read null
2024-06-10 Boosting Robustness in Preference-Based Reinforcement Learning with Dynamic Sparsity Calarina Muslimani et.al. 2406.06495 translate read null
2024-06-10 Towards Real-World Efficiency: Domain Randomization in Reinforcement Learning for Pre-Capture of Free-Floating Moving Targets by Autonomous Robots Bahador Beigomi et.al. 2406.06460 translate read link
2024-06-10 Is Value Functions Estimation with Classification Plug-and-play for Offline Reinforcement Learning? Denis Tarasov et.al. 2406.06309 translate read link
2024-06-10 Learning-based cognitive architecture for enhancing coordination in human groups Antonio Grotta et.al. 2406.06297 translate read null
2024-06-10 Deep Multi-Objective Reinforcement Learning for Utility-Based Infrastructural Maintenance Optimization Jesse van Remmerden et.al. 2406.06184 translate read null
2024-06-10 Mastering truss structure optimization with tree search Gabriel E. Garayalde et.al. 2406.06145 translate read null
2024-06-10 EXPIL: Explanatory Predicate Invention for Learning in Games Jingyuan Sha et.al. 2406.06107 translate read null
2024-06-10 Sim-To-Real Transfer for Visual Reinforcement Learning of Deformable Object Manipulation for Robot-Assisted Surgery Paul Maria Scheikl et.al. 2406.06092 translate read null
2024-06-07 LINX: A Language Driven Generative System for Goal-Oriented Automated Data Exploration Tavor Lipman et.al. 2406.05107 translate read null
2024-06-07 Massively Multiagent Minigames for Training Generalist Agents Kyoung Whan Choe et.al. 2406.05071 translate read link
2024-06-07 Online Frequency Scheduling by Learning Parallel Actions Anastasios Giovanidis et.al. 2406.05041 translate read null
2024-06-07 Optimizing Automatic Differentiation with Deep Reinforcement Learning Jamie Lohoff et.al. 2406.05027 translate read null
2024-06-07 Designs for Enabling Collaboration in Human-Machine Teaming via Interactive and Explainable Systems Rohan Paleja et.al. 2406.05003 translate read null
2024-06-07 SLOPE: Search with Learned Optimal Pruning-based Expansion Davor Bokan et.al. 2406.04935 translate read link
2024-06-07 Sim-to-real Transfer of Deep Reinforcement Learning Agents for Online Coverage Path Planning Arvi Jonnarth et.al. 2406.04920 translate read null
2024-06-07 Online Adaptation for Enhancing Imitation Learning Policies Federico Malato et.al. 2406.04913 translate read link
2024-06-07 Stabilizing Extreme Q-learning by Maclaurin Expansion Motoki Omura et.al. 2406.04896 translate read null
2024-06-07 Primitive Agentic First-Order Optimization R. Sala et.al. 2406.04841 translate read null
2024-06-06 ATraDiff: Accelerating Online Reinforcement Learning with Imaginary Trajectories Qianlan Yang et.al. 2406.04323 translate read null
2024-06-06 Self-Play with Adversarial Critic: Provable and Scalable Offline Alignment for Language Models Xiang Ji et.al. 2406.04274 translate read null
2024-06-06 Multi-Agent Imitation Learning: Value is Easy, Regret is Hard Jingwu Tang et.al. 2406.04219 translate read null
2024-06-06 Aligning Agents like Large Language Models Adam Jelley et.al. 2406.04208 translate read null
2024-06-06 MARLander: A Local Path Planning for Drone Swarms using Multiagent Deep Reinforcement Learning Demetros Aschu et.al. 2406.04159 translate read null
2024-06-06 Deterministic Uncertainty Propagation for Improved Model-Based Offline Reinforcement Learning Abdullah Akgül et.al. 2406.04088 translate read null
2024-06-06 Bootstrapping Expectiles in Reinforcement Learning Pierre Clavier et.al. 2406.04081 translate read null
2024-06-06 Spatio-temporal Early Prediction based on Multi-objective Reinforcement Learning Wei Shao et.al. 2406.04035 translate read link
2024-06-06 Contrastive Sparse Autoencoders for Interpreting Planning of Chess-Playing Agents Yoann Poupart et.al. 2406.04028 translate read link
2024-06-06 HackAtari: Atari Learning Environments for Robust and Continual Reinforcement Learning Quentin Delfosse et.al. 2406.03997 translate read link
2024-06-05 Automating Turkish Educational Quiz Generation Using Large Language Models Kamyar Zeinalipour et.al. 2406.03397 translate read null
2024-06-05 LLM-based Rewriting of Inappropriate Argumentation using Reinforcement Learning from Machine Feedback Timon Ziegenbein et.al. 2406.03363 translate read link
2024-06-05 UDQL: Bridging The Gap between MSE Loss and The Optimal Value Function in Offline Reinforcement Learning Yu Zhang et.al. 2406.03324 translate read null
2024-06-05 Revisiting Scalable Hessian Diagonal Approximations for Applications in Reinforcement Learning Mohamed Elsayed et.al. 2406.03276 translate read null
2024-06-05 Prompt-based Visual Alignment for Zero-shot Policy Transfer Haihan Gao et.al. 2406.03250 translate read null
2024-06-05 Fine-Grained Causal Dynamics Learning with Quantization for Improving Robustness in Reinforcement Learning Inwoo Hwang et.al. 2406.03234 translate read link
2024-06-05 CommonPower: Supercharging Machine Learning for Smart Grids Michael Eichelbeck et.al. 2406.03231 translate read link
2024-06-05 Object Manipulation in Marine Environments using Reinforcement Learning Ahmed Nader et.al. 2406.03223 translate read null
2024-06-05 Adaptive Distance Functions via Kelvin Transformation Rafael I. Cabral Muchacho et.al. 2406.03200 translate read null
2024-06-05 DEER: A Delay-Resilient Framework for Reinforcement Learning with Variable Delays Bo Xia et.al. 2406.03102 translate read null
2024-06-04 RoboCasa: Large-Scale Simulation of Everyday Tasks for Generalist Robots Soroush Nasiriany et.al. 2406.02523 translate read link
2024-06-04 Offline Bayesian Aleatoric and Epistemic Uncertainty Quantification and Posterior Value Optimisation in Finite-State MDPs Filippo Valdettaro et.al. 2406.02456 translate read null
2024-06-04 A Generalized Apprenticeship Learning Framework for Modeling Heterogeneous Student Pedagogical Strategies Md Mirajul Islam et.al. 2406.02450 translate read null
2024-06-04 Algorithmic Collusion in Dynamic Pricing with Deep Reinforcement Learning Shidi Deng et.al. 2406.02437 translate read null
2024-06-04 Seed-TTS: A Family of High-Quality Versatile Speech Generation Models Philip Anastassiou et.al. 2406.02430 translate read link
2024-06-04 Query-based Semantic Gaussian Field for Scene Representation in Reinforcement Learning Jiaxu Wang et.al. 2406.02370 translate read null
2024-06-04 How to Explore with Belief: State Entropy Maximization in POMDPs Riccardo Zamboni et.al. 2406.02295 translate read null
2024-06-04 Smaller Batches, Bigger Gains? Investigating the Impact of Batch Sizes on Reinforcement Learning Based Real-World Production Scheduling Arthur Müller et.al. 2406.02294 translate read null
2024-06-04 Test-Time Regret Minimization in Meta Reinforcement Learning Mirco Mutti et.al. 2406.02282 translate read null
2024-06-04 Reinforcement Learning with Lookahead Information Nadav Merlis et.al. 2406.02258 translate read null
2024-06-03 Fusion-PSRO: Nash Policy Fusion for Policy Space Response Oracles Jiesong Lian et.al. 2405.21027 translate read null

(<a href=../Reinforcement_Learning.md>back to Reinforcement Learning</a>)