Reinforcement Learning - 2024-10

Publish Date Title Authors PDF Translate Read Code
2024-10-31 EgoMimic: Scaling Imitation Learning via Egocentric Video Simar Kareer et.al. 2410.24221 translate read link
2024-10-31 Teaching Embodied Reinforcement Learning Agents: Informativeness and Diversity of Language Use Jiajun Xi et.al. 2410.24218 translate read link
2024-10-31 ARQ: A Mixed-Precision Quantization Framework for Accurate and Certifiably Robust DNNs Yuchen Yang et.al. 2410.24214 translate read null
2024-10-31 Zonal RL-RRT: Integrated RL-RRT Path Planning with Collision Probability and Zone Connectivity AmirMohammad Tahmasbi et.al. 2410.24205 translate read link
2024-10-31 DexMimicGen: Automated Data Generation for Bimanual Dexterous Manipulation via Imitation Learning Zhenyu Jiang et.al. 2410.24185 translate read null
2024-10-31 Language-Driven Policy Distillation for Cooperative Driving in Multi-Agent Reinforcement Learning Jiaqi Liu et.al. 2410.24152 translate read null
2024-10-31 Reinforcement Learning Gradients as Vitamin for Online Finetuning Decision Transformers Kai Yan et.al. 2410.24108 translate read link
2024-10-31 Progressive Safeguards for Safe and Model-Agnostic Reinforcement Learning Nabil Omi et.al. 2410.24096 translate read null
2024-10-31 3D-ViTac: Learning Fine-Grained Manipulation with Visuo-Tactile Sensing Binghao Huang et.al. 2410.24091 translate read null
2024-10-31 Demystifying Linear MDPs and Novel Dynamics Aggregation Framework Joongkyu Lee et.al. 2410.24089 translate read null
2024-10-30 Keypoint Abstraction using Large Models for Object-Relative Imitation Learning Xiaolin Fang et.al. 2410.23254 translate read null
2024-10-30 Carrot and Stick: Eliciting Comparison Data and Beyond Yiling Chen et.al. 2410.23243 translate read null
2024-10-30 A little less conversation, a little more action, please: Investigating the physical common-sense of LLMs in a 3D embodied environment Matteo G. Mecattaf et.al. 2410.23242 translate read null
2024-10-30 COMAL: A Convergent Meta-Algorithm for Aligning LLMs with General Preferences Yixin Liu et.al. 2410.23223 translate read link
2024-10-31 Grounding by Trying: LLMs with Reinforcement Learning-Enhanced Retrieval Sheryl Hsu et.al. 2410.23214 translate read null
2024-10-30 Kinetix: Investigating the Training of General Agents through Open-Ended Physics-Based Control Tasks Michael Matthews et.al. 2410.23208 translate read null
2024-10-30 Energy-Efficient Intra-Domain Network Slicing for Multi-Layer Orchestration in Intelligent-Driven Distributed 6G Networks: Learning Generic Assignment Skills with Unsupervised Reinforcement Learning Navideh Ghafouri et.al. 2410.23161 translate read null
2024-10-30 VisualPredicator: Learning Abstract World Models with Neuro-Symbolic Predicates for Robot Planning Yichao Liang et.al. 2410.23156 translate read null
2024-10-30 From Hype to Reality: The Road Ahead of Deploying DRL in 6G Networks Haiyuan Li et.al. 2410.23086 translate read null
2024-10-30 Offline Reinforcement Learning and Sequence Modeling for Downlink Link Adaptation Samuele Peri et.al. 2410.23031 translate read null
2024-10-29 Environment as Policy: Learning to Race in Unseen Tracks Hongze Wang et.al. 2410.22308 translate read null
2024-10-29 EconoJax: A Fast & Scalable Economic Simulation in Jax Koen Ponse et.al. 2410.22165 translate read link
2024-10-29 Learning Successor Features the Simple Way Raymond Chua et.al. 2410.22133 translate read null
2024-10-29 PC-Gym: Benchmark Environments For Process Control Problems Maximilian Bloor et.al. 2410.22093 translate read null
2024-10-29 PrefPaint: Aligning Image Inpainting Diffusion Model with Human Preference Kendong Liu et.al. 2410.21966 translate read null
2024-10-29 Human-Readable Programs as Actors of Reinforcement Learning Agents Using Critic-Moderated Evolution Senne Deproost et.al. 2410.21940 translate read link
2024-10-29 Precise and Dexterous Robotic Manipulation via Human-in-the-Loop Reinforcement Learning Jianlan Luo et.al. 2410.21845 translate read link
2024-10-29 Robot Policy Learning with Temporal Optimal Transport Reward Yuwei Fu et.al. 2410.21795 translate read link
2024-10-29 Stochastic Approximation with Unbounded Markovian Noise: A General-Purpose Theorem Shaan Ul Haque et.al. 2410.21704 translate read null
2024-10-29 Sequential choice in ordered bundles Rajeev Kohli et.al. 2410.21670 translate read null
2024-10-28 LongReward: Improving Long-context Large Language Models with AI Feedback Jiajie Zhang et.al. 2410.21252 translate read link
2024-10-28 Quantum Reinforcement Learning-Based Two-Stage Unit Commitment Framework for Enhanced Power Systems Robustness Xiang Wei et.al. 2410.21240 translate read null
2024-10-28 Offline Reinforcement Learning With Combinatorial Action Spaces Matthew Landers et.al. 2410.21151 translate read null
2024-10-28 Robustness and Generalization in Quantum Reinforcement Learning via Lipschitz Regularization Nico Meyer et.al. 2410.21117 translate read link
2024-10-28 Dual-Agent Deep Reinforcement Learning for Dynamic Pricing and Replenishment Yi Zheng et.al. 2410.21109 translate read null
2024-10-28 Stronger Regret Bounds for Safe Online Reinforcement Learning in the Linear Quadratic Regulator Benjamin Schiffer et.al. 2410.21081 translate read null
2024-10-28 Getting By Goal Misgeneralization With a Little Help From a Mentor Tu Trinh et.al. 2410.21052 translate read null
2024-10-28 FairStream: Fair Multimedia Streaming Benchmark for Reinforcement Learning Agents Jannis Weil et.al. 2410.21029 translate read null
2024-10-28 Reference-Free Formula Drift with Reinforcement Learning: From Driving Data to Tire Energy-Inspired, Real-World Policies Franck Djeumou et.al. 2410.20990 translate read null
2024-10-28 BlueSuffix: Reinforced Blue Teaming for Vision-Language Models Against Jailbreak Attacks Yunhan Zhao et.al. 2410.20971 translate read null
2024-10-25 Adversarial Environment Design via Regret-Guided Diffusion Models Hojun Chung et.al. 2410.19715 translate read null
2024-10-25 DA-VIL: Adaptive Dual-Arm Manipulation with Reinforcement Learning and Variable Impedance Control Md Faizal Karim et.al. 2410.19712 translate read null
2024-10-25 MILES: Making Imitation Learning Easy with Self-Supervision Georgios Papagiannis et.al. 2410.19693 translate read null
2024-10-25 Automated generation of photonic circuits for Bell tests with homodyne measurements Corentin Lanore et.al. 2410.19670 translate read null
2024-10-25 MetaTrading: An Immersion-Aware Model Trading Framework for Vehicular Metaverse Services Hongjia Wu et.al. 2410.19665 translate read null
2024-10-25 Shared Control with Black Box Agents using Oracle Queries Inbal Avraham et.al. 2410.19612 translate read null
2024-10-25 OpenWebVoyager: Building Multimodal Web Agents via Iterative Real-World Exploration, Feedback and Optimization Hongliang He et.al. 2410.19609 translate read link
2024-10-25 Diverse Sign Language Translation Xin Shen et.al. 2410.19586 translate read null
2024-10-25 Robotic Learning in your Backyard: A Neural Simulator from Open Source Components Liyou Zhou et.al. 2410.19564 translate read null
2024-10-25 AgentForge: A Flexible Low-Code Platform for Reinforcement Learning Agent Design Francisco Erivaldo Fernandes Junior et.al. 2410.19528 translate read null
2024-10-24 SkillMimicGen: Automated Demonstration Generation for Efficient Skill Learning and Deployment Caelan Garrett et.al. 2410.18907 translate read null
2024-10-24 Improving Small-Scale Large Language Models Function Calling for Reasoning Tasks Graziano A. Manduzio et.al. 2410.18890 translate read null
2024-10-24 Diff-Instruct++: Training One-step Text-to-image Generator Model to Align with Human Preferences Weijian Luo et.al. 2410.18881 translate read null
2024-10-24 Learning Collusion in Episodic, Inventory-Constrained Markets Paul Friedrich et.al. 2410.18871 translate read null
2024-10-24 Towards Visual Text Design Transfer Across Languages Yejin Choi et.al. 2410.18823 translate read null
2024-10-24 PointPatchRL – Masked Reconstruction Improves Reinforcement Learning on Point Clouds Balázs Gyenes et.al. 2410.18800 translate read null
2024-10-24 Adapting MLOps for Diverse In-Network Intelligence in 6G Era: Challenges and Solutions Peizheng Li et.al. 2410.18793 translate read null
2024-10-24 Data Scaling Laws in Imitation Learning for Robotic Manipulation Fanqi Lin et.al. 2410.18647 translate read link
2024-10-24 Multi-agent cooperation through learning-aware policy gradients Alexander Meulemans et.al. 2410.18636 translate read null
2024-10-24 Leveraging Graph Neural Networks and Multi-Agent Reinforcement Learning for Inventory Control in Supply Chains Niki Kotecha et.al. 2410.18631 translate read null
2024-10-23 Prioritized Generative Replay Renhao Wang et.al. 2410.18082 translate read null
2024-10-23 Leveraging Skills from Unlabeled Prior Data for Efficient Online Exploration Max Wilcoxson et.al. 2410.18076 translate read link
2024-10-23 SPIRE: Synergistic Planning, Imitation, and Reinforcement Learning for Long-Horizon Manipulation Zihan Zhou et.al. 2410.18065 translate read null
2024-10-23 Cross-lingual Transfer of Reward Models in Multilingual Alignment Jiwoo Hong et.al. 2410.18027 translate read link
2024-10-23 Dynamic Spectrum Access for Ambient Backscatter Communication-assisted D2D Systems with Quantum Reinforcement Learning Nguyen Van Huynh et.al. 2410.17971 translate read null
2024-10-23 Slot: Provenance-Driven APT Detection through Graph Reinforcement Learning Wei Qiao et.al. 2410.17910 translate read null
2024-10-23 Reinforcement Learning under Latent Dynamics: Toward Statistical and Algorithmic Modularity Philip Amortila et.al. 2410.17904 translate read null
2024-10-23 Scalable Offline Reinforcement Learning for Mean Field Games Axel Brunnbauer et.al. 2410.17898 translate read null
2024-10-23 Learning Versatile Skills with Curriculum Masking Yao Tang et.al. 2410.17744 translate read link
2024-10-23 Optimizing Load Scheduling in Power Grids Using Reinforcement Learning and Markov Decision Processes Dongwen Luo et.al. 2410.17696 translate read null
2024-10-22 Few-shot In-Context Preference Learning Using Large Language Models Chao Yu et.al. 2410.17233 translate read null
2024-10-22 DyPNIPP: Predicting Environment Dynamics for RL-based Robust Informative Path Planning Srujan Deolasee et.al. 2410.17186 translate read null
2024-10-22 Reinforcement learning on structure-conditioned categorical diffusion for protein inverse folding Yasha Ektefaie et.al. 2410.17173 translate read link
2024-10-22 Reinforcement Learning for Data-Driven Workflows in Radio Interferometry. I. Principal Demonstration in Calibration Brian M. Kirk et.al. 2410.17135 translate read null
2024-10-22 Exploring RL-based LLM Training for Formal Language Tasks with Programmed Rewards Alexander G. Padula et.al. 2410.17126 translate read link
2024-10-22 Science Out of Its Ivory Tower: Improving Accessibility with Reinforcement Learning Haining Wang et.al. 2410.17088 translate read link
2024-10-22 Delay-Constrained Grant-Free Random Access in MIMO Systems: Distributed Pilot Allocation and Power Control Jianan Bai et.al. 2410.17068 translate read null
2024-10-22 Optimal Design for Reward Modeling in RLHF Antoine Scheid et.al. 2410.17055 translate read null
2024-10-22 Proleptic Temporal Ensemble for Improving the Speed of Robot Tasks Generated by Imitation Learning Hyeonjun Park et.al. 2410.16981 translate read null
2024-10-22 Safe Load Balancing in Software-Defined-Networking Lam Dinh et.al. 2410.16846 translate read null
2024-10-21 Improve Vision Language Model Chain-of-thought Reasoning Ruohong Zhang et.al. 2410.16198 translate read link
2024-10-21 RM-Bench: Benchmarking Reward Models of Language Models with Subtlety and Style Yantao Liu et.al. 2410.16184 translate read link
2024-10-21 SMART: Self-learning Meta-strategy Agent for Reasoning Tasks Rongxing Liu et.al. 2410.16128 translate read link
2024-10-21 Statistical Inference for Temporal Difference Learning with Linear Function Approximation Weichen Wu et.al. 2410.16106 translate read null
2024-10-21 A New Approach to Solving SMAC Task: Generating Decision Tree Code from Large Language Models Yue Deng et.al. 2410.16024 translate read link
2024-10-21 Information-Theoretic Minimax Regret Bounds for Reinforcement Learning based on Duality Raghav Bongole et.al. 2410.16013 translate read null
2024-10-21 ARCADE: Scalable Demonstration Collection and Generation via Augmented Reality for Imitation Learning Yue Yang et.al. 2410.15994 translate read null
2024-10-21 Learning Quadrotor Control From Visual Features Using Differentiable Simulation Johannes Heeg et.al. 2410.15979 translate read null
2024-10-21 Diverse Policies Recovering via Pointwise Mutual Information Weighted Imitation Learning Hanlin Yang et.al. 2410.15910 translate read null
2024-10-21 FlickerFusion: Intra-trajectory Domain Generalizing Multi-Agent RL Woosung Koh et.al. 2410.15876 translate read link
2024-10-18 Online Reinforcement Learning with Passive Memory Anay Pattanaik et.al. 2410.14665 translate read null
2024-10-18 A Large Language Model-Driven Reward Design Framework via Dynamic Feedback for Reinforcement Learning Shengjie Sun et.al. 2410.14660 translate read null
2024-10-18 Harnessing Causality in Reinforcement Learning With Bagged Decision Times Daiqi Gao et.al. 2410.14659 translate read null
2024-10-18 Benchmarking Deep Reinforcement Learning for Navigation in Denied Sensor Environments Mariusz Wisniewski et.al. 2410.14616 translate read link
2024-10-18 Streaming Deep Reinforcement Learning Finally Works Mohamed Elsayed et.al. 2410.14606 translate read link
2024-10-18 Reinforcement Learning in Non-Markov Market-Making Luca Lalor et.al. 2410.14504 translate read null
2024-10-18 Transfer Reinforcement Learning in Heterogeneous Action Spaces using Subgoal Mapping Kavinayan P. Sivakumar et.al. 2410.14484 translate read null
2024-10-18 DRL Optimization Trajectory Generation via Wireless Network Intent-Guided Diffusion Models for Optimizing Resource Allocation Junjie Wu et.al. 2410.14481 translate read null
2024-10-18 From Simple to Complex: Knowledge Transfer in Safe and Efficient Reinforcement Learning for Autonomous Driving Rongliang Zhou et.al. 2410.14468 translate read null
2024-10-18 MARLIN: Multi-Agent Reinforcement Learning Guided by Language-Based Inter-Robot Negotiation Toby Godfrey et.al. 2410.14383 translate read null
2024-10-17 Diffusing States and Matching Scores: A New Framework for Imitation Learning Runzhe Wu et.al. 2410.13855 translate read link
2024-10-17 ORSO: Accelerating Reward Design via Online Reward Selection and Policy Optimization Chen Bo Calvin Zhang et.al. 2410.13837 translate read link
2024-10-17 A Common Pitfall of Margin-based Language Model Alignment: Gradient Entanglement Hui Yuan et.al. 2410.13828 translate read link
2024-10-17 Guided Reinforcement Learning for Robust Multi-Contact Loco-Manipulation Jean-Pierre Sleiman et.al. 2410.13817 translate read null
2024-10-17 Is Prior-Free Black-Box Non-Stationary Reinforcement Learning Feasible? Argyrios Gerogiannis et.al. 2410.13772 translate read null
2024-10-17 Transformer Guided Coevolution: Improved Team Formation in Multiagent Adversarial Games Pranav Rajbhandari et.al. 2410.13769 translate read null
2024-10-17 Fine-Tuning Discrete Diffusion Models via Reward Optimization with Applications to DNA and Protein Design Chenyu Wang et.al. 2410.13643 translate read link
2024-10-17 Ornstein-Uhlenbeck Adaptation as a Mechanism for Learning in Brains and Machines Jesus Garcia Fernandez et.al. 2410.13563 translate read null
2024-10-17 Contracting With a Reinforcement Learning Agent by Playing Trick or Treat Matteo Bollini et.al. 2410.13520 translate read null
2024-10-17 Integrating Large Language Models and Reinforcement Learning for Non-Linear Reasoning Yoav Alon et.al. 2410.13501 translate read null
2024-10-16 Neural-based Control for CubeSat Docking Maneuvers Matteo Stoisa et.al. 2410.12703 translate read null
2024-10-16 Dynamic Learning Rate for Deep Reinforcement Learning: A Bandit Approach Henrique Donâncio et.al. 2410.12598 translate read null
2024-10-16 Robust RL with LLM-Driven Data Synthesis and Policy Adaptation for Autonomous Driving Sihao Wu et.al. 2410.12568 translate read null
2024-10-16 Spectrum Sharing using Deep Reinforcement Learning in Vehicular Networks Riya Dinesh Deshpande et.al. 2410.12521 translate read null
2024-10-16 Insights from the Inverse: Reconstructing LLM Training Goals Through Inverse RL Jared Joselowitz et.al. 2410.12491 translate read null
2024-10-16 SAC-GLAM: Improving Online RL for LLM agents with Soft Actor-Critic and Hindsight Relabeling Loris Gaven et.al. 2410.12481 translate read null
2024-10-16 Sharpness-Aware Black-Box Optimization Feiyang Ye et.al. 2410.12457 translate read null
2024-10-16 AoI-Aware Resource Allocation for Smart Multi-QoS Provisioning Jingqing Wang et.al. 2410.12384 translate read null
2024-10-16 PRefLexOR: Preference-based Recursive Language Modeling for Exploratory Optimization of Reasoning and Agentic Thinking Markus J. Buehler et.al. 2410.12375 translate read link
2024-10-16 GAN Based Top-Down View Synthesis in Reinforcement Learning Environments Usama Younus et.al. 2410.12372 translate read null
2024-10-15 Molecular Quantum Control Algorithm Design by Reinforcement Learning Anastasia Pipi et.al. 2410.11839 translate read null
2024-10-15 Mitigating Suboptimality of Deterministic Policy Gradients in Complex Q-functions Ayush Jain et.al. 2410.11833 translate read null
2024-10-15 Learning Smooth Humanoid Locomotion through Lipschitz-Constrained Policies Zixuan Chen et.al. 2410.11825 translate read null
2024-10-15 Solving The Dynamic Volatility Fitting Problem: A Deep Reinforcement Learning Approach Emmanuel Gnabeyeu et.al. 2410.11789 translate read null
2024-10-15 Zero-shot Model-based Reinforcement Learning using Large Language Models Abdelhakim Benechehab et.al. 2410.11711 translate read link
2024-10-15 BlendRL: A Framework for Merging Symbolic and Neural Policy Learning Hikaru Shindo et.al. 2410.11689 translate read null
2024-10-15 Understanding Likelihood Over-optimisation in Direct Alignment Algorithms Zhengyan Shi et.al. 2410.11677 translate read null
2024-10-15 Safety Filtering While Training: Improving the Performance and Sample Efficiency of Reinforcement Learning Agents Federico Pizarro Bejarano et.al. 2410.11671 translate read link
2024-10-15 Improve Value Estimation of Q Function and Reshape Reward with Monte Carlo Tree Search Jiamian Li et.al. 2410.11642 translate read null
2024-10-15 DeformPAM: Data-Efficient Learning for Long-horizon Deformable Object Manipulation via Preference-based Action Alignment Wendi Chen et.al. 2410.11584 translate read link
2024-10-14 Adaptive Diffusion Terrain Generator for Autonomous Uneven Terrain Navigation Youwei Yu et.al. 2410.10766 translate read null
2024-10-14 Online Statistical Inference for Time-varying Sample-averaged Q-learning Saunak Kumar Panda et.al. 2410.10737 translate read null
2024-10-14 Enhancing Robustness in Deep Reinforcement Learning: A Lyapunov Exponent Approach Rory Young et.al. 2410.10674 translate read null
2024-10-14 Transforming Game Play: A Comparative Study of DCQN and DTQN Architectures in Reinforcement Learning William A. Stigall et.al. 2410.10660 translate read null
2024-10-14 DR-MPC: Deep Residual Model Predictive Control for Real-world Social Navigation James R. Han et.al. 2410.10646 translate read null
2024-10-14 Traversability-Aware Legged Navigation by Learning from Real-World Visual Data Hongbo Zhang et.al. 2410.10621 translate read null
2024-10-14 Online waveform selection for cognitive radar Thulasi Tholeti et.al. 2410.10591 translate read null
2024-10-14 STACKFEED: Structured Textual Actor-Critic Knowledge Base Editing with FeedBack Naman Gupta et.al. 2410.10584 translate read null
2024-10-14 Burning RED: Unlocking Subtask-Driven Reinforcement Learning and Risk-Awareness in Average-Reward Markov Decision Processes Juan Sebastian Rojas et.al. 2410.10578 translate read null
2024-10-14 Continual Deep Reinforcement Learning to Prevent Catastrophic Forgetting in Jamming Mitigation Kemal Davaslioglu et.al. 2410.10521 translate read null
2024-10-11 Hierarchical Universal Value Function Approximators Rushiv Arora et.al. 2410.08997 translate read null
2024-10-11 Overcoming Slow Decision Frequencies in Continuous Control: Model-Based Sequence Reinforcement Learning for Model-Free Control Devdhar Patel et.al. 2410.08979 translate read null
2024-10-11 MAD-TD: Model-Augmented Data stabilizes High Update Ratio RL Claas A Voelcker et.al. 2410.08896 translate read null
2024-10-11 Drama: Mamba-Enabled Model-Based Reinforcement Learning Is Sample and Parameter Efficient Wenlong Wang et.al. 2410.08893 translate read link
2024-10-11 Adaptive optimization of wave energy conversion in oscillatory wave surge converters via SPH simulation and deep reinforcement learning Mai Ye et.al. 2410.08871 translate read null
2024-10-11 Can we hop in general? A discussion of benchmark selection and design using the Hopper environment Claas A Voelcker et.al. 2410.08870 translate read null
2024-10-11 Hybrid LLM-DDQN based Joint Optimization of V2I Communication and Autonomous Driving Zijiang Yan et.al. 2410.08854 translate read null
2024-10-11 Conformalized Interactive Imitation Learning: Handling Expert Shift and Intermittent Feedback Michelle Zhao et.al. 2410.08852 translate read null
2024-10-11 Public Transport Network Design for Equality of Accessibility via Message Passing Neural Networks and Reinforcement Learning Duo Wang et.al. 2410.08841 translate read null
2024-10-11 SOLD: Reinforcement Learning with Slot Object-Centric Latent Dynamics Malte Mosbach et.al. 2410.08822 translate read null
2024-10-10 GenARM: Reward Guided Generation with Autoregressive Reward Model for Test-time Alignment Yuancheng Xu et.al. 2410.08193 translate read null
2024-10-10 Rewarding Progress: Scaling Automated Process Verifiers for LLM Reasoning Amrith Setlur et.al. 2410.08146 translate read null
2024-10-10 VerifierQ: Enhancing LLM Test Time Compute with Q-Learning-based Verifiers Jianing Qi et.al. 2410.08048 translate read null
2024-10-10 Probabilistic Satisfaction of Temporal Logic Constraints in Reinforcement Learning via Adaptive Policy-Switching Xiaoshan Lin et.al. 2410.08022 translate read null
2024-10-10 Neuroplastic Expansion in Deep Reinforcement Learning Jiashun Liu et.al. 2410.07994 translate read null
2024-10-10 Variational Inequality Methods for Multi-Agent Reinforcement Learning: Performance and Stability Gains Baraah A. M. Sidahmed et.al. 2410.07976 translate read null
2024-10-10 AI Surrogate Model for Distributed Computing Workloads David K. Park et.al. 2410.07940 translate read null
2024-10-10 Offline Hierarchical Reinforcement Learning via Inverse Optimization Carolin Schmidt et.al. 2410.07933 translate read null
2024-10-10 Efficient Reinforcement Learning with Large Language Model Priors Xue Yan et.al. 2410.07927 translate read null
2024-10-10 Meta-Learning Integration in Hierarchical Reinforcement Learning for Advanced Task Complexity Arash Khajooeinejad et.al. 2410.07921 translate read link
2024-10-09 One Initialization to Rule them All: Fine-tuning via Explained Variance Adaptation Fabian Paischer et.al. 2410.07170 translate read null
2024-10-09 Retrieval-Augmented Decision Transformer: External Memory for In-context RL Thomas Schmied et.al. 2410.07071 translate read null
2024-10-09 Safe Reinforcement Learning Filter for Multicopter Collision-Free Tracking under disturbances Qihan Qi et.al. 2410.06852 translate read null
2024-10-09 A Safety Modulator Actor-Critic Method in Model-Free Safe Reinforcement Learning and Application in UAV Hovering Qihan Qi et.al. 2410.06847 translate read null
2024-10-09 Transfer Learning for a Class of Cascade Dynamical Systems Shima Rabiei et.al. 2410.06828 translate read null
2024-10-09 Deep End-to-End Survival Analysis with Temporal Consistency Mariana Vargas Vieyra et.al. 2410.06786 translate read null
2024-10-09 Q-WSL:Leveraging Dynamic Programming for Weighted Supervised Learning in Goal-conditioned RL Xing Lei et.al. 2410.06648 translate read null
2024-10-09 Variations in Multi-Agent Actor-Critic Frameworks for Joint Optimizations in UAV Swarm Networks: Recent Evolution, Challenges, and Directions Muhammad Morshed Alam et.al. 2410.06627 translate read null
2024-10-09 Effective Exploration Based on the Structural Information Principles Xianghua Zeng et.al. 2410.06621 translate read null
2024-10-09 Disturbance Observer-based Control Barrier Functions with Residual Model Learning for Safe Reinforcement Learning Dvij Kalaria et.al. 2410.06570 translate read null
2024-10-07 DART: A Diffusion-Based Autoregressive Motion Model for Real-Time Text-Driven Motion Control Kaifeng Zhao et.al. 2410.05260 translate read null
2024-10-07 SePPO: Semi-Policy Preference Optimization for Diffusion Alignment Daoan Zhang et.al. 2410.05255 translate read link
2024-10-07 ETGL-DDPG: A Deep Deterministic Policy Gradient Algorithm for Sparse Reward Continuous Control Ehsan Futuhi et.al. 2410.05225 translate read null
2024-10-07 Smart Jamming Attack and Mitigation on Deep Transfer Reinforcement Learning Enabled Resource Allocation for Network Slicing Shavbo Salehi et.al. 2410.05153 translate read null
2024-10-07 PAMLR: A Passive-Active Multi-Armed Bandit-Based Solution for LoRa Channel Allocation Jihoon Yun et.al. 2410.05147 translate read null
2024-10-07 Human-Feedback Efficient Reinforcement Learning for Online Diffusion Model Finetuning Ayano Hiranaka et.al. 2410.05116 translate read null
2024-10-07 AlphaRouter: Quantum Circuit Routing with Reinforcement Learning and Tree Search Wei Tang et.al. 2410.05115 translate read null
2024-10-07 Reinforcement Learning Control for Autonomous Hydraulic Material Handling Machines with Underactuated Tools Filippo A. Spinelli et.al. 2410.05093 translate read null
2024-10-07 HE-Drive: Human-Like End-to-End Driving with Vision Language Models Junming Wang et.al. 2410.05051 translate read null
2024-10-07 Active Fine-Tuning of Generalist Policies Marco Bagatella et.al. 2410.05026 translate read null
2024-10-04 Learning Humanoid Locomotion over Challenging Terrain Ilija Radosavovic et.al. 2410.03654 translate read null
2024-10-04 Aligning LLMs with Individual Preferences via Interaction Shujin Wu et.al. 2410.03642 translate read link
2024-10-04 Robust Offline Imitation Learning from Diverse Auxiliary Data Udita Ghosh et.al. 2410.03626 translate read null
2024-10-04 Open-World Reinforcement Learning over Long Short-Term Imagination Jiajian Li et.al. 2410.03618 translate read null
2024-10-04 Training on more Reachable Tasks for Generalisation in Reinforcement Learning Max Weltevrede et.al. 2410.03565 translate read null
2024-10-04 GAP-RL: Grasps As Points for RL Towards Dynamic Object Grasping Pengwei Xie et.al. 2410.03509 translate read null
2024-10-04 STREAMS: An Assistive Multimodal AI Framework for Empowering Biosignal Based Robotic Controls Ali Rabiee et.al. 2410.03486 translate read null
2024-10-04 Deep Reinforcement Learning for Delay-Optimized Task Offloading in Vehicular Fog Computin Mohammad Parsa Toopchinezhad et.al. 2410.03472 translate read null
2024-10-04 CLoSD: Closing the Loop between Simulation and Diffusion for multi-task character control Guy Tevet et.al. 2410.03441 translate read link
2024-10-04 ToolGen: Unified Tool Retrieval and Calling via Generation Renxi Wang et.al. 2410.03439 translate read link
2024-10-03 ReLIC: A Recipe for 64k Steps of In-Context Reinforcement Learning for Embodied AI Ahmad Elawady et.al. 2410.02751 translate read link
2024-10-03 MA-RLHF: Reinforcement Learning from Human Feedback with Macro Actions Yekun Chai et.al. 2410.02743 translate read link
2024-10-03 DivScene: Benchmarking LVLMs for Object Navigation with Diverse Scenes and Objects Zhaowei Wang et.al. 2410.02730 translate read link
2024-10-03 Grounded Answers for Multi-agent Decision-making Problem through Generative World Model Zeyang Liu et.al. 2410.02664 translate read null
2024-10-03 Beyond Expected Returns: A Policy Gradient Algorithm for Cumulative Prospect Theoretic Reinforcement Learning Olivier Lepel et.al. 2410.02605 translate read null
2024-10-03 Boosting Sample Efficiency and Generalization in Multi-agent Reinforcement Learning via Equivariance Joshua McClellan et.al. 2410.02581 translate read null
2024-10-03 Machine Learning Approaches for Active Queue Management: A Survey, Taxonomy, and Future Directions Mohammad Parsa Toopchinezhad et.al. 2410.02563 translate read null
2024-10-03 Semantic-Guided RL for Interpretable Feature Engineering Mohamed Bouadi et.al. 2410.02519 translate read null
2024-10-03 Learning Emergence of Interaction Patterns across Independent RL Agents in Multi-Agent Environments Vasanth Reddy Baddam et.al. 2410.02516 translate read null
2024-10-03 A Hitchhiker’s Guide To Active Motion Tobias Plasczyk et.al. 2410.02515 translate read null
2024-10-02 Bellman Diffusion: Generative Modeling as Learning a Linear Operator in the Distribution Space Yangming Li et.al. 2410.01796 translate read null
2024-10-02 Open Human-Robot Collaboration using Decentralized Inverse Reinforcement Learning Prasanth Sengadu Suresh et.al. 2410.01790 translate read null
2024-10-02 Investigating on RLHF methodology Alexey Kutalev et.al. 2410.01789 translate read null
2024-10-02 Social coordination perpetuates stereotypic expectations and behaviors across generations in deep multi-agent reinforcement learning Rebekah A. Gelpí et.al. 2410.01763 translate read null
2024-10-02 PreND: Enhancing Intrinsic Motivation in Reinforcement Learning through Pre-trained Network Distillation Mohammadamin Davoodabadi et.al. 2410.01745 translate read null
2024-10-02 Mimicking Human Intuition: Cognitive Belief-Driven Q-Learning Xingrui Gu et.al. 2410.01739 translate read null
2024-10-02 Evaluating Robustness of Reward Models for Mathematical Reasoning Sunghwan Kim et.al. 2410.01729 translate read null
2024-10-02 Performant, Memory Efficient and Scalable Multi-Agent Reinforcement Learning Omayma Mahjoub et.al. 2410.01706 translate read null
2024-10-02 VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment Amirhossein Kazemnejad et.al. 2410.01679 translate read link
2024-10-02 Finding path and cycle counting formulae in graphs with Deep Reinforcement Learning Jason Piquenot et.al. 2410.01661 translate read null
2024-10-01 Enhancing GANs with Contrastive Learning-Based Multistage Progressive Finetuning SNN and RL-Based External Optimization Osama Mustafa et.al. 2409.20340 translate read null

(<a href=../Reinforcement_Learning.md>back to Reinforcement Learning</a>)