Reinforcement Learning - 2025-01

Publish Date Title Authors PDF Translate Read Code
2025-01-31 Vintix: Action Model via In-Context Reinforcement Learning Andrey Polubarov et.al. 2501.19400 translate read link
2025-01-31 The Energy Loss Phenomenon in RLHF: A New Perspective on Mitigating Reward Hacking Yuchun Miao et.al. 2501.19358 translate read null
2025-01-31 Jackpot! Alignment as a Maximal Lottery Roberto-Rafael Maura-Rivero et.al. 2501.19266 translate read null
2025-01-31 Objective Metrics for Human-Subjects Evaluation in Explainable Reinforcement Learning Balint Gyevnar et.al. 2501.19256 translate read null
2025-01-31 Linear $Q$ -Learning Does Not Diverge: Convergence Rates to a Bounded Set Xinyu Liu et.al. 2501.19254 translate read null
2025-01-31 An Empirical Game-Theoretic Analysis of Autonomous Cyber-Defence Agents Gregory Palmer et.al. 2501.19206 translate read null
2025-01-31 APEX: Automated Parameter Exploration for Low-Power Wireless Protocols Mohamed Hassaan M. Hydher et.al. 2501.19194 translate read null
2025-01-31 Test-Time Training Scaling for Chemical Exploration in Drug Design Morgan Thomas et.al. 2501.19153 translate read null
2025-01-31 Decorrelated Soft Actor-Critic for Efficient Deep Reinforcement Learning Burcu Küçükoğlu et.al. 2501.19133 translate read null
2025-01-30 Design and Validation of Learning Aware HMI For Learning-Enabled Increasingly Autonomous Systems Parth Ganeriwala et.al. 2501.18506 translate read null
2025-01-30 Curriculum-based Sample Efficient Reinforcement Learning for Robust Stabilization of a Quadrotor Fausto Mauricio Lagos Suarez et.al. 2501.18490 translate read null
2025-01-30 Model-Free RL Agents Demonstrate System 1-Like Intentionality Hal Ashton et.al. 2501.18299 translate read null
2025-01-30 Neural Operator based Reinforcement Learning for Control of first-order PDEs with Spatially-Varying State Delay Jiaqi Hu et.al. 2501.18201 translate read null
2025-01-30 QNN-QRL: Quantum Neural Network Integrated with Quantum Reinforcement Learning for Quantum Key Distribution Bikash K. Behera et.al. 2501.18188 translate read null
2025-01-30 Investigating Tax Evasion Emergence Using Dual Large Language Model and Deep Reinforcement Learning Powered Agent-based Simulation Teddy Lazebnik et.al. 2501.18177 translate read null
2025-01-30 B3C: A Minimalist Approach to Offline Multi-Agent Reinforcement Learning Woojun Kim et.al. 2501.18138 translate read null
2025-01-30 Diverse Preference Optimization Jack Lanchantin et.al. 2501.18101 translate read null
2025-01-30 Reward Prediction Error Prioritisation in Experience Replay: The RPE-PER Method Hoda Yamani et.al. 2501.18093 translate read null
2025-01-30 DIAL: Distribution-Informed Adaptive Learning of Multi-Task Constraints for Safety-Critical Systems Se-Wook Yoo et.al. 2501.18086 translate read null
2025-01-29 From Sparse to Dense: Toddler-inspired Reward Transition in Goal-Oriented Reinforcement Learning Junseok Park et.al. 2501.17842 translate read null
2025-01-29 Langevin Soft Actor-Critic: Efficient Exploration through Uncertainty-Driven Critic Learning Haque Ishfaq et.al. 2501.17827 translate read null
2025-01-29 Consensus Based Stochastic Control Liyao Lyu et.al. 2501.17801 translate read null
2025-01-29 CAMP in the Odyssey: Provably Robust Reinforcement Learning with Certified Radius Maximization Derui Wang et.al. 2501.17667 translate read link
2025-01-29 Accelerated DC loadflow solver for topology optimization Nico Westerbeck et.al. 2501.17529 translate read null
2025-01-29 Human-Aligned Skill Discovery: Balancing Behaviour Exploration and Alignment Maxence Hussonnois et.al. 2501.17431 translate read null
2025-01-29 Certificated Actor-Critic: Hierarchical Reinforcement Learning with Control Barrier Functions for Safe Navigation Junjun Xie et.al. 2501.17424 translate read null
2025-01-29 Value Function Decomposition in Markov Recommendation Process Xiaobei Wang et.al. 2501.17409 translate read null
2025-01-29 A Dual-Agent Adversarial Framework for Robust Generalization in Deep Reinforcement Learning Zhengpeng Xie et.al. 2501.17384 translate read null
2025-01-29 ASAP: Learning Generalizable Online Bin Packing via Adaptive Selection After Pruning Han Fang et.al. 2501.17377 translate read null
2025-01-28 SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training Tianzhe Chu et.al. 2501.17161 translate read null
2025-01-28 Evidence on the Regularisation Properties of Maximum-Entropy Reinforcement Learning Rémy Hosseinkhan Boucher et.al. 2501.17115 translate read null
2025-01-28 Unlocking Transparent Alignment Through Enhanced Inverse Constitutional AI for Principle Extraction Carl-Leander Henneking et.al. 2501.17112 translate read null
2025-01-28 COS(M+O)S: Curiosity and RL-Enhanced MCTS for Exploring Story Space via Language Models Tobias Materzok et.al. 2501.17104 translate read null
2025-01-28 Learning Mean Field Control on Sparse Graphs Christian Fabian et.al. 2501.17079 translate read null
2025-01-28 Induced Modularity and Community Detection for Functionally Interpretable Reinforcement Learning Anna Soligo et.al. 2501.17077 translate read null
2025-01-28 Challenges in Ensuring AI Safety in DeepSeek-R1 Models: The Shortcomings of Reinforcement Learning Strategies Manojkumar Parmar et.al. 2501.17030 translate read null
2025-01-28 Network Slice-based Low-Altitude Intelligent Network for Advanced Air Mobility Kai Xiong et.al. 2501.17014 translate read null
2025-01-28 Heterogeneity-aware Personalized Federated Learning via Adaptive Dual-Agent Reinforcement Learning Xi Chen et.al. 2501.16966 translate read null
2025-01-28 On Rollouts in Model-Based Reinforcement Learning Bernd Frauenknecht et.al. 2501.16918 translate read link
2025-01-27 Upside Down Reinforcement Learning with Policy Generators Jacopo Di Ventura et.al. 2501.16288 translate read link
2025-01-27 Accelerating Quantum Reinforcement Learning with a Quantum Natural Policy Gradient Based Approach Yang Xu et.al. 2501.16243 translate read null
2025-01-27 Towards General-Purpose Model-Free Reinforcement Learning Scott Fujimoto et.al. 2501.16142 translate read link
2025-01-27 Quantifying the Self-Interest Level of Markov Social Dilemmas Richard Willis et.al. 2501.16138 translate read null
2025-01-27 ReFill: Reinforcement Learning for Fill-In Minimization Elfarouk Harb et.al. 2501.16130 translate read null
2025-01-27 Multi-Agent Meta-Offline Reinforcement Learning for Timely UAV Path Planning and Data Collection Eslam Eldeeb et.al. 2501.16098 translate read null
2025-01-27 Flexible Blood Glucose Control: Offline Reinforcement Learning from Human Feedback Harry Emerson et.al. 2501.15972 translate read null
2025-01-27 REINFORCE-ING Chemical Language Models in Drug Design Morgan Thomas et.al. 2501.15971 translate read null
2025-01-27 Inverse Reinforcement Learning via Convex Optimization Hao Zhu et.al. 2501.15957 translate read null
2025-01-27 Generative AI for Lyapunov Optimization Theory in UAV-based Low-Altitude Economy Networking Zhang Liu et.al. 2501.15928 translate read null
2025-01-24 An Attentive Graph Agent for Topology-Adaptive Cyber Defence Ilya Orson Sandoval et.al. 2501.14700 translate read link
2025-01-24 ACT-JEPA: Joint-Embedding Predictive Architecture Improves Policy Representation Learning Aleksandar Vujinovic et.al. 2501.14622 translate read null
2025-01-24 COMIX: Generalized Conflict Management in O-RAN xApps – Architecture, Workflow, and a Power Control case Anastasios Giannopoulos et.al. 2501.14619 translate read null
2025-01-24 Age and Power Minimization via Meta-Deep Reinforcement Learning in UAV Networks Sankani Sarathchandra et.al. 2501.14603 translate read null
2025-01-24 Reducing Action Space for Deep Reinforcement Learning via Causal Effect Estimation Wenzhang Liu et.al. 2501.14543 translate read link
2025-01-24 Breaking the Pre-Planning Barrier: Real-Time Adaptive Coordination of Mission and Charging UAVs Using Graph Reinforcement Learning Yuhan Hu et.al. 2501.14488 translate read null
2025-01-24 MARL-OT: Multi-Agent Reinforcement Learning Guided Online Fuzzing to Detect Safety Violation in Autonomous Driving Systems Linfeng Liang et.al. 2501.14451 translate read null
2025-01-24 Learning more with the same effort: how randomization improves the robustness of a robotic deep reinforcement learning agent Lucía Güitta-López et.al. 2501.14443 translate read null
2025-01-24 SKIL: Semantic Keypoint Imitation Learning for Generalizable Data-efficient Manipulation Shengjie Wang et.al. 2501.14400 translate read null
2025-01-24 Reinforcement Learning for Efficient Returns Management Pascal Linden et.al. 2501.14394 translate read null
2025-01-23 CRPO: Confidence-Reward Driven Preference Optimization for Machine Translation Guofeng Cui et.al. 2501.13927 translate read null
2025-01-23 Improving Video Generation with Human Feedback Jie Liu et.al. 2501.13918 translate read link
2025-01-23 GUI-Bee: Align GUI Action Grounding to Novel Environments via Autonomous Exploration Yue Fan et.al. 2501.13896 translate read null
2025-01-23 Utilizing Evolution Strategies to Train Transformers in Reinforcement Learning Matyáš Lorenc et.al. 2501.13883 translate read link
2025-01-23 A space-decoupling framework for optimization on bounded-rank matrices with orthogonally invariant constraints Yan Yang et.al. 2501.13830 translate read null
2025-01-23 Large Language Model driven Policy Exploration for Recommender Systems Jie Wang et.al. 2501.13816 translate read null
2025-01-23 Integrating Causality with Neurochaos Learning: Proposed Approach and Research Agenda Nanjangud C. Narendra et.al. 2501.13763 translate read null
2025-01-23 Scalable Safe Multi-Agent Reinforcement Learning for Multi-Agent System Haikuo Du et.al. 2501.13727 translate read null
2025-01-23 WFCRL: A Multi-Agent Reinforcement Learning Benchmark for Wind Farm Control Claire Bizon Monroc et.al. 2501.13592 translate read link
2025-01-23 Explainable AI-aided Feature Selection and Model Reduction for DRL-based V2X Resource Allocation Nasir Khan et.al. 2501.13552 translate read null
2025-01-22 Which Sensor to Observe? Timely Tracking of a Joint Markov Source with Model Predictive Control Ismail Cosandal et.al. 2501.13099 translate read null
2025-01-22 Attention-Driven Hierarchical Reinforcement Learning with Particle Filtering for Source Localization in Dynamic Fields Yiwei Shi et.al. 2501.13084 translate read null
2025-01-22 Evolution and The Knightian Blindspot of Machine Learning Joel Lehman et.al. 2501.13075 translate read null
2025-01-22 AdaWM: Adaptive World Model based Planning for Autonomous Driving Hang Wang et.al. 2501.13072 translate read null
2025-01-22 Optimizing Return Distributions with Distributional Dynamic Programming Bernardo Ávila Pires et.al. 2501.13028 translate read null
2025-01-22 MONA: Myopic Optimization with Non-myopic Approval Can Mitigate Multi-step Reward Hacking Sebastian Farquhar et.al. 2501.13011 translate read null
2025-01-22 An Offline Multi-Agent Reinforcement Learning Framework for Radio Resource Management Eslam Eldeeb et.al. 2501.12991 translate read null
2025-01-22 DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning DeepSeek-AI et.al. 2501.12948 translate read link
2025-01-22 Offline Critic-Guided Diffusion Policy for Multi-User Delay-Constrained Scheduling Zhuoran Li et.al. 2501.12942 translate read null
2025-01-22 Reinforcement learning Based Automated Design of Differential Evolution Algorithm for Black-box Optimization Xu Yang et.al. 2501.12881 translate read null
2025-01-21 InternLM-XComposer2.5-Reward: A Simple Yet Effective Multi-Modal Reward Model Yuhang Zang et.al. 2501.12368 translate read link
2025-01-21 ARM-IRL: Adaptive Resilience Metric Quantification Using Inverse Reinforcement Learning Abhijeet Sahu et.al. 2501.12362 translate read null
2025-01-21 Sum Rate Enhancement using Machine Learning for Semi-Self Sensing Hybrid RIS-Enabled ISAC in THz Bands Sara Farrag Mobarak et.al. 2501.12353 translate read null
2025-01-21 Towards neural reinforcement learning for large deviations in nonequilibrium systems with memory Venkata D. Pamulaparthy et.al. 2501.12333 translate read null
2025-01-21 Heuristic Deep Reinforcement Learning for Phase Shift Optimization in RIS-assisted Secure Satellite Communication Systems with RSMA Tingnan Bao et.al. 2501.12311 translate read null
2025-01-21 RL-RC-DoT: A Block-level RL agent for Task-Aware Video Compression Uri Gadot et.al. 2501.12216 translate read null
2025-01-21 Experience-replay Innovative Dynamics Tuo Zhang et.al. 2501.12199 translate read null
2025-01-21 Extend Adversarial Policy Against Neural Machine Translation via Unknown Token Wei Zou et.al. 2501.12183 translate read null
2025-01-21 DNRSelect: Active Best View Selection for Deferred Neural Rendering Dongli Wu et.al. 2501.12150 translate read null
2025-01-21 Tackling Uncertainties in Multi-Agent Reinforcement Learning through Integration of Agent Termination Dynamics Somnath Hazra et.al. 2501.12061 translate read link
2025-01-17 DexForce: Extracting Force-informed Actions from Kinesthetic Demonstrations for Dexterous Manipulation Claire Chen et.al. 2501.10356 translate read null
2025-01-17 Enhancing AI Transparency: XRL-Based Resource Management and RAN Slicing for 6G ORAN Architecture Suvidha Mhatre et.al. 2501.10292 translate read null
2025-01-17 Enhancing UAV Path Planning Efficiency Through Accelerated Learning Joseanne Viana et.al. 2501.10141 translate read null
2025-01-17 Spatio-temporal Graph Learning on Adaptive Mined Key Frames for High-performance Multi-Object Tracking Futian Wang et.al. 2501.10129 translate read null
2025-01-17 PaSa: An LLM Agent for Comprehensive Academic Paper Search Yichen He et.al. 2501.10120 translate read link
2025-01-17 GAWM: Global-Aware World Model for Multi-Agent Reinforcement Learning Zifeng Shi et.al. 2501.10116 translate read null
2025-01-17 Robotic World Model: A Neural Network Simulator for Robust Policy Optimization in Robotics Chenhao Li et.al. 2501.10100 translate read null
2025-01-17 ForestProtector: An IoT Architecture Integrating Machine Vision and Deep Reinforcement Learning for Efficient Wildfire Monitoring Kenneth Bonilla-Ormachea et.al. 2501.09926 translate read null
2025-01-17 SLIM: Sim-to-Real Legged Instructive Manipulation via Long-Horizon Visuomotor Learning Haichao Zhang et.al. 2501.09905 translate read null
2025-01-16 From Explainability to Interpretability: Interpretable Policies in Reinforcement Learning Via Model Explanation Peilang Li et.al. 2501.09858 translate read null
2025-01-16 Towards Large Reasoning Models: A Survey of Reinforced Reasoning with Large Language Models Fengli Xu et.al. 2501.09686 translate read null
2025-01-16 Optimizing hypergraph product codes with random walks, simulated annealing and reinforcement learning Bruno C. A. Freire et.al. 2501.09622 translate read null
2025-01-16 Beyond Reward Hacking: Causal Rewards for Large Language Model Alignment Chaoqi Wang et.al. 2501.09620 translate read null
2025-01-16 EVaDE : Event-Based Variational Thompson Sampling for Model-Based Reinforcement Learning Siddharth Aravindan et.al. 2501.09611 translate read null
2025-01-16 RE-POSE: Synergizing Reinforcement Learning-Based Partitioning and Offloading for Edge Object Detection Jianrui Shi et.al. 2501.09465 translate read null
2025-01-16 ADAGE: A generic two-layer framework for adaptive agent based modelling Benjamin Patrick Evans et.al. 2501.09429 translate read null
2025-01-16 Fast Searching of Extreme Operating Conditions for Relay Protection Setting Calculation Based on Graph Neural Network and Reinforcement Learning Yan Li et.al. 2501.09399 translate read null
2025-01-16 Contract-Inspired Contest Theory for Controllable Image Generation in Mobile Edge Metaverse Guangyuan Liu et.al. 2501.09391 translate read null
2025-01-16 Adaptive Contextual Caching for Mobile Edge Large Language Model Service Guangyuan Liu et.al. 2501.09383 translate read null
2025-01-16 Solving Infinite-Player Games with Player-to-Strategy Networks Carlos Martin et.al. 2501.09330 translate read null
2025-01-15 Computing Approximated Fixpoints via Dampened Mann Iteration Paolo Baldan et.al. 2501.08950 translate read null
2025-01-15 A Reinforcement Learning Approach to Quiet and Safe UAM Traffic Management Surya Murthy et.al. 2501.08941 translate read null
2025-01-15 Reinforcement learning-based adaptive time-integration for nonsmooth dynamics David Riley et.al. 2501.08934 translate read null
2025-01-15 Projection Implicit Q-Learning with Support Constraint for Offline Reinforcement Learning Xinchen Han et.al. 2501.08907 translate read null
2025-01-15 Deep Learning Meets Queue-Reactive: A Framework for Realistic Limit Order Book Simulation Hamza Bodor et.al. 2501.08822 translate read null
2025-01-15 Multi-visual modality micro drone-based structural damage detection Isaac Osei Agyemanga et.al. 2501.08807 translate read null
2025-01-15 Networked Agents in the Dark: Team Value Learning under Partial Observability Guilherme S. Varela et.al. 2501.08778 translate read null
2025-01-15 SPEQ: Stabilization Phases for Efficient Q-Learning in High Update-To-Data Ratio Reinforcement Learning Carlo Romeo et.al. 2501.08669 translate read null
2025-01-15 Application of Deep Reinforcement Learning to UAV Swarming for Ground Surveillance Raúl Arranz et.al. 2501.08655 translate read null
2025-01-15 RLHS: Mitigating Misalignment in RLHF with Hindsight Simulation Kaiqu Liang et.al. 2501.08617 translate read null
2025-01-14 FDPP: Fine-tune Diffusion Policy with Human Preference Yuxin Chen et.al. 2501.08259 translate read null
2025-01-14 Dynamic Pricing in High-Speed Railways Using Multi-Agent Reinforcement Learning Enrique Adrian Villarrubia-Martin et.al. 2501.08234 translate read null
2025-01-14 Optimization of Link Configuration for Satellite Communication Using Reinforcement Learning Tobias Rohe et.al. 2501.08220 translate read null
2025-01-14 In-situ graph reasoning and knowledge expansion using Graph-PReFLexOR Markus J. Buehler et.al. 2501.08120 translate read null
2025-01-14 Data-driven inventory management for new products: A warm-start and adjusted Dyna- $Q$ approach Xinyu Qu et.al. 2501.08109 translate read null
2025-01-14 Hybrid Action Based Reinforcement Learning for Multi-Objective Compatible Autonomous Driving Guizhe Jin et.al. 2501.08096 translate read null
2025-01-14 CuAsmRL: Optimizing GPU SASS Schedules via Deep Reinforcement Learning Guoliang He et.al. 2501.08071 translate read null
2025-01-14 Continual Reinforcement Learning for Digital Twin Synchronization Optimization Haonan Tong et.al. 2501.08045 translate read null
2025-01-14 READ: Reinforcement-based Adversarial Learning for Text Classification with Limited Labeled Data Rohit Sharma et.al. 2501.08035 translate read null
2025-01-14 Cooperative Patrol Routing: Optimizing Urban Crime Surveillance through Multi-Agent Reinforcement Learning Juan Palma-Borda et.al. 2501.08020 translate read null
2025-01-13 SafeSwarm: Decentralized Safe RL for the Swarm of Drones Landing in Dense Crowds Grik Tadevosyan et.al. 2501.07566 translate read null
2025-01-13 Improving DeFi Accessibility through Efficient Liquidity Provisioning with Deep Reinforcement Learning Haonan Xu et.al. 2501.07508 translate read null
2025-01-13 RbRL2.0: Integrated Reward and Policy Learning for Rating-based Reinforcement Learning Mingkang Wu et.al. 2501.07502 translate read null
2025-01-13 Online inductive learning from answer sets for efficient reinforcement learning exploration Celeste Veronese et.al. 2501.07445 translate read null
2025-01-13 Attention when you need Lokesh Boominathan et.al. 2501.07440 translate read null
2025-01-13 Enhancing Online Reinforcement Learning with Meta-Learned Objective from Offline Data Shilong Deng et.al. 2501.07346 translate read link
2025-01-13 Foundation Models at Work: Fine-Tuning for Fairness in Algorithmic Hiring Buse Sibel Korkmaz et.al. 2501.07324 translate read link
2025-01-13 Mining Intraday Risk Factor Collections via Hierarchical Reinforcement Learning based on Transferred Options Wenyan Xu et.al. 2501.07274 translate read null
2025-01-13 Future-Conditioned Recommendations with Multi-Objective Controllable Decision Transformer Chongming Gao et.al. 2501.07212 translate read null
2025-01-13 Generalizable Graph Neural Networks for Robust Power Grid Topology Control Matthijs de Jong et.al. 2501.07186 translate read null
2025-01-10 From discrete-time policies to continuous-time diffusion samplers: Asymptotic equivalences and faster training Julius Berner et.al. 2501.06148 translate read link
2025-01-10 Vehicle-in-Virtual-Environment (VVE) Based Autonomous Driving Function Development and Evaluation Methodology for Vulnerable Road User Safety Haochong Chen et.al. 2501.06113 translate read null
2025-01-10 Learning Flexible Heterogeneous Coordination with Capability-Aware Shared Hypernetworks Kevin Fu et.al. 2501.06058 translate read null
2025-01-10 Investigating the Impact of Observation Space Design Choices On Training Reinforcement Learning Solutions for Spacecraft Problems Nathaniel Hamilton et.al. 2501.06016 translate read null
2025-01-10 The Safe Trusted Autonomy for Responsible Space Program Kerianne L. Hobbs et.al. 2501.05984 translate read null
2025-01-10 A Practical Demonstration of DRL-Based Dynamic Resource Allocation xApp Using OpenAirInterface Onur Sever et.al. 2501.05879 translate read null
2025-01-10 Diffusion Models for Smarter UAVs: Decision-Making and Modeling Yousef Emami et.al. 2501.05819 translate read null
2025-01-10 Real-Time Integrated Dispatching and Idle Fleet Steering with Deep Reinforcement Learning for A Meal Delivery Platform Jingyi Cheng et.al. 2501.05808 translate read null
2025-01-10 Understanding Impact of Human Feedback via Influence Functions Taywon Min et.al. 2501.05790 translate read link
2025-01-09 Session-Level Dynamic Ad Load Optimization using Offline Robust Reinforcement Learning Tao Liu et.al. 2501.05591 translate read null
2025-01-09 TimeRL: Efficient Deep Reinforcement Learning with Polyhedral Dependence Graphs Pedro F. Silvestre et.al. 2501.05408 translate read null
2025-01-09 Search-o1: Agentic Search-Enhanced Large Reasoning Models Xiaoxi Li et.al. 2501.05366 translate read link
2025-01-09 Knowledge Transfer in Model-Based Reinforcement Learning Agents for Efficient Multi-Task Learning Dmytro Kuzmenko et.al. 2501.05329 translate read null
2025-01-09 Design and Control of a Bipedal Robotic Character Ruben Grandia et.al. 2501.05204 translate read null
2025-01-09 Constrained Optimization of Charged Particle Tracking with Multi-Agent Reinforcement Learning Tobias Kortus et.al. 2501.05113 translate read null
2025-01-09 LearningFlow: Automated Policy Learning Workflow for Urban Driving with Large Language Models Zengqi Peng et.al. 2501.05057 translate read null
2025-01-09 CuRLA: Curriculum Learning Based Deep Reinforcement Learning for Autonomous Driving Bhargava Uppuluri et.al. 2501.04982 translate read null
2025-01-09 Promoting Shared Energy Storage Aggregation among High Price-Tolerance Prosumer: An Incentive Deposit and Withdrawal Service Xin Lu et.al. 2501.04964 translate read null
2025-01-09 Balancing Exploration and Cybersickness: Investigating Curiosity-Driven Behavior in Virtual Environments Tangyao Li et.al. 2501.04905 translate read null
2025-01-08 Multilinear Tensor Low-Rank Approximation for Policy-Gradient Methods in Reinforcement Learning Sergio Rozada et.al. 2501.04879 translate read null
2025-01-08 Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Thought Violet Xiang et.al. 2501.04682 translate read null
2025-01-08 Framework for Integrating Machine Learning Methods for Path-Aware Source Routing Anees Al-Najjar et.al. 2501.04624 translate read null
2025-01-08 MobileH2R: Learning Generalizable Human to Mobile Robot Handover Exclusively from Scalable and Diverse Synthetic Data Zifan Wang et.al. 2501.04595 translate read null
2025-01-08 HypeRL: Parameter-Informed Reinforcement Learning for Parametric PDEs Nicolò Botteghi et.al. 2501.04538 translate read null
2025-01-08 Safe Reinforcement Learning with Minimal Supervision Alexander Quessy et.al. 2501.04481 translate read null
2025-01-08 Research on environment perception and behavior prediction of intelligent UAV based on semantic communication Kechong Ren et.al. 2501.04480 translate read null
2025-01-08 Hybrid Artificial Intelligence Strategies for Drone Navigation Rubén San-Segundo et.al. 2501.04472 translate read null
2025-01-08 Risk-averse policies for natural gas futures trading using distributional reinforcement learning Félicien Hêche et.al. 2501.04421 translate read null
2025-01-08 Constraints as Rewards: Reinforcement Learning for Robots without Reward Functions Yu Ishihara et.al. 2501.04228 translate read null
2025-01-07 Explainable Reinforcement Learning via Temporal Policy Decomposition Franco Ruggeri et.al. 2501.03902 translate read null
2025-01-07 Neural DNF-MT: A Neuro-symbolic Approach for Learning Interpretable and Editable Policies Kexin Gu Baugh et.al. 2501.03888 translate read null
2025-01-07 AlphaPO – Reward shape matters for LLM alignment Aman Gupta et.al. 2501.03884 translate read null
2025-01-07 Online Reinforcement Learning-Based Dynamic Adaptive Evaluation Function for Real-Time Strategy Tasks Weilong Yang et.al. 2501.03824 translate read null
2025-01-07 Run-and-tumble chemotaxis using reinforcement learning Ramesh Pramanik et.al. 2501.03687 translate read null
2025-01-07 IEEE 802.11bn Multi-AP Coordinated Spatial Reuse with Hierarchical Multi-Armed Bandits Maksymilian Wojnar et.al. 2501.03680 translate read null
2025-01-07 SALE-Based Offline Reinforcement Learning with Ensemble Q-Networks Zheng Chun et.al. 2501.03676 translate read null
2025-01-07 Imitation Learning of MPC with Neural Networks: Error Guarantees and Sparsification Hendrik Alsmeier et.al. 2501.03671 translate read null
2025-01-07 Rethinking Adversarial Attacks in Reinforcement Learning from Policy Distribution Perspective Tianyang Duan et.al. 2501.03562 translate read null
2025-01-07 Align-Pro: A Principled Approach to Prompt Optimization for LLM Alignment Prashant Trivedi et.al. 2501.03486 translate read null
2025-01-06 Turn-based Multi-Agent Reinforcement Learning Model Checking Dennis Gross et.al. 2501.03187 translate read null
2025-01-06 Co-Activation Graph Analysis of Safety-Verified and Explainable Deep Reinforcement Learning Policies Dennis Gross et.al. 2501.03142 translate read null
2025-01-06 CALM: Curiosity-Driven Auditing for Large Language Models Xiang Zheng et.al. 2501.02997 translate read null
2025-01-06 CAMP: Collaborative Attention Model with Profiles for Vehicle Routing Problems Chuanbo Hua et.al. 2501.02977 translate read null
2025-01-06 Sim-to-Real Transfer for Mobile Robots with Reinforcement Learning: from NVIDIA Isaac Sim to Gazebo and Real ROS 2 Robots Sahar Salimpour et.al. 2501.02902 translate read link
2025-01-06 Revisiting Communication Efficiency in Multi-Agent Reinforcement Learning from the Dimensional Analysis Perspective Chuxiong Sun et.al. 2501.02888 translate read null
2025-01-06 First-place Solution for Streetscape Shop Sign Recognition Competition Bin Wang et.al. 2501.02811 translate read null
2025-01-06 Segmenting Text and Learning Their Rewards for Improved RLHF in Language Model Yueqin Yin et.al. 2501.02790 translate read null
2025-01-06 Joint Optimization of UAV-Carried IRS for Urban Low Altitude mmWave Communications with Deep Reinforcement Learning Wenwen Xie et.al. 2501.02787 translate read null
2025-01-06 Learn A Flexible Exploration Model for Parameterized Action Markov Decision Processes Zijian Wang et.al. 2501.02774 translate read null
2025-01-03 Evaluating Scenario-based Decision-making for Interactive Autonomous Driving Using Rational Criteria: A Survey Zhen Tian et.al. 2501.01886 translate read null
2025-01-03 Auto-RT: Automatic Jailbreak Strategy Exploration for Red-Teaming Large Language Models Yanjiang Liu et.al. 2501.01830 translate read null
2025-01-03 Genetic algorithm enhanced Solovay-Kitaev algorithm for quantum compiling Jiangwei Long et.al. 2501.01746 translate read null
2025-01-03 Proposing Hierarchical Goal-Conditioned Policy Planning in Multi-Goal Reinforcement Learning Gavin B. Rens et.al. 2501.01727 translate read null
2025-01-03 Inversely Learning Transferable Rewards via Abstracted States Yikang Gui et.al. 2501.01669 translate read null
2025-01-03 BLAST: A Stealthy Backdoor Leverage Attack against Cooperative Multi-Agent Deep Reinforcement Learning based Systems Yinbo Yu et.al. 2501.01593 translate read null
2025-01-02 Reinforcement-learning-based control of turbulent channel flows at high Reynolds numbers Zisong Zhou et.al. 2501.01573 translate read null
2025-01-02 Reinforcement Learning for Respondent-Driven Sampling Justin Weltz et.al. 2501.01505 translate read null
2025-01-02 Decoding Knowledge in Large Language Models: A Framework for Categorization and Comprehension Yanbo Fang et.al. 2501.01332 translate read null
2025-01-02 Towards Intelligent Antenna Positioning: Leveraging DRL for FAS-Aided ISAC Systems Shunxing Yang et.al. 2501.01281 translate read null
2025-01-02 PIMAEX: Multi-Agent Exploration through Peer Incentivization Michael Kölle et.al. 2501.01266 translate read null
2025-01-02 Embodied AI-Enhanced Vehicular Networks: An Integrated Large Language Models and Reinforcement Learning Method Ruichen Zhang et.al. 2501.01141 translate read null
2025-01-02 Communicating Unexpectedness for Out-of-Distribution Multi-Agent Reinforcement Learning Min Whoo Lee et.al. 2501.01140 translate read null
2025-01-02 Symmetries-enhanced Multi-Agent Reinforcement Learning Nikolaos Bousias et.al. 2501.01136 translate read null
2025-01-02 Noise-Resilient Symbolic Regression with Dynamic Gating Reinforcement Learning Chenglu Sun et.al. 2501.01085 translate read null
2025-01-02 Enhancing Neural Adaptive Wireless Video Streaming via Lower-Layer Information Exposure and Online Tuning Lingzhi Zhao et.al. 2501.01044 translate read null
2025-01-02 Energy-Efficient and Intelligent ISAC in V2X Networks with Spiking Neural Networks-Driven DRL Chen Shang et.al. 2501.01038 translate read null
2025-01-02 Deep Reinforcement Learning for Job Scheduling and Resource Management in Cloud Computing: An Algorithm-Level Review Yan Gu et.al. 2501.01007 translate read null

(<a href=../Reinforcement_Learning.md>back to Reinforcement Learning</a>)