Reinforcement Learning - 2025-01
Reinforcement Learning - 2025-01
| Publish Date | Title | Authors | Translate | Read | Code | |
|---|---|---|---|---|---|---|
| 2025-01-31 | Vintix: Action Model via In-Context Reinforcement Learning | Andrey Polubarov et.al. | 2501.19400 | translate | read | link |
| 2025-01-31 | The Energy Loss Phenomenon in RLHF: A New Perspective on Mitigating Reward Hacking | Yuchun Miao et.al. | 2501.19358 | translate | read | null |
| 2025-01-31 | Jackpot! Alignment as a Maximal Lottery | Roberto-Rafael Maura-Rivero et.al. | 2501.19266 | translate | read | null |
| 2025-01-31 | Objective Metrics for Human-Subjects Evaluation in Explainable Reinforcement Learning | Balint Gyevnar et.al. | 2501.19256 | translate | read | null |
| 2025-01-31 | Linear $Q$ -Learning Does Not Diverge: Convergence Rates to a Bounded Set | Xinyu Liu et.al. | 2501.19254 | translate | read | null |
| 2025-01-31 | An Empirical Game-Theoretic Analysis of Autonomous Cyber-Defence Agents | Gregory Palmer et.al. | 2501.19206 | translate | read | null |
| 2025-01-31 | APEX: Automated Parameter Exploration for Low-Power Wireless Protocols | Mohamed Hassaan M. Hydher et.al. | 2501.19194 | translate | read | null |
| 2025-01-31 | Test-Time Training Scaling for Chemical Exploration in Drug Design | Morgan Thomas et.al. | 2501.19153 | translate | read | null |
| 2025-01-31 | Decorrelated Soft Actor-Critic for Efficient Deep Reinforcement Learning | Burcu Küçükoğlu et.al. | 2501.19133 | translate | read | null |
| 2025-01-30 | Design and Validation of Learning Aware HMI For Learning-Enabled Increasingly Autonomous Systems | Parth Ganeriwala et.al. | 2501.18506 | translate | read | null |
| 2025-01-30 | Curriculum-based Sample Efficient Reinforcement Learning for Robust Stabilization of a Quadrotor | Fausto Mauricio Lagos Suarez et.al. | 2501.18490 | translate | read | null |
| 2025-01-30 | Model-Free RL Agents Demonstrate System 1-Like Intentionality | Hal Ashton et.al. | 2501.18299 | translate | read | null |
| 2025-01-30 | Neural Operator based Reinforcement Learning for Control of first-order PDEs with Spatially-Varying State Delay | Jiaqi Hu et.al. | 2501.18201 | translate | read | null |
| 2025-01-30 | QNN-QRL: Quantum Neural Network Integrated with Quantum Reinforcement Learning for Quantum Key Distribution | Bikash K. Behera et.al. | 2501.18188 | translate | read | null |
| 2025-01-30 | Investigating Tax Evasion Emergence Using Dual Large Language Model and Deep Reinforcement Learning Powered Agent-based Simulation | Teddy Lazebnik et.al. | 2501.18177 | translate | read | null |
| 2025-01-30 | B3C: A Minimalist Approach to Offline Multi-Agent Reinforcement Learning | Woojun Kim et.al. | 2501.18138 | translate | read | null |
| 2025-01-30 | Diverse Preference Optimization | Jack Lanchantin et.al. | 2501.18101 | translate | read | null |
| 2025-01-30 | Reward Prediction Error Prioritisation in Experience Replay: The RPE-PER Method | Hoda Yamani et.al. | 2501.18093 | translate | read | null |
| 2025-01-30 | DIAL: Distribution-Informed Adaptive Learning of Multi-Task Constraints for Safety-Critical Systems | Se-Wook Yoo et.al. | 2501.18086 | translate | read | null |
| 2025-01-29 | From Sparse to Dense: Toddler-inspired Reward Transition in Goal-Oriented Reinforcement Learning | Junseok Park et.al. | 2501.17842 | translate | read | null |
| 2025-01-29 | Langevin Soft Actor-Critic: Efficient Exploration through Uncertainty-Driven Critic Learning | Haque Ishfaq et.al. | 2501.17827 | translate | read | null |
| 2025-01-29 | Consensus Based Stochastic Control | Liyao Lyu et.al. | 2501.17801 | translate | read | null |
| 2025-01-29 | CAMP in the Odyssey: Provably Robust Reinforcement Learning with Certified Radius Maximization | Derui Wang et.al. | 2501.17667 | translate | read | link |
| 2025-01-29 | Accelerated DC loadflow solver for topology optimization | Nico Westerbeck et.al. | 2501.17529 | translate | read | null |
| 2025-01-29 | Human-Aligned Skill Discovery: Balancing Behaviour Exploration and Alignment | Maxence Hussonnois et.al. | 2501.17431 | translate | read | null |
| 2025-01-29 | Certificated Actor-Critic: Hierarchical Reinforcement Learning with Control Barrier Functions for Safe Navigation | Junjun Xie et.al. | 2501.17424 | translate | read | null |
| 2025-01-29 | Value Function Decomposition in Markov Recommendation Process | Xiaobei Wang et.al. | 2501.17409 | translate | read | null |
| 2025-01-29 | A Dual-Agent Adversarial Framework for Robust Generalization in Deep Reinforcement Learning | Zhengpeng Xie et.al. | 2501.17384 | translate | read | null |
| 2025-01-29 | ASAP: Learning Generalizable Online Bin Packing via Adaptive Selection After Pruning | Han Fang et.al. | 2501.17377 | translate | read | null |
| 2025-01-28 | SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training | Tianzhe Chu et.al. | 2501.17161 | translate | read | null |
| 2025-01-28 | Evidence on the Regularisation Properties of Maximum-Entropy Reinforcement Learning | Rémy Hosseinkhan Boucher et.al. | 2501.17115 | translate | read | null |
| 2025-01-28 | Unlocking Transparent Alignment Through Enhanced Inverse Constitutional AI for Principle Extraction | Carl-Leander Henneking et.al. | 2501.17112 | translate | read | null |
| 2025-01-28 | COS(M+O)S: Curiosity and RL-Enhanced MCTS for Exploring Story Space via Language Models | Tobias Materzok et.al. | 2501.17104 | translate | read | null |
| 2025-01-28 | Learning Mean Field Control on Sparse Graphs | Christian Fabian et.al. | 2501.17079 | translate | read | null |
| 2025-01-28 | Induced Modularity and Community Detection for Functionally Interpretable Reinforcement Learning | Anna Soligo et.al. | 2501.17077 | translate | read | null |
| 2025-01-28 | Challenges in Ensuring AI Safety in DeepSeek-R1 Models: The Shortcomings of Reinforcement Learning Strategies | Manojkumar Parmar et.al. | 2501.17030 | translate | read | null |
| 2025-01-28 | Network Slice-based Low-Altitude Intelligent Network for Advanced Air Mobility | Kai Xiong et.al. | 2501.17014 | translate | read | null |
| 2025-01-28 | Heterogeneity-aware Personalized Federated Learning via Adaptive Dual-Agent Reinforcement Learning | Xi Chen et.al. | 2501.16966 | translate | read | null |
| 2025-01-28 | On Rollouts in Model-Based Reinforcement Learning | Bernd Frauenknecht et.al. | 2501.16918 | translate | read | link |
| 2025-01-27 | Upside Down Reinforcement Learning with Policy Generators | Jacopo Di Ventura et.al. | 2501.16288 | translate | read | link |
| 2025-01-27 | Accelerating Quantum Reinforcement Learning with a Quantum Natural Policy Gradient Based Approach | Yang Xu et.al. | 2501.16243 | translate | read | null |
| 2025-01-27 | Towards General-Purpose Model-Free Reinforcement Learning | Scott Fujimoto et.al. | 2501.16142 | translate | read | link |
| 2025-01-27 | Quantifying the Self-Interest Level of Markov Social Dilemmas | Richard Willis et.al. | 2501.16138 | translate | read | null |
| 2025-01-27 | ReFill: Reinforcement Learning for Fill-In Minimization | Elfarouk Harb et.al. | 2501.16130 | translate | read | null |
| 2025-01-27 | Multi-Agent Meta-Offline Reinforcement Learning for Timely UAV Path Planning and Data Collection | Eslam Eldeeb et.al. | 2501.16098 | translate | read | null |
| 2025-01-27 | Flexible Blood Glucose Control: Offline Reinforcement Learning from Human Feedback | Harry Emerson et.al. | 2501.15972 | translate | read | null |
| 2025-01-27 | REINFORCE-ING Chemical Language Models in Drug Design | Morgan Thomas et.al. | 2501.15971 | translate | read | null |
| 2025-01-27 | Inverse Reinforcement Learning via Convex Optimization | Hao Zhu et.al. | 2501.15957 | translate | read | null |
| 2025-01-27 | Generative AI for Lyapunov Optimization Theory in UAV-based Low-Altitude Economy Networking | Zhang Liu et.al. | 2501.15928 | translate | read | null |
| 2025-01-24 | An Attentive Graph Agent for Topology-Adaptive Cyber Defence | Ilya Orson Sandoval et.al. | 2501.14700 | translate | read | link |
| 2025-01-24 | ACT-JEPA: Joint-Embedding Predictive Architecture Improves Policy Representation Learning | Aleksandar Vujinovic et.al. | 2501.14622 | translate | read | null |
| 2025-01-24 | COMIX: Generalized Conflict Management in O-RAN xApps – Architecture, Workflow, and a Power Control case | Anastasios Giannopoulos et.al. | 2501.14619 | translate | read | null |
| 2025-01-24 | Age and Power Minimization via Meta-Deep Reinforcement Learning in UAV Networks | Sankani Sarathchandra et.al. | 2501.14603 | translate | read | null |
| 2025-01-24 | Reducing Action Space for Deep Reinforcement Learning via Causal Effect Estimation | Wenzhang Liu et.al. | 2501.14543 | translate | read | link |
| 2025-01-24 | Breaking the Pre-Planning Barrier: Real-Time Adaptive Coordination of Mission and Charging UAVs Using Graph Reinforcement Learning | Yuhan Hu et.al. | 2501.14488 | translate | read | null |
| 2025-01-24 | MARL-OT: Multi-Agent Reinforcement Learning Guided Online Fuzzing to Detect Safety Violation in Autonomous Driving Systems | Linfeng Liang et.al. | 2501.14451 | translate | read | null |
| 2025-01-24 | Learning more with the same effort: how randomization improves the robustness of a robotic deep reinforcement learning agent | Lucía Güitta-López et.al. | 2501.14443 | translate | read | null |
| 2025-01-24 | SKIL: Semantic Keypoint Imitation Learning for Generalizable Data-efficient Manipulation | Shengjie Wang et.al. | 2501.14400 | translate | read | null |
| 2025-01-24 | Reinforcement Learning for Efficient Returns Management | Pascal Linden et.al. | 2501.14394 | translate | read | null |
| 2025-01-23 | CRPO: Confidence-Reward Driven Preference Optimization for Machine Translation | Guofeng Cui et.al. | 2501.13927 | translate | read | null |
| 2025-01-23 | Improving Video Generation with Human Feedback | Jie Liu et.al. | 2501.13918 | translate | read | link |
| 2025-01-23 | GUI-Bee: Align GUI Action Grounding to Novel Environments via Autonomous Exploration | Yue Fan et.al. | 2501.13896 | translate | read | null |
| 2025-01-23 | Utilizing Evolution Strategies to Train Transformers in Reinforcement Learning | Matyáš Lorenc et.al. | 2501.13883 | translate | read | link |
| 2025-01-23 | A space-decoupling framework for optimization on bounded-rank matrices with orthogonally invariant constraints | Yan Yang et.al. | 2501.13830 | translate | read | null |
| 2025-01-23 | Large Language Model driven Policy Exploration for Recommender Systems | Jie Wang et.al. | 2501.13816 | translate | read | null |
| 2025-01-23 | Integrating Causality with Neurochaos Learning: Proposed Approach and Research Agenda | Nanjangud C. Narendra et.al. | 2501.13763 | translate | read | null |
| 2025-01-23 | Scalable Safe Multi-Agent Reinforcement Learning for Multi-Agent System | Haikuo Du et.al. | 2501.13727 | translate | read | null |
| 2025-01-23 | WFCRL: A Multi-Agent Reinforcement Learning Benchmark for Wind Farm Control | Claire Bizon Monroc et.al. | 2501.13592 | translate | read | link |
| 2025-01-23 | Explainable AI-aided Feature Selection and Model Reduction for DRL-based V2X Resource Allocation | Nasir Khan et.al. | 2501.13552 | translate | read | null |
| 2025-01-22 | Which Sensor to Observe? Timely Tracking of a Joint Markov Source with Model Predictive Control | Ismail Cosandal et.al. | 2501.13099 | translate | read | null |
| 2025-01-22 | Attention-Driven Hierarchical Reinforcement Learning with Particle Filtering for Source Localization in Dynamic Fields | Yiwei Shi et.al. | 2501.13084 | translate | read | null |
| 2025-01-22 | Evolution and The Knightian Blindspot of Machine Learning | Joel Lehman et.al. | 2501.13075 | translate | read | null |
| 2025-01-22 | AdaWM: Adaptive World Model based Planning for Autonomous Driving | Hang Wang et.al. | 2501.13072 | translate | read | null |
| 2025-01-22 | Optimizing Return Distributions with Distributional Dynamic Programming | Bernardo Ávila Pires et.al. | 2501.13028 | translate | read | null |
| 2025-01-22 | MONA: Myopic Optimization with Non-myopic Approval Can Mitigate Multi-step Reward Hacking | Sebastian Farquhar et.al. | 2501.13011 | translate | read | null |
| 2025-01-22 | An Offline Multi-Agent Reinforcement Learning Framework for Radio Resource Management | Eslam Eldeeb et.al. | 2501.12991 | translate | read | null |
| 2025-01-22 | DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning | DeepSeek-AI et.al. | 2501.12948 | translate | read | link |
| 2025-01-22 | Offline Critic-Guided Diffusion Policy for Multi-User Delay-Constrained Scheduling | Zhuoran Li et.al. | 2501.12942 | translate | read | null |
| 2025-01-22 | Reinforcement learning Based Automated Design of Differential Evolution Algorithm for Black-box Optimization | Xu Yang et.al. | 2501.12881 | translate | read | null |
| 2025-01-21 | InternLM-XComposer2.5-Reward: A Simple Yet Effective Multi-Modal Reward Model | Yuhang Zang et.al. | 2501.12368 | translate | read | link |
| 2025-01-21 | ARM-IRL: Adaptive Resilience Metric Quantification Using Inverse Reinforcement Learning | Abhijeet Sahu et.al. | 2501.12362 | translate | read | null |
| 2025-01-21 | Sum Rate Enhancement using Machine Learning for Semi-Self Sensing Hybrid RIS-Enabled ISAC in THz Bands | Sara Farrag Mobarak et.al. | 2501.12353 | translate | read | null |
| 2025-01-21 | Towards neural reinforcement learning for large deviations in nonequilibrium systems with memory | Venkata D. Pamulaparthy et.al. | 2501.12333 | translate | read | null |
| 2025-01-21 | Heuristic Deep Reinforcement Learning for Phase Shift Optimization in RIS-assisted Secure Satellite Communication Systems with RSMA | Tingnan Bao et.al. | 2501.12311 | translate | read | null |
| 2025-01-21 | RL-RC-DoT: A Block-level RL agent for Task-Aware Video Compression | Uri Gadot et.al. | 2501.12216 | translate | read | null |
| 2025-01-21 | Experience-replay Innovative Dynamics | Tuo Zhang et.al. | 2501.12199 | translate | read | null |
| 2025-01-21 | Extend Adversarial Policy Against Neural Machine Translation via Unknown Token | Wei Zou et.al. | 2501.12183 | translate | read | null |
| 2025-01-21 | DNRSelect: Active Best View Selection for Deferred Neural Rendering | Dongli Wu et.al. | 2501.12150 | translate | read | null |
| 2025-01-21 | Tackling Uncertainties in Multi-Agent Reinforcement Learning through Integration of Agent Termination Dynamics | Somnath Hazra et.al. | 2501.12061 | translate | read | link |
| 2025-01-17 | DexForce: Extracting Force-informed Actions from Kinesthetic Demonstrations for Dexterous Manipulation | Claire Chen et.al. | 2501.10356 | translate | read | null |
| 2025-01-17 | Enhancing AI Transparency: XRL-Based Resource Management and RAN Slicing for 6G ORAN Architecture | Suvidha Mhatre et.al. | 2501.10292 | translate | read | null |
| 2025-01-17 | Enhancing UAV Path Planning Efficiency Through Accelerated Learning | Joseanne Viana et.al. | 2501.10141 | translate | read | null |
| 2025-01-17 | Spatio-temporal Graph Learning on Adaptive Mined Key Frames for High-performance Multi-Object Tracking | Futian Wang et.al. | 2501.10129 | translate | read | null |
| 2025-01-17 | PaSa: An LLM Agent for Comprehensive Academic Paper Search | Yichen He et.al. | 2501.10120 | translate | read | link |
| 2025-01-17 | GAWM: Global-Aware World Model for Multi-Agent Reinforcement Learning | Zifeng Shi et.al. | 2501.10116 | translate | read | null |
| 2025-01-17 | Robotic World Model: A Neural Network Simulator for Robust Policy Optimization in Robotics | Chenhao Li et.al. | 2501.10100 | translate | read | null |
| 2025-01-17 | ForestProtector: An IoT Architecture Integrating Machine Vision and Deep Reinforcement Learning for Efficient Wildfire Monitoring | Kenneth Bonilla-Ormachea et.al. | 2501.09926 | translate | read | null |
| 2025-01-17 | SLIM: Sim-to-Real Legged Instructive Manipulation via Long-Horizon Visuomotor Learning | Haichao Zhang et.al. | 2501.09905 | translate | read | null |
| 2025-01-16 | From Explainability to Interpretability: Interpretable Policies in Reinforcement Learning Via Model Explanation | Peilang Li et.al. | 2501.09858 | translate | read | null |
| 2025-01-16 | Towards Large Reasoning Models: A Survey of Reinforced Reasoning with Large Language Models | Fengli Xu et.al. | 2501.09686 | translate | read | null |
| 2025-01-16 | Optimizing hypergraph product codes with random walks, simulated annealing and reinforcement learning | Bruno C. A. Freire et.al. | 2501.09622 | translate | read | null |
| 2025-01-16 | Beyond Reward Hacking: Causal Rewards for Large Language Model Alignment | Chaoqi Wang et.al. | 2501.09620 | translate | read | null |
| 2025-01-16 | EVaDE : Event-Based Variational Thompson Sampling for Model-Based Reinforcement Learning | Siddharth Aravindan et.al. | 2501.09611 | translate | read | null |
| 2025-01-16 | RE-POSE: Synergizing Reinforcement Learning-Based Partitioning and Offloading for Edge Object Detection | Jianrui Shi et.al. | 2501.09465 | translate | read | null |
| 2025-01-16 | ADAGE: A generic two-layer framework for adaptive agent based modelling | Benjamin Patrick Evans et.al. | 2501.09429 | translate | read | null |
| 2025-01-16 | Fast Searching of Extreme Operating Conditions for Relay Protection Setting Calculation Based on Graph Neural Network and Reinforcement Learning | Yan Li et.al. | 2501.09399 | translate | read | null |
| 2025-01-16 | Contract-Inspired Contest Theory for Controllable Image Generation in Mobile Edge Metaverse | Guangyuan Liu et.al. | 2501.09391 | translate | read | null |
| 2025-01-16 | Adaptive Contextual Caching for Mobile Edge Large Language Model Service | Guangyuan Liu et.al. | 2501.09383 | translate | read | null |
| 2025-01-16 | Solving Infinite-Player Games with Player-to-Strategy Networks | Carlos Martin et.al. | 2501.09330 | translate | read | null |
| 2025-01-15 | Computing Approximated Fixpoints via Dampened Mann Iteration | Paolo Baldan et.al. | 2501.08950 | translate | read | null |
| 2025-01-15 | A Reinforcement Learning Approach to Quiet and Safe UAM Traffic Management | Surya Murthy et.al. | 2501.08941 | translate | read | null |
| 2025-01-15 | Reinforcement learning-based adaptive time-integration for nonsmooth dynamics | David Riley et.al. | 2501.08934 | translate | read | null |
| 2025-01-15 | Projection Implicit Q-Learning with Support Constraint for Offline Reinforcement Learning | Xinchen Han et.al. | 2501.08907 | translate | read | null |
| 2025-01-15 | Deep Learning Meets Queue-Reactive: A Framework for Realistic Limit Order Book Simulation | Hamza Bodor et.al. | 2501.08822 | translate | read | null |
| 2025-01-15 | Multi-visual modality micro drone-based structural damage detection | Isaac Osei Agyemanga et.al. | 2501.08807 | translate | read | null |
| 2025-01-15 | Networked Agents in the Dark: Team Value Learning under Partial Observability | Guilherme S. Varela et.al. | 2501.08778 | translate | read | null |
| 2025-01-15 | SPEQ: Stabilization Phases for Efficient Q-Learning in High Update-To-Data Ratio Reinforcement Learning | Carlo Romeo et.al. | 2501.08669 | translate | read | null |
| 2025-01-15 | Application of Deep Reinforcement Learning to UAV Swarming for Ground Surveillance | Raúl Arranz et.al. | 2501.08655 | translate | read | null |
| 2025-01-15 | RLHS: Mitigating Misalignment in RLHF with Hindsight Simulation | Kaiqu Liang et.al. | 2501.08617 | translate | read | null |
| 2025-01-14 | FDPP: Fine-tune Diffusion Policy with Human Preference | Yuxin Chen et.al. | 2501.08259 | translate | read | null |
| 2025-01-14 | Dynamic Pricing in High-Speed Railways Using Multi-Agent Reinforcement Learning | Enrique Adrian Villarrubia-Martin et.al. | 2501.08234 | translate | read | null |
| 2025-01-14 | Optimization of Link Configuration for Satellite Communication Using Reinforcement Learning | Tobias Rohe et.al. | 2501.08220 | translate | read | null |
| 2025-01-14 | In-situ graph reasoning and knowledge expansion using Graph-PReFLexOR | Markus J. Buehler et.al. | 2501.08120 | translate | read | null |
| 2025-01-14 | Data-driven inventory management for new products: A warm-start and adjusted Dyna- $Q$ approach | Xinyu Qu et.al. | 2501.08109 | translate | read | null |
| 2025-01-14 | Hybrid Action Based Reinforcement Learning for Multi-Objective Compatible Autonomous Driving | Guizhe Jin et.al. | 2501.08096 | translate | read | null |
| 2025-01-14 | CuAsmRL: Optimizing GPU SASS Schedules via Deep Reinforcement Learning | Guoliang He et.al. | 2501.08071 | translate | read | null |
| 2025-01-14 | Continual Reinforcement Learning for Digital Twin Synchronization Optimization | Haonan Tong et.al. | 2501.08045 | translate | read | null |
| 2025-01-14 | READ: Reinforcement-based Adversarial Learning for Text Classification with Limited Labeled Data | Rohit Sharma et.al. | 2501.08035 | translate | read | null |
| 2025-01-14 | Cooperative Patrol Routing: Optimizing Urban Crime Surveillance through Multi-Agent Reinforcement Learning | Juan Palma-Borda et.al. | 2501.08020 | translate | read | null |
| 2025-01-13 | SafeSwarm: Decentralized Safe RL for the Swarm of Drones Landing in Dense Crowds | Grik Tadevosyan et.al. | 2501.07566 | translate | read | null |
| 2025-01-13 | Improving DeFi Accessibility through Efficient Liquidity Provisioning with Deep Reinforcement Learning | Haonan Xu et.al. | 2501.07508 | translate | read | null |
| 2025-01-13 | RbRL2.0: Integrated Reward and Policy Learning for Rating-based Reinforcement Learning | Mingkang Wu et.al. | 2501.07502 | translate | read | null |
| 2025-01-13 | Online inductive learning from answer sets for efficient reinforcement learning exploration | Celeste Veronese et.al. | 2501.07445 | translate | read | null |
| 2025-01-13 | Attention when you need | Lokesh Boominathan et.al. | 2501.07440 | translate | read | null |
| 2025-01-13 | Enhancing Online Reinforcement Learning with Meta-Learned Objective from Offline Data | Shilong Deng et.al. | 2501.07346 | translate | read | link |
| 2025-01-13 | Foundation Models at Work: Fine-Tuning for Fairness in Algorithmic Hiring | Buse Sibel Korkmaz et.al. | 2501.07324 | translate | read | link |
| 2025-01-13 | Mining Intraday Risk Factor Collections via Hierarchical Reinforcement Learning based on Transferred Options | Wenyan Xu et.al. | 2501.07274 | translate | read | null |
| 2025-01-13 | Future-Conditioned Recommendations with Multi-Objective Controllable Decision Transformer | Chongming Gao et.al. | 2501.07212 | translate | read | null |
| 2025-01-13 | Generalizable Graph Neural Networks for Robust Power Grid Topology Control | Matthijs de Jong et.al. | 2501.07186 | translate | read | null |
| 2025-01-10 | From discrete-time policies to continuous-time diffusion samplers: Asymptotic equivalences and faster training | Julius Berner et.al. | 2501.06148 | translate | read | link |
| 2025-01-10 | Vehicle-in-Virtual-Environment (VVE) Based Autonomous Driving Function Development and Evaluation Methodology for Vulnerable Road User Safety | Haochong Chen et.al. | 2501.06113 | translate | read | null |
| 2025-01-10 | Learning Flexible Heterogeneous Coordination with Capability-Aware Shared Hypernetworks | Kevin Fu et.al. | 2501.06058 | translate | read | null |
| 2025-01-10 | Investigating the Impact of Observation Space Design Choices On Training Reinforcement Learning Solutions for Spacecraft Problems | Nathaniel Hamilton et.al. | 2501.06016 | translate | read | null |
| 2025-01-10 | The Safe Trusted Autonomy for Responsible Space Program | Kerianne L. Hobbs et.al. | 2501.05984 | translate | read | null |
| 2025-01-10 | A Practical Demonstration of DRL-Based Dynamic Resource Allocation xApp Using OpenAirInterface | Onur Sever et.al. | 2501.05879 | translate | read | null |
| 2025-01-10 | Diffusion Models for Smarter UAVs: Decision-Making and Modeling | Yousef Emami et.al. | 2501.05819 | translate | read | null |
| 2025-01-10 | Real-Time Integrated Dispatching and Idle Fleet Steering with Deep Reinforcement Learning for A Meal Delivery Platform | Jingyi Cheng et.al. | 2501.05808 | translate | read | null |
| 2025-01-10 | Understanding Impact of Human Feedback via Influence Functions | Taywon Min et.al. | 2501.05790 | translate | read | link |
| 2025-01-09 | Session-Level Dynamic Ad Load Optimization using Offline Robust Reinforcement Learning | Tao Liu et.al. | 2501.05591 | translate | read | null |
| 2025-01-09 | TimeRL: Efficient Deep Reinforcement Learning with Polyhedral Dependence Graphs | Pedro F. Silvestre et.al. | 2501.05408 | translate | read | null |
| 2025-01-09 | Search-o1: Agentic Search-Enhanced Large Reasoning Models | Xiaoxi Li et.al. | 2501.05366 | translate | read | link |
| 2025-01-09 | Knowledge Transfer in Model-Based Reinforcement Learning Agents for Efficient Multi-Task Learning | Dmytro Kuzmenko et.al. | 2501.05329 | translate | read | null |
| 2025-01-09 | Design and Control of a Bipedal Robotic Character | Ruben Grandia et.al. | 2501.05204 | translate | read | null |
| 2025-01-09 | Constrained Optimization of Charged Particle Tracking with Multi-Agent Reinforcement Learning | Tobias Kortus et.al. | 2501.05113 | translate | read | null |
| 2025-01-09 | LearningFlow: Automated Policy Learning Workflow for Urban Driving with Large Language Models | Zengqi Peng et.al. | 2501.05057 | translate | read | null |
| 2025-01-09 | CuRLA: Curriculum Learning Based Deep Reinforcement Learning for Autonomous Driving | Bhargava Uppuluri et.al. | 2501.04982 | translate | read | null |
| 2025-01-09 | Promoting Shared Energy Storage Aggregation among High Price-Tolerance Prosumer: An Incentive Deposit and Withdrawal Service | Xin Lu et.al. | 2501.04964 | translate | read | null |
| 2025-01-09 | Balancing Exploration and Cybersickness: Investigating Curiosity-Driven Behavior in Virtual Environments | Tangyao Li et.al. | 2501.04905 | translate | read | null |
| 2025-01-08 | Multilinear Tensor Low-Rank Approximation for Policy-Gradient Methods in Reinforcement Learning | Sergio Rozada et.al. | 2501.04879 | translate | read | null |
| 2025-01-08 | Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Thought | Violet Xiang et.al. | 2501.04682 | translate | read | null |
| 2025-01-08 | Framework for Integrating Machine Learning Methods for Path-Aware Source Routing | Anees Al-Najjar et.al. | 2501.04624 | translate | read | null |
| 2025-01-08 | MobileH2R: Learning Generalizable Human to Mobile Robot Handover Exclusively from Scalable and Diverse Synthetic Data | Zifan Wang et.al. | 2501.04595 | translate | read | null |
| 2025-01-08 | HypeRL: Parameter-Informed Reinforcement Learning for Parametric PDEs | Nicolò Botteghi et.al. | 2501.04538 | translate | read | null |
| 2025-01-08 | Safe Reinforcement Learning with Minimal Supervision | Alexander Quessy et.al. | 2501.04481 | translate | read | null |
| 2025-01-08 | Research on environment perception and behavior prediction of intelligent UAV based on semantic communication | Kechong Ren et.al. | 2501.04480 | translate | read | null |
| 2025-01-08 | Hybrid Artificial Intelligence Strategies for Drone Navigation | Rubén San-Segundo et.al. | 2501.04472 | translate | read | null |
| 2025-01-08 | Risk-averse policies for natural gas futures trading using distributional reinforcement learning | Félicien Hêche et.al. | 2501.04421 | translate | read | null |
| 2025-01-08 | Constraints as Rewards: Reinforcement Learning for Robots without Reward Functions | Yu Ishihara et.al. | 2501.04228 | translate | read | null |
| 2025-01-07 | Explainable Reinforcement Learning via Temporal Policy Decomposition | Franco Ruggeri et.al. | 2501.03902 | translate | read | null |
| 2025-01-07 | Neural DNF-MT: A Neuro-symbolic Approach for Learning Interpretable and Editable Policies | Kexin Gu Baugh et.al. | 2501.03888 | translate | read | null |
| 2025-01-07 | AlphaPO – Reward shape matters for LLM alignment | Aman Gupta et.al. | 2501.03884 | translate | read | null |
| 2025-01-07 | Online Reinforcement Learning-Based Dynamic Adaptive Evaluation Function for Real-Time Strategy Tasks | Weilong Yang et.al. | 2501.03824 | translate | read | null |
| 2025-01-07 | Run-and-tumble chemotaxis using reinforcement learning | Ramesh Pramanik et.al. | 2501.03687 | translate | read | null |
| 2025-01-07 | IEEE 802.11bn Multi-AP Coordinated Spatial Reuse with Hierarchical Multi-Armed Bandits | Maksymilian Wojnar et.al. | 2501.03680 | translate | read | null |
| 2025-01-07 | SALE-Based Offline Reinforcement Learning with Ensemble Q-Networks | Zheng Chun et.al. | 2501.03676 | translate | read | null |
| 2025-01-07 | Imitation Learning of MPC with Neural Networks: Error Guarantees and Sparsification | Hendrik Alsmeier et.al. | 2501.03671 | translate | read | null |
| 2025-01-07 | Rethinking Adversarial Attacks in Reinforcement Learning from Policy Distribution Perspective | Tianyang Duan et.al. | 2501.03562 | translate | read | null |
| 2025-01-07 | Align-Pro: A Principled Approach to Prompt Optimization for LLM Alignment | Prashant Trivedi et.al. | 2501.03486 | translate | read | null |
| 2025-01-06 | Turn-based Multi-Agent Reinforcement Learning Model Checking | Dennis Gross et.al. | 2501.03187 | translate | read | null |
| 2025-01-06 | Co-Activation Graph Analysis of Safety-Verified and Explainable Deep Reinforcement Learning Policies | Dennis Gross et.al. | 2501.03142 | translate | read | null |
| 2025-01-06 | CALM: Curiosity-Driven Auditing for Large Language Models | Xiang Zheng et.al. | 2501.02997 | translate | read | null |
| 2025-01-06 | CAMP: Collaborative Attention Model with Profiles for Vehicle Routing Problems | Chuanbo Hua et.al. | 2501.02977 | translate | read | null |
| 2025-01-06 | Sim-to-Real Transfer for Mobile Robots with Reinforcement Learning: from NVIDIA Isaac Sim to Gazebo and Real ROS 2 Robots | Sahar Salimpour et.al. | 2501.02902 | translate | read | link |
| 2025-01-06 | Revisiting Communication Efficiency in Multi-Agent Reinforcement Learning from the Dimensional Analysis Perspective | Chuxiong Sun et.al. | 2501.02888 | translate | read | null |
| 2025-01-06 | First-place Solution for Streetscape Shop Sign Recognition Competition | Bin Wang et.al. | 2501.02811 | translate | read | null |
| 2025-01-06 | Segmenting Text and Learning Their Rewards for Improved RLHF in Language Model | Yueqin Yin et.al. | 2501.02790 | translate | read | null |
| 2025-01-06 | Joint Optimization of UAV-Carried IRS for Urban Low Altitude mmWave Communications with Deep Reinforcement Learning | Wenwen Xie et.al. | 2501.02787 | translate | read | null |
| 2025-01-06 | Learn A Flexible Exploration Model for Parameterized Action Markov Decision Processes | Zijian Wang et.al. | 2501.02774 | translate | read | null |
| 2025-01-03 | Evaluating Scenario-based Decision-making for Interactive Autonomous Driving Using Rational Criteria: A Survey | Zhen Tian et.al. | 2501.01886 | translate | read | null |
| 2025-01-03 | Auto-RT: Automatic Jailbreak Strategy Exploration for Red-Teaming Large Language Models | Yanjiang Liu et.al. | 2501.01830 | translate | read | null |
| 2025-01-03 | Genetic algorithm enhanced Solovay-Kitaev algorithm for quantum compiling | Jiangwei Long et.al. | 2501.01746 | translate | read | null |
| 2025-01-03 | Proposing Hierarchical Goal-Conditioned Policy Planning in Multi-Goal Reinforcement Learning | Gavin B. Rens et.al. | 2501.01727 | translate | read | null |
| 2025-01-03 | Inversely Learning Transferable Rewards via Abstracted States | Yikang Gui et.al. | 2501.01669 | translate | read | null |
| 2025-01-03 | BLAST: A Stealthy Backdoor Leverage Attack against Cooperative Multi-Agent Deep Reinforcement Learning based Systems | Yinbo Yu et.al. | 2501.01593 | translate | read | null |
| 2025-01-02 | Reinforcement-learning-based control of turbulent channel flows at high Reynolds numbers | Zisong Zhou et.al. | 2501.01573 | translate | read | null |
| 2025-01-02 | Reinforcement Learning for Respondent-Driven Sampling | Justin Weltz et.al. | 2501.01505 | translate | read | null |
| 2025-01-02 | Decoding Knowledge in Large Language Models: A Framework for Categorization and Comprehension | Yanbo Fang et.al. | 2501.01332 | translate | read | null |
| 2025-01-02 | Towards Intelligent Antenna Positioning: Leveraging DRL for FAS-Aided ISAC Systems | Shunxing Yang et.al. | 2501.01281 | translate | read | null |
| 2025-01-02 | PIMAEX: Multi-Agent Exploration through Peer Incentivization | Michael Kölle et.al. | 2501.01266 | translate | read | null |
| 2025-01-02 | Embodied AI-Enhanced Vehicular Networks: An Integrated Large Language Models and Reinforcement Learning Method | Ruichen Zhang et.al. | 2501.01141 | translate | read | null |
| 2025-01-02 | Communicating Unexpectedness for Out-of-Distribution Multi-Agent Reinforcement Learning | Min Whoo Lee et.al. | 2501.01140 | translate | read | null |
| 2025-01-02 | Symmetries-enhanced Multi-Agent Reinforcement Learning | Nikolaos Bousias et.al. | 2501.01136 | translate | read | null |
| 2025-01-02 | Noise-Resilient Symbolic Regression with Dynamic Gating Reinforcement Learning | Chenglu Sun et.al. | 2501.01085 | translate | read | null |
| 2025-01-02 | Enhancing Neural Adaptive Wireless Video Streaming via Lower-Layer Information Exposure and Online Tuning | Lingzhi Zhao et.al. | 2501.01044 | translate | read | null |
| 2025-01-02 | Energy-Efficient and Intelligent ISAC in V2X Networks with Spiking Neural Networks-Driven DRL | Chen Shang et.al. | 2501.01038 | translate | read | null |
| 2025-01-02 | Deep Reinforcement Learning for Job Scheduling and Resource Management in Cloud Computing: An Algorithm-Level Review | Yan Gu et.al. | 2501.01007 | translate | read | null |
(<a href=../Reinforcement_Learning.md>back to Reinforcement Learning</a>)