Reinforcement Learning - 2025-01

Publish Date	Title	Authors	PDF	Translate	Read	Code
2025-01-31	Vintix: Action Model via In-Context Reinforcement Learning	Andrey Polubarov et.al.	2501.19400	translate	read	link
2025-01-31	The Energy Loss Phenomenon in RLHF: A New Perspective on Mitigating Reward Hacking	Yuchun Miao et.al.	2501.19358	translate	read	null
2025-01-31	Jackpot! Alignment as a Maximal Lottery	Roberto-Rafael Maura-Rivero et.al.	2501.19266	translate	read	null
2025-01-31	Objective Metrics for Human-Subjects Evaluation in Explainable Reinforcement Learning	Balint Gyevnar et.al.	2501.19256	translate	read	null
2025-01-31	Linear $Q$ -Learning Does Not Diverge: Convergence Rates to a Bounded Set	Xinyu Liu et.al.	2501.19254	translate	read	null
2025-01-31	An Empirical Game-Theoretic Analysis of Autonomous Cyber-Defence Agents	Gregory Palmer et.al.	2501.19206	translate	read	null
2025-01-31	APEX: Automated Parameter Exploration for Low-Power Wireless Protocols	Mohamed Hassaan M. Hydher et.al.	2501.19194	translate	read	null
2025-01-31	Test-Time Training Scaling for Chemical Exploration in Drug Design	Morgan Thomas et.al.	2501.19153	translate	read	null
2025-01-31	Decorrelated Soft Actor-Critic for Efficient Deep Reinforcement Learning	Burcu Küçükoğlu et.al.	2501.19133	translate	read	null
2025-01-30	Design and Validation of Learning Aware HMI For Learning-Enabled Increasingly Autonomous Systems	Parth Ganeriwala et.al.	2501.18506	translate	read	null
2025-01-30	Curriculum-based Sample Efficient Reinforcement Learning for Robust Stabilization of a Quadrotor	Fausto Mauricio Lagos Suarez et.al.	2501.18490	translate	read	null
2025-01-30	Model-Free RL Agents Demonstrate System 1-Like Intentionality	Hal Ashton et.al.	2501.18299	translate	read	null
2025-01-30	Neural Operator based Reinforcement Learning for Control of first-order PDEs with Spatially-Varying State Delay	Jiaqi Hu et.al.	2501.18201	translate	read	null
2025-01-30	QNN-QRL: Quantum Neural Network Integrated with Quantum Reinforcement Learning for Quantum Key Distribution	Bikash K. Behera et.al.	2501.18188	translate	read	null
2025-01-30	Investigating Tax Evasion Emergence Using Dual Large Language Model and Deep Reinforcement Learning Powered Agent-based Simulation	Teddy Lazebnik et.al.	2501.18177	translate	read	null
2025-01-30	B3C: A Minimalist Approach to Offline Multi-Agent Reinforcement Learning	Woojun Kim et.al.	2501.18138	translate	read	null
2025-01-30	Diverse Preference Optimization	Jack Lanchantin et.al.	2501.18101	translate	read	null
2025-01-30	Reward Prediction Error Prioritisation in Experience Replay: The RPE-PER Method	Hoda Yamani et.al.	2501.18093	translate	read	null
2025-01-30	DIAL: Distribution-Informed Adaptive Learning of Multi-Task Constraints for Safety-Critical Systems	Se-Wook Yoo et.al.	2501.18086	translate	read	null
2025-01-29	From Sparse to Dense: Toddler-inspired Reward Transition in Goal-Oriented Reinforcement Learning	Junseok Park et.al.	2501.17842	translate	read	null
2025-01-29	Langevin Soft Actor-Critic: Efficient Exploration through Uncertainty-Driven Critic Learning	Haque Ishfaq et.al.	2501.17827	translate	read	null
2025-01-29	Consensus Based Stochastic Control	Liyao Lyu et.al.	2501.17801	translate	read	null
2025-01-29	CAMP in the Odyssey: Provably Robust Reinforcement Learning with Certified Radius Maximization	Derui Wang et.al.	2501.17667	translate	read	link
2025-01-29	Accelerated DC loadflow solver for topology optimization	Nico Westerbeck et.al.	2501.17529	translate	read	null
2025-01-29	Human-Aligned Skill Discovery: Balancing Behaviour Exploration and Alignment	Maxence Hussonnois et.al.	2501.17431	translate	read	null
2025-01-29	Certificated Actor-Critic: Hierarchical Reinforcement Learning with Control Barrier Functions for Safe Navigation	Junjun Xie et.al.	2501.17424	translate	read	null
2025-01-29	Value Function Decomposition in Markov Recommendation Process	Xiaobei Wang et.al.	2501.17409	translate	read	null
2025-01-29	A Dual-Agent Adversarial Framework for Robust Generalization in Deep Reinforcement Learning	Zhengpeng Xie et.al.	2501.17384	translate	read	null
2025-01-29	ASAP: Learning Generalizable Online Bin Packing via Adaptive Selection After Pruning	Han Fang et.al.	2501.17377	translate	read	null
2025-01-28	SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training	Tianzhe Chu et.al.	2501.17161	translate	read	null
2025-01-28	Evidence on the Regularisation Properties of Maximum-Entropy Reinforcement Learning	Rémy Hosseinkhan Boucher et.al.	2501.17115	translate	read	null
2025-01-28	Unlocking Transparent Alignment Through Enhanced Inverse Constitutional AI for Principle Extraction	Carl-Leander Henneking et.al.	2501.17112	translate	read	null
2025-01-28	COS(M+O)S: Curiosity and RL-Enhanced MCTS for Exploring Story Space via Language Models	Tobias Materzok et.al.	2501.17104	translate	read	null
2025-01-28	Learning Mean Field Control on Sparse Graphs	Christian Fabian et.al.	2501.17079	translate	read	null
2025-01-28	Induced Modularity and Community Detection for Functionally Interpretable Reinforcement Learning	Anna Soligo et.al.	2501.17077	translate	read	null
2025-01-28	Challenges in Ensuring AI Safety in DeepSeek-R1 Models: The Shortcomings of Reinforcement Learning Strategies	Manojkumar Parmar et.al.	2501.17030	translate	read	null
2025-01-28	Network Slice-based Low-Altitude Intelligent Network for Advanced Air Mobility	Kai Xiong et.al.	2501.17014	translate	read	null
2025-01-28	Heterogeneity-aware Personalized Federated Learning via Adaptive Dual-Agent Reinforcement Learning	Xi Chen et.al.	2501.16966	translate	read	null
2025-01-28	On Rollouts in Model-Based Reinforcement Learning	Bernd Frauenknecht et.al.	2501.16918	translate	read	link
2025-01-27	Upside Down Reinforcement Learning with Policy Generators	Jacopo Di Ventura et.al.	2501.16288	translate	read	link
2025-01-27	Accelerating Quantum Reinforcement Learning with a Quantum Natural Policy Gradient Based Approach	Yang Xu et.al.	2501.16243	translate	read	null
2025-01-27	Towards General-Purpose Model-Free Reinforcement Learning	Scott Fujimoto et.al.	2501.16142	translate	read	link
2025-01-27	Quantifying the Self-Interest Level of Markov Social Dilemmas	Richard Willis et.al.	2501.16138	translate	read	null
2025-01-27	ReFill: Reinforcement Learning for Fill-In Minimization	Elfarouk Harb et.al.	2501.16130	translate	read	null
2025-01-27	Multi-Agent Meta-Offline Reinforcement Learning for Timely UAV Path Planning and Data Collection	Eslam Eldeeb et.al.	2501.16098	translate	read	null
2025-01-27	Flexible Blood Glucose Control: Offline Reinforcement Learning from Human Feedback	Harry Emerson et.al.	2501.15972	translate	read	null
2025-01-27	REINFORCE-ING Chemical Language Models in Drug Design	Morgan Thomas et.al.	2501.15971	translate	read	null
2025-01-27	Inverse Reinforcement Learning via Convex Optimization	Hao Zhu et.al.	2501.15957	translate	read	null
2025-01-27	Generative AI for Lyapunov Optimization Theory in UAV-based Low-Altitude Economy Networking	Zhang Liu et.al.	2501.15928	translate	read	null
2025-01-24	An Attentive Graph Agent for Topology-Adaptive Cyber Defence	Ilya Orson Sandoval et.al.	2501.14700	translate	read	link
2025-01-24	ACT-JEPA: Joint-Embedding Predictive Architecture Improves Policy Representation Learning	Aleksandar Vujinovic et.al.	2501.14622	translate	read	null
2025-01-24	COMIX: Generalized Conflict Management in O-RAN xApps – Architecture, Workflow, and a Power Control case	Anastasios Giannopoulos et.al.	2501.14619	translate	read	null
2025-01-24	Age and Power Minimization via Meta-Deep Reinforcement Learning in UAV Networks	Sankani Sarathchandra et.al.	2501.14603	translate	read	null
2025-01-24	Reducing Action Space for Deep Reinforcement Learning via Causal Effect Estimation	Wenzhang Liu et.al.	2501.14543	translate	read	link
2025-01-24	Breaking the Pre-Planning Barrier: Real-Time Adaptive Coordination of Mission and Charging UAVs Using Graph Reinforcement Learning	Yuhan Hu et.al.	2501.14488	translate	read	null
2025-01-24	MARL-OT: Multi-Agent Reinforcement Learning Guided Online Fuzzing to Detect Safety Violation in Autonomous Driving Systems	Linfeng Liang et.al.	2501.14451	translate	read	null
2025-01-24	Learning more with the same effort: how randomization improves the robustness of a robotic deep reinforcement learning agent	Lucía Güitta-López et.al.	2501.14443	translate	read	null
2025-01-24	SKIL: Semantic Keypoint Imitation Learning for Generalizable Data-efficient Manipulation	Shengjie Wang et.al.	2501.14400	translate	read	null
2025-01-24	Reinforcement Learning for Efficient Returns Management	Pascal Linden et.al.	2501.14394	translate	read	null
2025-01-23	CRPO: Confidence-Reward Driven Preference Optimization for Machine Translation	Guofeng Cui et.al.	2501.13927	translate	read	null
2025-01-23	Improving Video Generation with Human Feedback	Jie Liu et.al.	2501.13918	translate	read	link
2025-01-23	GUI-Bee: Align GUI Action Grounding to Novel Environments via Autonomous Exploration	Yue Fan et.al.	2501.13896	translate	read	null
2025-01-23	Utilizing Evolution Strategies to Train Transformers in Reinforcement Learning	Matyáš Lorenc et.al.	2501.13883	translate	read	link
2025-01-23	A space-decoupling framework for optimization on bounded-rank matrices with orthogonally invariant constraints	Yan Yang et.al.	2501.13830	translate	read	null
2025-01-23	Large Language Model driven Policy Exploration for Recommender Systems	Jie Wang et.al.	2501.13816	translate	read	null
2025-01-23	Integrating Causality with Neurochaos Learning: Proposed Approach and Research Agenda	Nanjangud C. Narendra et.al.	2501.13763	translate	read	null
2025-01-23	Scalable Safe Multi-Agent Reinforcement Learning for Multi-Agent System	Haikuo Du et.al.	2501.13727	translate	read	null
2025-01-23	WFCRL: A Multi-Agent Reinforcement Learning Benchmark for Wind Farm Control	Claire Bizon Monroc et.al.	2501.13592	translate	read	link
2025-01-23	Explainable AI-aided Feature Selection and Model Reduction for DRL-based V2X Resource Allocation	Nasir Khan et.al.	2501.13552	translate	read	null
2025-01-22	Which Sensor to Observe? Timely Tracking of a Joint Markov Source with Model Predictive Control	Ismail Cosandal et.al.	2501.13099	translate	read	null
2025-01-22	Attention-Driven Hierarchical Reinforcement Learning with Particle Filtering for Source Localization in Dynamic Fields	Yiwei Shi et.al.	2501.13084	translate	read	null
2025-01-22	Evolution and The Knightian Blindspot of Machine Learning	Joel Lehman et.al.	2501.13075	translate	read	null
2025-01-22	AdaWM: Adaptive World Model based Planning for Autonomous Driving	Hang Wang et.al.	2501.13072	translate	read	null
2025-01-22	Optimizing Return Distributions with Distributional Dynamic Programming	Bernardo Ávila Pires et.al.	2501.13028	translate	read	null
2025-01-22	MONA: Myopic Optimization with Non-myopic Approval Can Mitigate Multi-step Reward Hacking	Sebastian Farquhar et.al.	2501.13011	translate	read	null
2025-01-22	An Offline Multi-Agent Reinforcement Learning Framework for Radio Resource Management	Eslam Eldeeb et.al.	2501.12991	translate	read	null
2025-01-22	DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning	DeepSeek-AI et.al.	2501.12948	translate	read	link
2025-01-22	Offline Critic-Guided Diffusion Policy for Multi-User Delay-Constrained Scheduling	Zhuoran Li et.al.	2501.12942	translate	read	null
2025-01-22	Reinforcement learning Based Automated Design of Differential Evolution Algorithm for Black-box Optimization	Xu Yang et.al.	2501.12881	translate	read	null
2025-01-21	InternLM-XComposer2.5-Reward: A Simple Yet Effective Multi-Modal Reward Model	Yuhang Zang et.al.	2501.12368	translate	read	link
2025-01-21	ARM-IRL: Adaptive Resilience Metric Quantification Using Inverse Reinforcement Learning	Abhijeet Sahu et.al.	2501.12362	translate	read	null
2025-01-21	Sum Rate Enhancement using Machine Learning for Semi-Self Sensing Hybrid RIS-Enabled ISAC in THz Bands	Sara Farrag Mobarak et.al.	2501.12353	translate	read	null
2025-01-21	Towards neural reinforcement learning for large deviations in nonequilibrium systems with memory	Venkata D. Pamulaparthy et.al.	2501.12333	translate	read	null
2025-01-21	Heuristic Deep Reinforcement Learning for Phase Shift Optimization in RIS-assisted Secure Satellite Communication Systems with RSMA	Tingnan Bao et.al.	2501.12311	translate	read	null
2025-01-21	RL-RC-DoT: A Block-level RL agent for Task-Aware Video Compression	Uri Gadot et.al.	2501.12216	translate	read	null
2025-01-21	Experience-replay Innovative Dynamics	Tuo Zhang et.al.	2501.12199	translate	read	null
2025-01-21	Extend Adversarial Policy Against Neural Machine Translation via Unknown Token	Wei Zou et.al.	2501.12183	translate	read	null
2025-01-21	DNRSelect: Active Best View Selection for Deferred Neural Rendering	Dongli Wu et.al.	2501.12150	translate	read	null
2025-01-21	Tackling Uncertainties in Multi-Agent Reinforcement Learning through Integration of Agent Termination Dynamics	Somnath Hazra et.al.	2501.12061	translate	read	link
2025-01-17	DexForce: Extracting Force-informed Actions from Kinesthetic Demonstrations for Dexterous Manipulation	Claire Chen et.al.	2501.10356	translate	read	null
2025-01-17	Enhancing AI Transparency: XRL-Based Resource Management and RAN Slicing for 6G ORAN Architecture	Suvidha Mhatre et.al.	2501.10292	translate	read	null
2025-01-17	Enhancing UAV Path Planning Efficiency Through Accelerated Learning	Joseanne Viana et.al.	2501.10141	translate	read	null
2025-01-17	Spatio-temporal Graph Learning on Adaptive Mined Key Frames for High-performance Multi-Object Tracking	Futian Wang et.al.	2501.10129	translate	read	null
2025-01-17	PaSa: An LLM Agent for Comprehensive Academic Paper Search	Yichen He et.al.	2501.10120	translate	read	link
2025-01-17	GAWM: Global-Aware World Model for Multi-Agent Reinforcement Learning	Zifeng Shi et.al.	2501.10116	translate	read	null
2025-01-17	Robotic World Model: A Neural Network Simulator for Robust Policy Optimization in Robotics	Chenhao Li et.al.	2501.10100	translate	read	null
2025-01-17	ForestProtector: An IoT Architecture Integrating Machine Vision and Deep Reinforcement Learning for Efficient Wildfire Monitoring	Kenneth Bonilla-Ormachea et.al.	2501.09926	translate	read	null
2025-01-17	SLIM: Sim-to-Real Legged Instructive Manipulation via Long-Horizon Visuomotor Learning	Haichao Zhang et.al.	2501.09905	translate	read	null
2025-01-16	From Explainability to Interpretability: Interpretable Policies in Reinforcement Learning Via Model Explanation	Peilang Li et.al.	2501.09858	translate	read	null
2025-01-16	Towards Large Reasoning Models: A Survey of Reinforced Reasoning with Large Language Models	Fengli Xu et.al.	2501.09686	translate	read	null
2025-01-16	Optimizing hypergraph product codes with random walks, simulated annealing and reinforcement learning	Bruno C. A. Freire et.al.	2501.09622	translate	read	null
2025-01-16	Beyond Reward Hacking: Causal Rewards for Large Language Model Alignment	Chaoqi Wang et.al.	2501.09620	translate	read	null
2025-01-16	EVaDE : Event-Based Variational Thompson Sampling for Model-Based Reinforcement Learning	Siddharth Aravindan et.al.	2501.09611	translate	read	null
2025-01-16	RE-POSE: Synergizing Reinforcement Learning-Based Partitioning and Offloading for Edge Object Detection	Jianrui Shi et.al.	2501.09465	translate	read	null
2025-01-16	ADAGE: A generic two-layer framework for adaptive agent based modelling	Benjamin Patrick Evans et.al.	2501.09429	translate	read	null
2025-01-16	Fast Searching of Extreme Operating Conditions for Relay Protection Setting Calculation Based on Graph Neural Network and Reinforcement Learning	Yan Li et.al.	2501.09399	translate	read	null
2025-01-16	Contract-Inspired Contest Theory for Controllable Image Generation in Mobile Edge Metaverse	Guangyuan Liu et.al.	2501.09391	translate	read	null
2025-01-16	Adaptive Contextual Caching for Mobile Edge Large Language Model Service	Guangyuan Liu et.al.	2501.09383	translate	read	null
2025-01-16	Solving Infinite-Player Games with Player-to-Strategy Networks	Carlos Martin et.al.	2501.09330	translate	read	null
2025-01-15	Computing Approximated Fixpoints via Dampened Mann Iteration	Paolo Baldan et.al.	2501.08950	translate	read	null
2025-01-15	A Reinforcement Learning Approach to Quiet and Safe UAM Traffic Management	Surya Murthy et.al.	2501.08941	translate	read	null
2025-01-15	Reinforcement learning-based adaptive time-integration for nonsmooth dynamics	David Riley et.al.	2501.08934	translate	read	null
2025-01-15	Projection Implicit Q-Learning with Support Constraint for Offline Reinforcement Learning	Xinchen Han et.al.	2501.08907	translate	read	null
2025-01-15	Deep Learning Meets Queue-Reactive: A Framework for Realistic Limit Order Book Simulation	Hamza Bodor et.al.	2501.08822	translate	read	null
2025-01-15	Multi-visual modality micro drone-based structural damage detection	Isaac Osei Agyemanga et.al.	2501.08807	translate	read	null
2025-01-15	Networked Agents in the Dark: Team Value Learning under Partial Observability	Guilherme S. Varela et.al.	2501.08778	translate	read	null
2025-01-15	SPEQ: Stabilization Phases for Efficient Q-Learning in High Update-To-Data Ratio Reinforcement Learning	Carlo Romeo et.al.	2501.08669	translate	read	null
2025-01-15	Application of Deep Reinforcement Learning to UAV Swarming for Ground Surveillance	Raúl Arranz et.al.	2501.08655	translate	read	null
2025-01-15	RLHS: Mitigating Misalignment in RLHF with Hindsight Simulation	Kaiqu Liang et.al.	2501.08617	translate	read	null
2025-01-14	FDPP: Fine-tune Diffusion Policy with Human Preference	Yuxin Chen et.al.	2501.08259	translate	read	null
2025-01-14	Dynamic Pricing in High-Speed Railways Using Multi-Agent Reinforcement Learning	Enrique Adrian Villarrubia-Martin et.al.	2501.08234	translate	read	null
2025-01-14	Optimization of Link Configuration for Satellite Communication Using Reinforcement Learning	Tobias Rohe et.al.	2501.08220	translate	read	null
2025-01-14	In-situ graph reasoning and knowledge expansion using Graph-PReFLexOR	Markus J. Buehler et.al.	2501.08120	translate	read	null
2025-01-14	Data-driven inventory management for new products: A warm-start and adjusted Dyna- $Q$ approach	Xinyu Qu et.al.	2501.08109	translate	read	null
2025-01-14	Hybrid Action Based Reinforcement Learning for Multi-Objective Compatible Autonomous Driving	Guizhe Jin et.al.	2501.08096	translate	read	null
2025-01-14	CuAsmRL: Optimizing GPU SASS Schedules via Deep Reinforcement Learning	Guoliang He et.al.	2501.08071	translate	read	null
2025-01-14	Continual Reinforcement Learning for Digital Twin Synchronization Optimization	Haonan Tong et.al.	2501.08045	translate	read	null
2025-01-14	READ: Reinforcement-based Adversarial Learning for Text Classification with Limited Labeled Data	Rohit Sharma et.al.	2501.08035	translate	read	null
2025-01-14	Cooperative Patrol Routing: Optimizing Urban Crime Surveillance through Multi-Agent Reinforcement Learning	Juan Palma-Borda et.al.	2501.08020	translate	read	null
2025-01-13	SafeSwarm: Decentralized Safe RL for the Swarm of Drones Landing in Dense Crowds	Grik Tadevosyan et.al.	2501.07566	translate	read	null
2025-01-13	Improving DeFi Accessibility through Efficient Liquidity Provisioning with Deep Reinforcement Learning	Haonan Xu et.al.	2501.07508	translate	read	null
2025-01-13	RbRL2.0: Integrated Reward and Policy Learning for Rating-based Reinforcement Learning	Mingkang Wu et.al.	2501.07502	translate	read	null
2025-01-13	Online inductive learning from answer sets for efficient reinforcement learning exploration	Celeste Veronese et.al.	2501.07445	translate	read	null
2025-01-13	Attention when you need	Lokesh Boominathan et.al.	2501.07440	translate	read	null
2025-01-13	Enhancing Online Reinforcement Learning with Meta-Learned Objective from Offline Data	Shilong Deng et.al.	2501.07346	translate	read	link
2025-01-13	Foundation Models at Work: Fine-Tuning for Fairness in Algorithmic Hiring	Buse Sibel Korkmaz et.al.	2501.07324	translate	read	link
2025-01-13	Mining Intraday Risk Factor Collections via Hierarchical Reinforcement Learning based on Transferred Options	Wenyan Xu et.al.	2501.07274	translate	read	null
2025-01-13	Future-Conditioned Recommendations with Multi-Objective Controllable Decision Transformer	Chongming Gao et.al.	2501.07212	translate	read	null
2025-01-13	Generalizable Graph Neural Networks for Robust Power Grid Topology Control	Matthijs de Jong et.al.	2501.07186	translate	read	null
2025-01-10	From discrete-time policies to continuous-time diffusion samplers: Asymptotic equivalences and faster training	Julius Berner et.al.	2501.06148	translate	read	link
2025-01-10	Vehicle-in-Virtual-Environment (VVE) Based Autonomous Driving Function Development and Evaluation Methodology for Vulnerable Road User Safety	Haochong Chen et.al.	2501.06113	translate	read	null
2025-01-10	Learning Flexible Heterogeneous Coordination with Capability-Aware Shared Hypernetworks	Kevin Fu et.al.	2501.06058	translate	read	null
2025-01-10	Investigating the Impact of Observation Space Design Choices On Training Reinforcement Learning Solutions for Spacecraft Problems	Nathaniel Hamilton et.al.	2501.06016	translate	read	null
2025-01-10	The Safe Trusted Autonomy for Responsible Space Program	Kerianne L. Hobbs et.al.	2501.05984	translate	read	null
2025-01-10	A Practical Demonstration of DRL-Based Dynamic Resource Allocation xApp Using OpenAirInterface	Onur Sever et.al.	2501.05879	translate	read	null
2025-01-10	Diffusion Models for Smarter UAVs: Decision-Making and Modeling	Yousef Emami et.al.	2501.05819	translate	read	null
2025-01-10	Real-Time Integrated Dispatching and Idle Fleet Steering with Deep Reinforcement Learning for A Meal Delivery Platform	Jingyi Cheng et.al.	2501.05808	translate	read	null
2025-01-10	Understanding Impact of Human Feedback via Influence Functions	Taywon Min et.al.	2501.05790	translate	read	link
2025-01-09	Session-Level Dynamic Ad Load Optimization using Offline Robust Reinforcement Learning	Tao Liu et.al.	2501.05591	translate	read	null
2025-01-09	TimeRL: Efficient Deep Reinforcement Learning with Polyhedral Dependence Graphs	Pedro F. Silvestre et.al.	2501.05408	translate	read	null
2025-01-09	Search-o1: Agentic Search-Enhanced Large Reasoning Models	Xiaoxi Li et.al.	2501.05366	translate	read	link
2025-01-09	Knowledge Transfer in Model-Based Reinforcement Learning Agents for Efficient Multi-Task Learning	Dmytro Kuzmenko et.al.	2501.05329	translate	read	null
2025-01-09	Design and Control of a Bipedal Robotic Character	Ruben Grandia et.al.	2501.05204	translate	read	null
2025-01-09	Constrained Optimization of Charged Particle Tracking with Multi-Agent Reinforcement Learning	Tobias Kortus et.al.	2501.05113	translate	read	null
2025-01-09	LearningFlow: Automated Policy Learning Workflow for Urban Driving with Large Language Models	Zengqi Peng et.al.	2501.05057	translate	read	null
2025-01-09	CuRLA: Curriculum Learning Based Deep Reinforcement Learning for Autonomous Driving	Bhargava Uppuluri et.al.	2501.04982	translate	read	null
2025-01-09	Promoting Shared Energy Storage Aggregation among High Price-Tolerance Prosumer: An Incentive Deposit and Withdrawal Service	Xin Lu et.al.	2501.04964	translate	read	null
2025-01-09	Balancing Exploration and Cybersickness: Investigating Curiosity-Driven Behavior in Virtual Environments	Tangyao Li et.al.	2501.04905	translate	read	null
2025-01-08	Multilinear Tensor Low-Rank Approximation for Policy-Gradient Methods in Reinforcement Learning	Sergio Rozada et.al.	2501.04879	translate	read	null
2025-01-08	Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Thought	Violet Xiang et.al.	2501.04682	translate	read	null
2025-01-08	Framework for Integrating Machine Learning Methods for Path-Aware Source Routing	Anees Al-Najjar et.al.	2501.04624	translate	read	null
2025-01-08	MobileH2R: Learning Generalizable Human to Mobile Robot Handover Exclusively from Scalable and Diverse Synthetic Data	Zifan Wang et.al.	2501.04595	translate	read	null
2025-01-08	HypeRL: Parameter-Informed Reinforcement Learning for Parametric PDEs	Nicolò Botteghi et.al.	2501.04538	translate	read	null
2025-01-08	Safe Reinforcement Learning with Minimal Supervision	Alexander Quessy et.al.	2501.04481	translate	read	null
2025-01-08	Research on environment perception and behavior prediction of intelligent UAV based on semantic communication	Kechong Ren et.al.	2501.04480	translate	read	null
2025-01-08	Hybrid Artificial Intelligence Strategies for Drone Navigation	Rubén San-Segundo et.al.	2501.04472	translate	read	null
2025-01-08	Risk-averse policies for natural gas futures trading using distributional reinforcement learning	Félicien Hêche et.al.	2501.04421	translate	read	null
2025-01-08	Constraints as Rewards: Reinforcement Learning for Robots without Reward Functions	Yu Ishihara et.al.	2501.04228	translate	read	null
2025-01-07	Explainable Reinforcement Learning via Temporal Policy Decomposition	Franco Ruggeri et.al.	2501.03902	translate	read	null
2025-01-07	Neural DNF-MT: A Neuro-symbolic Approach for Learning Interpretable and Editable Policies	Kexin Gu Baugh et.al.	2501.03888	translate	read	null
2025-01-07	AlphaPO – Reward shape matters for LLM alignment	Aman Gupta et.al.	2501.03884	translate	read	null
2025-01-07	Online Reinforcement Learning-Based Dynamic Adaptive Evaluation Function for Real-Time Strategy Tasks	Weilong Yang et.al.	2501.03824	translate	read	null
2025-01-07	Run-and-tumble chemotaxis using reinforcement learning	Ramesh Pramanik et.al.	2501.03687	translate	read	null
2025-01-07	IEEE 802.11bn Multi-AP Coordinated Spatial Reuse with Hierarchical Multi-Armed Bandits	Maksymilian Wojnar et.al.	2501.03680	translate	read	null
2025-01-07	SALE-Based Offline Reinforcement Learning with Ensemble Q-Networks	Zheng Chun et.al.	2501.03676	translate	read	null
2025-01-07	Imitation Learning of MPC with Neural Networks: Error Guarantees and Sparsification	Hendrik Alsmeier et.al.	2501.03671	translate	read	null
2025-01-07	Rethinking Adversarial Attacks in Reinforcement Learning from Policy Distribution Perspective	Tianyang Duan et.al.	2501.03562	translate	read	null
2025-01-07	Align-Pro: A Principled Approach to Prompt Optimization for LLM Alignment	Prashant Trivedi et.al.	2501.03486	translate	read	null
2025-01-06	Turn-based Multi-Agent Reinforcement Learning Model Checking	Dennis Gross et.al.	2501.03187	translate	read	null
2025-01-06	Co-Activation Graph Analysis of Safety-Verified and Explainable Deep Reinforcement Learning Policies	Dennis Gross et.al.	2501.03142	translate	read	null
2025-01-06	CALM: Curiosity-Driven Auditing for Large Language Models	Xiang Zheng et.al.	2501.02997	translate	read	null
2025-01-06	CAMP: Collaborative Attention Model with Profiles for Vehicle Routing Problems	Chuanbo Hua et.al.	2501.02977	translate	read	null
2025-01-06	Sim-to-Real Transfer for Mobile Robots with Reinforcement Learning: from NVIDIA Isaac Sim to Gazebo and Real ROS 2 Robots	Sahar Salimpour et.al.	2501.02902	translate	read	link
2025-01-06	Revisiting Communication Efficiency in Multi-Agent Reinforcement Learning from the Dimensional Analysis Perspective	Chuxiong Sun et.al.	2501.02888	translate	read	null
2025-01-06	First-place Solution for Streetscape Shop Sign Recognition Competition	Bin Wang et.al.	2501.02811	translate	read	null
2025-01-06	Segmenting Text and Learning Their Rewards for Improved RLHF in Language Model	Yueqin Yin et.al.	2501.02790	translate	read	null
2025-01-06	Joint Optimization of UAV-Carried IRS for Urban Low Altitude mmWave Communications with Deep Reinforcement Learning	Wenwen Xie et.al.	2501.02787	translate	read	null
2025-01-06	Learn A Flexible Exploration Model for Parameterized Action Markov Decision Processes	Zijian Wang et.al.	2501.02774	translate	read	null
2025-01-03	Evaluating Scenario-based Decision-making for Interactive Autonomous Driving Using Rational Criteria: A Survey	Zhen Tian et.al.	2501.01886	translate	read	null
2025-01-03	Auto-RT: Automatic Jailbreak Strategy Exploration for Red-Teaming Large Language Models	Yanjiang Liu et.al.	2501.01830	translate	read	null
2025-01-03	Genetic algorithm enhanced Solovay-Kitaev algorithm for quantum compiling	Jiangwei Long et.al.	2501.01746	translate	read	null
2025-01-03	Proposing Hierarchical Goal-Conditioned Policy Planning in Multi-Goal Reinforcement Learning	Gavin B. Rens et.al.	2501.01727	translate	read	null
2025-01-03	Inversely Learning Transferable Rewards via Abstracted States	Yikang Gui et.al.	2501.01669	translate	read	null
2025-01-03	BLAST: A Stealthy Backdoor Leverage Attack against Cooperative Multi-Agent Deep Reinforcement Learning based Systems	Yinbo Yu et.al.	2501.01593	translate	read	null
2025-01-02	Reinforcement-learning-based control of turbulent channel flows at high Reynolds numbers	Zisong Zhou et.al.	2501.01573	translate	read	null
2025-01-02	Reinforcement Learning for Respondent-Driven Sampling	Justin Weltz et.al.	2501.01505	translate	read	null
2025-01-02	Decoding Knowledge in Large Language Models: A Framework for Categorization and Comprehension	Yanbo Fang et.al.	2501.01332	translate	read	null
2025-01-02	Towards Intelligent Antenna Positioning: Leveraging DRL for FAS-Aided ISAC Systems	Shunxing Yang et.al.	2501.01281	translate	read	null
2025-01-02	PIMAEX: Multi-Agent Exploration through Peer Incentivization	Michael Kölle et.al.	2501.01266	translate	read	null
2025-01-02	Embodied AI-Enhanced Vehicular Networks: An Integrated Large Language Models and Reinforcement Learning Method	Ruichen Zhang et.al.	2501.01141	translate	read	null
2025-01-02	Communicating Unexpectedness for Out-of-Distribution Multi-Agent Reinforcement Learning	Min Whoo Lee et.al.	2501.01140	translate	read	null
2025-01-02	Symmetries-enhanced Multi-Agent Reinforcement Learning	Nikolaos Bousias et.al.	2501.01136	translate	read	null
2025-01-02	Noise-Resilient Symbolic Regression with Dynamic Gating Reinforcement Learning	Chenglu Sun et.al.	2501.01085	translate	read	null
2025-01-02	Enhancing Neural Adaptive Wireless Video Streaming via Lower-Layer Information Exposure and Online Tuning	Lingzhi Zhao et.al.	2501.01044	translate	read	null
2025-01-02	Energy-Efficient and Intelligent ISAC in V2X Networks with Spiking Neural Networks-Driven DRL	Chen Shang et.al.	2501.01038	translate	read	null
2025-01-02	Deep Reinforcement Learning for Job Scheduling and Resource Management in Cloud Computing: An Algorithm-Level Review	Yan Gu et.al.	2501.01007	translate	read	null

(<a href=../Reinforcement_Learning.md>back to Reinforcement Learning</a>)