Reinforcement Learning - 2026-04

Publish Date Title Authors PDF Translate Read Code
2026-04-01 Embarrassingly Simple Self-Distillation Improves Code Generation Ruixiang Zhang et.al. 2604.01193 translate read null
2026-04-01 Deep Reinforcement Learning for Robotic Manipulation under Distribution Shift with Bounded Extremum Seeking Shaifalee Saxena et.al. 2604.01142 translate read null
2026-04-01 Multi-Agent LLM Governance for Safe Two-Timescale Reinforcement Learning in SDN-IoT Defense Saeid Jamshidi et.al. 2604.01127 translate read null
2026-04-01 BAT: Balancing Agility and Stability via Online Policy Switching for Long-Horizon Whole-Body Humanoid Control Donghoon Baek et.al. 2604.01064 translate read null
2026-04-01 Adversarial Attacks in AI-Driven RAN Slicing: SLA Violations and Recovery Deemah H. Tashman et.al. 2604.01049 translate read null
2026-04-01 Query-Conditioned Evidential Keyframe Sampling for MLLM-Based Long-Form Video Understanding Yiheng Wang et.al. 2604.01002 translate read null
2026-04-01 Focal plane wavefront control with model-based reinforcement learning Jalo Nousiainen et.al. 2604.00993 translate read null
2026-04-01 Flow-based Policy With Distributional Reinforcement Learning in Trajectory Optimization Ruijie Hao et.al. 2604.00977 translate read null
2026-04-01 Policy Improvement Reinforcement Learning Huaiyang Wang et.al. 2604.00860 translate read null
2026-04-01 Disentangling to Re-couple: Resolving the Similarity-Controllability Paradox in Subject-Driven Text-to-Image Generation Shuang Li et.al. 2604.00849 translate read null
2026-04-01 Bridging RL and MPC for mixed-integer optimal control with application to Formula 1 race strategies Joschua Wüthrich et.al. 2604.00826 translate read null
2026-04-01 RefineRL: Advancing Competitive Programming with Self-Refinement Reinforcement Learning Shaopeng Fu et.al. 2604.00790 translate read null
2026-04-01 LangMARL: Natural Language Multi-Agent Reinforcement Learning Huaiyuan Yao et.al. 2604.00722 translate read null
2026-04-01 Learning to Hint for Reinforcement Learning Yu Xia et.al. 2604.00698 translate read null
2026-04-01 TTA-Vid: Generalized Test-Time Adaptation for Video Reasoning Soumya Shamarao Jahagirdar et.al. 2604.00696 translate read null
2026-04-01 Full-Gradient Successor Feature Representations Ritish Shrirao et.al. 2604.00686 translate read null
2026-04-01 A Survey of On-Policy Distillation for Large Language Models Mingyang Song et.al. 2604.00626 translate read null
2026-04-01 A Physical Imitation Learning Pipeline for Energy-Efficient Quadruped Locomotion Assisted by Parallel Elastic Joint Huyue Ma et.al. 2604.00611 translate read null
2026-04-01 Toward Efficient Deployment and Synchronization in Digital Twins-Empowered Networks Hossam Farag et.al. 2604.00566 translate read null
2026-04-01 Multi-Camera View Scaling for Data-Efficient Robot Imitation Learning Yichen Xie et.al. 2604.00557 translate read null
2026-04-01 Optimsyn: Influence-Guided Rubrics Optimization for Synthetic Data Generation Zhiting Fan et.al. 2604.00536 translate read null
2026-04-01 AceTone: Bridging Words and Colors for Conditional Image Grading Tianren Ma et.al. 2604.00530 translate read null
2026-04-01 MOON3.0: Reasoning-aware Multimodal Representation Learning for E-commerce Product Understanding Junxian Wu et.al. 2604.00513 translate read null
2026-04-01 A Reasoning-Enabled Vision-Language Foundation Model for Chest X-ray Interpretation Yabin Zhang et.al. 2604.00493 translate read null
2026-04-01 All Roads Lead to Rome: Incentivizing Divergent Thinking in Vision-Language Models Xinyu Tian et.al. 2604.00479 translate read null
2026-04-01 Execution-Verified Reinforcement Learning for Optimization Modeling Runda Guan et.al. 2604.00442 translate read null
2026-04-01 TR-ICRL: Test-Time Rethinking for In-Context Reinforcement Learning Wenxuan Jiang et.al. 2604.00438 translate read null
2026-04-01 Internal State-Based Policy Gradient Methods for Partially Observable Markov Potential Games Wonseok Yang et.al. 2604.00433 translate read null
2026-04-01 GUIDE: Reinforcement Learning for Behavioral Action Support in Type 1 Diabetes Saman Khamesian et.al. 2604.00385 translate read null
2026-04-01 Agent Q-Mix: Selecting the Right Action for LLM Multi-Agent Systems through Reinforcement Learning Eric Hanchen Jiang et.al. 2604.00344 translate read null

(<a href=../Reinforcement_Learning.md>back to Reinforcement Learning</a>)