Reinforcement Learning - 2026-04
Reinforcement Learning - 2026-04
| Publish Date | Title | Authors | Translate | Read | Code | |
|---|---|---|---|---|---|---|
| 2026-04-01 | Embarrassingly Simple Self-Distillation Improves Code Generation | Ruixiang Zhang et.al. | 2604.01193 | translate | read | null |
| 2026-04-01 | Deep Reinforcement Learning for Robotic Manipulation under Distribution Shift with Bounded Extremum Seeking | Shaifalee Saxena et.al. | 2604.01142 | translate | read | null |
| 2026-04-01 | Multi-Agent LLM Governance for Safe Two-Timescale Reinforcement Learning in SDN-IoT Defense | Saeid Jamshidi et.al. | 2604.01127 | translate | read | null |
| 2026-04-01 | BAT: Balancing Agility and Stability via Online Policy Switching for Long-Horizon Whole-Body Humanoid Control | Donghoon Baek et.al. | 2604.01064 | translate | read | null |
| 2026-04-01 | Adversarial Attacks in AI-Driven RAN Slicing: SLA Violations and Recovery | Deemah H. Tashman et.al. | 2604.01049 | translate | read | null |
| 2026-04-01 | Query-Conditioned Evidential Keyframe Sampling for MLLM-Based Long-Form Video Understanding | Yiheng Wang et.al. | 2604.01002 | translate | read | null |
| 2026-04-01 | Focal plane wavefront control with model-based reinforcement learning | Jalo Nousiainen et.al. | 2604.00993 | translate | read | null |
| 2026-04-01 | Flow-based Policy With Distributional Reinforcement Learning in Trajectory Optimization | Ruijie Hao et.al. | 2604.00977 | translate | read | null |
| 2026-04-01 | Policy Improvement Reinforcement Learning | Huaiyang Wang et.al. | 2604.00860 | translate | read | null |
| 2026-04-01 | Disentangling to Re-couple: Resolving the Similarity-Controllability Paradox in Subject-Driven Text-to-Image Generation | Shuang Li et.al. | 2604.00849 | translate | read | null |
| 2026-04-01 | Bridging RL and MPC for mixed-integer optimal control with application to Formula 1 race strategies | Joschua Wüthrich et.al. | 2604.00826 | translate | read | null |
| 2026-04-01 | RefineRL: Advancing Competitive Programming with Self-Refinement Reinforcement Learning | Shaopeng Fu et.al. | 2604.00790 | translate | read | null |
| 2026-04-01 | LangMARL: Natural Language Multi-Agent Reinforcement Learning | Huaiyuan Yao et.al. | 2604.00722 | translate | read | null |
| 2026-04-01 | Learning to Hint for Reinforcement Learning | Yu Xia et.al. | 2604.00698 | translate | read | null |
| 2026-04-01 | TTA-Vid: Generalized Test-Time Adaptation for Video Reasoning | Soumya Shamarao Jahagirdar et.al. | 2604.00696 | translate | read | null |
| 2026-04-01 | Full-Gradient Successor Feature Representations | Ritish Shrirao et.al. | 2604.00686 | translate | read | null |
| 2026-04-01 | A Survey of On-Policy Distillation for Large Language Models | Mingyang Song et.al. | 2604.00626 | translate | read | null |
| 2026-04-01 | A Physical Imitation Learning Pipeline for Energy-Efficient Quadruped Locomotion Assisted by Parallel Elastic Joint | Huyue Ma et.al. | 2604.00611 | translate | read | null |
| 2026-04-01 | Toward Efficient Deployment and Synchronization in Digital Twins-Empowered Networks | Hossam Farag et.al. | 2604.00566 | translate | read | null |
| 2026-04-01 | Multi-Camera View Scaling for Data-Efficient Robot Imitation Learning | Yichen Xie et.al. | 2604.00557 | translate | read | null |
| 2026-04-01 | Optimsyn: Influence-Guided Rubrics Optimization for Synthetic Data Generation | Zhiting Fan et.al. | 2604.00536 | translate | read | null |
| 2026-04-01 | AceTone: Bridging Words and Colors for Conditional Image Grading | Tianren Ma et.al. | 2604.00530 | translate | read | null |
| 2026-04-01 | MOON3.0: Reasoning-aware Multimodal Representation Learning for E-commerce Product Understanding | Junxian Wu et.al. | 2604.00513 | translate | read | null |
| 2026-04-01 | A Reasoning-Enabled Vision-Language Foundation Model for Chest X-ray Interpretation | Yabin Zhang et.al. | 2604.00493 | translate | read | null |
| 2026-04-01 | All Roads Lead to Rome: Incentivizing Divergent Thinking in Vision-Language Models | Xinyu Tian et.al. | 2604.00479 | translate | read | null |
| 2026-04-01 | Execution-Verified Reinforcement Learning for Optimization Modeling | Runda Guan et.al. | 2604.00442 | translate | read | null |
| 2026-04-01 | TR-ICRL: Test-Time Rethinking for In-Context Reinforcement Learning | Wenxuan Jiang et.al. | 2604.00438 | translate | read | null |
| 2026-04-01 | Internal State-Based Policy Gradient Methods for Partially Observable Markov Potential Games | Wonseok Yang et.al. | 2604.00433 | translate | read | null |
| 2026-04-01 | GUIDE: Reinforcement Learning for Behavioral Action Support in Type 1 Diabetes | Saman Khamesian et.al. | 2604.00385 | translate | read | null |
| 2026-04-01 | Agent Q-Mix: Selecting the Right Action for LLM Multi-Agent Systems through Reinforcement Learning | Eric Hanchen Jiang et.al. | 2604.00344 | translate | read | null |
(<a href=../Reinforcement_Learning.md>back to Reinforcement Learning</a>)