Reinforcement Learning - 2024-12
Reinforcement Learning - 2024-12
| Publish Date | Title | Authors | Translate | Read | Code | |
|---|---|---|---|---|---|---|
| 2024-12-30 | Advances in Multi-agent Reinforcement Learning: Persistent Autonomy and Robot Learning Lab Report 2024 | Reza Azadeh et.al. | 2412.21088 | translate | read | null |
| 2024-12-30 | Learning Epidemiological Dynamics via the Finite Expression Method | Jianda Du et.al. | 2412.21049 | translate | read | null |
| 2024-12-30 | Weber-Fechner Law in Temporal Difference learning derived from Control as Inference | Keiichiro Takahashi et.al. | 2412.21004 | translate | read | null |
| 2024-12-30 | LEASE: Offline Preference-based Reinforcement Learning with High Sample Efficiency | Xiao-Yin Liu et.al. | 2412.21001 | translate | read | link |
| 2024-12-30 | UnrealZoo: Enriching Photo-realistic Virtual Worlds for Embodied AI | Fangwei Zhong et.al. | 2412.20977 | translate | read | null |
| 2024-12-30 | Data-Based Efficient Off-Policy Stabilizing Optimal Control Algorithms for Discrete-Time Linear Systems via Damping Coefficients | Dongdong Li et.al. | 2412.20845 | translate | read | null |
| 2024-12-30 | Isoperimetry is All We Need: Langevin Posterior Sampling for RL with Sublinear Regret | Emilio Jorge et.al. | 2412.20824 | translate | read | null |
| 2024-12-29 | The intrinsic motivation of reinforcement and imitation learning for sequential tasks | Sao Mai Nguyen et.al. | 2412.20573 | translate | read | null |
| 2024-12-29 | Diminishing Return of Value Expansion Methods | Daniel Palenicek et.al. | 2412.20537 | translate | read | link |
| 2024-12-29 | Game Theory and Multi-Agent Reinforcement Learning : From Nash Equilibria to Evolutionary Dynamics | Neil De La Fuente et.al. | 2412.20523 | translate | read | null |
| 2024-12-27 | From Ceilings to Walls: Universal Dynamic Perching of Small Aerial Robots on Surfaces with Variable Orientations | Bryan Habas et.al. | 2412.19765 | translate | read | null |
| 2024-12-27 | Adaptive Context-Aware Multi-Path Transmission Control for VR/AR Content: A Deep Reinforcement Learning Approach | Shakil Ahmed et.al. | 2412.19737 | translate | read | null |
| 2024-12-27 | Goal-oriented Communications based on Recursive Early Exit Neural Networks | Jary Pomponi et.al. | 2412.19587 | translate | read | null |
| 2024-12-27 | Graph-attention-based Casual Discovery with Trust Region-navigated Clipping Policy Optimization | Shixuan Liu et.al. | 2412.19578 | translate | read | null |
| 2024-12-27 | Reinforced Label Denoising for Weakly-Supervised Audio-Visual Video Parsing | Yongbiao Gao et.al. | 2412.19563 | translate | read | null |
| 2024-12-27 | Scalable Hierarchical Reinforcement Learning for Hyper Scale Multi-Robot Task Planning | Xuan Zhou et.al. | 2412.19538 | translate | read | null |
| 2024-12-27 | An Overview of Machine Learning-Driven Resource Allocation in IoT Networks | Zhengdong Li et.al. | 2412.19478 | translate | read | null |
| 2024-12-27 | DeepSeek-V3 Technical Report | DeepSeek-AI et.al. | 2412.19437 | translate | read | link |
| 2024-12-27 | Low-Rank Contextual Reinforcement Learning from Heterogeneous Human Feedback | Seong Jin Lee et.al. | 2412.19436 | translate | read | null |
| 2024-12-27 | Comparing Few to Rank Many: Active Human Preference Learning using Randomized Frank-Wolfe | Kiran Koshy Thekumparampil et.al. | 2412.19396 | translate | read | null |
| 2024-12-24 | Modeling the Centaur: Human-Machine Synergy in Sequential Decision Making | David Shoresh et.al. | 2412.18593 | translate | read | null |
| 2024-12-24 | Dynamic Optimization of Portfolio Allocation Using Deep Reinforcement Learning | Gang Huang et.al. | 2412.18563 | translate | read | link |
| 2024-12-24 | Large Language Model guided Deep Reinforcement Learning for Decision Making in Autonomous Driving | Hao Pang et.al. | 2412.18511 | translate | read | null |
| 2024-12-24 | Joint Adaptive OFDM and Reinforcement Learning Design for Autonomous Vehicles: Leveraging Age of Updates | Mamady Delamou et.al. | 2412.18500 | translate | read | null |
| 2024-12-24 | Contrastive Representation for Interactive Recommendation | Jingyu Li et.al. | 2412.18396 | translate | read | link |
| 2024-12-24 | Navigating Data Corruption in Machine Learning: Balancing Quality, Quantity, and Imputation Strategies | Qi Liu et.al. | 2412.18296 | translate | read | null |
| 2024-12-24 | Improving Multi-Step Reasoning Abilities of Large Language Models with Direct Advantage Policy Optimization | Jiacai Liu et.al. | 2412.18279 | translate | read | null |
| 2024-12-24 | Accelerating AIGC Services with Latent Action Diffusion Scheduling in Edge Networks | Changfu Xu et.al. | 2412.18212 | translate | read | link |
| 2024-12-24 | Quantum framework for Reinforcement Learning: integrating Markov Decision Process, quantum arithmetic, and trajectory search | Thet Htar Su et.al. | 2412.18208 | translate | read | null |
| 2024-12-24 | Token Highlighter: Inspecting and Mitigating Jailbreak Prompts for Large Language Models | Xiaomeng Hu et.al. | 2412.18171 | translate | read | null |
| 2024-12-23 | HyperQ-Opt: Q-learning for Hyperparameter Optimization | Md. Tarek Hasan et.al. | 2412.17765 | translate | read | null |
| 2024-12-23 | Mimicking-Bench: A Benchmark for Generalizable Humanoid-Scene Interaction Learning via Human Mimicking | Yun Liu et.al. | 2412.17730 | translate | read | null |
| 2024-12-23 | SMAC-Hard: Enabling Mixed Opponent Strategy Script and Self-play on SMAC | Yue Deng et.al. | 2412.17707 | translate | read | link |
| 2024-12-23 | Towards Intrinsic Self-Correction Enhancement in Monte Carlo Tree Search Boosted Reasoning via Iterative Preference Learning | Huchen Jiang et.al. | 2412.17397 | translate | read | null |
| 2024-12-23 | Reinforcement Learning with a Focus on Adjusting Policies to Reach Targets | Akane Tsuboya et.al. | 2412.17344 | translate | read | null |
| 2024-12-23 | Multimodal Deep Reinforcement Learning for Portfolio Optimization | Sumit Nawathe et.al. | 2412.17293 | translate | read | null |
| 2024-12-23 | LMD-PGN: Cross-Modal Knowledge Distillation from First-Person-View Images to Third-Person-View BEV Maps for Universal Point Goal Navigation | Riku Uemura et.al. | 2412.17282 | translate | read | null |
| 2024-12-23 | ACECode: A Reinforcement Learning Framework for Aligning Code Efficiency and Correctness in Code Language Models | Chengran Yang et.al. | 2412.17264 | translate | read | null |
| 2024-12-23 | A Coalition Game for On-demand Multi-modal 3D Automated Delivery System | Farzan Moosavi et.al. | 2412.17252 | translate | read | null |
| 2024-12-23 | Model-free stochastic linear quadratic design by semidefinite programming | Jing Guo et.al. | 2412.17230 | translate | read | null |
| 2024-12-20 | Offline Reinforcement Learning for LLM Multi-Step Reasoning | Huaijie Wang et.al. | 2412.16145 | translate | read | null |
| 2024-12-20 | APIRL: Deep Reinforcement Learning for REST API Fuzzing | Myles Foley et.al. | 2412.15991 | translate | read | link |
| 2024-12-20 | Active Flow Control for Bluff Body under High Reynolds Number Turbulent Flow Conditions Using Deep Reinforcement Learning | Jingbo Chen et.al. | 2412.15975 | translate | read | null |
| 2024-12-20 | From General to Specific: Tailoring Large Language Models for Personalized Healthcare | Ruize Shi et.al. | 2412.15957 | translate | read | null |
| 2024-12-20 | What Are Step-Level Reward Models Rewarding? Counterintuitive Findings from MCTS-Boosted Mathematical Reasoning | Yiran Ma et.al. | 2412.15904 | translate | read | null |
| 2024-12-20 | Align Anything: Training All-Modality Models to Follow Instructions with Language Feedback | Jiaming Ji et.al. | 2412.15838 | translate | read | link |
| 2024-12-20 | MacLight: Multi-scene Aggregation Convolutional Learning for Traffic Signal Control | Sunbowen Lee et.al. | 2412.15703 | translate | read | link |
| 2024-12-20 | AIR: Unifying Individual and Cooperative Exploration in Collective Multi-Agent Reinforcement Learning | Guangchong Zhou et.al. | 2412.15700 | translate | read | link |
| 2024-12-20 | Tacit Learning with Adaptive Information Selection for Cooperative Multi-Agent Reinforcement Learning | Lunjun Liu et.al. | 2412.15639 | translate | read | null |
| 2024-12-20 | Dexterous Manipulation Based on Prior Dexterous Grasp Pose Knowledge | Hengxu Yan et.al. | 2412.15587 | translate | read | null |
| 2024-12-19 | Qwen2.5 Technical Report | Qwen et.al. | 2412.15115 | translate | read | null |
| 2024-12-19 | Dream to Manipulate: Compositional World Models Empowering Robot Imitation Learning with Imagination | Leonardo Barcellona et.al. | 2412.14957 | translate | read | null |
| 2024-12-19 | Effective Method with Compression for Distributed and Federated Cocoercive Variational Inequalities | Daniil Medyakov et.al. | 2412.14935 | translate | read | null |
| 2024-12-19 | Hierarchical Subspaces of Policies for Continual Offline Reinforcement Learning | Anthony Kobanda et.al. | 2412.14865 | translate | read | null |
| 2024-12-19 | Entropy Regularized Task Representation Learning for Offline Meta-Reinforcement Learning | Mohammadreza nakhaei et.al. | 2412.14834 | translate | read | link |
| 2024-12-19 | Agent-Temporal Credit Assignment for Optimal Policy Preservation in Sparse Multi-Agent Reinforcement Learning | Aditya Kapoor et.al. | 2412.14779 | translate | read | null |
| 2024-12-19 | Learning to Generate Research Idea with Dynamic Control | Ruochen Li et.al. | 2412.14626 | translate | read | null |
| 2024-12-19 | Simulation-Free Hierarchical Latent Policy Planning for Proactive Dialogues | Tao He et.al. | 2412.14584 | translate | read | null |
| 2024-12-19 | Single-Loop Federated Actor-Critic across Heterogeneous Environments | Ye Zhu et.al. | 2412.14555 | translate | read | null |
| 2024-12-18 | Implementing TD3 to train a Neural Network to fly a Quadcopter through an FPV Gate | Patrick Thomas et.al. | 2412.14367 | translate | read | null |
| 2024-12-18 | Learning from Massive Human Videos for Universal Humanoid Pose Control | Jiageng Mao et.al. | 2412.14172 | translate | read | null |
| 2024-12-18 | Scaling of Search and Learning: A Roadmap to Reproduce o1 from Reinforcement Learning Perspective | Zhiyuan Zeng et.al. | 2412.14135 | translate | read | null |
| 2024-12-18 | Alignment faking in large language models | Ryan Greenblatt et.al. | 2412.14093 | translate | read | link |
| 2024-12-18 | Spatio-Temporal SIR Model of Pandemic Spread During Warfare with Optimal Dual-use Healthcare System Administration using Deep Reinforcement Learning | Adi Shuchami et.al. | 2412.14039 | translate | read | null |
| 2024-12-18 | Robust Optimal Safe and Stability Guaranteeing Reinforcement Learning Control for Quadcopter | Sanghyoup Gu et.al. | 2412.14003 | translate | read | null |
| 2024-12-18 | Harvesting energy from turbulent winds with Reinforcement Learning | Lorenzo Basile et.al. | 2412.13961 | translate | read | null |
| 2024-12-18 | RoboMIND: Benchmark on Multi-embodiment Intelligence Normative Data for Robot Manipulation | Kun Wu et.al. | 2412.13877 | translate | read | null |
| 2024-12-18 | AI-Powered Algorithm-Centric Quantum Processor Topology Design | Tian Li et.al. | 2412.13805 | translate | read | link |
| 2024-12-18 | Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and Post-LN | Pengxiang Li et.al. | 2412.13795 | translate | read | link |
| 2024-12-18 | A hybrid learning agent for episodic learning tasks with unknown target distance | Oliver Sefrin et.al. | 2412.13686 | translate | read | null |
| 2024-12-17 | ExBody2: Advanced Expressive Humanoid Whole-Body Control | Mazeyu Ji et.al. | 2412.13196 | translate | read | null |
| 2024-12-17 | Tilted Quantile Gradient Updates for Quantile-Constrained Reinforcement Learning | Chenglin Li et.al. | 2412.13184 | translate | read | link |
| 2024-12-17 | Learning Visuotactile Estimation and Control for Non-prehensile Manipulation under Occlusions | Juan Del Aguila Ferrandis et.al. | 2412.13157 | translate | read | null |
| 2024-12-17 | Practicable Black-box Evasion Attacks on Link Prediction in Dynamic Graphs – A Graph Sequential Embedding Method | Jiate Li et.al. | 2412.13134 | translate | read | link |
| 2024-12-17 | Active Reinforcement Learning Strategies for Offline Policy Improvement | Ambedkar Dukkipati et.al. | 2412.13106 | translate | read | null |
| 2024-12-17 | Reservoir Computing for Fast, Simplified Reinforcement Learning on Memory Tasks | Kevin McKee et.al. | 2412.13093 | translate | read | null |
| 2024-12-17 | SMOSE: Sparse Mixture of Shallow Experts for Interpretable Reinforcement Learning in Continuous Control Tasks | Mátyás Vincze et.al. | 2412.13053 | translate | read | null |
| 2024-12-17 | Relational Neurosymbolic Markov Models | Lennert De Smet et.al. | 2412.13023 | translate | read | null |
| 2024-12-17 | Future Aspects in Human Action Recognition: Exploring Emerging Techniques and Ethical Influences | Antonios Gasteratos et.al. | 2412.12990 | translate | read | null |
| 2024-12-17 | Guiding Generative Protein Language Models with Reinforcement Learning | Filippo Stocco et.al. | 2412.12979 | translate | read | null |
| 2024-12-16 | MaxInfoRL: Boosting exploration in reinforcement learning through information gain maximization | Bhavya Sukhija et.al. | 2412.12098 | translate | read | null |
| 2024-12-16 | Stabilizing Reinforcement Learning in Differentiable Multiphysics Simulation | Eliot Xing et.al. | 2412.12089 | translate | read | null |
| 2024-12-16 | Artificial Intelligence in Traffic Systems | Ritwik Raj Saxena et.al. | 2412.12046 | translate | read | null |
| 2024-12-16 | Learning to Navigate in Mazes with Novel Layouts using Abstract Top-down Maps | Linfeng Zhao et.al. | 2412.12024 | translate | read | null |
| 2024-12-16 | Agentic AI-Driven Technical Troubleshooting for Enterprise Systems: A Novel Weighted Retrieval-Augmented Generation Paradigm | Rajat Khanda et.al. | 2412.12006 | translate | read | null |
| 2024-12-16 | AlphaZero Neural Scaling and Zipf’s Law: a Tale of Board Games and Power Laws | Oren Neumann et.al. | 2412.11979 | translate | read | link |
| 2024-12-16 | Emma-X: An Embodied Multimodal Action Model with Grounded Chain of Thought and Look-ahead Spatial Reasoning | Qi Sun et.al. | 2412.11974 | translate | read | link |
| 2024-12-16 | Hierarchical Meta-Reinforcement Learning via Automated Macro-Action Discovery | Minjae Cho et.al. | 2412.11930 | translate | read | null |
| 2024-12-16 | Generalized Bayesian deep reinforcement learning | Shreya Sinha Roy et.al. | 2412.11743 | translate | read | null |
| 2024-12-16 | Learning UAV-based path planning for efficient localization of objects using prior knowledge | Rick van Essen et.al. | 2412.11717 | translate | read | null |
| 2024-12-13 | A Novel Framework Using Deep Reinforcement Learning for Join Order Selection | Chang Liu et.al. | 2412.10253 | translate | read | null |
| 2024-12-13 | Physics Instrument Design with Reinforcement Learning | Shah Rukh Qasim et.al. | 2412.10237 | translate | read | null |
| 2024-12-13 | Scaling Combinatorial Optimization Neural Improvement Heuristics with Online Search and Adaptation | Federico Julian Camerota Verdù et.al. | 2412.10163 | translate | read | null |
| 2024-12-13 | AMUSE: Adaptive Model Updating using a Simulated Environment | Louis Chislett et.al. | 2412.10119 | translate | read | null |
| 2024-12-13 | Reward Machine Inference for Robotic Manipulation | Mattijs Baert et.al. | 2412.10096 | translate | read | null |
| 2024-12-13 | Optimized Coordination Strategy for Multi-Aerospace Systems in Pick-and-Place Tasks By Deep Neural Network | Ye Zhang et.al. | 2412.09877 | translate | read | null |
| 2024-12-13 | RLDG: Robotic Generalist Policy Distillation via Reinforcement Learning | Charles Xu et.al. | 2412.09858 | translate | read | null |
| 2024-12-13 | ScaleOT: Privacy-utility-scalable Offsite-tuning with Dynamic LayerReplace and Selective Rank Compression | Kai Yao et.al. | 2412.09812 | translate | read | null |
| 2024-12-12 | GainAdaptor: Learning Quadrupedal Locomotion with Dual Actors for Adaptable and Energy-Efficient Walking on Various Terrains | Mincheol Kim et.al. | 2412.09520 | translate | read | null |
| 2024-12-12 | Distributional Reinforcement Learning based Integrated Decision Making and Control for Autonomous Surface Vehicles | Xi Lin et.al. | 2412.09466 | translate | read | link |
| 2024-12-12 | Learning to Adapt: Bio-Inspired Gait Strategies for Versatile Quadruped Locomotion | Joseph Humphreys et.al. | 2412.09440 | translate | read | null |
| 2024-12-12 | Reinforcement Learning Within the Classical Robotics Stack: A Case Study in Robot Soccer | Adam Labiosa et.al. | 2412.09417 | translate | read | null |
| 2024-12-12 | Does Low Spoilage Under Cold Conditions Foster Cultural Complexity During the Foraging Era? – A Theoretical and Computational Inquiry | Minhyeok Lee et.al. | 2412.09335 | translate | read | null |
| 2024-12-12 | Learning to be Indifferent in Complex Decisions: A Coarse Payoff-Assessment Model | Philippe Jehiel et.al. | 2412.09321 | translate | read | null |
| 2024-12-12 | Learning Novel Skills from Language-Generated Demonstrations | Ao-Qun Jin et.al. | 2412.09286 | translate | read | null |
| 2024-12-12 | Student-Informed Teacher Training | Nico Messikommer et.al. | 2412.09149 | translate | read | null |
| 2024-12-12 | Reconfigurable Intelligent Surface for Internet of Robotic Things | Wanli Ni et.al. | 2412.09117 | translate | read | null |
| 2024-12-12 | In-Dataset Trajectory Return Regularization for Offline Preference-based Reinforcement Learning | Songjun Tu et.al. | 2412.09104 | translate | read | null |
| 2024-12-11 | Learning Sketch Decompositions in Planning via Deep Reinforcement Learning | Michael Aichmüller et.al. | 2412.08574 | translate | read | null |
| 2024-12-11 | GenPlan: Generative sequence models as adaptive planners | Akash Karthikeyan et.al. | 2412.08565 | translate | read | null |
| 2024-12-11 | An End-to-End Collaborative Learning Approach for Connected Autonomous Vehicles in Occluded Scenarios | Leandro Parada et.al. | 2412.08562 | translate | read | null |
| 2024-12-11 | MaestroMotif: Skill Design from Artificial Intelligence Feedback | Martin Klissarov et.al. | 2412.08542 | translate | read | null |
| 2024-12-11 | Subspace-wise Hybrid RL for Articulated Object Manipulation | Yujin Kim et.al. | 2412.08522 | translate | read | null |
| 2024-12-11 | Multi-perspective Alignment for Increasing Naturalness in Neural Machine Translation | Huiyuan Lai et.al. | 2412.08473 | translate | read | null |
| 2024-12-11 | IRL for Restless Multi-Armed Bandits with Applications in Maternal and Child Health | Gauri Jain et.al. | 2412.08463 | translate | read | link |
| 2024-12-11 | SINERGYM – A virtual testbed for building energy optimization with Reinforcement Learning | Alejandro Campoy-Nieves et.al. | 2412.08293 | translate | read | link |
| 2024-12-11 | Coarse-to-Fine: A Dual-Phase Channel-Adaptive Method for Wireless Image Transmission | Hanlei Li et.al. | 2412.08211 | translate | read | null |
| 2024-12-11 | Learn How to Query from Unlabeled Data Streams in Federated Learning | Yuchang Sun et.al. | 2412.08138 | translate | read | link |
| 2024-12-10 | Mobile-TeleVision: Predictive Motion Priors for Humanoid Whole-Body Control | Chenhao Lu et.al. | 2412.07773 | translate | read | null |
| 2024-12-10 | Efficient Online Reinforcement Learning Fine-Tuning Need Not Retain Offline Data | Zhiyuan Zhou et.al. | 2412.07762 | translate | read | null |
| 2024-12-10 | Optimizing Sensor Redundancy in Sequential Decision-Making Problems | Jonas Nüßlein et.al. | 2412.07686 | translate | read | null |
| 2024-12-10 | Offline Multi-Agent Reinforcement Learning via In-Sample Sequential Policy Optimization | Zongkai Liu et.al. | 2412.07639 | translate | read | null |
| 2024-12-10 | Swarm Behavior Cloning | Jonas Nüßlein et.al. | 2412.07617 | translate | read | null |
| 2024-12-10 | Contractive Dynamical Imitation Policies for Efficient Out-of-Sample Recovery | Amin Abyaneh et.al. | 2412.07544 | translate | read | null |
| 2024-12-10 | ConfigX: Modular Configuration for Evolutionary Algorithms via Multitask Reinforcement Learning | Hongshu Guo et.al. | 2412.07507 | translate | read | null |
| 2024-12-10 | Optimizing pulsed blowing parameters for active separation control in a one-sided diffuser using reinforcement learning | Alexandra Müller et.al. | 2412.07480 | translate | read | null |
| 2024-12-10 | Progressive-Resolution Policy Distillation: Leveraging Coarse-Resolution Simulation for Time-Efficient Fine-Resolution Policy Learning | Yuki Kadokawa et.al. | 2412.07477 | translate | read | null |
| 2024-12-10 | RLT4Rec: Reinforcement Learning Transformer for User Cold Start and Item Recommendation | Dilina Chandika Rajapakse et.al. | 2412.07403 | translate | read | null |
| 2024-12-09 | Partially Observed Optimal Stochastic Control: Regularity, Optimality, Approximations, and Learning | Ali Devran Kara et.al. | 2412.06735 | translate | read | null |
| 2024-12-09 | Policy Agnostic RL: Offline RL and Online RL Fine-Tuning of Any Class and Backbone | Max Sobol Mark et.al. | 2412.06685 | translate | read | null |
| 2024-12-09 | Off-Policy Maximum Entropy RL with Future State and Action Visitation Measures | Adrien Bolland et.al. | 2412.06655 | translate | read | null |
| 2024-12-09 | Unraveling the Complexity of Memory in RL Agents: an Approach for Classification and Evaluation | Egor Cherepanov et.al. | 2412.06531 | translate | read | null |
| 2024-12-09 | SimuDICE: Offline Policy Optimization Through World Model Updates and DICE Estimation | Catalin E. Brita et.al. | 2412.06486 | translate | read | link |
| 2024-12-09 | Edge Delayed Deep Deterministic Policy Gradient: efficient continuous control for edge scenarios | Alberto Sinigaglia et.al. | 2412.06390 | translate | read | null |
| 2024-12-09 | Tracking control of latent dynamic systems with application to spacecraft attitude control | Congxi Zhang et.al. | 2412.06342 | translate | read | null |
| 2024-12-09 | Augmenting the action space with conventions to improve multi-agent cooperation in Hanabi | F. Bredell et.al. | 2412.06333 | translate | read | null |
| 2024-12-09 | Vision-Based Deep Reinforcement Learning of UAV Autonomous Navigation Using Privileged Information | Junqiao Wang et.al. | 2412.06313 | translate | read | null |
| 2024-12-09 | A Scalable Decentralized Reinforcement Learning Framework for UAV Target Localization Using Recurrent PPO | Leon Fernando et.al. | 2412.06231 | translate | read | null |
| 2024-12-06 | Reinforcement Learning: An Overview | Kevin Murphy et.al. | 2412.05265 | translate | read | null |
| 2024-12-06 | TeamCraft: A Benchmark for Multi-Modal Multi-Agent Systems in Minecraft | Qian Long et.al. | 2412.05255 | translate | read | link |
| 2024-12-06 | LIAR: Leveraging Alignment (Best-of-N) to Jailbreak LLMs in Seconds | James Beetham et.al. | 2412.05232 | translate | read | null |
| 2024-12-06 | FlowPolicy: Enabling Fast and Robust 3D Flow-based Policy via Consistency Flow Matching for Robot Manipulation | Qinglun Zhang et.al. | 2412.04987 | translate | read | null |
| 2024-12-06 | Putting the Iterative Training of Decision Trees to the Test on a Real-World Robotic Task | Raphael C. Engelhardt et.al. | 2412.04974 | translate | read | null |
| 2024-12-06 | DEMO: Reframing Dialogue Interaction with Fine-grained Element Modeling | Minzheng Wang et.al. | 2412.04905 | translate | read | link |
| 2024-12-06 | Maximizing Alignment with Minimal Feedback: Efficiently Learning Rewards for Visuomotor Robot Policy Alignment | Ran Tian et.al. | 2412.04835 | translate | read | null |
| 2024-12-06 | Learning-based Control for Tendon-Driven Continuum Robotic Arms | Nima Maghooli et.al. | 2412.04829 | translate | read | null |
| 2024-12-06 | A Temporally Correlated Latent Exploration for Reinforcement Learning | SuMin Oh et.al. | 2412.04775 | translate | read | null |
| 2024-12-06 | Measuring Goal-Directedness | Matt MacDermott et.al. | 2412.04758 | translate | read | null |
| 2024-12-05 | Marvel: Accelerating Safe Online Reinforcement Learning with Finetuned Offline Policy | Keru Chen et.al. | 2412.04426 | translate | read | null |
| 2024-12-05 | Intersection-Aware Assessment of EMS Accessibility in NYC: A Data-Driven Approach | Haoran Su et.al. | 2412.04369 | translate | read | null |
| 2024-12-05 | Finer Behavioral Foundation Models via Auto-Regressive Features and Advantage Weighting | Edoardo Cetin et.al. | 2412.04368 | translate | read | null |
| 2024-12-05 | Reinforcement Learning for Freeway Lane-Change Regulation via Connected Vehicles | Ke Sun et.al. | 2412.04341 | translate | read | null |
| 2024-12-05 | Action Mapping for Reinforcement Learning in Continuous Environments with Constraints | Mirco Theile et.al. | 2412.04327 | translate | read | null |
| 2024-12-05 | GRAM: Generalization in Deep RL with a Robust Adaptation Module | James Queeney et.al. | 2412.04323 | translate | read | link |
| 2024-12-05 | Reinforcement Learning from Wild Animal Videos | Elliot Chane-Sane et.al. | 2412.04273 | translate | read | null |
| 2024-12-05 | HyperMARL: Adaptive Hypernetworks for Multi-Agent RL | Kale-ab Abebe Tessera et.al. | 2412.04233 | translate | read | null |
| 2024-12-05 | A Dynamic Safety Shield for Safe and Efficient Reinforcement Learning of Navigation Tasks | Murad Dawood et.al. | 2412.04153 | translate | read | null |
| 2024-12-05 | Towards Generalizable Autonomous Penetration Testing via Domain Randomization and Meta-Reinforcement Learning | Shicheng Zhou et.al. | 2412.04078 | translate | read | link |
| 2024-12-04 | AI-Driven Day-to-Day Route Choice | Leizhen Wang et.al. | 2412.03338 | translate | read | null |
| 2024-12-04 | Rotograb: Combining Biomimetic Hands with Industrial Grippers using a Rotating Thumb | Arnaud Bersier et.al. | 2412.03279 | translate | read | null |
| 2024-12-04 | Learning on One Mode: Addressing Multi-Modality in Offline Reinforcement Learning | Mianchu Wang et.al. | 2412.03258 | translate | read | null |
| 2024-12-04 | Alignment at Pre-training! Towards Native Alignment for Arabic LLMs | Juhao Liang et.al. | 2412.03253 | translate | read | link |
| 2024-12-04 | Variable-Speed Teaching-Playback as Real-World Data Augmentation for Imitation Learning | Nozomu Masuya et.al. | 2412.03252 | translate | read | null |
| 2024-12-04 | Using Deep Reinforcement Learning to Enhance Channel Sampling Patterns in Integrated Sensing and Communication | Federico Mason et.al. | 2412.03157 | translate | read | null |
| 2024-12-04 | Experience-driven discovery of planning strategies | Ruiqi He et.al. | 2412.03111 | translate | read | null |
| 2024-12-04 | Less is More: A Stealthy and Efficient Adversarial Attack Method for DRL-based Autonomous Driving Policies | Junchao Fan et.al. | 2412.03051 | translate | read | null |
| 2024-12-04 | Learning Whole-Body Loco-Manipulation for Omni-Directional Task Space Pose Tracking with a Wheeled-Quadrupedal-Manipulator | Kaiwen Jiang et.al. | 2412.03012 | translate | read | null |
| 2024-12-04 | Data Acquisition for Improving Model Fairness using Reinforcement Learning | Jahid Hasan et.al. | 2412.03009 | translate | read | null |
| 2024-12-03 | UniGraspTransformer: Simplified Policy Distillation for Scalable Dexterous Robotic Grasping | Wenbo Wang et.al. | 2412.02699 | translate | read | link |
| 2024-12-03 | Preliminary Investigation into Data Scaling Laws for Imitation Learning-Based End-to-End Autonomous Driving | Yupeng Zheng et.al. | 2412.02689 | translate | read | null |
| 2024-12-03 | T-REG: Preference Optimization with Token-Level Reward Regularization | Wenxuan Zhou et.al. | 2412.02685 | translate | read | link |
| 2024-12-03 | AI-Driven Resource Allocation Framework for Microservices in Hybrid Cloud Platforms | Biman Barua et.al. | 2412.02610 | translate | read | null |
| 2024-12-03 | Explainable CTR Prediction via LLM Reasoning | Xiaohan Yu et.al. | 2412.02588 | translate | read | null |
| 2024-12-03 | Mobile Cell-Free Massive MIMO with Multi-Agent Reinforcement Learning: A Scalable Framework | Ziheng Liu et.al. | 2412.02581 | translate | read | null |
| 2024-12-03 | Generating Critical Scenarios for Testing Automated Driving Systems | Trung-Hieu Nguyen et.al. | 2412.02574 | translate | read | link |
| 2024-12-03 | Cooperative Cruising: Reinforcement Learning based Time-Headway Control for Increased Traffic Efficiency | Yaron Veksler et.al. | 2412.02520 | translate | read | null |
| 2024-12-03 | Reinforcement learning to learn quantum states for Heisenberg scaling accuracy | Jeongwoo Jae et.al. | 2412.02334 | translate | read | null |
| 2024-12-03 | Optimizing Plastic Waste Collection in Water Bodies Using Heterogeneous Autonomous Surface Vehicles with Deep Reinforcement Learning | Alejandro Mendoza Barrionuevo et.al. | 2412.02316 | translate | read | null |
(<a href=../Reinforcement_Learning.md>back to Reinforcement Learning</a>)