Reinforcement Learning - 2024-09
Reinforcement Learning - 2024-09
| Publish Date | Title | Authors | Translate | Read | Code | |
|---|---|---|---|---|---|---|
| 2024-09-30 | Upper and Lower Bounds for Distributionally Robust Off-Dynamics Reinforcement Learning | Zhishuai Liu et.al. | 2409.20521 | translate | read | null |
| 2024-09-30 | Opt2Skill: Imitating Dynamically-feasible Whole-Body Trajectories for Versatile Humanoid Loco-Manipulation | Fukang Liu et.al. | 2409.20514 | translate | read | null |
| 2024-09-30 | The Perfect Blend: Redefining RLHF with Mixture of Judges | Tengyu Xu et.al. | 2409.20370 | translate | read | null |
| 2024-09-30 | MARLadona – Towards Cooperative Team Play Using Multi-Agent Reinforcement Learning | Zichong Li et.al. | 2409.20326 | translate | read | null |
| 2024-09-30 | RL-GSBridge: 3D Gaussian Splatting Based Real2Sim2Real Method for Robotic Manipulation Learning | Yuxuan Wu et.al. | 2409.20291 | translate | read | null |
| 2024-09-30 | Inferring Preferences from Demonstrations in Multi-objective Reinforcement Learning | Junlin Lu et.al. | 2409.20258 | translate | read | link |
| 2024-09-30 | Professor X: Manipulating EEG BCI with Invisible and Robust Backdoor Attack | Xuan-Hao Liu et.al. | 2409.20158 | translate | read | null |
| 2024-09-30 | GravMAD: Grounded Spatial Value Maps Guided Action Diffusion for Generalized 3D Manipulation | Yangtao Chen et.al. | 2409.20154 | translate | read | null |
| 2024-09-30 | DRLinSPH: An open-source platform using deep reinforcement learning and SPHinXsys for fluid-structure-interaction problems | Mai Ye et.al. | 2409.20134 | translate | read | null |
| 2024-09-27 | Robust Deep Reinforcement Learning for Volt-VAR Optimization in Active Distribution System under Uncertainty | Zhengrong Chen et.al. | 2409.18937 | translate | read | null |
| 2024-09-27 | HM3: Hierarchical Multi-Objective Model Merging for Pretrained Models | Yu Zhou et.al. | 2409.18893 | translate | read | null |
| 2024-09-27 | ARLBench: Flexible and Efficient Benchmarking for Hyperparameter Optimization in Reinforcement Learning | Jannis Becktepe et.al. | 2409.18827 | translate | read | link |
| 2024-09-27 | LLMs4Synthesis: Leveraging Large Language Models for Scientific Synthesis | Hamed Babaei Giglou et.al. | 2409.18812 | translate | read | null |
| 2024-09-27 | Autoregressive Policy Optimization for Constrained Allocation Tasks | David Winkel et.al. | 2409.18735 | translate | read | link |
| 2024-09-27 | Enhancing Spectrum Efficiency in 6G Satellite Networks: A GAIL-Powered Policy Learning via Asynchronous Federated Inverse Reinforcement Learning | Sheikh Salman Hassan et.al. | 2409.18718 | translate | read | null |
| 2024-09-27 | Refutation of Spectral Graph Theory Conjectures with Search Algorithms) | Milo Roucairol et.al. | 2409.18626 | translate | read | null |
| 2024-09-27 | TemporalPaD: a reinforcement-learning framework for temporal feature representation and dimension reduction | Xuechen Mu et.al. | 2409.18597 | translate | read | null |
| 2024-09-27 | Climate Adaptation with Reinforcement Learning: Experiments with Flooding and Transportation in Copenhagen | Miguel Costa et.al. | 2409.18574 | translate | read | null |
| 2024-09-27 | Cost-Aware Dynamic Cloud Workflow Scheduling using Self-Attention and Evolutionary Reinforcement Learning | Ya Shen et.al. | 2409.18444 | translate | read | null |
| 2024-09-26 | Inverse Reinforcement Learning with Multiple Planning Horizons | Jiayu Yao et.al. | 2409.18051 | translate | read | null |
| 2024-09-26 | Role-RL: Online Long-Context Processing with Role Reinforcement Learning for Distinct LLMs in Their Optimal Roles | Lewei He et.al. | 2409.18014 | translate | read | null |
| 2024-09-26 | LoopSR: Looping Sim-and-Real for Lifelong Policy Adaptation of Legged Robots | Peilin Wu et.al. | 2409.17992 | translate | read | null |
| 2024-09-26 | Navigation in a simplified Urban Flow through Deep Reinforcement Learning | Federica Tonti et.al. | 2409.17922 | translate | read | null |
| 2024-09-26 | Model-Free versus Model-Based Reinforcement Learning for Fixed-Wing UAV Attitude Control Under Varying Wind Conditions | David Olivares et.al. | 2409.17896 | translate | read | null |
| 2024-09-26 | Self-supervised Preference Optimization: Enhance Your Language Model with Preference Degree Awareness | Jian Li et.al. | 2409.17791 | translate | read | link |
| 2024-09-26 | Robust Ladder Climbing with a Quadrupedal Robot | Dylan Vogel et.al. | 2409.17731 | translate | read | null |
| 2024-09-26 | Cross-lingual Human-Preference Alignment for Neural Machine Translation with Direct Quality Optimization | Kaden Uhlig et.al. | 2409.17673 | translate | read | null |
| 2024-09-26 | Hierarchical End-to-End Autonomous Driving: Integrating BEV Perception with Deep Reinforcement Learning | Siyi Lu et.al. | 2409.17659 | translate | read | null |
| 2024-09-26 | FactorSim: Generative Simulation via Factorized Representation | Fan-Yun Sun et.al. | 2409.17652 | translate | read | null |
| 2024-09-25 | Learning with Dynamics: Autonomous Regulation of UAV Based Communication Networks with Dynamic UAV Crew | Ran Zhang et.al. | 2409.17139 | translate | read | null |
| 2024-09-25 | Landscape of Policy Optimization for Finite Horizon MDPs with General State and Action | Xin Chen et.al. | 2409.17138 | translate | read | null |
| 2024-09-25 | On-orbit Servicing for Spacecraft Collision Avoidance With Autonomous Decision Making | Susmitha Patnala et.al. | 2409.17125 | translate | read | null |
| 2024-09-25 | AI-Driven Risk-Aware Scheduling for Active Debris Removal Missions | Antoine Poupon et.al. | 2409.17012 | translate | read | null |
| 2024-09-25 | Multi-Robot Informative Path Planning for Efficient Target Mapping using Deep Reinforcement Learning | Apoorva Vashisth et.al. | 2409.16967 | translate | read | link |
| 2024-09-25 | Dynamic Obstacle Avoidance through Uncertainty-Based Adaptive Planning with Diffusion | Vineet Punyamoorty et.al. | 2409.16950 | translate | read | null |
| 2024-09-25 | Enhancing Temporal Sensitivity and Reasoning for Time-Sensitive Question Answering | Wanqi Yang et.al. | 2409.16909 | translate | read | null |
| 2024-09-25 | Revisiting Space Mission Planning: A Reinforcement Learning-Guided Approach for Multi-Debris Rendezvous | Agni Bandyopadhyay et.al. | 2409.16882 | translate | read | null |
| 2024-09-25 | Behavior evolution-inspired approach to walking gait reinforcement training for quadruped robots | Yu Wang et.al. | 2409.16862 | translate | read | null |
| 2024-09-25 | Asynchronous Fractional Multi-Agent Deep Reinforcement Learning for Age-Minimal Mobile Edge Computing | Lyudong Jin et.al. | 2409.16832 | translate | read | null |
| 2024-09-24 | A Critical Review of Safe Reinforcement Learning Techniques in Smart Grid Applications | Van-Hai Bui et.al. | 2409.16256 | translate | read | null |
| 2024-09-24 | Context-Based Meta Reinforcement Learning for Robust and Adaptable Peg-in-Hole Assembly Tasks | Ahmed Shokry et.al. | 2409.16208 | translate | read | null |
| 2024-09-24 | Microsecond-Latency Feedback at a Particle Accelerator by Online Reinforcement Learning on Hardware | Luca Scomparin et.al. | 2409.16177 | translate | read | null |
| 2024-09-24 | The Digital Transformation in Health: How AI Can Improve the Performance of Health Systems | África Periáñez et.al. | 2409.16098 | translate | read | null |
| 2024-09-24 | Whole-body end-effector pose tracking | Tifanny Portela et.al. | 2409.16048 | translate | read | null |
| 2024-09-24 | Bridging Environments and Language with Rendering Functions and Vision-Language Models | Theo Cachet et.al. | 2409.16024 | translate | read | null |
| 2024-09-24 | Provably Efficient Exploration in Inverse Constrained Reinforcement Learning | Bo Yue et.al. | 2409.15963 | translate | read | null |
| 2024-09-24 | Overcoming Reward Model Noise in Instruction-Guided Reinforcement Learning | Sukai Huang et.al. | 2409.15922 | translate | read | null |
| 2024-09-24 | Multi-UAV Pursuit-Evasion with Online Planning in Unknown Environments by Deep Reinforcement Learning | Jiayu Chen et.al. | 2409.15866 | translate | read | null |
| 2024-09-24 | Adaptive Learn-then-Test: Statistically Valid and Efficient Hyperparameter Selection | Matteo Zecchin et.al. | 2409.15844 | translate | read | null |
| 2024-09-18 | DynaMo: In-Domain Dynamics Pretraining for Visuo-Motor Control | Zichen Jeff Cui et.al. | 2409.12192 | translate | read | null |
| 2024-09-18 | Robots that Learn to Safely Influence via Prediction-Informed Reach-Avoid Dynamic Games | Ravi Pandya et.al. | 2409.12153 | translate | read | null |
| 2024-09-18 | Almost Sure Convergence of Linear Temporal Difference Learning with Arbitrary Features | Jiuqi Wang et.al. | 2409.12135 | translate | read | null |
| 2024-09-18 | Qwen2.5-Math Technical Report: Toward Mathematical Expert Model via Self-Improvement | An Yang et.al. | 2409.12122 | translate | read | null |
| 2024-09-18 | IMRL: Integrating Visual, Physical, Temporal, and Geometric Representations for Enhanced Food Acquisition | Rui Liu et.al. | 2409.12092 | translate | read | null |
| 2024-09-18 | Generalized Robot Learning Framework | Jiahuan Yan et.al. | 2409.12061 | translate | read | null |
| 2024-09-23 | Handling Long-Term Safety and Uncertainty in Safe Reinforcement Learning | Jonas Günster et.al. | 2409.12045 | translate | read | link |
| 2024-09-18 | Putting Data at the Centre of Offline Multi-Agent Reinforcement Learning | Claude Formanek et.al. | 2409.12001 | translate | read | null |
| 2024-09-18 | Data-Efficient Quadratic Q-Learning Using LMIs | J. S. van Hulst et.al. | 2409.11986 | translate | read | null |
| 2024-09-18 | Reinforcement Learning with Lie Group Orientations for Robotics | Martin Schuck et.al. | 2409.11935 | translate | read | null |
| 2024-09-17 | UniLCD: Unified Local-Cloud Decision-Making via Reinforcement Learning | Kathakoli Sengupta et.al. | 2409.11403 | translate | read | null |
| 2024-09-17 | Integrating Reinforcement Learning and Model Predictive Control with Applications to Microgrids | Caio Fabio Oliveira da Silva et.al. | 2409.11267 | translate | read | null |
| 2024-09-17 | Attacking Slicing Network via Side-channel Reinforcement Learning Attack | Wei Shao et.al. | 2409.11258 | translate | read | null |
| 2024-09-17 | LLM-as-a-Judge & Reward Model: What They Can and Cannot Do | Guijin Son et.al. | 2409.11239 | translate | read | null |
| 2024-09-17 | Leveraging Symmetry to Accelerate Learning of Trajectory Tracking Controllers for Free-Flying Robotic Systems | Jake Welde et.al. | 2409.11238 | translate | read | null |
| 2024-09-17 | Linear Jamming Bandits: Learning to Jam 5G-based Coded Communications Systems | Zachary Schutz et.al. | 2409.11191 | translate | read | null |
| 2024-09-17 | Preventing Unconstrained CBF Safety Filters Caused by Invalid Relative Degree Assumptions | Lukas Brunke et.al. | 2409.11171 | translate | read | null |
| 2024-09-17 | Co-Designing Tools and Control Policies for Robust Manipulation | Yifei Dong et.al. | 2409.11113 | translate | read | null |
| 2024-09-17 | Reactive Environments for Active Inference Agents with RxEnvironments.jl | Wouter W. L. Nuijten et.al. | 2409.11087 | translate | read | link |
| 2024-09-17 | A Reinforcement Learning Environment for Automatic Code Optimization in the MLIR Compiler | Nazim Bendib et.al. | 2409.11068 | translate | read | null |
| 2024-09-16 | Instigating Cooperation among LLM Agents Using Adaptive Information Modulation | Qiliang Chen et.al. | 2409.10372 | translate | read | null |
| 2024-09-16 | Catch It! Learning to Catch in Flight with Mobile Dexterous Hands | Yuanhang Zhang et.al. | 2409.10319 | translate | read | null |
| 2024-09-16 | ReflectDiffu: Reflect between Emotion-intent Contagion and Mimicry for Empathetic Response Generation via a RL-Diffusion Framework | Jiahao Yuan et.al. | 2409.10289 | translate | read | null |
| 2024-09-16 | Safety-Oriented Pruning and Interpretation of Reinforcement Learning Policies | Dennis Gross et.al. | 2409.10218 | translate | read | null |
| 2024-09-16 | Enhancing RL Safety with Counterfactual LLM Reasoning | Dennis Gross et.al. | 2409.10188 | translate | read | null |
| 2024-09-16 | Safe and Stable Closed-Loop Learning for Neural-Network-Supported Model Predictive Control | Sebastian Hirt et.al. | 2409.10171 | translate | read | null |
| 2024-09-16 | Quantile Regression for Distributional Reward Models in RLHF | Nicolai Dorka et.al. | 2409.10164 | translate | read | link |
| 2024-09-16 | Robust Reinforcement Learning with Dynamic Distortion Risk Measures | Anthony Coache et.al. | 2409.10096 | translate | read | null |
| 2024-09-16 | Audio-Driven Reinforcement Learning for Head-Orientation in Naturalistic Environments | Wessel Ledder et.al. | 2409.10048 | translate | read | null |
| 2024-09-16 | Reinforcement learning-based statistical search strategy for an axion model from flavor | Satsuki Nishimura et.al. | 2409.10023 | translate | read | null |
| 2024-09-13 | The unknotting number, hard unknot diagrams, and reinforcement learning | Taylor Applebaum et.al. | 2409.09032 | translate | read | null |
| 2024-09-13 | Modeling Rational Adaptation of Visual Search to Hierarchical Structures | Saku Sourulahti et.al. | 2409.08967 | translate | read | null |
| 2024-09-13 | Average-Reward Maximum Entropy Reinforcement Learning for Underactuated Double Pendulum Tasks | Jean Seong Bjorn Choe et.al. | 2409.08938 | translate | read | null |
| 2024-09-13 | AnyBipe: An End-to-End Framework for Training and Deploying Bipedal Robots Guided by Large Language Models | Yifei Yao et.al. | 2409.08904 | translate | read | null |
| 2024-09-13 | Deep reinforcement learning for tracking a moving target in jellyfish-like swimming | Yihao Chen et.al. | 2409.08815 | translate | read | null |
| 2024-09-13 | DexSim2Real $^{2}$ : Building Explicit World Model for Precise Articulated Object Dexterous Manipulation | Taoran Jiang et.al. | 2409.08750 | translate | read | null |
| 2024-09-13 | Quasimetric Value Functions with Dense Rewards | Khadichabonu Valieva et.al. | 2409.08724 | translate | read | null |
| 2024-09-13 | Secure Offloading in NOMA-Aided Aerial MEC Systems Based on Deep Reinforcement Learning | Hongjiang Lei et.al. | 2409.08579 | translate | read | null |
| 2024-09-13 | Batch Ensemble for Variance Dependent Regret in Stochastic Bandits | Asaf Cassel et.al. | 2409.08570 | translate | read | null |
| 2024-09-13 | OIDM: An Observability-based Intelligent Distributed Edge Sensing Method for Industrial Cyber-Physical Systems | Shigeng Wang et.al. | 2409.08549 | translate | read | null |
| 2024-09-12 | Hand-Object Interaction Pretraining from Videos | Himanshu Gaurav Singh et.al. | 2409.08273 | translate | read | null |
| 2024-09-12 | Multi-Model based Federated Learning Against Model Poisoning Attack: A Deep Learning Based Model Selection for MEC Systems | Somayeh Kianpisheh et.al. | 2409.08237 | translate | read | null |
| 2024-09-12 | Towards Online Safety Corrections for Robotic Manipulation Policies | Ariana Spalter et.al. | 2409.08233 | translate | read | null |
| 2024-09-12 | Design Optimization of Nuclear Fusion Reactor through Deep Reinforcement Learning | Jinsu Kim et.al. | 2409.08231 | translate | read | null |
| 2024-09-12 | Adaptive Language-Guided Abstraction from Contrastive Explanations | Andi Peng et.al. | 2409.08212 | translate | read | null |
| 2024-09-12 | Optimal Management of Grid-Interactive Efficient Buildings via Safe Reinforcement Learning | Xiang Huo et.al. | 2409.08132 | translate | read | null |
| 2024-09-12 | Linear Complementary Dual Codes Constructed from Reinforcement Learning | Yansheng Wu et.al. | 2409.08114 | translate | read | null |
| 2024-09-12 | Q-value Regularized Decision ConvFormer for Offline Reinforcement Learning | Teng Yan et.al. | 2409.08062 | translate | read | null |
| 2024-09-12 | Learning Causally Invariant Reward Functions from Diverse Demonstrations | Ivan Ovinnikov et.al. | 2409.08012 | translate | read | null |
| 2024-09-12 | Digital Twin for Autonomous Guided Vehicles based on Integrated Sensing and Communications | Van-Phuc Bui et.al. | 2409.08005 | translate | read | null |
| 2024-09-11 | Autonomous loading of ore piles with Load-Haul-Dump machines using Deep Reinforcement Learning | Rodrigo Salas et.al. | 2409.07449 | translate | read | null |
| 2024-09-11 | Hierarchical Reinforcement Learning for Temporal Abstraction of Listwise Recommendation | Luo Ji et.al. | 2409.07416 | translate | read | null |
| 2024-09-11 | Learning Robotic Manipulation Policies from Point Clouds with Conditional Flow Matching | Eugenio Chisari et.al. | 2409.07343 | translate | read | null |
| 2024-09-11 | Online Decision MetaMorphFormer: A Casual Transformer-Based Reinforcement Learning Framework of Universal Embodied Intelligence | Luo Ji et.al. | 2409.07341 | translate | read | null |
| 2024-09-11 | A Framework for Predicting the Impact of Game Balance Changes through Meta Discovery | Akash Saravanan et.al. | 2409.07340 | translate | read | null |
| 2024-09-11 | Multi-Type Preference Learning: Empowering Preference-Based Reinforcement Learning with Equal Preferences | Ziang Liu et.al. | 2409.07268 | translate | read | null |
| 2024-09-11 | Perceptive Pedipulation with Local Obstacle Avoidance | Jonas Stolle et.al. | 2409.07195 | translate | read | null |
| 2024-09-11 | A Perspective on AI-Guided Molecular Simulations in VR: Exploring Strategies for Imitation Learning in Hyperdimensional Molecular Systems | Mohamed Dhouioui et.al. | 2409.07189 | translate | read | null |
| 2024-09-11 | Learning Efficient Recursive Numeral Systems via Reinforcement Learning | Jonathan D. Thomas et.al. | 2409.07170 | translate | read | null |
| 2024-09-11 | DCMAC: Demand-aware Customized Multi-Agent Communication via Upper Bound Training | Dongkun Huo et.al. | 2409.07127 | translate | read | null |
| 2024-09-10 | DemoStart: Demonstration-led auto-curriculum applied to sim-to-real with multi-fingered robots | Maria Bauza et.al. | 2409.06613 | translate | read | null |
| 2024-09-10 | Advancements in Gesture Recognition Techniques and Machine Learning for Enhanced Human-Robot Interaction: A Comprehensive Review | Sajjad Hussain et.al. | 2409.06503 | translate | read | null |
| 2024-09-10 | Superior Computer Chess with Model Predictive Control, Reinforcement Learning, and Rollout | Atharva Gundawar et.al. | 2409.06477 | translate | read | null |
| 2024-09-10 | Learning Generative Interactive Environments By Trained Agent Exploration | Naser Kazemi et.al. | 2409.06445 | translate | read | link |
| 2024-09-10 | Length Desensitization in Directed Preference Optimization | Wei Liu et.al. | 2409.06411 | translate | read | null |
| 2024-09-10 | One Policy to Run Them All: an End-to-end Learning Approach to Multi-Embodiment Locomotion | Nico Bohlinger et.al. | 2409.06366 | translate | read | null |
| 2024-09-10 | Double Successive Over-Relaxation Q-Learning with an Extension to Deep Reinforcement Learning | Shreyas S R et.al. | 2409.06356 | translate | read | null |
| 2024-09-10 | Learning Augmentation Policies from A Model Zoo for Time Series Forecasting | Haochen Yuan et.al. | 2409.06282 | translate | read | null |
| 2024-09-09 | Robot Utility Models: General Policies for Zero-Shot Deployment in New Environments | Haritheja Etukuru et.al. | 2409.05865 | translate | read | link |
| 2024-09-09 | An Introduction to Quantum Reinforcement Learning (QRL) | Samuel Yen-Chi Chen et.al. | 2409.05846 | translate | read | null |
| 2024-09-09 | Learning control of underactuated double pendulum with Model-Based Reinforcement Learning | Niccolò Turcato et.al. | 2409.05811 | translate | read | null |
| 2024-09-09 | Markov Chain Variance Estimation: A Stochastic Approximation Approach | Shubhada Agrawal et.al. | 2409.05733 | translate | read | null |
| 2024-09-09 | Cooperative Decision-Making for CAVs at Unsignalized Intersections: A MARL Approach with Attention and Hierarchical Game Priors | Jiaqi Liu et.al. | 2409.05712 | translate | read | null |
| 2024-09-09 | Interactive incremental learning of generalizable skills with local trajectory modulation | Markus Knauer et.al. | 2409.05655 | translate | read | null |
| 2024-09-09 | Forward KL Regularized Preference Optimization for Aligning Diffusion Policies | Zhao Shan et.al. | 2409.05622 | translate | read | null |
| 2024-09-09 | Adaptive Multi-Layer Deployment for A Digital Twin Empowered Satellite-Terrestrial Integrated Network | Yihong Tao et.al. | 2409.05480 | translate | read | null |
| 2024-09-09 | Reinforcement Learning for Variational Quantum Circuits Design | Simone Foderà et.al. | 2409.05475 | translate | read | null |
| 2024-09-09 | Semifactual Explanations for Reinforcement Learning | Jasmina Gajcin et.al. | 2409.05435 | translate | read | null |
| 2024-09-06 | RLPF: Reinforcement Learning from Prediction Feedback for User Summarization with LLMs | Jiaxing Wu et.al. | 2409.04421 | translate | read | null |
| 2024-09-06 | Gaussian-Mixture-Model Q-Functions for Reinforcement Learning by Riemannian Optimization | Minh Vu et.al. | 2409.04374 | translate | read | null |
| 2024-09-06 | Refined Bounds on Near Optimality Finite Window Policies in POMDPs and Their Reinforcement Learning | Yunus Emre Demirci et.al. | 2409.04351 | translate | read | null |
| 2024-09-06 | Advancing Multi-Organ Disease Care: A Hierarchical Multi-Agent Reinforcement Learning Framework | Daniel J. Tan et.al. | 2409.04224 | translate | read | null |
| 2024-09-06 | The Prevalence of Neural Collapse in Neural Multivariate Regression | George Andriopoulos et.al. | 2409.04180 | translate | read | null |
| 2024-09-06 | Prompt-based Personality Profiling: Reinforcement Learning for Relevance Filtering | Jan Hofmann et.al. | 2409.04122 | translate | read | null |
| 2024-09-05 | DRAL: Deep Reinforcement Adaptive Learning for Multi-UAVs Navigation in Unknown Indoor Environment | Kangtong Mo et.al. | 2409.03930 | translate | read | null |
| 2024-09-05 | Asynchronous Stochastic Approximation and Average-Reward Reinforcement Learning | Huizhen Yu et.al. | 2409.03915 | translate | read | null |
| 2024-09-05 | On the Convergence Rates of Federated Q-Learning across Heterogeneous Environments | Muxing Wang et.al. | 2409.03897 | translate | read | null |
| 2024-09-05 | Multi-agent Path Finding for Mixed Autonomy Traffic Coordination | Han Zheng et.al. | 2409.03881 | translate | read | null |
| 2024-09-05 | Dynamics of Supervised and Reinforcement Learning in the Non-Linear Perceptron | Christian Schmid et.al. | 2409.03749 | translate | read | null |
| 2024-09-05 | Differentiable Discrete Event Simulation for Queuing Network Control | Ethan Che et.al. | 2409.03740 | translate | read | null |
| 2024-09-05 | On the Limited Generalization Capability of the Implicit Reward Model Induced by Direct Preference Optimization | Yong Lin et.al. | 2409.03650 | translate | read | null |
| 2024-09-05 | 1 Modular Parallel Manipulator for Long-Term Soft Robotic Data Collection | Kiyn Chin et.al. | 2409.03614 | translate | read | null |
| 2024-09-05 | CHIRPs: Change-Induced Regret Proxy metrics for Lifelong Reinforcement Learning | John Birkbeck et.al. | 2409.03577 | translate | read | null |
| 2024-09-05 | Sparsifying Parametric Models with L0 Regularization | Nicolò Botteghi et.al. | 2409.03489 | translate | read | null |
| 2024-09-05 | Reinforcement Learning Approach to Optimizing Profilometric Sensor Trajectories for Surface Inspection | Sara Roos-Hoefgeest et.al. | 2409.03429 | translate | read | null |
| 2024-09-05 | Game On: Towards Language Models as RL Experimenters | Jingwei Zhang et.al. | 2409.03402 | translate | read | null |
| 2024-09-05 | ELO-Rated Sequence Rewards: Advancing Reinforcement Learning Models | Qi Ju et.al. | 2409.03301 | translate | read | link |
| 2024-09-05 | Robust synchronization and policy adaptation for networked heterogeneous agents | Miguel F. Arevalo-Castiblanco et.al. | 2409.03273 | translate | read | null |
| 2024-09-04 | Hybrid Imitation-Learning Motion Planner for Urban Driving | Cristian Gariboldi et.al. | 2409.02871 | translate | read | null |
| 2024-09-04 | Knowledge Transfer for Collaborative Misbehavior Detection in Untrusted Vehicular Environments | Roshan Sedar et.al. | 2409.02844 | translate | read | null |
| 2024-09-04 | Tractable Offline Learning of Regular Decision Processes | Ahana Deb et.al. | 2409.02747 | translate | read | null |
| 2024-09-04 | Surgical Task Automation Using Actor-Critic Frameworks and Self-Supervised Imitation Learning | Jingshuai Liu et.al. | 2409.02724 | translate | read | null |
| 2024-09-04 | Decision Transformer for Enhancing Neural Local Search on the Job Shop Scheduling Problem | Constantin Waubert de Puiseau et.al. | 2409.02697 | translate | read | null |
| 2024-09-04 | Causality-Aware Transformer Networks for Robotic Navigation | Ruoyu Wang et.al. | 2409.02669 | translate | read | null |
| 2024-09-04 | A Survey on Emergent Language | Jannik Peters et.al. | 2409.02645 | translate | read | null |
| 2024-09-04 | Mamba as a motion encoder for robotic imitation learning | Toshiaki Tsuji et.al. | 2409.02636 | translate | read | null |
| 2024-09-04 | Continual Diffuser (CoD): Mastering Continual Offline Reinforcement Learning with Experience Rehearsal | Jifeng Hu et.al. | 2409.02512 | translate | read | null |
| 2024-09-04 | USV-AUV Collaboration Framework for Underwater Tasks under Extreme Sea Conditions | Jingzehua Xu et.al. | 2409.02444 | translate | read | null |
(<a href=../Reinforcement_Learning.md>back to Reinforcement Learning</a>)