Reinforcement Learning - 2024-08
Reinforcement Learning - 2024-08
| Publish Date | Title | Authors | Translate | Read | Code | |
|---|---|---|---|---|---|---|
| 2024-08-30 | Traffic expertise meets residual RL: Knowledge-informed model-based residual reinforcement learning for CAV trajectory control | Zihao Sheng et.al. | 2408.17380 | translate | read | link |
| 2024-08-30 | Stationary Policies are Optimal in Risk-averse Total-reward MDPs with EVaR | Xihong Su et.al. | 2408.17286 | translate | read | null |
| 2024-08-30 | Using Quantum Solved Deep Boltzmann Machines to Increase the Data Efficiency of RL Agents | Daniel Kent et.al. | 2408.17240 | translate | read | null |
| 2024-08-30 | MaFeRw: Query Rewriting with Multi-Aspect Feedbacks for Retrieval-Augmented Large Language Models | Yujing Wang et.al. | 2408.17072 | translate | read | null |
| 2024-08-30 | Efficient Camera Exposure Control for Visual Odometry via Deep Reinforcement Learning | Shuyang Zhang et.al. | 2408.17005 | translate | read | link |
| 2024-08-30 | A Tighter Convergence Proof of Reverse Experience Replay | Nan Jiang et.al. | 2408.16999 | translate | read | link |
| 2024-08-30 | Discovery of False Data Injection Schemes on Frequency Controllers with Reinforcement Learning | Romesh Prasad et.al. | 2408.16958 | translate | read | null |
| 2024-08-29 | FlowRetrieval: Flow-Guided Data Retrieval for Few-Shot Imitation Learning | Li-Heng Lin et.al. | 2408.16944 | translate | read | null |
| 2024-08-29 | Manipulating OpenFlow Link Discovery Packet Forwarding for Topology Poisoning | Mingming Chen et.al. | 2408.16940 | translate | read | null |
| 2024-08-29 | Coverage Analysis of Multi-Environment Q-Learning Algorithms for Wireless Network Optimization | Talha Bozkus et.al. | 2408.16882 | translate | read | null |
| 2024-08-29 | Reinforcement Learning without Human Feedback for Last Mile Fine-Tuning of Large Language Models | Alec Solway et.al. | 2408.16753 | translate | read | null |
| 2024-08-29 | A GREAT Architecture for Edge-Based Graph Problems Like TSP | Attila Lischka et.al. | 2408.16717 | translate | read | null |
| 2024-08-29 | RLCP: A Reinforcement Learning-based Copyright Protection Method for Text-to-Image Diffusion Model | Zhuan Shi et.al. | 2408.16634 | translate | read | null |
| 2024-08-29 | Optimizing Automated Picking Systems in Warehouse Robots Using Machine Learning | Keqin Li et.al. | 2408.16633 | translate | read | null |
| 2024-08-29 | Phase Optimization and Relay Selection for Joint Relay and IRS-Assisted Communication | Uyoata E. Uyoata et.al. | 2408.16399 | translate | read | null |
| 2024-08-29 | EasyChauffeur: A Baseline Advancing Simplicity and Efficiency on Waymax | Lingyu Xiao et.al. | 2408.16375 | translate | read | null |
| 2024-08-29 | Efficient Multi-agent Navigation with Lightweight DRL Policy | Xingrong Diao et.al. | 2408.16370 | translate | read | null |
| 2024-08-29 | On Convergence of Average-Reward Q-Learning in Weakly Communicating Markov Decision Processes | Yi Wan et.al. | 2408.16262 | translate | read | null |
| 2024-08-28 | DECAF: a Discrete-Event based Collaborative Human-Robot Framework for Furniture Assembly | Giulio Giacomuzzo et.al. | 2408.16125 | translate | read | null |
| 2024-08-28 | RAIN: Reinforcement Algorithms for Improving Numerical Weather and Climate Models | Pritthijit Nath et.al. | 2408.16118 | translate | read | link |
| 2024-08-28 | In-Context Imitation Learning via Next-Token Prediction | Letian Fu et.al. | 2408.15980 | translate | read | link |
| 2024-08-28 | Atari-GPT: Investigating the Capabilities of Multimodal Large Language Models as Low-Level Policies for Atari Games | Nicholas R. Waytowich et.al. | 2408.15950 | translate | read | null |
| 2024-08-28 | DeMoBot: Deformable Mobile Manipulation with Vision-based Sub-goal Retrieval | Yuying Zhang et.al. | 2408.15919 | translate | read | null |
| 2024-08-28 | Adaptive Traffic Signal Control Using Reinforcement Learning | Muhammad Tahir Rafique et.al. | 2408.15751 | translate | read | null |
| 2024-08-28 | Deep Reinforcement Learning for Radiative Heat Transfer Optimization Problems | Eva Ortiz-Mansilla et.al. | 2408.15727 | translate | read | null |
| 2024-08-28 | Comparison of Model Predictive Control and Proximal Policy Optimization for a 1-DOF Helicopter System | Georg Schäfer et.al. | 2408.15633 | translate | read | null |
| 2024-08-28 | Structural Optimization of Lightweight Bipedal Robot via SERL | Yi Cheng et.al. | 2408.15632 | translate | read | null |
| 2024-08-28 | Statistical QoS Provision in Business-Centric Networks | Chang Wu et.al. | 2408.15609 | translate | read | null |
| 2024-08-28 | Skills Regularized Task Decomposition for Multi-task Offline Reinforcement Learning | Minjong Yoo et.al. | 2408.15593 | translate | read | null |
| 2024-08-28 | Improving Thompson Sampling via Information Relaxation for Budgeted Multi-armed Bandits | Woojin Jeong et.al. | 2408.15535 | translate | read | null |
| 2024-08-27 | SpecGuard: Specification Aware Recovery for Robotic Autonomous Vehicles from Physical Attacks | Pritam Dash et.al. | 2408.15200 | translate | read | null |
| 2024-08-27 | Exploiting Approximate Symmetry for Efficient Multi-Agent Reinforcement Learning | Batuhan Yardim et.al. | 2408.15173 | translate | read | null |
| 2024-08-27 | Applications in CityLearn Gym Environment for Multi-Objective Control Benchmarking in Grid-Interactive Buildings and Districts | Kingsley Nweye et.al. | 2408.15170 | translate | read | null |
| 2024-08-27 | muPRL: A Mutation Testing Pipeline for Deep Reinforcement Learning based on Real Faults | Deepak-George Thomas et.al. | 2408.15150 | translate | read | null |
| 2024-08-27 | No Regrets: Investigating and Improving Regret Approximations for Curriculum Discovery | Alexander Rutherford et.al. | 2408.15099 | translate | read | link |
| 2024-08-27 | MiWaves Reinforcement Learning Algorithm | Susobhan Ghosh et.al. | 2408.15076 | translate | read | null |
| 2024-08-27 | Earth Observation Satellite Scheduling with Graph Neural Networks | Antoine Jacquet et.al. | 2408.15041 | translate | read | null |
| 2024-08-27 | Inverse-Q*: Token Level Reinforcement Learning for Aligning Large Language Models Without Preference Data | Han Xia et.al. | 2408.14874 | translate | read | null |
| 2024-08-27 | Robo-GS: A Physics Consistent Spatial-Temporal Model for Robotic Arm with Hybrid Representation | Haozhe Lou et.al. | 2408.14873 | translate | read | null |
| 2024-08-27 | Learning Robust Reward Machines from Noisy Labels | Roko Parac et.al. | 2408.14871 | translate | read | link |
| 2024-08-26 | Advancing Humanoid Locomotion: Mastering Challenging Terrains with Denoising World Model Learning | Xinyang Gu et.al. | 2408.14472 | translate | read | null |
| 2024-08-26 | Equivariant Reinforcement Learning under Partial Observability | Hai Nguyen et.al. | 2408.14336 | translate | read | null |
| 2024-08-26 | Efficient Active Flow Control Strategy for Confined Square Cylinder Wake Using Deep Learning-Based Surrogate Model and Reinforcement Learning | Meng Zhang et.al. | 2408.14232 | translate | read | null |
| 2024-08-26 | DynamicRouteGPT: A Real-Time Multi-Vehicle Dynamic Navigation Framework Based on Large Language Models | Ziai Zhou et.al. | 2408.14185 | translate | read | null |
| 2024-08-26 | Robot Navigation with Entity-Based Collision Avoidance using Deep Reinforcement Learning | Yury Kolomeytsev et.al. | 2408.14183 | translate | read | null |
| 2024-08-26 | ReLExS: Reinforcement Learning Explanations for Stackelberg No-Regret Learners | Xiangge Huang et.al. | 2408.14086 | translate | read | null |
| 2024-08-26 | Bridging the gap between Learning-to-plan, Motion Primitives and Safe Reinforcement Learning | Piotr Kicki et.al. | 2408.14063 | translate | read | null |
| 2024-08-26 | Re-Mix: Optimizing Data Mixtures for Large Scale Imitation Learning | Joey Hejna et.al. | 2408.14037 | translate | read | link |
| 2024-08-26 | Optimizing TD3 for 7-DOF Robotic Arm Grasping: Overcoming Suboptimality with Exploration-Enhanced Contrastive Learning | Wen-Han Hsieh et.al. | 2408.14009 | translate | read | null |
| 2024-08-26 | Quantitative Representation of Scenario Difficulty for Autonomous Driving Based on Adversarial Policy Search | Shuo Yang et.al. | 2408.14000 | translate | read | null |
| 2024-08-23 | Optimally Solving Simultaneous-Move Dec-POMDPs: The Sequential Central Planning Approach | Johan Peralez et.al. | 2408.13139 | translate | read | null |
| 2024-08-23 | Diffusion-based Episodes Augmentation for Offline Multi-Agent Reinforcement Learning | Jihwan Oh et.al. | 2408.13092 | translate | read | null |
| 2024-08-23 | Guiding IoT-Based Healthcare Alert Systems with Large Language Models | Yulan Gao et.al. | 2408.13071 | translate | read | null |
| 2024-08-23 | cc-DRL: a Convex Combined Deep Reinforcement Learning Flight Control Design for a Morphing Quadrotor | Tao Yang et.al. | 2408.13054 | translate | read | null |
| 2024-08-23 | In-Context Learning with Reinforcement Learning for Incomplete Utterance Rewriting | Haowei Du et.al. | 2408.13028 | translate | read | null |
| 2024-08-23 | Robust Iterative Value Conversion: Deep Reinforcement Learning for Neurochip-driven Edge Robots | Yuki Kadokawa et.al. | 2408.13018 | translate | read | null |
| 2024-08-23 | SUMO: Search-Based Uncertainty Estimation for Model-Based Offline Reinforcement Learning | Zhongjian Qiao et.al. | 2408.12970 | translate | read | null |
| 2024-08-23 | SAMBO-RL: Shifts-aware Model-based Offline Reinforcement Learning | Wang Luo et.al. | 2408.12830 | translate | read | null |
| 2024-08-23 | DutyTTE: Deciphering Uncertainty in Origin-Destination Travel Time Estimation | Xiaowei Mao et.al. | 2408.12809 | translate | read | null |
| 2024-08-23 | Intelligent OPC Engineer Assistant for Semiconductor Manufacturing | Guojin Chen et.al. | 2408.12775 | translate | read | null |
| 2024-08-22 | Controllable Text Generation for Large Language Models: A Survey | Xun Liang et.al. | 2408.12599 | translate | read | link |
| 2024-08-22 | Automating Deformable Gasket Assembly | Simeon Adebola et.al. | 2408.12593 | translate | read | null |
| 2024-08-22 | Human-In-The-Loop Machine Learning for Safe and Ethical Autonomous Vehicles: Principles, Challenges, and Opportunities | Yousef Emami et.al. | 2408.12548 | translate | read | null |
| 2024-08-22 | PCGRL+: Scaling, Control and Generalization in Reinforcement Learning Level Generators | Sam Earle et.al. | 2408.12525 | translate | read | null |
| 2024-08-22 | EX-DRL: Hedging Against Heavy Losses with EXtreme Distributional Reinforcement Learning | Parvin Malekzadeh et.al. | 2408.12446 | translate | read | null |
| 2024-08-22 | Leveraging Unlabeled Data Sharing through Kernel Function Approximation in Offline Reinforcement Learning | Yen-Ru Lai et.al. | 2408.12307 | translate | read | null |
| 2024-08-22 | Domino-cooling Oscillator Networks with Deep Reinforcement Learning | Sampreet Kalita et.al. | 2408.12271 | translate | read | null |
| 2024-08-22 | UNCO: Towards Unifying Neural Combinatorial Optimization through Large Language Model | Xia Jiang et.al. | 2408.12214 | translate | read | null |
| 2024-08-22 | A Safety-Oriented Self-Learning Algorithm for Autonomous Driving: Evolution Starting from a Basic Model | Shuo Yang et.al. | 2408.12190 | translate | read | null |
| 2024-08-22 | A Safe and Efficient Self-evolving Algorithm for Decision-making and Control of Autonomous Driving Systems | Shuo Yang et.al. | 2408.12187 | translate | read | null |
| 2024-08-21 | Efficient Exploration and Discriminative World Model Learning with an Object-Centric Abstraction | Anthony GX-Chen et.al. | 2408.11816 | translate | read | null |
| 2024-08-21 | ACE: A Cross-Platform Visual-Exoskeletons System for Low-Cost Dexterous Teleoperation | Shiqi Yang et.al. | 2408.11805 | translate | read | null |
| 2024-08-21 | Critique-out-Loud Reward Models | Zachary Ankner et.al. | 2408.11791 | translate | read | link |
| 2024-08-21 | Deviations from the Nash equilibrium and emergence of tacit collusion in a two-player optimal execution game with reinforcement learning | Fabrizio Lillo et.al. | 2408.11773 | translate | read | null |
| 2024-08-21 | Bayesian Optimization Framework for Efficient Fleet Design in Autonomous Multi-Robot Exploration | David Molina Concha et.al. | 2408.11751 | translate | read | null |
| 2024-08-21 | Optimizing Interpretable Decision Tree Policies for Reinforcement Learning | Daniël Vos et.al. | 2408.11632 | translate | read | link |
| 2024-08-21 | A Survey of Embodied Learning for Object-Centric Robotic Manipulation | Ying Zheng et.al. | 2408.11537 | translate | read | link |
| 2024-08-22 | Using Part-based Representations for Explainable Deep Reinforcement Learning | Manos Kirtas et.al. | 2408.11455 | translate | read | null |
| 2024-08-21 | Subgoal-based Hierarchical Reinforcement Learning for Multi-Agent Collaboration | Cheng Xu et.al. | 2408.11416 | translate | read | link |
| 2024-08-21 | Reflex-Based Open-Vocabulary Navigation without Prior Knowledge Using Omnidirectional Camera and Multiple Vision-Language Models | Kento Kawaharazuka et.al. | 2408.11380 | translate | read | null |
| 2024-08-20 | Accelerating Goal-Conditioned RL Algorithms and Research | Michał Bortkiewicz et.al. | 2408.11052 | translate | read | link |
| 2024-08-20 | RP1M: A Large-Scale Motion Dataset for Piano Playing with Bi-Manual Dexterous Robot Hands | Yi Zhao et.al. | 2408.11048 | translate | read | null |
| 2024-08-20 | Quantum Machine Learning Algorithms for Anomaly Detection: a Survey | Sebastiano Corli et.al. | 2408.11047 | translate | read | null |
| 2024-08-20 | Deep Reinforcement Learning for Network Energy Saving in 6G and Beyond Networks | Dinh-Hieu Tran et.al. | 2408.10974 | translate | read | null |
| 2024-08-20 | The Evolution of Reinforcement Learning in Quantitative Finance | Nikolaos Pippas et.al. | 2408.10932 | translate | read | null |
| 2024-08-20 | Knowledge Sharing and Transfer via Centralized Reward Agent for Multi-Task Reinforcement Learning | Haozhe Ma et.al. | 2408.10858 | translate | read | link |
| 2024-08-20 | Offline Model-Based Reinforcement Learning with Anti-Exploration | Padmanaba Srinivasan et.al. | 2408.10713 | translate | read | null |
| 2024-08-20 | Minor SFT loss for LLM fine-tune to increase performance and reduce model deviation | Shiming Xie et.al. | 2408.10642 | translate | read | null |
| 2024-08-20 | Strategist: Learning Strategic Skills by LLMs via Bi-Level Tree Search | Jonathan Light et.al. | 2408.10635 | translate | read | link |
| 2024-08-20 | Hologram Reasoning for Solving Algebra Problems with Geometry Diagrams | Litian Huang et.al. | 2408.10592 | translate | read | link |
| 2024-08-19 | LEAD: Towards Learning-Based Equity-Aware Decarbonization in Ridesharing Platforms | Mahsa Sahebdel et.al. | 2408.10201 | translate | read | null |
| 2024-08-19 | Physics-Aware Combinatorial Assembly Planning using Deep Reinforcement Learning | Ruixuan Liu et.al. | 2408.10162 | translate | read | null |
| 2024-08-19 | $R^2$ -Mesh: Reinforcement Learning Powered Mesh Reconstruction via Geometry and Appearance Refinement | Haoyang Wang et.al. | 2408.10135 | translate | read | null |
| 2024-08-19 | Enhancing Reinforcement Learning Through Guided Search | Jérôme Arjonilla et.al. | 2408.10113 | translate | read | null |
| 2024-08-19 | Personalizing Reinforcement Learning from Human Feedback with Variational Preference Learning | Sriyash Poddar et.al. | 2408.10075 | translate | read | null |
| 2024-08-19 | Efficient Exploration in Deep Reinforcement Learning: A Novel Bayesian Actor-Critic Algorithm | Nikolai Rozanov et.al. | 2408.10055 | translate | read | null |
| 2024-08-19 | Adaptive BESS and Grid Setpoints Optimization: A Model-Free Framework for Efficient Battery Management under Dynamic Tariff Pricing | Alaa Selim et.al. | 2408.09989 | translate | read | null |
| 2024-08-19 | The Exploration-Exploitation Dilemma Revisited: An Entropy Perspective | Renye Yan et.al. | 2408.09974 | translate | read | null |
| 2024-08-19 | GINO-Q: Learning an Asymptotically Optimal Index Policy for Restless Multi-armed Bandits | Gongpu Chen et.al. | 2408.09882 | translate | read | null |
| 2024-08-19 | ShortCircuit: AlphaZero-Driven Circuit Design | Dimitrios Tsaras et.al. | 2408.09858 | translate | read | null |
| 2024-08-16 | HistoGym: A Reinforcement Learning Environment for Histopathological Image Analysis | Zhi-Bo Liu et.al. | 2408.08847 | translate | read | link |
| 2024-08-16 | CAT: Caution Aware Transfer in Reinforcement Learning via Distributional Risk | Mohamad Fares El Hajj Chehade et.al. | 2408.08812 | translate | read | null |
| 2024-08-16 | Evaluating the Evaluator: Measuring LLMs’ Adherence to Task Evaluation Instructions | Bhuvanashree Murugadoss et.al. | 2408.08781 | translate | read | null |
| 2024-08-16 | SYMPOL: Symbolic Tree-Based On-Policy Reinforcement Learning | Sascha Marton et.al. | 2408.08761 | translate | read | link |
| 2024-08-16 | Efficient Multi-Policy Evaluation for Reinforcement Learning | Shuze Liu et.al. | 2408.08706 | translate | read | null |
| 2024-08-16 | Neural Reward Machines | Elena Umili et.al. | 2408.08677 | translate | read | link |
| 2024-08-16 | Fine-tuning LLMs for Autonomous Spacecraft Control: A Case Study Using Kerbal Space Program | Alejandro Carrasco et.al. | 2408.08676 | translate | read | link |
| 2024-08-16 | DeepREST: Automated Test Case Generation for REST APIs Exploiting Deep Reinforcement Learning | Davide Corradini et.al. | 2408.08594 | translate | read | null |
| 2024-08-16 | Multilevel Graph Reinforcement Learning for Consistent Cognitive Decision-making in Heterogeneous Mixed Autonomy | Xin Gao et.al. | 2408.08516 | translate | read | null |
| 2024-08-16 | Deep multi-intentional inverse reinforcement learning for cognitive multi-function radar inverse cognition | Hancong Feng et.al. | 2408.08478 | translate | read | null |
| 2024-08-15 | A Conflicts-free, Speed-lossless KAN-based Reinforcement Learning Decision System for Interactive Driving in Roundabouts | Zhihao Lin et.al. | 2408.08242 | translate | read | null |
| 2024-08-15 | Explaining an Agent’s Future Beliefs through Temporally Decomposing Future Reward Estimators | Mark Towers et.al. | 2408.08230 | translate | read | link |
| 2024-08-15 | DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search | Huajian Xin et.al. | 2408.08152 | translate | read | link |
| 2024-08-15 | Independent Policy Mirror Descent for Markov Potential Games: Scaling to Large Number of Players | Pragnya Alatur et.al. | 2408.08075 | translate | read | null |
| 2024-08-15 | An Efficient Continuous Control Perspective for Reinforcement-Learning-based Sequential Recommendation | Jun Wang et.al. | 2408.08047 | translate | read | null |
| 2024-08-15 | Adaptive User Journeys in Pharma E-Commerce with Reinforcement Learning: Insights from SwipeRx | Ana Fernández del Río et.al. | 2408.08024 | translate | read | null |
| 2024-08-15 | Experimental evaluation of offline reinforcement learning for HVAC control in buildings | Jun Wang et.al. | 2408.07986 | translate | read | link |
| 2024-08-15 | Meta SAC-Lag: Towards Deployable Safe Reinforcement Learning via MetaGradient-based Hyperparameter Tuning | Homayoun Honari et.al. | 2408.07962 | translate | read | null |
| 2024-08-15 | Solving a Rubik’s Cube Using its Local Graph Structure | Shunyu Yao et.al. | 2408.07945 | translate | read | null |
| 2024-08-15 | IReCa: Intrinsic Reward-enhanced Context-aware Reinforcement Learning for Human-AI Coordination | Xin Hao et.al. | 2408.07877 | translate | read | null |
| 2024-08-14 | Off-Policy Reinforcement Learning with High Dimensional Reward | Dong Neuck Lee et.al. | 2408.07660 | translate | read | null |
| 2024-08-14 | Adaptive Behavioral AI: Reinforcement Learning to Enhance Pharmacy Services | Ana Fernández del Río et.al. | 2408.07647 | translate | read | null |
| 2024-08-14 | SigmaRL: A Sample-Efficient and Generalizable Multi-Agent Reinforcement Learning Framework for Motion Planning | Jianye Xu et.al. | 2408.07644 | translate | read | link |
| 2024-08-14 | Optimizing HIV Patient Engagement with Reinforcement Learning in Resource-Limited Settings | África Periáñez et.al. | 2408.07629 | translate | read | null |
| 2024-08-14 | A Nested Graph Reinforcement Learning-based Decision-making Strategy for Eco-platooning | Xin Gao et.al. | 2408.07578 | translate | read | null |
| 2024-08-14 | Large Language Models Know What Makes Exemplary Contexts | Quanyu Long et.al. | 2408.07505 | translate | read | null |
| 2024-08-14 | Large Language Models Prompting With Episodic Memory | Dai Do et.al. | 2408.07465 | translate | read | null |
| 2024-08-14 | Real-world validation of safe reinforcement learning, model predictive control and decision tree-based home energy management systems | Julian Ruddick et.al. | 2408.07435 | translate | read | null |
| 2024-08-14 | Bridging Training and Execution via Dynamic Directed Graph-Based Communication in Cooperative Multi-Agent Systems | Zhuohui Zhang et.al. | 2408.07397 | translate | read | null |
| 2024-08-14 | Improving Global Parameter-sharing in Physically Heterogeneous Multi-agent Reinforcement Learning with Unified Action Space | Xiaoyang Yu et.al. | 2408.07395 | translate | read | null |
| 2024-08-13 | LLMs can Schedule | Henrik Abgaryan et.al. | 2408.06993 | translate | read | link |
| 2024-08-13 | IRS-Assisted Lossy Communications Under Correlated Rayleigh Fading: Outage Probability Analysis and Optimization | Guanchang Li et.al. | 2408.06969 | translate | read | null |
| 2024-08-13 | Heavy-Ball Momentum Accelerated Actor-Critic With Function Approximation | Yanjie Dong et.al. | 2408.06945 | translate | read | null |
| 2024-08-13 | Multi-Agent Continuous Control with Generative Flow Networks | Shuang Luo et.al. | 2408.06920 | translate | read | link |
| 2024-08-13 | Personalized Dynamic Difficulty Adjustment – Imitation Learning Meets Reinforcement Learning | Ronja Fuchs et.al. | 2408.06818 | translate | read | link |
| 2024-08-13 | Integrating Saliency Ranking and Reinforcement Learning for Enhanced Object Detection | Matthias Bartolo et.al. | 2408.06803 | translate | read | link |
| 2024-08-13 | Residual Deep Reinforcement Learning for Inverter-based Volt-Var Control | Qiong Liu et.al. | 2408.06790 | translate | read | null |
| 2024-08-13 | Deep reinforcement learning for the management of the wall regeneration cycle in wall-bounded turbulent flows | Giorgio Maria Cavallazzi et.al. | 2408.06783 | translate | read | null |
| 2024-08-13 | Robust Deep Reinforcement Learning for Inverter-based Volt-Var Control in Partially Observable Distribution Networks | Qiong Liu et.al. | 2408.06776 | translate | read | null |
| 2024-08-13 | MAPPO-PIS: A Multi-Agent Proximal Policy Optimization Method with Prior Intent Sharing for CAVs’ Cooperative Decision-Making | Yicheng Guo et.al. | 2408.06656 | translate | read | link |
| 2024-08-12 | Body Transformer: Leveraging Robot Embodiment for Policy Learning | Carmelo Sferrazza et.al. | 2408.06316 | translate | read | link |
| 2024-08-12 | Inverse designing metamaterials with programmable nonlinear functional responses in graph space | Marco Maurizi et.al. | 2408.06300 | translate | read | null |
| 2024-08-12 | EyeSight Hand: Design of a Fully-Actuated Dexterous Robot Hand with Integrated Vision-Based Tactile Sensors and Compliant Actuation | Branden Romero et.al. | 2408.06265 | translate | read | null |
| 2024-08-12 | Stable-BC: Controlling Covariate Shift with Stable Behavior Cloning | Shaunak A. Mehta et.al. | 2408.06246 | translate | read | null |
| 2024-08-12 | Building Decision Making Models Through Language Model Regime | Yu Zhang et.al. | 2408.06087 | translate | read | null |
| 2024-08-12 | Sequential sampling without comparison to boundary through model-free reinforcement learning | Jamal Esmaily et.al. | 2408.06080 | translate | read | null |
| 2024-08-12 | Online Optimization of Curriculum Learning Schedules using Evolutionary Optimization | Mohit Jiwatode et.al. | 2408.06068 | translate | read | null |
| 2024-08-12 | GFlowNet Training by Policy Gradients | Puhua Niu et.al. | 2408.05885 | translate | read | link |
| 2024-08-12 | Multi-Agent Deep Reinforcement Learning Framework for Wireless MAC Protocol Design and Optimization | Navid Keshtiarast et.al. | 2408.05884 | translate | read | null |
| 2024-08-11 | Root Cause Attribution of Delivery Risks via Causal Discovery with Reinforcement Learning | Shi Bo et.al. | 2408.05860 | translate | read | null |
| 2024-08-09 | Deterministic remote entanglement using a chiral quantum interconnect | Aziza Almanakly et.al. | 2408.05164 | translate | read | null |
| 2024-08-09 | Kolmogorov-Arnold Network for Online Reinforcement Learning | Victor Augusto Kich et.al. | 2408.04841 | translate | read | null |
| 2024-08-09 | Multi-User MISO with Stacked Intelligent Metasurfaces: A DRL-Based Sum-Rate Optimization Approach | Hao Liu et.al. | 2408.04837 | translate | read | null |
| 2024-08-09 | Next-Generation Wi-Fi Networks with Generative AI: Design and Insights | Jingyu Wang et.al. | 2408.04835 | translate | read | null |
| 2024-08-08 | Learning Fair Cooperation in Mixed-Motive Games with Indirect Reciprocity | Martin Smit et.al. | 2408.04549 | translate | read | link |
| 2024-08-08 | Hybrid Reinforcement Learning Breaks Sample Size Barriers in Linear MDPs | Kevin Tan et.al. | 2408.04526 | translate | read | null |
| 2024-08-08 | Model-Based Transfer Learning for Contextual Reinforcement Learning | Jung-Hoon Cho et.al. | 2408.04498 | translate | read | link |
| 2024-08-08 | Reinforcement Learning from Human Feedback for Lane Changing of Autonomous Vehicles in Mixed Traffic | Yuting Wang et.al. | 2408.04447 | translate | read | null |
| 2024-08-08 | Non-maximizing policies that fulfill multi-criterion aspirations in expectation | Simon Dima et.al. | 2408.04385 | translate | read | null |
| 2024-08-08 | Deep Generative Models in Robotics: A Survey on Learning from Multimodal Demonstrations | Julen Urain et.al. | 2408.04380 | translate | read | null |
| 2024-08-08 | Deep Reinforcement Learning for the Design of Metamaterial Mechanisms with Functional Compliance Control | Yejun Choi et.al. | 2408.04376 | translate | read | null |
| 2024-08-08 | Goal-Oriented UAV Communication Design and Optimization for Target Tracking: A MachineLearning Approach | Wenchao Wu et.al. | 2408.04358 | translate | read | null |
| 2024-08-08 | KnowPC: Knowledge-Driven Programmatic Reinforcement Learning for Zero-shot Coordination | Yin Gu et.al. | 2408.04336 | translate | read | null |
| 2024-08-08 | Assigning Credit with Partial Reward Decoupling in Multi-Agent Proximal Policy Optimization | Aditya Kapoor et.al. | 2408.04295 | translate | read | null |
| 2024-08-07 | Traffic and Obstacle-aware UAV Positioning in Urban Environments Using Reinforcement Learning | Kamran Shafafi et.al. | 2408.03894 | translate | read | null |
| 2024-08-07 | Navigating the Human Maze: Real-Time Robot Pathfinding with Generative Imitation Learning | Martin Moder et.al. | 2408.03807 | translate | read | null |
| 2024-08-07 | HDPlanner: Advancing Autonomous Deployments in Unknown Environments through Hierarchical Decision Networks | Jingsong Liang et.al. | 2408.03768 | translate | read | null |
| 2024-08-07 | Asynchronous Credit Assignment Framework for Multi-Agent Reinforcement Learning | Yongheng Liang et.al. | 2408.03692 | translate | read | null |
| 2024-08-07 | RL-ADN: A High-Performance Deep Reinforcement Learning Environment for Optimal Energy Storage Systems Dispatch in Active Distribution Networks | Shengren Hou et.al. | 2408.03685 | translate | read | null |
| 2024-08-07 | AI-Driven approach for sustainable extraction of earth’s subsurface renewable energy while minimizing seismic activity | Diego Gutierrez-Oribio et.al. | 2408.03664 | translate | read | null |
| 2024-08-07 | A Comparison of LLM Finetuning Methods & Evaluation Metrics with Travel Chatbot Use Case | Sonia Meyer et.al. | 2408.03562 | translate | read | null |
| 2024-08-07 | Deep Reinforcement Learning for Robotics: A Survey of Real-World Successes | Chen Tang et.al. | 2408.03539 | translate | read | null |
| 2024-08-06 | Spacecraft inertial parameters estimation using time series clustering and reinforcement learning | Konstantinos Platanitis et.al. | 2408.03445 | translate | read | null |
| 2024-08-06 | Communication-Aware Consistent Edge Selection for Mobile Users and Autonomous Vehicles | Nazish Tahir et.al. | 2408.03435 | translate | read | null |
| 2024-08-07 | Adversarial Safety-Critical Scenario Generation using Naturalistic Human Driving Priors | Kunkun Hao et.al. | 2408.03200 | translate | read | null |
| 2024-08-06 | RELIEF: Reinforcement Learning Empowered Graph Feature Prompt Tuning | Jiapeng Zhu et.al. | 2408.03195 | translate | read | link |
| 2024-08-06 | Integrated Intention Prediction and Decision-Making with Spectrum Attention Net and Proximal Policy Optimization | Xiao Zhou et.al. | 2408.03191 | translate | read | null |
| 2024-08-06 | CADRL: Category-aware Dual-agent Reinforcement Learning for Explainable Recommendations over Knowledge Graphs | Shangfei Zheng et.al. | 2408.03166 | translate | read | null |
| 2024-08-06 | QADQN: Quantum Attention Deep Q-Network for Financial Market Prediction | Siddhant Dutta et.al. | 2408.03088 | translate | read | null |
| 2024-08-06 | Research on Autonomous Driving Decision-making Strategies based Deep Reinforcement Learning | Zixiang Wang et.al. | 2408.03084 | translate | read | null |
| 2024-08-06 | Model-free optimal controller for discrete-time Markovian jump linear systems: A Q-learning approach | Ehsan Badfar et.al. | 2408.03077 | translate | read | null |
| 2024-08-06 | Learning to Turn: Diffusion Imitation for Robust Row Turning in Under-Canopy Robots | Arun N. Sivakumar et.al. | 2408.03059 | translate | read | null |
| 2024-08-06 | A Course in Dynamic Optimization | Bar Light et.al. | 2408.03034 | translate | read | null |
| 2024-08-07 | Highly Efficient Self-Adaptive Reward Shaping for Reinforcement Learning | Haozhe Ma et.al. | 2408.03029 | translate | read | null |
| 2024-08-05 | Integrating Model-Based Footstep Planning with Model-Free Reinforcement Learning for Dynamic Legged Locomotion | Ho Jae Lee et.al. | 2408.02662 | translate | read | null |
| 2024-08-05 | Context-aware Mamba-based Reinforcement Learning for social robot navigation | Syed Muhammad Mustafa et.al. | 2408.02661 | translate | read | null |
| 2024-08-05 | Can Reinforcement Learning Unlock the Hidden Dangers in Aligned Large Language Models? | Mohammad Bahrami Karkevandi et.al. | 2408.02651 | translate | read | null |
| 2024-08-05 | Backward explanations via redefinition of predicates | Léo Saulières et.al. | 2408.02606 | translate | read | null |
| 2024-08-05 | Progressively Selective Label Enhancement for Language Model Alignment | Biao Liu et.al. | 2408.02599 | translate | read | null |
| 2024-08-05 | Evaluating and Enhancing LLMs Agent based on Theory of Mind in Guandan: A Multi-Player Cooperative Game under Imperfect Information | Yauwai Yim et.al. | 2408.02559 | translate | read | null |
| 2024-08-05 | Counterfactual Shapley Values for Explaining Reinforcement Learning | Yiwei Shi et.al. | 2408.02529 | translate | read | null |
| 2024-08-05 | Fair Resource Allocation For Hierarchical Federated Edge Learning in Space-Air-Ground Integrated Networks via Deep Reinforcement Learning with Hybrid Control | Chong Huang et.al. | 2408.02501 | translate | read | null |
| 2024-08-05 | Full error analysis of policy gradient learning algorithms for exploratory linear quadratic mean-field control problem in continuous time with common noise | Noufel Frikha et.al. | 2408.02489 | translate | read | null |
| 2024-08-05 | Terracorder: Sense Long and Prosper | Josh Millar et.al. | 2408.02407 | translate | read | null |
| 2024-08-02 | Pre-trained Language Models Improve the Few-shot Prompt Ability of Decision Transformer | Yu Yang et.al. | 2408.01402 | translate | read | null |
| 2024-08-02 | NOLO: Navigate Only Look Once | Bohan Zhou et.al. | 2408.01384 | translate | read | null |
| 2024-08-02 | Play to the Score: Stage-Guided Dynamic Multi-Sensory Fusion for Robotic Manipulation | Ruoxuan Feng et.al. | 2408.01366 | translate | read | null |
| 2024-08-02 | Jacta: A Versatile Planner for Learning Dexterous and Whole-body Manipulation | Jan Brüdigam et.al. | 2408.01258 | translate | read | null |
| 2024-08-02 | Deep progressive reinforcement learning-based flexible resource scheduling framework for IRS and UAV-assisted MEC system | Li Dong et.al. | 2408.01248 | translate | read | null |
| 2024-08-02 | Multi-Objective Deep Reinforcement Learning for Optimisation in Autonomous Systems | Juan C. Rosero et.al. | 2408.01188 | translate | read | null |
| 2024-08-02 | Optimizing Variational Quantum Circuits Using Metaheuristic Strategies in Reinforcement Learning | Michael Kölle et.al. | 2408.01187 | translate | read | null |
| 2024-08-02 | TCR-GPT: Integrating Autoregressive Model and Reinforcement Learning for T-Cell Receptor Repertoires Generation | Yicheng Lin et.al. | 2408.01156 | translate | read | null |
| 2024-08-02 | Actra: Optimized Transformer Architecture for Vision-Language-Action Models in Robot Learning | Yueen Ma et.al. | 2408.01147 | translate | read | null |
| 2024-08-02 | A Survey on Self-play Methods in Reinforcement Learning | Ruize Zhang et.al. | 2408.01072 | translate | read | null |
| 2024-08-01 | A Policy-Gradient Approach to Solving Imperfect-Information Games with Iterate Convergence | Mingyang Liu et.al. | 2408.00751 | translate | read | null |
| 2024-08-01 | Insurance Portfolio Pursuit with Reinforcement Learning | Edward James Young et.al. | 2408.00713 | translate | read | null |
| 2024-08-01 | Learning in Multi-Objective Public Goods Games with Non-Linear Utilities | Nicole Orzan et.al. | 2408.00682 | translate | read | null |
| 2024-08-01 | Discretizing Continuous Action Space with Unimodal Probability Distributions for On-Policy Reinforcement Learning | Yuanyang Zhu et.al. | 2408.00309 | translate | read | null |
| 2024-08-01 | A Reinforcement Learning Based Motion Planner for Quadrotor Autonomous Flight in Dense Environment | Zhaohong Liu et.al. | 2408.00275 | translate | read | null |
| 2024-08-01 | Large Language Model (LLM)-enabled In-context Learning for Wireless Network Optimization: A Case Study of Power Control | Hao Zhou et.al. | 2408.00214 | translate | read | null |
(<a href=../Reinforcement_Learning.md>back to Reinforcement Learning</a>)