Corporation Consumer Carrier Home and Enterprise

Language

Boundary Data Augmentation for Offline Reinforcement Learning

Release Date：2023-09-27 Author：SHEN Jiahao, JIANG Ke, TAN Xiaoyang

<p><b>Abstract: </b>Offline reinforcement learning (ORL) aims to learn a rational agent purely from behavior data without any online interaction. One of the major challenges encountered in ORL is the problem of distribution shift, i.e., the mismatch between the knowledge of the learned policy and the reality of the underlying environment. Recent works usually handle this in a too pessimistic manner to avoid out-of-distribution (OOD) queries as much as possible, but this can influence the robustness of the agents at unseen states. In this paper, we propose a simple but effective method to address this issue. The key idea of our method is to enhance the robustness of the new policy learned offline by weakening its confidence in highly uncertain regions, and we propose to find those regions by simulating them with modified Generative Adversarial Nets (GAN) such that the generated data not only follow the same distribution with the old experience but are very difficult to deal with by themselves, with regard to the behavior policy or some other reference policy. We then use this information to regularize the ORL algorithm to penalize the overconfidence behavior in these regions. Extensive experiments on several publicly available offline RL benchmarks demonstrate the feasibility and effectiveness of the proposed method.</p> <p> </p> <p><b>Keywords:</b> offline reinforcement learning; out-of-distribution state; robustness; uncertainty</p>

download： PDF

Related Articles

Log Anomaly Detection Through GPT-2 for Large Scale Systems

A 220-GHz Frequency-Division Multiplexing Wireless Link with High Data Rate

Learning-Based Admission Control for Low-Earth-Orbit Satellite Communication Networks

Massive Unsourced Random Access Under Carrier Frequency Offset

Differential Quasi-Yagi Antenna and Array

A Practical Reinforcement Learning Framework for Automatic Radar Detection

Multi-Agent Hierarchical Graph Attention Reinforcement Learning for Grid-Aware Energy Management

Double Deep Q-Network Decoder Based on EEG Brain-Computer Interface

Editorial: Reinforcement Learning and Intelligent Decision