Optimizing Long-term Predictions for Model-based Policy Search
2017
Conference Paper
am
ics
We propose a novel long-term optimization criterion to improve the robustness of model-based reinforcement learning in real-world scenarios. Learning a dynamics model to derive a solution promises much greater data-efficiency and reusability compared to model-free alternatives. In practice, however, modelbased RL suffers from various imperfections such as noisy input and output data, delays and unmeasured (latent) states. To achieve higher resilience against such effects, we propose to optimize a generative long-term prediction model directly with respect to the likelihood of observed trajectories as opposed to the common approach of optimizing a dynamics model for one-step-ahead predictions. We evaluate the proposed method on several artificial and real-world benchmark problems and compare it to PILCO, a model-based RL framework, in experiments on a manipulation robot. The results show that the proposed method is competitive compared to state-of-the-art model learning methods. In contrast to these more involved models, our model can directly be employed for policy search and outperforms a baseline method in the robot experiment.
Author(s): | Andreas Doerr and Christian Daniel and Duy Nguyen-Tuong and Alonso Marco and Stefan Schaal and Marc Toussaint and Sebastian Trimpe |
Book Title: | Proceedings of 1st Annual Conference on Robot Learning (CoRL) |
Volume: | 78 |
Pages: | 227-238 |
Year: | 2017 |
Month: | November |
Editors: | Sergey Levine and Vincent Vanhoucke and Ken Goldberg |
Department(s): | Autonomous Motion, Intelligent Control Systems |
Research Project(s): |
Learning Probabilistic Dynamics Models
|
Bibtex Type: | Conference Paper (conference) |
Paper Type: | Conference |
Event Name: | 1st Annual Conference on Robot Learning |
Event Place: | Mountain View, CA, USA |
State: | Published |
Links: |
PDF
|
BibTex @conference{doerr2017optimizing, title = {Optimizing Long-term Predictions for Model-based Policy Search}, author = {Doerr, Andreas and Daniel, Christian and Nguyen-Tuong, Duy and Marco, Alonso and Schaal, Stefan and Toussaint, Marc and Trimpe, Sebastian}, booktitle = {Proceedings of 1st Annual Conference on Robot Learning (CoRL)}, volume = {78}, pages = {227-238}, editors = {Sergey Levine and Vincent Vanhoucke and Ken Goldberg}, month = nov, year = {2017}, doi = {}, month_numeric = {11} } |