Carnegie Mellon University
TL;DR
We present a maximum entropy first-order model-based RL algorithm, alongside
a parallel differentiable multiphysics simulation platform for RL that supports
simulating various materials beyond just rigid bodies.
Results on Rewarped tasks
Figure. Visualizations of trajectories from policies learned by SAPO in Rewarped tasks.
BibTeX
@article{xing2024stabilizing,
title={Stabilizing Reinforcement Learning in Differentiable Multiphysics Simulation},
author={Eliot Xing and Vernon Luk and Jean Oh},
journal={arXiv preprint arXiv:2412.12089},
year={2024}
}