Eliot Xing
Vernon Luk
Jean Oh

Carnegie Mellon University


TL;DR We present a maximum entropy first-order model-based RL algorithm, alongside a parallel differentiable multiphysics simulation platform for RL that supports simulating various materials beyond just rigid bodies.

Results on Rewarped tasks

AntRun
HandReorient
RollingFlat
SoftJumper
HandFlip
FluidTransport
Figure. Visualizations of trajectories from policies learned by SAPO in Rewarped tasks.

BibTeX
@article{xing2024stabilizing,
    title={Stabilizing Reinforcement Learning in Differentiable Multiphysics Simulation},
    author={Eliot Xing and Vernon Luk and Jean Oh},
    journal={arXiv preprint arXiv:2412.12089},
    year={2024}
}