MotioNet: 3D Human Motion Reconstruction from Monocular Video with Skeleton Consistency

Abstract

We introduce MotioNet, a deep neural network that directly reconstructs the motion of a 3D human skeleton from monocular video. While previous methods rely on either rigging or inverse kinematics (IK) to associate a consistent skeleton with temporally coherent joint rotations, our method is the first data-driven approach that directly outputs a kinematic skeleton, which is a complete, commonly used, motion representation. At the crux of our approach lies a deep neural network with embedded kinematic priors, which decomposes sequences of 2D joint positions into two separate attributes - a single, symmetric, skeleton, encoded by bone lengths, and a sequence of 3D joint rotations associated with global root positions and foot contact labels.

Publication
ToG 2020
Mingyi Shi
Mingyi Shi
PhD, since Nov. 2020.
Taku Komura
Taku Komura
Professor

Related