We address the problem of accurate capture and expressive modelling of interactive behaviors happening between two persons in daily scenarios. Different from previous works which either only consider one person or focus on conversational gestures, we propose to simultaneously model the activities of two persons, and target objective-driven, dynamic, and coherent interactions which often span long duration and cover big space. To this end, we capture a new dataset dubbed InterAct, which is composed of 241 motion sequences where two persons perform a realistic, dynamic, and coherent scenario over each whole sequence. The audios, body motions, and facial expressions of both persons are all captured in our dataset. We also demonstrate the first diffusion model based approach that directly estimates the interactive motions between two persons from their audios alone. All the data and code will be available for research purposes upon acceptance of the paper.
To be written
To be written
@article{huang2024interact,
title={InterAct: Capture and Modelling of Realistic, Expressive and Interactive Activities between Two Persons in Daily Scenarios},
author={Yinghao Huang and Leo Ho and Dafei Qin and Mingyi Shi and Taku Komura},
year={2024},
eprint={2405.11690},
archivePrefix={arXiv},
primaryClass={cs.CV}
}