Music Driven Dance Motion Synthesis


We introduce a novel method for synthesizing dance motions that follow the emotions and contents of a piece of music. Our method employs a learning-based approach to model the music to motion mapping relationship embodied in example dance motions along with those motions’ accompanying background music. A key step in our method is to train a music to motion matching quality rating function through learning the music to motion mapping relationship exhibited in synchronized music and dance motion data, which were captured from professional human dance performance. To generate an optimal sequence of dance motion segments to match with a piece of music, we introduce a constraint-based dynamic programming procedure. This procedure considers both music to motion matching quality and visual smoothness of a resultant dance motion sequence. We also introduce a two-way evaluation strategy, coupled with a GPU-based implementation, through which we can execute the dynamic programming process in parallel, resulting in significant speedup. To evaluate the effectiveness of our method, we quantitatively compare the dance motions synthesized by our method with motion synthesis results by several peer methods using the motions captured from professional human dancers’ performance as the gold standard. We also conducted several medium-scale user studies to explore how perceptually our dance motion synthesis method can outperform existing methods in synthesizing dance motions to match with a piece of music. These user studies produced very positive results on our music-driven dance motion synthesis experiments for several Asian dance genres, confirming the advantages of our method.

Paper: Example-Based Automatic Music-Driven Conventional Dance Motion Synthesis

We have implemented a prototype demo system. Users can import a music piece from our CAPG music dance database and the corresponding dance motion will be automatically imported into the system and displayed on the left. After that, users can choose segmentation method, music feature and motion feature. Finally, click generate button which will invoke the generation process by calling the pre-trained model. When the generated motion is ready, it will be displayed on the right of the original motion.

Supported by National 863 High-Tech Program

Related paper:

[1] Jin B, Feng L, Liu G, et al. A hybrid approach to animating the murals with Dunhuang style[C]//2014 IEEE 16th International Workshop on Multimedia Signal Processing (MMSP). IEEE, 2014: 1–6
[2] Jin B, Geng W. Correspondence specification learned from master frames for automatic inbetweening[J]. Multimedia Tools and Applications, Springer US, 2015, 74(13): 4873–4889
[3] Du Y, Wong Y, Liu Y, et al. Marker-Less 3D Human Motion Capture with Monocular Image Sequence and Height-Maps[G]//Springer, Cham, 2016: 20–36
[4] Du Y, Wong Y, Jin W, et al. Semi-supervised learning for surface EMG-based gesture recognition[J]. IJCAI International Joint Conference on Artificial Intelligence, California: International Joint Conferences on Artificial Intelligence Organization, 2017: 1624–1630.
[5] Wang Z, Han F, Geng W. Image mosaicking for oversized documents with a multi-camera rig[C]//2017 IEEE 15th International Conference on Software Engineering Research, Management and Applications (SERA). IEEE, 2017: 161–167
[6] Wang Z, Geng W. Generation of view-dependent textures for an inaccurate model[C]//2017 IEEE/ACIS 16th International Conference on Computer and Information Science (ICIS). IEEE, 2017: 85–90
[7] Wang Z, Jin B, Geng W. Estimation of Antenna Pose in the Earth Frame Using Camera and IMU Data from Mobile Phones[J]. Sensors, Multidisciplinary Digital Publishing Institute, 2017, 17(4): 806
[8] Du Y, Jin W, Wei W, et al. Surface EMG-Based Inter-Session Gesture Recognition Enhanced by Deep Domain Adaptation[J]. Sensors, Multidisciplinary Digital Publishing Institute, 2017, 17(3): 458.


[1] 金秉文,基于样例学习的风格化脸部动画生成方法研究
[2] 杜宇,基于深度机器学习的体态与手势感知计算关键技术研究
[3] 姜锦正,非接触式测量天线姿态的技术与系统
[4] 李霞,基于样例的敦煌壁画技法模拟技术与系统
[5] 兰恒,集合深度学习的音乐舞蹈一体化编排系统


[1] 浙江大学,结合高度图从无标记单目图像中恢复三维人体姿态的方法,申请号:CN201510970682.3,申请日:2015.12.21,公开号:CN105631861A,公开日:2016.06.01。