DeepDance: Music-to-Dance Motion Choreography with Adversarial Learning


    The creation of improvised dancing choreographies is an important research field of cross-modal analysis. A key point of this task is how to effectively create and correlate music and dance with a probabilistic one-to-many mapping, which is essential to create realistic dances of various genres. To address this issue, we propose a GAN-based cross-modal association framework, DeepDance, which correlates two different modalities (dance motion and music) together, aiming at creating the desired dance sequence in terms of the input music. Its generator is to predictively produce the dance movements best-fit to current music piece by learning from examples. In another hand, its discriminator acts as an external evaluation from the audience and judges the whole performance. The generated dance movements and the corresponding input music are considered to be wellmatched if the discriminator cannot distinguish the generated movements from the training samples according to the estimated probability. By adding motion consistency constraints in our loss function, the proposed framework is able to create long realistic dance sequences. To alleviate the problem of expensive and inefficient data collection, we propose an effective approach to create a large-scale dataset, YouTube-Dance3D, from open data source. Extensive experiments on currently available musicdance datasets and our YouTube-Dance3D dataset demonstrate that our approach effectively captures the correlation between music and dance and can be used to choreograph appropriate dance sequences.


We release a semi-automatic dance acquisiton tool to create large-scale music-dance dataset from open data source (i.e. dance videos from YouTube). Our GAN-based cross-modal association model captures correlation between music and dance, and is trained via adversarial learning. Our learned model can create realistic dance sequences according to input music and starting pose.


Supported by National Natural Science Foundation of China under Grant 61379067
Supported by the National Research Foundation, Prime Minister’s Office, Singapore under its Strategic Capability Research Centres Funding Initiative


Our code is available on github.


Our dataset
EA-MUD: consists of music-dance pairs of multiple east Asian dancing genres.
YouTube-Dance3D: consists of extracted skeletal dance sequences from dance videos in YouTube8M.
Other dataset
Cyprus: contains 10 dance sequences for Laban analysis.
Alemi: contains 4 dance sequences.
Yalta: contains about 10 dance sequences of 2 genres.
Tang: contains about 60 dance sequences of 4 genres.

For all dataset, we provide the retargeted dance sequences as well as musical data used in this paper. Download link will be available soon.


@ARTICLE{9042236,  author={G. {Sun} and Y. {Wong} and Z. {Cheng} and M. S. {Kankanhalli} and W. {Geng} and X. {Li}},  
    journal={IEEE Transactions on Multimedia},   
    title={DeepDance: Music-to-Dance Motion Choreography with Adversarial Learning},   

Related work:

Example-Based Automatic Music-Driven Conventional Dance Motion Synthesis.


[1] 樊儒昆,音乐驱动的舞蹈动作合成
[2] 兰恒,集合深度学习的音乐舞蹈一体化编排系统
[3] 赖章炯,基于深度学习的音乐舞蹈自动编排技术研究