MuseGAN: Multi-track Sequential Generative Adversarial Networks for - - PowerPoint PPT Presentation

musegan multi track sequential generative adversarial
SMART_READER_LITE
LIVE PREVIEW

MuseGAN: Multi-track Sequential Generative Adversarial Networks for - - PowerPoint PPT Presentation

MuseGAN: Multi-track Sequential Generative Adversarial Networks for Symbolic Music Generation and Accompaniment Hao-Wen Dong*, Wen-Yi Hsiao*, Li-Chia Yang, Yi-Hsuan Yang Research Center of IT Innovation, Academia Sinica Demo Page


slide-1
SLIDE 1

MuseGAN: Multi-track Sequential Generative Adversarial Networks for Symbolic Music Generation and Accompaniment

Hao-Wen Dong*, Wen-Yi Hsiao*, Li-Chia Yang, Yi-Hsuan Yang

Research Center of IT Innovation, Academia Sinica Demo Page https://salu133445.github.io/musegan/

*these authors contributed equally to this work

slide-2
SLIDE 2

Outline

。Goals & Challenges 。Data 。Proposed Model 。Results & Evaluation 。Future Works

Source Code https://github.com/salu133445/musegan Demo Page https://salu133445.github.io/musegan/

2

slide-3
SLIDE 3

Goals

Generate pop music 。of multiple tracks 。in piano-roll format 。using GAN with CNNs

[Source Code] https://github.com/ salu133445/musegan [Demo Page] https://salu133445. github.io/musegan/

3

slide-4
SLIDE 4

Challenge I

Multitrack Interdependency

vocal piano bass drums strings

music & clip by phycause

Multi-track GAN

4

slide-5
SLIDE 5

Challenge II

Music Texture

melody chord (harmony)

Convolutional Neural Networks

5

slide-6
SLIDE 6

Challenge III

Temporal Structure

paragraph 1 paragraph 2 paragraph 3 phrase 1 phrase 2 phrase 3 phrase 4 bar 1 bar 2 bar 3 bar 4 beat 1 beat 2 beat 3 beat 4 step 1 step 2 ··· step 24

song

phrase 2

4/4 time

6

slide-7
SLIDE 7

Challenge III

Temporal Structure

bar 1 bar 2 bar 3 bar 4 beat 1 beat 2 beat 3 beat 4 step 1 step 2 ··· step 24

Fixed Structure

Convolutional Neural Networks

4/4 time

7

slide-8
SLIDE 8

Data Representation

pitch time

Piano-roll

Bar 1 Bar 2 Bar 3 Bar 4 polyphonic  multi-track  time step (with symbolic timing)

8

A3 t0 t1

slide-9
SLIDE 9

Data Representation

Multi-track Piano-roll

pitch time tracks

polyphonic  multi-track 

(with symbolic timing)

9

slide-10
SLIDE 10

Data Representation

10

96 time steps 84

pitches

5 tracks 4 bars

a 4×96×84×5 tensor

Drums Guitar Piano Strings Bass

slide-11
SLIDE 11

Data

LPD (Lakh Pianoroll Dataset)

。>170,000 multi-track piano-rolls 。Derived from Lakh MIDI Dataset 。Mainly pop songs

Pypianoroll (Python package)

。Manipulation & Visualization 。Efficient Save/Load 。Parse/Write MIDI files 。On PYPI (pip installable)

[Dataset] https://salu133445.gith ub.io/musegan/dataset [Pypianoroll] https://salu133445. github.io/pypianoroll/

11

slide-12
SLIDE 12

G z~p(z) G(z) D X 1/0 random noise real data fake data

Generative Adversarial Networks

12

slide-13
SLIDE 13

G z~p(z) G(z) D X 1/0

log(1-D(X)) + log(D(G(z))) log(1-D(G(z))) Goal of G Make G(z) undistinguishable from real data for D Goal of D Distinguish G(z) being fake from X being real

random noise real data fake data

Generative Adversarial Networks

13

slide-14
SLIDE 14

Generative Adversarial Networks

X

real data

G z~p(z) G(z)

random noise fake data

Generator D real/fake Discriminator

4-bar phrases of 5 tracks

critic

(wgan-gp)

14

slide-15
SLIDE 15

MuseGAN – An Overview

Gtemp

4 latent variables 1 random noise temporal generator bar generator 4 piano-roll matrices

Gbar

15

slide-16
SLIDE 16

Bar Generator

Generator

z z z z z z z z z z z z z G G G G G

16

slide-17
SLIDE 17

Generator

z Bar Generator z z z z z z z z z z z z z

17

G G G G G No Coordination Coordination track-dependent track-independent

slide-18
SLIDE 18

z z z z z

Generator

z Bar Generator G z G G G G G z z z z z z z z z z z z z z z z z

18

G G G G G

slide-19
SLIDE 19

z z z z z

Generator

z Bar Generator G z G G G G G z z z z z z z z z z z z z z z z z

19

G G G G G

slide-20
SLIDE 20

Time Dependent Independent Track Dependent Melody Groove Independent Chords Style

z z z z z

Generator

z Bar Generator G z G G G G G z z z z z z z z z z z z z z z z z

20

G G G G G

Chords Style Melody Groove

slide-21
SLIDE 21

MuseGAN

21

slide-22
SLIDE 22

Results

More Samples on Demo Page https://salu133445.github.io/musegan/

Sample 1 Sample 2

22

Bass Drums Guitar Strings Piano Step 0 Step 700 Step 2500 Step 6000 Step 7900

Drum pattern Chords Bass Line

slide-23
SLIDE 23

step

2000 4000 6000 8000 104 106 108 1010 1012

Negative Critic Loss

Objective Metrics

UPC

step

QN

step

UPC number of used pitch classes per bar QN ratio of qualified notes

Monitor the Training

23

slide-24
SLIDE 24

User Study

H: harmonious R: rhythmic MS: musically structured C: coherent OR: overall rating

composer jamming hybrid

24

slide-25
SLIDE 25

Accompaniment System

25

Conditional GAN Generation from Scratch nothing  5-track Accompaniment System single-track  5-track

slide-26
SLIDE 26

Summary

。MuseGAN

  • a novel GAN for multi-track sequence generation
  • multi-track, polyphonic music
  • human-AI cooperative scenario

。Lakh Pianoroll Dataset (LPD) (new dataset!!) 。Pypianoroll (new package!!)

26

slide-27
SLIDE 27

Future Works

Full Song Generation

bar 1 bar 2 bar 3 bar 4 beat 1 beat 2 beat 3 beat 4 step 1 step 2 ··· step 24 phrase 2 paragraph 1 paragraph 2 paragraph 3 phrase 1 phrase 2 phrase 3 phrase 4

song

Hierarchical Temporal Structure

27

slide-28
SLIDE 28

Future Works

Cross-modal Generation 。Music + Video 。Music + Lyrics 。Video + Text

28

slide-29
SLIDE 29

MIR

Analysis

  • music  features
  • e.g. chord recognition, beat/downbeat

detection, music transcription, source separation

Retrieval

  • query  music
  • e.g. query by humming, similarity

search, music recommendation, playlist generation

Generation

  • X  music
  • e.g. generation, accompaniment, style

transfer, mashup, remix

Music Information Research

29

slide-30
SLIDE 30

MACLab

Research Center for IT Innovation, Academia Sinica

30

音樂生成 人聲分離

音樂精彩段落擷取

伴奏系統 創作系統 多音軌/樂器模型 MIDI音樂格式

請搜尋

MuseGAN MidiNet

音樂拼圖遊戲

(應用: 音樂串燒生成) demo: https://remyhuang.github.io/ 分離人聲 分離音樂 運用machine learning 技術,從歌曲中萃 取出人聲以及 音樂兩部分

Lab Director

  • Dr. Yi-Hsuan Yang

Music and Audio Computing Lab

[Lab Website] http://mac.citi.sinica.e du.tw/

slide-31
SLIDE 31

AAAI 2018

31

slide-32
SLIDE 32

New Orleans

32

slide-33
SLIDE 33

Mardi Gras

33

slide-34
SLIDE 34

Q&A

MuseGAN: Multi-track Sequential Generative Adversarial Networks for Symbolic Music Generation and Accompaniment Source Code https://github.com/salu133445/musegan Demo Page https://salu133445.github.io/musegan/