[PYTHON] Let's try TensorFlow music generation project "Magenta" from development environment setting to song generation.

Introduction

Based on magenta's GitHub README, set up magenta's development environment without using Docker. I would like to generate a song with Basic RNN.

Download and use the trained model.

If you want to use Docker easily, please refer to here. Try making a song with TensorFlow's art / music generation project "magenta".

Development environment

To prepare the development environment for Magenta -Magenta's GitHub repository ・ Bazel ・ TensorFlow You need three.

Also, Magenta seems to support only Python 2.7 at the moment, so I created a Python 2.7 environment with virtualenv on a Mac and set it up.

A clone of Magenta's GitHub repository

git clone https://github.com/tensorflow/magenta.git

Install Bazel

https://www.bazel.io/versions/master/docs/install.html

$ brew install bazel

TensorFlow https://www.tensorflow.org/versions/master/get_started/os_setup.html

When installing TensorFlow, please check whether your PC is CPU only or GPU can be used. It seems that TensorFlow supports GPU called CUDA, so if you have a Mac, you may use that. Reference: OSX GPU is supported by Tensorflow

I proceeded with CPU only.

$ export TF_BINARY_URL=https://storage.googleapis.com/tensorflow/mac/cpu/tensorflow-0.10.0-py2-none-any.whl
$ pip install --upgrade $TF_BINARY_URL

Let's run the test after setting. Go to / magenta / under git clone and hit the following command.

bazel test //magenta/...

If the following message appears, it is successful and the setting is complete.

.........
INFO: Found 65 targets and 27 test targets...
INFO: Elapsed time: 43.359s, Critical Path: 35.97s
//magenta/common:concurrency_test                                        PASSED in 5.4s
//magenta/interfaces/midi:midi_hub_test                                  PASSED in 14.8s
(Omission)

Executed 27 out of 27 tests: 27 tests pass.
There were tests whose specified size is too big. Use the --test_verbose_timeout_warnings command line option to see which ones these are.

Let's make a song with Basic RNN.

Magenta has three types of learning models: Basic RNN, LookBack RNN, and Attention RNN. This time, I will try the method of Basic RNN.

This model is a melody that produces only one note at the same timing. I won't do it in this post, but if you put in MIDI and learn by yourself, it seems that only one sound is considered even if there are sounds that sound at the same timing with that MIDI.

Download trained model

You can download it from here.

After downloading, please place it anywhere.

Path setting

Set the path of the trained model downloaded above with "Absolute path".

BUNDLE_PATH=<absolute path of basic_rnn.mag>

Song generation

It can be generated with the following command.

bazel run //magenta/models/basic_rnn:basic_rnn_generate -- \
--bundle_file=${BUNDLE_PATH} \
--output_dir=/tmp/basic_rnn/generated \
--num_outputs=10 \
--num_steps=128 \
--primer_melody="[60]"

The options seem to have the following meanings:

Song generation destination

--output_dir=/tmp/basic_rnn/generated \

You can decide where to generate the song.

Number of songs

--num_outputs=10 \

You can decide how many songs to make.

Song length

--num_steps=128 \

Set how many notes and how many measures to make a song. A song of 8 bars is made with 128. I think that 16 sounds can be inserted per bar. (The definition of words such as "bar" is ambiguous ...)

Song intro

--primer_melody="[60]"

It's what sound the song starts with. As an example --primer_melody="[60, -2, 60, -2, 67, -2, 67, -2]” This stands for "Dodo Soso" and will now generate songs that start with a glittering star intro. (By the way, since the above is 8 notes, it means that half of the first bar is set.)

The numbers are -2:no event -1:note-off event It is set like this, and I do not understand it in words, but In the case of [60, -2, 60, -2, 67, -2, 67, -2], "Dodo Soso" and the sound before it continue, whereas In the case of [60, -1, 60, -1, 67, -1, 67, -1], the sound was interrupted as "do do so so". Is it the difference between "doing nothing" and "turning off the sound"?

60: de 67: So I think that the number of keys on the piano indicates the sound.

When executed, a song will be generated.

result

The following songs have been made. It started with the intro of "Do Do So So" and made a 16-bar song. Sample

Here is the first one with "Dodo Soso". Sample

The atmosphere changes quite a bit, and you can feel that the intro settings have an effect on the subsequent melody.

Thank you very much.

Recommended Posts

Let's try TensorFlow music generation project "Magenta" from development environment setting to song generation.
Learn MIDI files and generate songs with the TensorFlow music generation project "Magenta".
Introduction to Python Let's prepare the development environment
From 0 to Django development environment construction to basic operation
[It's not too late to learn Python from 2020] Part 2 Let's create a Python development environment
Try setting SSH (Exscript) from the software to the router
Try setting NETCONF (ncclient) from software to the router
Python development environment construction 2020 [From Python installation to poetry introduction]
How to build a development environment for TensorFlow (1.0.0) (Mac)
Let's add it to the environment variable from the command ~
Switch the setting value of setting.py according to the development environment
From setting up a Rust environment to running Hello World