[PYTHON] I tried using magenta / TensorFlow

Automatic melody generation using magenta / TensorFlow

background

This time I wanted to be able to manipulate MIDI data using Python, so while browsing various sites, I arrived at a tool called Magenta.

What is #Magenta? The other day, Google's new project called Magenta opened on Github. https://github.com/tensorflow/magenta

Magenta is a project that uses neural networks to generate art and music.

It aims to evolve machine learning creativity and become a community of artists and machine learning.

Recurrent neural network to compose

As the first installment of Magenta, a model of a recurrent neural network (RNN) that composes music has been released. It incorporates a technique called LSTM (Long Short-Term Memory). According to the magenta official website (http://magenta.tensorflow.org/)

It’s purposefully a simple model, so don’t expect stellar music results. We’ll post more complex models soon. This is a deliberately simplified model, and you can't expect to make outstanding music. However, we will continue to publish more complex models.

Magenta was released in 2016. It is necessary to study RNN and LSTM first, but I would like to utilize this tool by trying to use it for the time being.

environment

References

-Extract only the sound of a specific instrument from the MIDI file and make it a separate file

Extract a specific part

We confirmed the MIDI data (title song of Hinatazaka46, 4 songs) purchased from the YAMAHA online shop.

スクリーンショット 2020-04-17 22.53.40.png

In this way, it was divided into various tracks. (Drum part, accompaniment, etc.) As will be described later, learning was performed using the data as it is without doing anything. I confirmed the results, but it is not good if there are many genres of data used. Therefore, it is necessary to focus only on the melody.

So let's extract only the melody part using the Python package pretty-midi.

terminal


pip install pretty_midi

For reference, the resource allocation is as follows.

スクリーンショット 2020-04-17 22.56.58.png

cut_out.py



import pretty_midi
import os

os.chdir("~/WorkSpace/MIDI_data/cut_out")

def func():
  midi_name = ["00", "01", "02", "03"]

  for m_name in midi_name:
    midi_data = pretty_midi.PrettyMIDI("resource" + m_name + '.mid')

    #Select the selected instrument number and save
    midi_track = midi_data.instruments
    # print("midi_track = {0}" .format(midi_track))
    for m_track in midi_track:
        print("midi_track = {0}" .format(m_track))

    print("select track-nunber => ")
    i = int(input())

    #Take out the selected instrument and make it an instance
    for instrument in midi_track:
        if instrument.program == i:
            ins = instrument

    #Create a Pretty MIDI object for new creation
    rev_en_chord = pretty_midi.PrettyMIDI()

    #Added to PrettyMIDI object
    rev_en_chord.instruments.append(ins)

    #save
    rev_en_chord.write('./result/ins_' + m_name + '.mid')

  return

func()

This is a code that refers to each MIDI file, specifies the instrument number, and saves only that track.

Various correspondence tables for MIDI note numbers. Rhythm instrument name in GM, GS, XG. Program number, instrument name and pronunciation range.

See the above site for instrument codes. Most people seem to play with instrument chords when touching MIDI data from programming.

Finally Learning

Start Docker and pull the magenta / tensorflow image. Since TensorBoard is also used, specify the Port number.

Terminal


docker run -it -p 6006:6006 -v /tmp/magenta:/magenta-data tensorflow/magenta

Basically, I did not specify it here because I proceeded with the procedure of the following site.

PyDataOkinawa/meetup017

However, I had to hit the shell script in the image, but it stopped with an error. I had never touched a shell script before, so it took a long time to decipher it. Perhaps because of a different environment, the name of the specified programming file was different, or the parameter settings were incorrect, resulting in an error. (I learned from shell scripts.)

For reference, learning was possible in my environment by making the following changes.

build_dataset.sh



convert_midi_dir_to_note_sequences \
    --midi_dir=$MIDI_DIR \
    --output_file=$SEQUENCES_TFRECORD \
    --recursive

######Changed to######

convert_dir_to_note_sequences \
    --midi_dir=$MIDI_DIR \
    --output_file=$SEQUENCES_TFRECORD \
    --recursive

Learning ... I also saw TensorBoard for the first time. I have to study more because I have no idea what is written. スクリーンショット 2020-04-16 17.29.32.png

スクリーンショット 2020-04-15 18.37.07.png

I tried turning it on a MacBook for the time being, but it seems to be difficult ... I need to be able to turn it at Deep Infinity next week ...

Since the number of steps was set to 1200, it took 1 hour and 30 minutes to learn with a MacBook Pro. The model used is Basic RNN スクリーンショット 2020-04-16 18.50.52.png

Accuracy increased to 0.8.

Generate

A MIDI file was generated using this training data. The number of steps was specified as 500. In addition, the number of generated files is five.

The generated MIDI data was converted to wav using timidity and uploaded to Google Drive.

Google Drive

Time has come here, but as far as I listen to the song, I feel that a melody line that is slightly reminiscent of the original song has been generated.

Recommended Posts

I tried using magenta / TensorFlow
I tried using parameterized
I tried using argparse
I tried using mimesis
I tried using anytree
I tried using aiomysql
I tried using Summpy
I tried playing a ○ ✕ game using TensorFlow
I tried using coturn
I tried using Pipenv
I tried using matplotlib
I tried using "Anvil".
I tried using Hubot
I tried using ESPCN
I tried to classify text using TensorFlow
I tried using openpyxl
I tried using Ipython
I tried using PyCaret
I tried using cron
I tried using face_recognition
I tried using Jupyter
I tried using PyCaret
I tried using Heapq
I tried using doctest
I tried running TensorFlow
I tried using folium
I tried using jinja2
I tried using folium
I tried using time-window
I tried to make a ○ ✕ game using TensorFlow
[I tried using Pythonista 3] Introduction
I tried using easydict (memo).
I tried face recognition using Face ++
I tried using BigQuery ML
I tried using Amazon Glacier
I tried using git inspector
[Python] I tried using OpenPose
I tried using AWS Chalice
I tried using Slack emojinator
I tried using Rotrics Dex Arm # 2
I tried the TensorFlow tutorial 1st
I tried the TensorFlow tutorial 2nd
I tried TensorFlow tutorial CNN 4th
I tried using Thonny (Python / IDE)
I tried server-client communication using tmux
I tried reinforcement learning using PyBrain
I tried deep learning using Theano
Somehow I tried using jupyter notebook
[Kaggle] I tried undersampling using imbalanced-learn
I tried shooting Kamehameha using OpenPose
I tried using the checkio API
[Python] I tried using YOLO v3
I tried asynchronous processing using asyncio
I tried refactoring the CNN model of TensorFlow using TF-Slim
I tried hosting a TensorFlow deep learning model using TensorFlow Serving
[For beginners] I tried using the Tensorflow Object Detection API
I tried the TensorFlow tutorial MNIST 3rd
I tried using Amazon SQS with django-celery
I tried using Azure Speech to Text.
I tried using Twitter api and Line api
I tried to implement Autoencoder with TensorFlow