[PYTHON] Create noise-filled audio data with SoX

Since the number of voice data samples was small due to machine learning, we mass-produced voice data by adding noise data. I will introduce the procedure at that time.

Basically, we will proceed with the discussion on the premise that the audio file to be handled is a wav file, so please forgive me ...

SoX´╝łSound eXchange) You can use this to process various sounds. This time, I will try mass-producing noise-containing voice data using some of these functions.

First, install it.

brew install sox --with-lame

Mix noise with audio data

Immediately, let's mix the noise. Let's pick up noise data from free materials. If you look at "free material voice", you will find data that can be used as noise such as daily life sounds.

sox -m sound.wav -v 0.1 noise.wav noise_mix.wav trim 0 3

You can create noise_mix.wav by mixing sound.wav and noise.wav with the above command.

-m means to mix the two audio files. -v is the loudness adjustment of noise (second audio data), 1 is the original volume. trim is trimming, this time trimming from 0th second to 3rd second.

Mass production by hitting the shell with python

When it comes to mass production, I think that it will be executed using for statements. In python, you can execute commands by using the subprocess module, so I thought it would be good to turn the for statement in this, so I wrote the code. I think that the code will change depending on the directory structure etc., so it is just an example, but it looks like the following.


import subprocess

for sound_idx in range(1,11):
    for volume in range(1,11):
        for noise_idx in range(1,11):
            s_idx = str(sound_idx)
            noise_volume = str(float(volume) / 10)
            v = str(volume)
            n = str(noise_idx)
            cmd = f'sox -m sound_{s_idx}.wav -v{noise_volume} noise_{n}.wav {s_idx}_{v}_{n}.wav trim 0 3'
            subprocess.check_output(cmd, shell=True)

I should be able to write it a little more beautifully, so for reference only ...


If the data to be mixed and the number of sample rates are different, an error will occur, so I will not write this time, but I think that there is a possibility that adjustment will be necessary using SoX etc. in this area as well.

Reference: http://webdatareport.hatenablog.com/entry/2016/11/06/161304

Recommended Posts

Create noise-filled audio data with SoX
Embed audio data with Jupyter
How to create sample CSV data with hypothesis
Create test data like that with Python (Part 1)
Data analysis with python 2
[Python] Create structured array (store heterogeneous data with NumPy)
Reading data with TensorFlow
Data visualization with pandas
Create games with Pygame
Create filter with scipy
Data manipulation with Pandas!
Shuffle data with pandas
Data Augmentation with openCV
Normarize data with Scipy
Data analysis with Python
Create regular polyhedron data
LOAD DATA with PyMysql
Have Google Text-to-Speech create audio data (narration) for video material (with C # and Python samples)
Create applications, register data, and share with a single email
Graph Excel data with matplotlib (1)
Artificial data generation with numpy
Create an environment with virtualenv
Create Cloud TPU with tf-nightly
Create an API with Django
Create / search / create table with PynamoDB
Extract Twitter data with CSV
Create 3d gif with python3
Create graph with plotly button
Create a homepage with django
Clustering ID-POS data with LDA
Learn new data with PaintsChainer
Binarize photo data with OpenCV
Graph Excel data with matplotlib (2)
Create Image Viewer with Tkinter
Create custom rules with ElastAlert
Create patent map with Streamlit
Create a heatmap with pyqtgraph
Create a directory with python
Create xlsx file with XlsxWriter
Data processing tips with Pandas
Interpolate 2D data with scipy.interpolate.griddata
Read json data with python
Create a USB boot Ubuntu with a Python environment for data analysis