Explain in detail how to make sounds with python

Human beings are creatures who want to make synthetic speech reading software by themselves. This is unavoidable. Pascal also says, people think of reeds. you see? (What?)

Well, for the time being, I will try an implementation that makes sounds with python with nuances like that preliminary investigation.

Chapter.0 Language / Module

--Language: Python

python


import numpy as np
import matplotlib.pyplot as pl
import wave
import struct
import pyaudio

Jupyter notebook may make the sound a little easier (I don't know the details), but no.

Chapter1. I have to express the sound with an expression ...

Do you guys know what sound is? Sound is like a periodic (?) Change in air density. In short, it's a wave. Speaking of waves, it's sin, cos. Hooray! In conclusion, this time we will use a sine wave with the following formula. sin(2πnt/s) note_hz=n sample_hz=s

python


sec = 1 #1 second
note_hz = 440 #La sound frequency
sample_hz = 44100 #Sampling frequency
t = np.arange(0, sample_hz * sec) #Secure an array of time for 1 second
wv = np.sin(2 * np.pi * note_hz * t/sample_hz)

t represents the time of 1 second, and in the above case it is a one-dimensional array of 44100 elements. The information in the world we live in is continuous (analog), but unfortunately personal computers can only handle discrete (digital) data. Therefore, one second is divided into 44100 pieces for expression. サンプリング周波数.jpg (By the way, the sampling frequency of 44100hz is the standard of the sampling frequency of CD, and it is about twice the number of human audible range. Why is it doubled? Let's google with Nyquist frequency.)

The content of the sign is * 2πnt / s *. t / sample_hz * = t / s * increases to * 0,1,2, ..., 44100 * By dividing * t * by * s = 44100 *, * 0,1 / 44100 , 2/44100, ..., 440099/44100,1 *, which expresses "one second that gradually increases (1/44100 each)".

Once you ignore note_hz * = n * and look atnp.sin (2 * np.pi * t / sample_hz)* = sin (2πt / s) *, * t / s * is 0 Since it is considered to be a variable that increases from → 1 (rather than a function of time?), It can be seen that * 2πt / s * inside sin increases from 0 to 2π. In other words, * sin (2πt / s) * is a function that goes around the unit circle exactly in one second (a wave that vibrates once in one second). 正弦波.jpg Vibrating once in 1 second means that the frequency of this wave is 1 [Hz = 1 / s]. However, at frequency 1, it is inaudible. That's where note_hz * = n * comes in.

You can freely change the frequency of the wave simply by multiplying n by * 2πt / s *. For example, if * n = 440 *, * sin (2πnt / s) * becomes a wave ("la" in sound) that vibrates 440 times per second.

This completes the expression of sound in the program. I'll copy the program pasted above again.

python


sec = 1 #1 second
note_hz = 440 #La sound frequency
sample_hz = 44100 #Sampling frequency
t = np.arange(0, sample_hz * sec) #Secure an array of time for 1 second
wv = np.sin(2 * np.pi * note_hz * t/sample_hz)

Chapter2. Let's output the sound expressed by the program to .wav. Let's do that.

The flow from here is as follows.

  1. Output the created sound as a .wav file.
  2. Binary the sound data with the struct module.
  3. Output the binary data as a .wav file with the wave module.
  4. Play the created sound on the program. (Any)
  5. Open the created .wav file with the wave module
  6. Play with the pyaudio module.
  7. Display the sound waveform as a graph with the matplotlib.pyplot module. (Any)

Regarding 3., if you don't care about the waveform, you don't have to do it at all. 2. uses a module called pyaudio, but it is troublesome to install with Python 3.7 series (if you want to install, please refer to the reference site at the end of this page), so the .wav created in 1. All you have to do is play the file with Windows Media Player.

Now, I will explain how to output as .wav.

1. Binary

It is binarized. Binaryization means converting data into binary numbers. When using the wave module, it seems that it is not possible to write to .wav files unless it is binarized. Perhaps. So let's make it binary!

Let's paste the answer first.

python


max_num = 32767.0 / max(wv) #Preparation for binarization
wv16 = [int(x * max_num) for x in wv] #Preparation for binarization
bi_wv = struct.pack("h" * len(wv16), *wv16) #Binary

It is like this. (Rather, it's almost like copying the site that I referred to, but is there any etiquette that prohibits copying ...? )

Let's look at the contents of [int (x * max_num) for x in wv] with wv * = W, * x * =" each of the child elements of W "= w *. In each child element w of W, x * max_num =x * 32767.0 / max(wv)

What number is 32767! I think you understand. This is because the possible values of 16-bit data (data expressed in 16-digit binary numbers) are * -32768 to 32767 *. (Because 2 to the 16th power is 65536, and half of them are 32768 ... The values that * w / max (W) * can take are * -1 to 1 *, and by multiplying it by 32767, * 32767 ・ (w / max (W)) * takes the value of * -32767 to 32767 *. The waveform data of the sound is evenly (or rather perfect?) Fitted in 16 bits. That's what you can do with wv16. Huh ...

And the binary code bi_wv = struct.pack ("h" * len (wv16), * wv16) . To be honest, I don't know anything about this. This is a copy. For the time being, the struct binary struct.pack converts it to binary format, and the first argument"h"seems to be a 2byte (16bit) integer format. Hey.

Yes, binarization is complete!

2. Output .wav file with wave module

I will paste the answer again first.

python


file = wave.open('sin_wave.wav', mode='wb') #sin_wave.Open wav in write mode. (If the file does not exist, create a new one.)
param = (1,2,sample_hz,len(bi_wv),'NONE','not compressed') #Parameters
file.setparams(param) #Parameter setting
file.writeframes(bi_wv) #Writing data
file.close #Close file

It is like this. Open the file with wave.open (). Specify the name of the file with the first argument, and set the write mode ('wb') or read mode ('rb') with the second argument mode =.

Set the parameters of the .wav file with wave.setparams (). The parameters (param) are in order from the left

is. Then write the binary data (bi_wv) and close the file. It's easy to forget to close the file ...

Alright, it's done! !! (Try running the file at a terminal or command prompt to see if a .wav file is generated!)

Chapter3. I want to make a sound on the program because it's annoying!

So, first open the file you created earlier with the wave module.

python


file = wave.open('sin_wave.wav', mode='rb')

Now you can open it. You are in read mode properly. The file part offile = wave.open ('sin_wave.wav', mode ='rb')represents a variable, so you can use a different name. fairu, wave_no_kiwami_otome, whatever. Well, I just said it. When I was a beginner, did I have to call it file? Because I misunderstood.

Then play the sound with the pyaudio module.

python


p = pyaudio.PyAudio() #Instantiation of pyaudio
stream = p.open(
    format = p.get_format_from_width(file.getsampwidth()),
    channels = wr.getnchannels(),
    rate = wr.getframerate(),
    output = True
    ) #Create a stream for recording and playing sound.
file.rewind() #Move the pointer back to the beginning.
chunk = 1024 #I'm not sure, but the official documentation did this.
data = file.readframes(chunk) #Read chunks (1024) of frames (sound waveform data).
while data:
    stream.write(data) #Make a sound by writing data to the stream.
    data = file.readframes(chunk) #Load a new chunk frame.
stream.close() #Close the stream.
p.terminate() #Close PyAudio.

As above, the procedure is 1.Open pyaudio, 2. Open stream, 3. Write data to stream to make sound, 4. Close stream, 5. Close pyaudio It's like that.

Chapter 4. Waveform display

Well, how long an article would be. I'm tired, Patrasche. That's why I put the code as burn.

python


pl.plot(t,wv)
pl.show()

How simple! matplotlib.pyplot has a lot of articles so I won't say anything in particular.

At the end

Thank you for reading this far! I did my best for the second Qiita article in my life ...

Even so, can I really make artificial speech synthesis software by myself?

Reference site / literature

-[\ Notes ] python wave module --Qiita ――It was very helpful. -[Sound programming with python](http://samuiui.com/2019/03/11/python%E3%81%A7%E9%9F%B3%E3%83%97%E3%83%AD%E3% 82% B0% E3% 83% A9% E3% 83% 9F% E3% 83% B3% E3% 82% B0 /) --A site that seems to use Jupyter notebook. I haven't seen much.

Final code

python


import numpy as np
import matplotlib.pyplot as pl
import wave
import struct
import pyaudio

#Chapter1
sec = 1 #1 second
note_hz = 440 #La sound frequency
sample_hz = 44100 #Sampling frequency
t = np.arange(0, sample_hz * sec) #Secure an array of time for 1 second
wv = np.sin(2 * np.pi * note_hz * t/sample_hz)


#Chapter2
max_num = 32767.0 / max(wv) #Preparation for binarization
wv16 = [int(x * max_num) for x in wv] #Preparation for binarization
bi_wv = struct.pack("h" * len(wv16), *wv16) #Binary

file = wave.open('sin_wave.wav', mode='wb') #sin_wave.Open wav in write mode. (If the file does not exist, create a new one.)
param = (1,2,sample_hz,len(bi_wv),'NONE','not compressed') #Parameters
file.setparams(param) #Parameter setting
file.writeframes(bi_wv) #Writing data
file.close #Close file

#Chapter3
file = wave.open('sin_wave.wav', mode='rb')

p = pyaudio.PyAudio()
stream = p.open(
    format = p.get_format_from_width(file.getsampwidth()),
    channels = file.getnchannels(),
    rate = file.getframerate(),
    output = True
    )
chunk = 1024
file.rewind()
data = file.readframes(chunk)
while data:
    stream.write(data)
    data = file.readframes(chunk)
stream.close()
p.terminate()

#Chapter4
pl.plot(t,wv)
pl.show()

Recommended Posts

Explain in detail how to make sounds with python
How to work with BigQuery in Python
[REAPER] How to play with Reascript in Python
How to use tkinter with python in pyenv
How to make Python Interpreter changes in Pycharm
Working with sounds in Python
How to develop in Python
How to convert / restore a string with [] in python
How to do hash calculation with salt in Python
How to run tests in bulk with Python unittest
[Python] How to do PCA in Python
How to collect images in Python
How to use SQLite in Python
How to get started with Python
How to use Mysql in python
How to wrap C in Python
How to use ChemSpider in Python
How to use FTP with Python
How to use PubChem in Python
How to calculate date with python
How to handle Japanese in Python
How to extract any appointment in Google Calendar with Python
How to log in to AtCoder with Python and submit automatically
How to deal with python installation error in pyenv (BUILD FAILED)
[Introduction to Python] How to use class in Python?
How to not escape Japanese when dealing with json in python
How to access environment variables in Python
How to dynamically define variables in Python
[Python] How to make a class iterable
How to do R chartr () in Python
How to make a string into an array or an array into a string in Python
[Itertools.permutations] How to put permutations in Python
Fractal to make and play with Python
How to make a surveillance camera (Security Camera) with Opencv and Python
How to create a heatmap with an arbitrary domain in Python
How to use python put in pyenv on macOS with PyCall
How to get a stacktrace in python
How to display multiplication table in python
How to extract polygon area in Python
How to do portmanteau test with python
How to check opencv version in python
How to display python Japanese with lolipop
How to switch python versions in cloud9
How to adjust image contrast in Python
How to use __slots__ in Python class
Throw something to Kinesis with python and make sure it's in
How to dynamically zero pad in Python
How to enter Japanese with Python curses
To work with timestamp stations in Python
How to use regular expressions in Python
[Python] How to deal with module errors
How to display Hello world in python
How to use is and == in Python
How to display legend marks in one with Python 2D plot
How to write Ruby to_s in Python
How to install python3 with docker centos
How to calculate "xx time" in one shot with Python timedelta
How to get the date and time difference in seconds with python
I want to explain the abstract class (ABCmeta) of Python in detail.
How to deal with old Python versions in Cloud9 made by others
How to upload with Heroku, Flask, Python, Git (4)