8/1 Added a plot diagram of the Fourier transform of the bass sound source and other detailed explanations.
Compare the Fourier transform of a composite of two sound sources with the Fourier transform of each sound source.
Most waves consist of sine waves of various frequencies. The Fourier transform is to find out what kind of sine wave a wave is and what proportion it is added to. Often referred to as "time domain to frequency domain conversion".
Use the discrete Fourier transform. Very simply, it is the Fourier transform of a digital signal. I used Python 3.7 and numpy.
import numpy as np
import matplotlib.pyplot as plt
import wave
def wave_open(file):
wf = wave.open(file, "rb")
nc = wf.getnchannels() #Number of channels(Monaural or stereo)
qn = wf.getsampwidth() #Quantization bytes(1byte=8bit)
sr = wf.getframerate() #Sampling frequency
fl = wf.getnframes() #All frames(fl/sr=time(s))
data = wf.readframes(fl) #Read all frames(binary)
data = np.frombuffer(data, dtype="int16") #Make it an integer numpy array
if nc==2:
data=data[::2] #Use only one channel
wf.close()
return nc,qn,sr,fl,data
w_file_gt = wave_open("DisGtr-F.wav")
nc_gt = w_file_gt[0] #Monaural or stereo
qn_gt = w_file_gt[1] #Quantization bytes
sr_gt = w_file_gt[2] #Sampling frequency
fl_gt = w_file_gt[3] #All frames
# x~x+1[s]Use 1 second of data
timex = 5
data_size_s = sr_gt*timex
data_size_e = sr_gt*(timex+1)
data_gt = w_file_gt[4][data_size_s:data_size_e]
Sound data and sampling frequency were acquired using the wave module. Next, we will perform the Fourier transform. The sound data is a 16bit, 44.1kHz guitar sound source made with a DAW. I'm playing the F (fa) major chord. This sound source is It has such a waveform.
Now let's do the Fourier transform. I want to go there, but before that, I will explain the "window function". If the sound signal is cut out as it is in seconds to seconds, noise that should not be there will be generated at both ends. Therefore, by multiplying the entire data by the window function, noise can be suppressed at the cost of lowering the accuracy of frequency decomposition. This time I will use a general humming window. The Fourier transform uses numpy's FFT (Fast Fourier Transform). Since it was cut for just 1 second, the data size is 44100, and the frequency decomposition is also done every 1Hz.
#Humming window
h_win = np.hamming(data_size)
data_gt_h = data_gt * h_win
#Fourier transform
data_size = sr_gt #Data size per second is equal to sampling frequency
#FFT data
fft_data_gt = np.fft.fft(data_gt) #Fast Fourier transform
amp_gt = np.abs(fft_data_gt/(data_size/2)) #Get amplitude
amp_gt = amp_gt[:data_size//2] #The result of FFT is a complex array → only the first half is a real number
#Frequency data
freq = np.fft.fftfreq(data_size,d=1/data_size) #d is the sampling period
freq = freq[:data_size//2] #Fit to FFT data
Now that the Fourier transform has been completed, let's plot it.
plt.plot(freq,amp_gt)
plt.axis([0,5000,0,3000])
plt.xlabel("Frequency(Hz)")
plt.ylabel("Amplitude")
plt.show()
Then it becomes like this. (Since it was almost 0 after 10000Hz, I cut it.)
Now the main subject. Add each element of the result of the Fourier transform. The other sound source was also made from a DAW bass sound source. The BPM (tempo) is the same as the guitar sound source in 8 minutes of F (fa). The FFT method is the same as for a guitar sound source. The Fourier transform of the bass sound source is shown below (3000Hz is cut for the same reason as the guitar sound source).
Now, let's synthesize.
fft_data_ba = np.fft.fft(data_ba) #Bass sound source FFT
amp_ba = np.abs(fft_data_ba/(data_size/2))[:data_size//2] #Same data size as guitar
#Synthetic
amp_mix = amp_gt + amp_ba #Addition for each element
#plot
plt.plot(freq,amp_mix)
plt.axis([0,10000,0,8000])
plt.xlabel("Frequency(Hz)")
plt.ylabel("Amplitude")
plt.show()
Then, the figure below is obtained.
The top is the composite sound source of two sound sources, and the bottom is the composite of each Fourier transform. In both cases, the data is added and synthesized as it is without adjusting the balance such as volume. Although there are maximum values and small differences, it can be said that the outlines are almost the same. From the above, it was found that if the frequencies contained in each sound source and their ratios are known, the frequency distribution of the synthesized sound source can be known without synthesizing the sound sources.
In conclusion, it was natural, but it seems to be useful when changing the frequency band of the guitar or bass to separate the sounds. Next, I would like to investigate the cause of the slightly different part between the Fourier transform of the synthesized sound source and the synthesis of the Fourier transform. Maybe it's a phase issue. that's all.
Recommended Posts