When I was reading "Detailed explanation Deep Learning Tensorflfow / Keras Time Series Data Processing" published around May of this year, random numbers are generated. The part has come out.
It was said that the seed was set to generate the same random number every time, but I didn't understand the description around here, so I checked it briefly.
The random numbers generated by the random function are pseudo-random numbers. [Pseudo-random numbers-Wikipedia](https://ja.wikipedia.org/wiki/Pseudo-random numbers) In the first place, a pseudo-random number is a random number that looks like an irregular random number, but is calculated as if it were random by a certain deterministic calculation (fixed algorithm?).
When calculating pseudo-random numbers, the value set as the initial state is the seed.
By setting this seed to the same number, the same pseudo-random number is generated every time, so I tried to generate pseudo-random numbers many times with the seed fixed.
rand_generation.py
import bumpy as np
rng = np.random.RandomState(100)
for i in range(10):
print(rng.randn(5)) #randn :Sampling from standard normal distribution
Generation result: [-1.74976547 0.3426804 1.1530358 -0.25243604 0.98132079] [ 0.51421884 0.22117967 -1.07004333 -0.18949583 0.25500144] [-0.45802699 0.43516349 -0.58359505 0.81684707 0.67272081] [-0.10441114 -0.53128038 1.02973269 -0.43813562 -1.11831825] [ 1.61898166 1.54160517 -0.25187914 -0.84243574 0.18451869] [ 0.9370822 0.73100034 1.36155613 -0.32623806 0.05567601] [ 0.22239961 -1.443217 -0.75635231 0.81645401 0.75044476] [-0.45594693 1.18962227 -1.69061683 -1.35639905 -1.23243451] [-0.54443916 -0.66817174 0.00731456 -0.61293874 1.29974807] [-1.73309562 -0.9833101 0.35750775 -1.6135785 1.47071387]
Oh, I thought that it would be a different random number each time it was generated, but it seems that the second and subsequent generations will be newly generated based on the previously generated random numbers. https://teratail.com/questions/15388 (I used this as a reference.)
In the above script, the seed is set to 100 and a pseudo-random number list of length 5 is generated 10 times (that is, the random numbers are generated 50 times?), So it is different each time.
When I ran the same script again, it generated the same random numbers.
Having the same seed will always generate the same first pseudo-random number with the same algorithm ↓ A second pseudo-random number is generated based on the first pseudo-random number, so if the first is always the same, the second is always the same. ...
It means repeating. The fact that the same pseudo-random number is generated every time means that the same pseudo-random number sequence is generated starting from the time when the seed is set. (If you think about it, it's no longer a random number if the same random number is always generated.)
Even in machine learning, if you want to evaluate performance using random numbers, if you use different random numbers each time, you may not know whether it is the difference in performance due to the difference in random numbers or the result of improvement in parameters etc. It seems that there are various uses because there are.
By the way, if you don't set a seed, it seems that you can guarantee that the seed will be different each time by using the system time as the seed. Was it such a karakuri? https://docs.scipy.org/doc/numpy/reference/generated/numpy.random.RandomState.html (I used this as a reference.)
Does that mean that the n-1st pseudo-random number is the seed when generating the nth pseudo-random number?
Recommended Posts