[PYTHON] The value of pyTorch torch.var () is not distributed

Posted on April 13, 2020

0. Who is the target of this article

--People who have touched python and have a good execution environment --People who have touched pyTorch to some extent --People using torch.var () by pyTorch

1.First of all

Nowadays, research on machine learning is mainly done in the python language. Because python has a lot of libraries (called modules) for fast data analysis and calculations. Among them, this time we will use a module called ** pyTorch ** and talk about ** torch.var () ** in it. From the conclusion, the calculation by ** torch.var () ** is not the variance but the ** unbiased variance (sample variance) **. In fact, in many statistical libraries, ** variance ** seems to refer to ** unbiased variance ** (I didn't know, but statisticians take it for granted). I will actually introduce this story through the program.

However, this article is like your own memo, and I want you to use it as a reference only, and there may be cases where you use incorrect expressions or phrases for the sake of brevity, but please understand that. I want you to do it.

2. Prior knowledge

I would like you to be able to use python's numpy and pyTorch to some extent as prior knowledge. In this article, we will proceed with the body that can be used as a matter of course. For reference, refer to the article about Tensor type of pyTorch in the following Link.

What is the Tensor type of pyTorch

3. Try using torch.var ()

First, before writing the program, the formulas for the mean $ \ mu $, variance $ \ sigma ^ 2 $, and unbiased variance $ s ^ 2 $ are shown.

\mu = \frac{1}{n}\sum_i^n x_i\\
\sigma^2 = \frac{1}{n}\sum_i^n (x_i-\mu)^2\\
s^2 = \frac{1}{n-1}\sum_i^n (x_i-\mu)^2

Where $ x $ is the input sample and $ n $ is the number of samples.

The sample data is defined as follows.

filename.rb


a = torch.tensor([1.,2.,3.,4.,5.])
print(a)

------'''Output result below'''--------
tensor([1., 2., 3., 4., 5.])

Well, first of all, if you try to find the variance normally

filename.rb


mu = torch.mean(a)
var = torch.mean((a - mu)**2)
print(var)

------'''Output result below'''--------
tensor(2.)

Here, ** torch.mean () ** calculates the average of all the input elements. Thus the variance was found to be 2.0.

Now, let's use pytorch's ** torch.var () **.

filename.rb


var = torch.var(a)
print(var)

------'''Output result below'''--------
tensor(2.5000)

And the value has changed. This answer is why ** torch.var () ** doesn't ask for variance. In fact, ** torch.var () ** finds the ** unbiased variance (sample variance) ** of all the input elements.

4. Precautions when using torch.var ()

A caveat when actually using ** torch.var () **, but it's not always the case that you should avoid using it if you anticipate distribution. This is because, as you can see from the equation, when the number of samples is very large, the values are almost the same (if n is 1000, the variance divided by 1000 and the unbiased variance divided by 999 are almost the same). If you do ** with a small number of samples ** like my example this time, you need to be careful.

5. A word

This time I summarized the things about torch.var (). Perhaps it's just a matter of course, but I was surprised so I wrote it as an article. Also, since I have little knowledge about the solid meaning of variance and unbiased variance, I would like you to warmly point out any mistakes in expression. I think there were many points that were difficult to read, but thank you for reading.

Recommended Posts

The value of pyTorch torch.var () is not distributed
The update of conda is not finished.
Around the place where the value of Errbot is stored
Is the probability of precipitation correct?
About the return value of pthread_mutex_init ()
About the return value of the histogram.
Science "Is Saito the representative of Saito?"
When incrementing the value of a key that does not exist
If the accuracy of the PCR test is poor, why not repeat the test?
What is the cause of the following error?
The one who is not on DVD
[python] [meta] Is the type of python a type?
Get the value of the middle layer of NN
I think the limit of knapsack is not the weight but the volume w_11/22update
The backslash of the Japanese keyboard is "ro"
Make the default value of the argument immutable
It seems that the version of pyflakes is not the latest when flake8 is installed
Pipfile is not created in the current directory
The answer of "1/2" is different between python2 and 3
The origin of Manjaro Linux is "Mount Kilimanjaro"
Watch out for the return value of __len__
FAQ: Why is the comparison of numbers inconsistent?
Try singular value decomposition of the daimyo matrix
Find the divisor of the value entered in python
This is the only basic review of Python ~ 1 ~
This is the only basic review of Python ~ 2 ~
Search by the value of the instance in the list
This is the only basic review of Python ~ 3 ~
[Python Data Frame] When the value is empty, fill it with the value of another column.
Return value of quit ()-Is there anything returned by the "function that ends everything"?
What to do if the progress bar is not displayed in tqdm of python
The timing when the value of the default argument is evaluated is different between Ruby and Python.
When the selected object in bpy.context.selected_objects is not returned
[Python] Calculate the average value of the pixel value RGB of the object
[C language] [Linux] Get the value of environment variable
Take the value of SwitchBot thermo-hygrometer with Raspberry Pi
Make the default value of the argument immutable (article explanation)
Log the value of SwitchBot thermo-hygrometer with Raspberry Pi
What is the true identity of Python's sort method "sort"? ??
[Golang] Specify an array in the value of map
Not being aware of the contents of the data in python
Preparing the execution environment of PyTorch with Docker November 2019
The story that the return value of tape.gradient () was None
The latest version of Pillow 7.0.0 will kill the pytorch transform.
Zip 4 Gbyte problem is a story of the past
[Django 2.2] Sort and get the value of the relation destination
What is a recommend engine? Summary of the types
When you think the update of ManjaroLinux is strange
Why is the first argument of [Python] Class self?
The copy method of pandas.DataFrame is deep copy by default
[DanceDanceRevolution] Is it possible to predict the difficulty level (foot) from the value of the groove radar?
The image is displayed in the local development environment, but the image is not displayed on the remote server of VPS
When a character string of a certain series is in the Key of the dictionary, the character string is converted to the Value of the dictionary.
I want to output while converting the value of the type (e.g. datetime) that is not supported when outputting json with python