Mathematics is a little dekill Ordinary backend engineer The story of studying statistics
Calculate various statistics using Jupyter Notebook.
・ Total value ·Average value ・ Sample dispersion ・ Unbiased dispersion ·standard deviation ·minimum value ·Median ·Maximum value
import numpy as np
import scipy as sp
#Try descriptive statistics on the theme of fish data
fish_data = np.array([2,3,3,4,4,4,4,5,5,6])
Pass the array fish_data to the sum method of the scipy module. Store the result in sum_value
import numpy as np
import scipy as sp
fish_data = np.array([2,3,3,4,4,4,4,5,5,6])
#Total value(sum_value)Put out
sum_value = sp.sum(fish_data)
Because I can't do physical education on a rainy day, the teacher Numbers from 1 to 100 for boys Where I imposed all additions It took only a few tens of seconds to calculate and produce results.
You all know this boy Gauss, who later became a great figure in the fields of mathematics and physics.
It can be calculated by measuring the length of the array.
import numpy as np
import scipy as sp
fish_data = np.array([2,3,3,4,4,4,4,5,5,6])
#Count the number of specimens N
N = len(fish_data)
It can be calculated by the total value / the number of samples. It can be calculated with the mean method using scipy.
import numpy as np
import scipy as sp
fish_data = np.array([2,3,3,4,4,4,4,5,5,6])
#Sum to average_value / N
avg = sp.mean(fish_data)
An index that indicates "how far the data is from the average value"
import numpy as np
import scipy as sp
fish_data = np.array([2,3,3,4,4,4,4,5,5,6])
#Sum to average_value / N
avg = sp.mean(fish_data)
#Sample variance is "Sample variance" in English
sigma = sp.sum((fish_data - avg)**2) / N
You can easily calculate the sample variance using scipy.
import numpy as np
import scipy as sp
fish_data = np.array([2,3,3,4,4,4,4,5,5,6])
#Calculate the sample variance sigma
# (Can be calculated in one shot using scipy's var method)
sigma = sp.var(fish_data , ddof = 0)
Variance without bias that underestimates the value of the variance
import numpy as np
import scipy as sp
fish_data = np.array([2,3,3,4,4,4,4,5,5,6])
#Sum to average_value / N
avg = sp.mean(fish_data)
#Count the number of specimens N
N = len(fish_data)
#"Unbiased distribution" in English
unb_dist = sp.sum((fish_data - avg)**2) / (N-1)
You can easily calculate the unbiased variance using scipy.
import numpy as np
import scipy as sp
fish_data = np.array([2,3,3,4,4,4,4,5,5,6])
unb_dist = sp.var(fish_data , ddof = 1)
"How many deviations does Mr. XX have?" It is the deviation value of "Well, the deviation value is low !!?"
It can be calculated by squared the unbiased variance.
import numpy as np
import scipy as sp
fish_data = np.array([2,3,3,4,4,4,4,5,5,6])
#Sum to average_value / N
avg = sp.mean(fish_data)
#Count the number of specimens N
N = len(fish_data)
#"Unbiased distribution" in English
unb_dist = sp.sum((fish_data - avg)**2) / (N-1)
#Standard deviation "standard deviation" in English
std_dev = sp.sqrt(unb_dist)
You can easily calculate the standard deviation using scipy.
import numpy as np
import scipy as sp
fish_data = np.array([2,3,3,4,4,4,4,5,5,6])
#Calculate standard deviation using unbiased variance(ddof = 1)
sp.std(fish_data,ddof = 1)
You can easily calculate the minimum value using scipy. The smallest number
import numpy as np
import scipy as sp
fish_data = np.array([2,3,3,4,4,4,4,5,5,6])
sp.amin(fish_data)
You can easily calculate the median using scipy. The median is the number that is exactly in the middle of the sample.
import numpy as np
import scipy as sp
fish_data = np.array([2,3,3,4,4,4,4,5,5,6])
sp.median(fish_data)
You can easily calculate the maximum value using scipy. The largest number
import numpy as np
import scipy as sp
fish_data = np.array([2,3,3,4,4,4,4,5,5,6])
sp.amax(fish_data)
If you can handle numpy and scipy Statistics often used in descriptive statistics can be roughly calculated. However, let's understand how to obtain each statistic.