[Python3] Save the mean and covariance matrix in json with pandas

Preface

Always stingray! I've written the code in and output the mean and covariance many times, but make a note of it. The contents are really simple. There is no twist. I'm sorry.

Situation: I have pandas data and want to find the mean and covariance for a particular data X, Y, Z

policy

  1. Create a DataFrame containing only x, y, z
  2. Find the mean and covariance of the created DataFrame.
  3. Output to json file.

Implementation

The point is --Use `` `.loc``` to extract by specifying the data name in DataFrame. --DataFrame has mean (), cov (), corr (), and the output is DataFrame, so refer to ndarray in values. --To register as a list in the dictionary, you can use tolist () of ndarray.

from pandas import DataFrame
from numpy import random
import json

df = DataFrame(random.randint(0,100,size=(252, 4)), columns=list('XYZW'))
output_data = dict()

# 1. extract XYZ
df_xyz = df.loc[:,list("XYZ")]

# 2-1 mean vector
u = df_xyz.mean()
output_data["mean"] = u.values.tolist()

# 2-2 covariance
s = df_xyz.cov()
output_data["covariance"] = s.values.tolist()

# 3
with open("out.json", 'w') as f:
    json.dump(output_data, f, indent=2)

The output json file is

{
  "mean": [
    48.34126984126984,
    50.52777777777778,
    51.492063492063494
  ],
  "covariance": [
    [
      877.6360589388478,
      -44.88202744577245,
      -71.94548788971099
    ],
    [
      -44.88202744577245,
      876.4733289065962,
      -32.312527667109336
    ],
    [
      -71.94548788971099,
      -32.312527667109336,
      784.7768291911716
    ]
  ]
}

is.

in conclusion

I did some research until I got to this implementation. (Sweat) The Covariance of the DataFrame can be found in the API document (here).

(2020/05/11)

afterwards

--When you want to write NaN (not a number) processing, you can check the if statement like this.

   from numpy import isnan
   if isnan(x).any():
       x = zeros(3)
   if isnan(S).any():
       S = zeros( (3,3) )

Recommended Posts

[Python3] Save the mean and covariance matrix in json with pandas
Save the binary file in Python
JSON encoding and decoding with python
Shuffle the images in any directory with Python and save them in another folder with serial numbers.
Operate Firefox with Selenium from python and save the screen capture
Display Python 3 in the browser with MAMP
Reading and writing JSON files with Python
Dealing with "years and months" in Python
Read and write JSON files in Python
Find and check inverse matrix in Python
Fill the string with zeros in python and count some characters from the string
I set the environment variable with Docker and displayed it in Python
Get and convert the current time in the system local timezone with python
Join data with main key (required) and subkey (optional) in Python pandas
[Python] Get the files in a folder with Python
12. Save the first column in col1.txt and the second column in col2.txt
Read JSON with Python and output as CSV
About the difference between "==" and "is" in python
Reading and writing CSV and JSON files in Python
Load the network modeled with Rhinoceros in Python ②
[Python3] Read and write with datetime isoformat with json
Solving the Lorenz 96 model with Julia and Python
Archive and compress the entire directory with python
POST JSON in Python and receive it in PHP
Identity matrix and inverse matrix: Linear algebra in Python <4>
[Python] Find the transposed matrix in a comprehension
Matrix Calculations and Linear Equations: Linear Algebra in Python <3>
Load the network modeled with Rhinoceros in Python ①
Let's play with Python Receive and save / display the text of the input form
Let's transpose the matrix with numpy and multiply the matrices.
Automate background removal for the latest portraits in a directory with Python and API
The simplest Python memo in Japan (classes and objects)
Receive the form in Python and do various things
[Python] Use JSON with Python
Handling json in python
[Python] Get the numbers in the graph image with OCR
Carefully understand the exponential distribution and draw in Python
[Python] Read Japanese csv with pandas without garbled characters (and extract columns written in Japanese)
Visualize the range of interpolation and extrapolation with python
Plot and understand the multivariate normal distribution in Python
Crawl the URL contained in the twitter tweet with python
Convert the image in .zip to PDF with Python
Get the result in dict format with Python psycopg2
Write letters in the card illustration with OpenCV python
Calculate Pose and Transform differences in Python with ROS
Carefully understand the Poisson distribution and draw in Python
How to convert JSON file to CSV file with Python Pandas
Output log in JSON format with Python standard logging
[Note] How to write QR code and description in the same image with python
Mutual conversion between JSON and YAML / TOML in Python
Install the latest stable Python with pyenv (both 2 and 3)
Read json file with Python, format it, and output json
Start numerical calculation in Python (with Homebrew and pip)
[Python] Explain the difference between strftime and strptime in the datetime module with an example
POST the image with json and receive it with flask
In Python3.8 and later, the inverse mod can be calculated with the built-in function pow.
Extract the maximum value with pandas and change that value
I tried to compare the processing speed with dplyr of R and pandas of Python
It is easy to execute SQL with Python and output the result in Excel
What does the last () in a function mean in Python?
Python --Read data from a numeric data file to find the covariance matrix, eigenvalues, and eigenvectors