Summary of how to read numerical data with python [CSV, NetCDF, Fortran binary]

Python is a very convenient language for analyzing numerical data, but the first step in analyzing data is to load the data. Therefore, we will summarize how to read numerical data in various formats in the form of a numpy array.

Below, in all cases, the contents of the file are stored in'data'.

** Read csv (text) file **

filename.csv


 year,Jan,Feb,Mar,Apr,May,Jun,Jul,Aug,Sep,Oct,Nov,Dec
2001,-0.4,-0.3,-0.2,-0.1,0,0,-0.1,-0.2,-0.3,-0.4,-0.5,-0.4
2002,-0.3,-0.1,0,0.3,0.4,0.5,0.6,0.7,0.8,1,1,1
2003,0.8,0.5,0,-0.2,-0.2,-0.2,-0.1,0.1,0.3,0.4,0.4,0.4
2004,0.4,0.3,0.1,0,-0.1,0,0,0.2,0.3,0.4,0.4,0.3
2005,0.2,0.1,0.1,0.1,0.2,0.3,0.2,0.1,-0.2,-0.5,-0.7,-0.8
2006,-0.8,-0.7,-0.5,-0.3,-0.2,0.1,0.3,0.4,0.6,0.8,0.9,0.8
2007,0.5,0.2,-0.2,-0.5,-0.6,-0.7,-0.9,-1.1,-1.3,-1.4,-1.5,-1.5
2008,-1.4,-1.1,-0.8,-0.5,-0.1,0.1,0.2,0.2,0.2,0,-0.2,-0.4
2009,-0.5,-0.5,-0.3,0,0.3,0.5,0.7,0.8,0.9,1,1,1.1
2010,1.1,0.9,0.7,0.3,0,-0.4,-0.8,-1.1,-1.3,-1.4,-1.5,-1.4
2011,-1.2,-0.9,-0.7,-0.4,-0.2,-0.2,-0.2,-0.4,-0.6,-0.8,-0.8,-0.7
2012,-0.6,-0.3,-0.1,0.1,0.3,0.4,0.5,0.5,0.4,0.2,0,-0.2
2013,-0.2,-0.3,-0.4,-0.4,-0.5,-0.6,-0.6,-0.5,-0.3,-0.2,-0.1,-0.2
2014,-0.2,-0.1,0,0.2,0.4,0.5,0.5,0.5,0.6,0.7,0.7,0.6
2015,0.5,0.5,0.6,0.8,1.2,99.9,99.9,99.9,99.9,99.9,99.9,99.9

Read the text data displayed as above when opened with Notepad.

import numpy as np
data = np.loadtxt('filename.csv', comments='year', delimiter=',', dtype='float')

** Explanation **

--In comments, specify the character string that exists at the left end of the line to be skipped. --Specify the delimiter with delimiter. If it is separated by a space, the description of delimiter = ... is not necessary. --Specify in which format to read the data with dtype. The default is float (floating point number). If you want to read it as an integer, use int.

** Reference URL **

** Read NetCDF file **

import netCDF4
nc = netCDF4.Dataset('filename.nc', 'r')
data = nc.variables['varname'][:]

** Explanation **

--It reads as an array of numpy without importing numpy. --Enter the variable name in the varname part. --No matter how many dimensions the data is read, the last part of the third line can be [:].

** Reference URL **

** Reading Fortran binary files **

write_binary_2D.f90


program main

  implicit none

  integer,parameter::N=10,M=20
  integer::i,j
  real,dimension(1:N,1:M)::x

  open(10,file='filename.out',form='unformatted',access='direct',recl=N*4)

  do i = 1,N
     do j = 1,M
        x(i,j) = i+j*2
     end do
  end do

  do j = 1,M
     write(10,rec=j)(x(i,j),i=1,N)
  end do

  close(10)

end program main

Let's read the contents of filename.out (a 4-byte floating point binary without a little endian header. What is commonly called GrADS format) created by the above program.

import numpy as np
N = 10  #The number of data stored per record number.
M = 20  #Total number of records.
f = open('filename.out', 'r')
dty = np.dtype([('data', '<' + str(N) + 'f')])
chunk = np.fromfile(f, dtype=dty, count=M)
data = np.array([chunk[j]['data'] for j in range(M)])

** Explanation **

--The last line is

data = []
for j in range(M):
    data.append(chunk[j]['data'])
    
data = np.array(data)

Is rewritten in one line.

--chunk [k-1] corresponds to the data of record number k in Fortran. So, for example, if you want to retrieve only the data with record number 6, put the last line

   data = chunk[5]['data']

You can replace it with.

** Reference URL **

Recommended Posts

Summary of how to read numerical data with python [CSV, NetCDF, Fortran binary]
How to read a CSV file with Python 2/3
[Python] Summary of how to use pandas
[Python2.7] Summary of how to use unittest
Summary of how to use Python list
[Python2.7] Summary of how to use subprocess
How to read problem data with paiza
[Python] How to read a csv file (read_csv method of pandas module)
How to create sample CSV data with hypothesis
Read data with python / netCDF> nc.variables [] / Check data size
Summary of how to import files in Python 3
Read Python csv data with Pandas ⇒ Graph with Matplotlib
[Python] How to read excel file with pandas
[Python] How to read data from CIFAR-10 and CIFAR-100
Summary of how to use MNIST in Python
How to specify attributes with Mock of python
Write CSV data to AWS-S3 with AWS-Lambda + Python
Numerical summary of data
[Introduction to Python] How to get the index of data with a for statement
Summary of tools needed to analyze data in Python
How to scrape image data from flickr with python
How to output CSV of multi-line header with pandas
How to convert JSON file to CSV file with Python Pandas
[Python] How to deal with pandas read_html read error
[Python] Summary of how to specify the color of the figure
How to read csv containing only integers in Python
Summary of how to share state with multiple functions
[Python] Summary of eval / exec functions + How to write character strings with line breaks
Read csv with python pandas
Write to csv with Python
Read json data with python
[python] Summary of how to retrieve lists and dictionary elements
How to enable Read / Write of net.Conn with context with golang
20200329_Introduction to Data Analysis with Python Second Edition Personal Summary
[Python] Summary of how to use split and join functions
How to read an Excel file (.xlsx) with Pandas [Python]
Basic summary of data manipulation with Python Pandas-First half: Data creation & manipulation
[Introduction to Python] How to get data with the listdir function
[Python / Ruby] Understanding with code How to get data from online and write it to CSV
[Python] How to FFT mp3 data
Python: How to use async with
How to read e-Stat subregion data
[Python] Write to csv file with Python
Summary of how to use pandas.DataFrame.loc
Output to csv file with Python
How to deal with imbalanced data
How to deal with imbalanced data
Summary of how to use pyenv-virtualenv
How to get started with Python
How to Data Augmentation with PyTorch
How to use FTP with Python
How to calculate date with python
Summary of how to use csvkit
How to extract features of time series data with PySpark Basics
[Hugo] Summary of how to add pages to sites built with Learn
How to import CSV and TSV files into SQLite with Python
How to avoid duplication of data when inputting from Python to SQLite.
Here's a brief summary of how to get started with Django
[Python] How to store a csv file as one-dimensional array data
[Python] Read a csv file with a large data size using a generator
[Python] How to convert db file to csv