Reading and writing CSV and JSON files in Python

In O'Reilly's book "Data Visualization Beginning with Python and JavaScript" A nice input / output method for csv and json files in python was organized, so make a note. When I checked the operation with Jupyter Notebook, there were some places where an error occurred, so There are some differences from the code in the book.

Book official website: https://www.oreilly.co.jp/books/9784873118086/#toc

Operating environment

Windows 10 Pro 64bit Python: 3.6.1 Anaconda: 4.4.0 Jupyter Notebook: 1.0.0

Use case data

Input / output data uses the following

nobel_winners = [
{ 'category': 'Physics',
  'name': 'Albert Einstein',
  'nationality': 'Swiss',
  'sex': 'male',
  'year': 1921},
{ 'category': 'Physics',
  'name': 'Paul Dirac',
  'nationality': 'British',
  'sex': 'male',
  'year': 1933},
{ 'category': 'Chemistry',
  'name': 'Marie Curie',
  'nationality': 'Polish',
  'sex': 'female',
  'year': 1911},
]

Code example

read / write csv file

Writing the dictionary to the csv file with python is as follows. The first element of the array is fetched with nobel_winners [0] .keys (), and the keys of the dictionary are obtained. To sort the keys, use sorted (cols) and assign to cols. In the book, it was sorted by cols.sort (), but since an error occurred, it is written as follows.

cols = nobel_winners[0].keys()
cols = sorted(cols)

with open('data/nobel_winners.csv', 'w') as f:
    f.write(','.join(cols) + '\n')
    
    for o in nobel_winners:
        row = [str(o[col]) for col in cols]
        f.write(','.join(row) + '\n')

To read the exported nobel_winners.csv, execute the following.

with open('data/nobel_winners.csv', 'r') as f:
    for line in f.readlines():
        print(line, sep='')

Out


category,name,nationality,sex,year

Physics,Albert Enistein,Swiss,male,1921

Physics,Paul Dirac,British,male,1933

Chemistry,Marie Curie,Polish,female,1911

Reading and writing using the csv module

Next, read and write using the csv module of python. If you do not specify newline ='' for open, when you run it on Jupyter Notebook There is another line break under writer.writerow, so I put it in.

import csv

with open('data/nobel_winners.csv', 'w', newline='') as f:
    fieldnames = nobel_winners[0].keys()
    fieldnames = sorted(fieldnames)
    writer = csv.DictWriter(f, fieldnames=fieldnames)
    writer.writeheader()
    for w in nobel_winners:
        writer.writerow(w)

When reading a file using the csv module, it will be as follows.

import csv

with open('data/nobel_winners.csv', 'r') as f:
    reader = csv.reader(f)
    for row in reader:
        print(row, sep='')

Out


['category', 'name', 'nationality', 'sex', 'year']
['Physics', 'Albert Enistein', 'Swiss', 'male', '1921']
['Physics', 'Paul Dirac', 'British', 'male', '1933']
['Chemistry', 'Marie Curie', 'Polish', 'female', '1911']

You can also get csv data by converting the lines to a Python dictionary. The following is what I got in that way.

import csv

with open('data/nobel_winners.csv') as f:
    reader = csv.DictReader(f)
    nobel_winners = list(reader)

for w in nobel_winners:
    w['year'] = int(w['year'])
    
nobel_winners

Out


[OrderedDict([('category', 'Physics'),
              ('name', 'Albert Enistein'),
              ('nationality', 'Swiss'),
              ('sex', 'male'),
              ('year', 1921)]),
 OrderedDict([('category', 'Physics'),
              ('name', 'Paul Dirac'),
              ('nationality', 'British'),
              ('sex', 'male'),
              ('year', 1933)]),
 OrderedDict([('category', 'Chemistry'),
              ('name', 'Marie Curie'),
              ('nationality', 'Polish'),
              ('sex', 'female'),
              ('year', 1911)])]

The csv reader does not guess the data type when reading from a csv file, Since everything is treated as a string, year needs to be cast to an int.

JSON file read / write

The Python dictionary can be saved to a JSON file using the json module. When saving, use the dump method of the json module.

import json

with open('data/nobel_winners.json', 'w') as f:
    json.dump(nobel_winners, f)

For reading JSON files, see the json module You can do this by using the load method.

import json

with open('data/nobel_winners.json') as f:
    nobel_winners = json.load(f)

nobel_winners

Out


[{'category': 'Physics',
  'name': 'Albert Enistein',
  'nationality': 'Swiss',
  'sex': 'male',
  'year': 1921},
 {'category': 'Physics',
  'name': 'Paul Dirac',
  'nationality': 'British',
  'sex': 'male',
  'year': 1933},
 {'category': 'Chemistry',
  'name': 'Marie Curie',
  'nationality': 'Polish',
  'sex': 'female',
  'year': 1911}]

When loading the json module, unlike csv, the integer type of year is No casting is required as it will be loaded while being preserved.

Datetime type JSON encoding

To encode Python data containing datetime type Create a custom encoder like the one below.

import datetime
import json

class JSONDateTimeEncoder(json.JSONEncoder):  
    def default(self, obj):
        if isinstance(obj, (datetime.date, datetime.datetime)):
            return obj.isoformat()
        else:
            return json.JSONEncoder.default(self, obj)
    
def dumps(obj):
    return json.dumps(obj, cls=JSONDateTimeEncoder)

now_str = dumps({'time': datetime.datetime.now()})
now_str

Out


'{"time": "2017-09-03T01:03:32.634095"}'

First, a customized date processing encoder Subclass the JSONEncoder to create it. In this process, if the passed argument obj is a datetime object It is executing to return the date and time isoformat. Set a custom date encoder in the cls argument in the json.dumps method.

References

--Data visualization started with Python and JavaScript https://www.oreilly.co.jp/books/9784873118086/#toc --Reading and writing CSV files https://docs.python.jp/3/library/csv.html

Recommended Posts

Reading and writing CSV and JSON files in Python
Reading and writing JSON files with Python
Python CSV file reading and writing
Reading and writing CSV with Python
Reading and writing text in Python
[Introduction for beginners] Reading and writing Python CSV files
Read and write JSON files in Python
[Python] Reading CSV files
Study from Python Reading and writing Hour9 files
Reading and writing fits files with Python (memo)
Example of reading and writing CSV with Python
Transpose CSV files in Python Part 1
Manipulate files and folders in Python
Handling of JSON files in Python
Character code for reading and writing csv files with python ~ windows environment ver ~
Reading and writing NetCDF with Python
Export and output files in Python
Notes on reading and writing float32 TIFF images in python
Csv in python
Data input / output in Python (CSV, JSON)
uproot: Python / Numpy based library for reading and writing ROOT files
Handling json in python
Reading from text files and SQLite in Python (+ Pandas), R, Julia (+ DataFrames)
Get options in Python from both JSON files and command line arguments
Split files when writing vim plugin in python
POST JSON in Python and receive it in PHP
Case sensitive when reading and writing INI files
Full-width and half-width processing of CSV data in Python
Include and use external Kv files in Python Kivy
Mutual conversion between JSON and YAML / TOML in Python
Recursively search for files and directories in Python and output
Easily format JSON in Python
Stack and Queue in Python
Reading .txt files with Python
Unittest and CI in Python
[R] [Python] Memo to read multiple csv files in multiple zip files
[Python] Loading csv files using pandas
MIDI packages in Python midi and pretty_midi
Difference between list () and [] in Python
Difference between == and is in python
View photos in Python and html
Write JSON Schema in Python DSL
Sorting algorithm and implementation in Python
About dtypes in Python and Cython
Dynamically load json type in python
Read and use Python files from Python
Download Google Drive files in Python
Assignments and changes in Python objects
JSON encoding and decoding with python
Check and move directories in Python
Ciphertext in Python: IND-CCA2 and RSA-OAEP
Sort large text files in Python
Hashing data in R and Python
Handle Excel CSV files with Python
Function synthesis and application in Python
Read files in parallel with Python
Python logging and dump to json
Reverse Hiragana and Katakana in Python2.7
[GUI in Python] PyQt5-Menu and Toolbar-
When writing a program in Python
Create and read messagepacks in Python