[PYTHON] I convert AWS JSON data to CSV like this

For projects using AWS, there are quite a few times when you want to document a list of AMIs.

In such a case, there are times when you want to get the data in JSON with the AWS official CLI, convert it to CSV, and then compile it into a document. Tabular data, which is common in documents, is very incompatible with JSON, which is a problem.

In such a case, do this. For example, suppose you can get the JSON of the AMI list like this.

images.json


{
    "Images": [
        {
            "VirtualizationType": "hvm", 
            "Name": "hogehoge_1", 
            "Hypervisor": "xen", 
            "SriovNetSupport": "simple", 
            "ImageId": "ami-99999991", 
            "State": "available", 
            "BlockDeviceMappings": [
                {
                    "DeviceName": "/dev/xvda", 
                    "Ebs": {
                        "DeleteOnTermination": true, 
                        "SnapshotId": "snap-9999999d", 
                        "VolumeSize": 100, 
                        "VolumeType": "standard", 
                        "Encrypted": false
                    }
                }
            ], 
            "Architecture": "x86_64", 
            "ImageLocation": "999999999993/hogehoge1", 
            "RootDeviceType": "ebs", 
            "OwnerId": "999999999999", 
            "RootDeviceName": "/dev/xvda", 
            "CreationDate": "2014-12-17T06:35:39.000Z", 
            "Public": false, 
            "ImageType": "machine", 
            "Description": null
        }, 
        {
            "VirtualizationType": "hvm", 
            "Name": "hogehoge_2", 
            "Hypervisor": "xen", 
            "SriovNetSupport": "simple", 
            "ImageId": "ami-99999991", 
            "State": "available", 
            "BlockDeviceMappings": [
                {
                    "DeviceName": "/dev/xvda", 
                    "Ebs": {
                        "DeleteOnTermination": true, 
                        "SnapshotId": "snap-9999999d", 
                        "VolumeSize": 100, 
                        "VolumeType": "standard", 
                        "Encrypted": false
                    }
                }
            ], 
            "Architecture": "x86_64", 
            "ImageLocation": "999999999993/hogehoge1", 
            "RootDeviceType": "ebs", 
            "OwnerId": "999999999999", 
            "RootDeviceName": "/dev/xvda", 
            "CreationDate": "2014-12-17T06:35:39.000Z", 
            "Public": false, 
            "ImageType": "machine", 
            "Description": null
        }, 
    ]
}

Change this json to CSV using python. I made it improvised, so I think there are various rushes, but I'm sorry. I'm really sorry.

sample.py


# coding:UTF-8
import json
import codecs

#Variable definition. Kitto
sourcefilename = "images.json"
outfilename = "outfile.csv"
targetDataName = "Images"
#Function definition
def list2str(srclist,startStr,endStr):
    #If it's not a list, don't convert it.
    if not (isinstance(srclist,list)):
        return srclist
    #Change list to string
    resultStr = ""
    resultStr += startStr
    for item in srclist:
        tmp = ""
        if isinstance(item,str):
            tmp += item +","
        elif isinstance(item,list):
            tmp += list2str(item,"","") + ","
        else:
            tmp += str(item) + ","
        resultStr += tmp
    resultStr += endStr
    return resultStr

#File reading
sourceFile = open(sourcefilename,"r")
sourceData = json.load(sourceFile)
sourceFile.close()
#Get the desired data from json data
targetData = sourceData.get(targetDataName)

#Get header
headerSet = set()
for row in targetData:
    colList = row.keys()
    for colName in colList:
        headerSet.add(colName)
headerstr = ""
for headerName in headerSet:
    headerstr += headerName + ","
#Header acquisition completed

#Data acquisition
datalist = list()
for row in targetData:
    rowstr = ""
    for colName in headerSet:
        if isinstance(row.get(colName),list):
            rowstr += list2str(row.get(colName),"\"","\"") + ","
        elif isinstance(row.get(colName),bool):
            tmpStr = str(row.get(colName))
            rowstr += tmpStr + ","
        elif isinstance(row.get(colName),unicode):
            rowstr += row.get(colName) + ","
        elif isinstance(row.get(colName),type(None)):
            rowstr += "None,"
    datalist.append(rowstr)

#Open the burn file
outfile = codecs.open(outfilename,"w", "shift_jis")
outfile.write(headerstr + "\n")
for rowstr in datalist:
    outfile.write(rowstr + "\n")
outfile.close()

Now you can create CSV data. Once converted to CSV data, it can be opened in Excel and edited in various ways.

I didn't know Python before, so I used to make tools in Java, but Python is very convenient. The AWS CLI requires Python, so there's no need to add new Python. I think I'll study Python in earnest.

I hope you find it useful.

Recommended Posts

I convert AWS JSON data to CSV like this
[Python] Convert CSV file uploaded to S3 to JSON file with AWS Lambda
Convert Excel data to JSON with python
Anyway, I want to check JSON data easily
Convert json format data to txt (using yolo)
Convert json to excel
How to convert JSON file to CSV file with Python Pandas
How to convert Json file to CSV format or EXCEL format
Convert SDF to CSV quickly
Convert Tweepy Status object to JSON
I want to convert vertically held data (long type) to horizontally held data (wide type)
[Data science basics] I tried saving from csv to mysql with python
I want to convert a table converted to PDF in Python back to CSV
I tried to create CSV upload, data processing, download function with Django
[Python] How to convert db file to csv
Convert xml format data to txt format data (yolov3)
Data input / output in Python (CSV, JSON)
How to convert csv to tsv in CLI
[Python] Convert csv file delimiters to tab delimiters
Convert from PDF to CSV with pdfplumber
Convert (compress) formatted JSON string to 1-line JSON
[MNIST] Convert data to PNG for keras
Automatically update CSV files to AWS DynamoDB
Use pandas to convert grid data to row-holding (?) Data
Convert FX 1-minute data to 5-minute data with Python
I want to play with aws with python
I saved the scraped data in CSV!
Convert array (struct) to json with golang
I stumbled on the character code when converting CSV to JSON in Python
Convert csv, tsv data to matrix with python --using MovieLens as an example
Try using django-import-export to add csv data to django
Convert data with shape (number of data, 1) to (number of data,) with numpy.
Just add the python array to the json data
[Python] Convert from DICOM to PNG or CSV
[Introduction to json] No, I was addicted to it. .. .. ♬
I tried to save the data with discord
I want to knock 100 data sciences with Colaboratory
I want to visualize csv files using Vega-Lite!
I tried to get CloudWatch data with Python
[Introduction to Python] How to handle JSON format data
Convert UTF-8 CSV files to read in Excel
I want to get League of Legends data ③
I want to get League of Legends data ②
Write CSV data to AWS-S3 with AWS-Lambda + Python
Convert XLSX to CSV on the command line
I want to get League of Legends data ①
I would like to know about Django pagination.
[AWS] Migrate data from DynamoDB to Aurora MySQL