[PYTHON] Export pandas dataframe to excel

Use pandas to_excel to export the data frame to excel.

First, make sure you can import the Python modules you need.

import pandas as pd
import matplotlib.pyplot as plt
from pandas.io import wb
import xlsxwriter

Data reading

Use the World Bank API to get US and Japanese population and GDP. The indicator string is obtained from the World Bank API.

df_gdp = wb.download(indicator='NY.GDP.PCAP.KD', country=['US', 'JP'], start=1960, end=2013)
df_population = wb.download(indicator='SP.POP.TOTL', country=['US', 'JP'], start=1960, end=2013)

Data confirmation

df_gdp.head(3)
NY.GDP.PCAP.KD
country year
Japan 2013 37432.840747
2012 36800.922307
2011 36203.430066
df_gdp.dtypes
NY.GDP.PCAP.KD    float64
dtype: object
df_gdp.index

MultiIndex(levels=[['Japan', 'United States'], ['1960', '1961', '1962', '1963', '1964', '1965', '1966', '1967', '1968', '1969', '1970', '1971', '1972', '1973', '1974', '1975', '1976', '1977', '1978', '1979', '1980', '1981', '1982', '1983', '1984', '1985', '1986', '1987', '1988', '1989', '1990', '1991', '1992', '1993', '1994', '1995', '1996', '1997', '1998', '1999', '2000', '2001', '2002', '2003', '2004', '2005', '2006', '2007', '2008', '2009', '2010', '2011', '2012', '2013']],
           labels=[[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...], [53, 52, 51, 50, 49, 48, 47, 46, 45, 44, 43, 42, 41, 40, 39, 38, 37, 36, 35, 34, 33, 32, 31, 30, 29, 28, 27, 26, 25, 24, 23, 22, 21, 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0, 53, 52, 51, 50, 49, 48, 47, 46, 45, 44, 43, 42, 41, 40, 39, 38, 37, 36, 35, 34, 33, 32, 31, 30, 29, 28, 27, 26, 25, 24, 23, 22, 21, 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, ...]],
           names=['country', 'year'])
df_gdp.describe()
NY.GDP.PCAP.KD
count 108.000000
mean 28015.188967
std 10061.123534
min 7079.439251
25% 20152.131354
50% 28858.039661
75% 35165.327764
max 45863.019564

The GDP data frame is:

The population data frame has a similar structure.

Format data

Since it is difficult to use as it is, format the data.

df_gdp.unstack(level=0).head(3)
NY.GDP.PCAP.KD
country Japan United States
year
1960 7079.439251 15469.072967
1961 7728.000388 15564.690585
1962 8338.409056 16262.092906
df_gdp.unstack(level=0).describe()
NY.GDP.PCAP.KD
Japan United States
count 54.000000 54.000000
mean 25134.970999 30895.406935
std 9716.583553 9646.035006
min 7079.439251 15469.072967
25% 17457.921985 22981.450242
50% 26005.632842 30462.082595
75% 33991.192095 40658.654684
max 37432.840747 45863.019564
df_gdp.unstack(level=0).plot(figsize=(16, 4), colormap='seismic')

worldbank_to_excel_16_1.png

Population data is converted in the same way.

df_population.unstack(level=0).head(3)
SP.POP.TOTL
country Japan United States
year
1960 92500572 180671000
1961 94943000 183691000
1962 95832000 186538000
ax = df_population.unstack(level=0).plot(figsize=(16, 4), colormap='seismic')

worldbank_to_excel_19_0.png

When calculating the five-year moving average, it is a little clear that the population of the United States is increasing and that of Japan is flat.

ax = pd.stats.moments.rolling_mean(df_population.unstack(level=0), 5).plot(figsize=(16, 4), colormap='seismic')

worldbank_to_excel_21_0.png

ax = pd.stats.moments.rolling_std(df_population.unstack(level=0)['SP.POP.TOTL'], 5).plot(figsize=(16, 4), colormap='seismic')

worldbank_to_excel_22_0.png

Data combination

Combines two data frames.

pd.concat([df_gdp, df_population], axis=1).unstack(level=0).head(3)
NY.GDP.PCAP.KD SP.POP.TOTL
country Japan United States Japan United States
year
1960 7079.439251 15469.072967 92500572 180671000
1961 7728.000388 15564.690585 94943000 183691000
1962 8338.409056 16262.092906 95832000 186538000
df = pd.concat([df_gdp, df_population], axis=1).unstack(level=0)
df.describe()
NY.GDP.PCAP.KD SP.POP.TOTL
Japan United States Japan United States
count 54.000000 54.000000 5.400000e+01 5.400000e+01
mean 25134.970999 30895.406935 1.171442e+08 2.460156e+08
std 9716.583553 9646.035006 1.115320e+07 4.038823e+07
min 7079.439251 15469.072967 9.250057e+07 1.806710e+08
25% 17457.921985 22981.450242 1.085998e+08 2.123952e+08
50% 26005.632842 30462.082595 1.217915e+08 2.412110e+08
75% 33991.192095 40658.654684 1.268150e+08 2.813818e+08
max 37432.840747 45863.019564 1.278173e+08 3.161288e+08

Let's draw a graph with GDP on the left axis and population on the right axis.

ax = df.plot(figsize=(16, 6), colormap='seismic',
                 secondary_y=[('SP.POP.TOTL', 'Japan'), ('SP.POP.TOTL', 'United States')])
    ax.set_ylabel('GDP')
    _ = ax.right_ax.set_ylabel('Population')

worldbank_to_excel_28_0.png

It can now be handled in a single data frame.

Calculating the variance in 5-year increments, it can be said that Japan's population has been sluggish since around 1980 and GDP has been flat since around 1995. In the case of the United States, there was a wave of population growth around 1990, and the increase in GDP peaked around 2000, and it can be said that there are regular waves.

ax = pd.stats.moments.rolling_var(df, 5).plot(subplots=True, layout=(2, 2), figsize=(16, 6))

worldbank_to_excel_31_0.png

Data output

Output to Excel. Call a method in the data frame, but xlsxwriter Excel export module such as must be installed.

df.to_excel('/data/sample.xlsx', sheet_name='Japan_US')
%ls /data
sample.xlsx

Please open the completed data in Excel. You can also use LibreOffice.

When exporting multiple data frames to separate sheets, the argument is a writer object (ʻExcelWriter`) instead of a filename. Is specified. There is an example in the official documentation, but look for more information on Stack Overflow.

Operating environment

SoftwareVersion
Python3.4.2 64bit [GCC 4.9.1]
IPython2.3.1
OSLinux 3.13.0 24 generic x86_64 with debian 8.0
numpy1.9.1
pandas0.15.1
matplotlib1.4.2
xlsxwriter0.6.4
Mon Dec 08 15:50:50 2014 UTC
$ ipython nbconvert --to markdown /data/worldbank_to_excel.ipynb

Recommended Posts

Export pandas dataframe to excel
Excel-> pandas-> sqlite
[Pandas] Expand the character string to DataFrame
[Python] Add total rows to Pandas DataFrame
Create a dataframe from excel using pandas
[Python] How to read excel file with pandas
[python] Create table from pandas DataFrame to postgres
Convert pandas dataframe elements to regular string type
3D plot Pandas DataFrame
How to use Pandas 2
Convert 202003 to 2020-03 with pandas
Convert json to excel
Python application: Pandas # 3: Dataframe
Convert comma-separated numeric strings to numbers in Pandas DataFrame
Convert from Pandas DataFrame to System.Data.DataTable using Python for .NET
Try to operate an Excel file using Python (Pandas / XlsxWriter) ①
Try to operate an Excel file using Python (Pandas / XlsxWriter) ②
Formatted display of pandas DataFrame
Key additions to pandas 1.1.0 and 1.0.0
Put the lists together in pandas to make a DataFrame
How to use Pandas Rolling
A handy function to add a column anywhere in a Pandas DataFrame
[Python] How to output a pandas table to an excel file
How to read an Excel file (.xlsx) with Pandas [Python]
7rep --Insert Dataframe To Elasitcsearch
How to paste a CSV file into an Excel file using Pandas
Points to note when making pandas read csv of excel output
python / pandas / dataframe / How to get the simplest row / column / index / column
[Python] How to add rows and columns to a table (pandas DataFrame)
How to find the memory address of a Pandas dataframe value
DataFrame of pandas From creating a DataFrame from two lists to writing a file
Python hand play (Pandas / DataFrame beginning)
Pandas / DataFrame Tips for practical use
Function to convert Excel column to number
[Python] Operation memo of pandas DataFrame
[Pandas] Save DataFrame as JSON, load JSON as DataFrame
[Python] How to use Pandas Series
Create a pandas Dataframe from a string.
Save pandas.DataFrame to Excel by sheet
Export Google Analytics Standard to BigQuery
Convert list to DataFrame with python
[Introduction to Python] Let's use pandas
Bulk Insert Pandas DataFrame with psycopg2
I want to do ○○ with Pandas
[Introduction to Python] Let's use pandas
Excel, csv import, export with Django
[Introduction to Python] Let's use pandas
Excel aggregation with Python pandas Part 1
How to get a specific column name and index name in pandas DataFrame
[Python / Tkinter] Search for Pandas DataFrame → Create a simple search form to display