How to read an Excel file (.xlsx) with Pandas [Python]

In this article How to read an Excel file (.xlsx) with Pandas I will write an article about.

How to read an Excel file (.xlsx)

From here on, we will proceed using the jupyter lab. If you haven't downloaded it yet, please do so first.

First, import Pandas.

import pandas as pd

Next is reading the Excel file. Excel is similar to reading a CSV file. See the article below for CSV files. How to read CSV files with Pandas --Qiita

df = pd.read_excel('excel/data.xlsx')
df

I read a file called data.xlsx in a folder called excel.

In Excel, the upper line may be blank. For example, if the first two lines are blank, the columns name will be'unnamed' and the first line will be'NaN'. So, let's pass 2 to the argument of "skiprows", skip the first two lines, and read from the third line.

df = pd.read_excel('excel/data.xlsx', skiprows = 2)

If you change the argument of skiprows, you can read from your favorite line.

You can also explicitly specify the header if you have a columns name. For example, when specifying the third line as the columns name,

df = pd.read_excel('excel/data.xlsx', skiprows = 2, header = [2])

will do. It should be noted here that when specifying the third line, do not write'[3]'. The programming array starts with '0', so when you specify the third line, write'[2]'.

By the way, the initial value of the header is 0, so if you do not write anything, the first line will be the columns name.

You can also give two headers. Pass it in list format as an argument, such as [1,2].

df = pd.read_excel('excel/data.xlsx', skiprows = 2, header = [1,2])

For files without a header, if you give None to the header, the columns name will be automatically numbered.

df = pd.read_excel('excel/data02.xlsx', header = None)

** Specify index ** You can also specify an index. Give index_col the number of the column you want to specify. When specifying the first column as index,

df = pd.read_excel('excel/data02.xlsx', index_col = 0)

Now you can specify the first column as index. Again, don't forget that the array starts at 0.

index_col can also be specified by column name instead of column number. For example, if you want to specify the column'date' as index,

df = pd.read_excel('excel/data02.xlsx', index_col = 'date')

will do.

By the way, as it is, the data type is read as Index, but to read as date,

df = pd.read_excel('excel/data02.xlsx', index_col = 'date', parse_dates = True)

Then you can read it as a date. If you check the data type, it will be “DatatimeIndex”.

Summary

This time, How to read an Excel file (.xlsx) with Pandas I wrote an article about.

There are a lot of things that I don't understand, but I think I managed to grasp them to some extent.

I think that it will be necessary to automate Excel in the future, so I would like to remember it.

Thank you very much.

Note

This article was written by a programming beginner and may be incorrect. Thank you for your understanding. Also, if you notice any mistakes, we would appreciate it if you could point them out. Thank you.

Recommended Posts

How to read an Excel file (.xlsx) with Pandas [Python]
[Python] How to read excel file with pandas
[Python] How to output a pandas table to an excel file
How to read a CSV file with Python 2/3
How to convert JSON file to CSV file with Python Pandas
[Python] How to deal with pandas read_html read error
Create an Excel file with Python3
Try to operate an Excel file using Python (Pandas / XlsxWriter) ①
Try to operate an Excel file using Python (Pandas / XlsxWriter) ②
[Python] How to change EXCEL file saved in xlsb to xlsx
[Python Kivy] How to create an exe file with pyinstaller
How to convert Python to an exe file
Quickly create an excel file with Python #python
How to paste a CSV file into an Excel file using Pandas
How to crop an image with Python + OpenCV
How to read an array with Python's ConfigParser
[Python] How to read a csv file (read_csv method of pandas module)
Read csv with python pandas
How to measure mp3 file playback time with python
High resolution acoustic signal processing (1) --How to read 24-bit wav file with Python
Python: How to use async with
How to switch the configuration file to be read by Python
How to convert an array to a dictionary with Python [Application]
[Python] Write to csv file with Python
Reading, summarizing, visualizing, and exporting time series data to an Excel file with Python
[Python] How to use Pandas Series
Output to csv file with Python
How to get started with Python
How to use FTP with Python
How to calculate date with python
Excel aggregation with Python pandas Part 1
How to make a command to read the configuration file with pyramid
How to create a heatmap with an arbitrary domain in Python
[Python] Explains how to use the format function with an example
[Django] How to read variables / constants defined in an external file
How to use an external editor for Python development with Grasshopper
[ROS2] How to play a bag file with python format launch
[Python] How to convert db file to csv
[Python] Summary of how to use pandas
Read CSV file with python (Download & parse CSV file)
python3 How to install an external module
Convert Excel data to JSON with python
How to work with BigQuery in Python
Let's read the RINEX file with Python ①
How to do portmanteau test with python
Excel aggregation with Python pandas Part 2 Variadic
How to display python Japanese with lolipop
How to enter Japanese with Python curses
[Easy Python] Reading Excel files with pandas
[Python] How to deal with module errors
How to read CSV files in Pandas
How to read problem data with paiza
How to install python3 with docker centos
[Automation] Read mail (msg file) with Python
Excel with Python
Read and format a csv file mixed with comma tabs with Python pandas
Read CSV file with Python and convert it to DataFrame as it is
How to enjoy programming with Minecraft (Ruby, Python)
[REAPER] How to play with Reascript in Python
How to do multi-core parallel processing with python
Strategy on how to monetize with Python Java