I am a liberal arts student, but I was interested in the possibilities of AI, so I went to the AI-specialized school "Aidemy" to study. I would like to share the knowledge gained here with you, and I am summarizing it on Qiita.

What to learn this time ・ Introduction of formats that can be converted with pandas -Data format conversion using pandas ・ Graph CSV file

Data format analysis

File input / output using pandas

-HTML, JSON, CSV, and Excel have different uses such as Web pages, WebAPI, and data organization. You can convert between these data formats using __pandas. __

HTML scraping with pandas

-Basically, HTML tag elements such as \

and \

are scraped with BeautifulSoup, but __table elements \

__ are scraped with pandas.

About JSON

-JSON is an abbreviation of "JavaScript Object Notation" and supports the exchange of data in different programming languages. -The structure of the JSON file is basically the same as the structure of Python dictionary variables, and is expressed in the form of {key: value,}.

About CSV files

-CSV is "Comma Separated Values", that is, "comma-separated values". Due to its lightweight and simple data structure, it has been used for data exchange since ancient times. -The CSV file has a structure that is only separated by value, such as "a, b, c,".

About Excel

・ It goes without saying that Excel is spreadsheet software. Since it is widely used, the range of data analysis will expand when Excel scraping becomes possible. -For each name of Excel, first, the file is called __ "book" __, the table in the file is __ "sheet" __, of which the vertical is __ "column" __ The side is __ "row" __, and each item is called __ "cell" __.

Data format conversion

Read the file with DataFrame

-Actually convert the above-mentioned data format. First, reading the file _pd.read Data type ("file name") __. For example, HTML is "pd.read_html ()", and Excel is "pd.read_excel ()".

-Write the file with _pd.to data type ("file name") __. Also, here it is "pd", but if you want to write the DataFrame type object "df" to an HTML file, it will be "df.to_html ()".

Graph the data in the CSV file

Graphing procedure

-"Read CSV file (read_csv)" "Create graph with pandas" "Draw graph with matplotlib (plt.show)" ・ Of these, "Create graphs with pandas" is new. The method is OK with __ "df.plot ()" __.




-Pandas allows you to exchange data between various data formats. -When reading or writing other data formats to python, it is expressed as __ "pd.read_csv ()" "df.to_html ()" __. -The read CSV file can be graphed like __df.plot () __.

Thank you for reading until the end.

