Basic data frame operations written by beginners in a week of learning Python

This is a personal memo of the part where I stumbled while writing the code. Besides simply reading and writing out data frames This is an example of processing when you want to create a new column B that contains 1 if there is a circle in column A and 2 if it does not.

Code example

In this code, "dfCsv" is generally written as "df".

dfex.py


import csv
import codecs
import os, os.path
import datetime
import pandas as pd
import warnings

CSVFILE="Nanna.csv"

def main():
    print(str(datetime.datetime.now())+"\t"+"Start reading the target data.")

    #Convert from CSV file to data frame dfCsv.
    dfCsv= pd.read_csv(CSVFILE,encoding='cp932', header=0)
    print(str(datetime.datetime.now())+"\t"+CSVFILE+":Loading is complete.")
    
    
    #When you add a new column, you can do it like this.
    dfCsv=textSearch(dfCsv)  
    
    #Result the execution result.Export to csv
    with open("result.csv",mode='w') as f:
        s = ""
        f.write(s)
    dfCsv.to_csv("result.csv",mode="a")

#Added to existing data frame.
def textSearch(dfTmp):
    #Declare an empty list
    #If you append while reading one line from the data frame, the list will have the same number of lines as the data frame.
    profList=[]
    for profTxt in dfTmp['profile']:
        profList.append(profTxt)

    retList=[]
    for prof in profList:
        if ("Japan" in str(prof))  : 
            ret="Japanese"
        else:
            ret="not Japanese"
        retList.append(ret)

    #Join the list created by this subroutine to the passed data frame.
    dfTmp['Japanese?'] = retList 
    return(dfTmp)

if __name__ == "__main__":
    main()

Commentary

This is the heart of this time.

    #When you add a new column, you can do it like this.
    dfCsv=textSearch(dfCsv)  

It doesn't mean "just call a function called textSearch!". The textSearch itself is defined in this program code. If you pass the data frame to the subroutine and process it in this way You can add a new column to the data frame that stores the processing results.

Recommended Posts

Basic data frame operations written by beginners in a week of learning Python
Get a glimpse of machine learning in Python
A well-prepared record of data analysis in Python
Basic story of inheritance in Python (for beginners)
Group by consecutive elements of a list in Python
A collection of Excel operations often used in Python
Basic summary of data manipulation in Python Pandas-Second half: Data aggregation
A memo of writing a basic function in Python using recursion
Comparison of data frame handling in Python (pandas), R, Pig
A beginner's summary of Python machine learning is super concise.
Machine learning summary by Python beginners
[Python] A memo of frequently used phrases (by myself) in Python scripts
Read the standard output of a subprocess line by line in Python
Impressions of touching Dash, a data visualization tool made by python
A memorandum of scraping & machine learning [development technique] by Python (Chapter 4)
A memorandum of scraping & machine learning [development technique] by Python (Chapter 5)
"The one that blocks all Twitter accounts in the database" created by beginners of Python learning day
[Learning memo] Basics of class by python
Display a list of alphabets in Python 3
How to send a visualization image of data created in Python to Typetalk
Python: Preprocessing in machine learning: Data acquisition
Python: Preprocessing in machine learning: Data conversion
Gacha written in python-Implementation in basic data structure-
Code reading of faker, a library that generates test data in Python
Python: Preprocessing in machine learning: Handling of missing, outlier, and imbalanced data
Let's use Python to represent the frequency of binary data contained in a data frame in a single bar graph.
A textbook for beginners made by Python beginners
Data analysis in Python Summary of sources to look at first for beginners
Japanese translation of self-study "A Beginner's Guide to Getting User Input in Python"
A memorandum of method often used in machine learning using scikit-learn (for beginners)
[Python] Plot data by prefecture on a map (number of cars owned nationwide)
Get the caller of a function in Python
Real-time visualization of thermography AMG8833 data in Python
A memorandum of extraction by python bs4 request
Rewriting elements in a loop of lists (Python)
Video frame interpolation by deep learning Part1 [Python]
The story of reading HSPICE data in Python
Make a joyplot-like plot of R in python
Output in the form of a python array
A story about data analysis by machine learning
Summary of Excel operations using OpenPyXL in Python
How to make a face image data set used in machine learning (2: Frame analysis of video to obtain candidate images)
Predicting the goal time of a full marathon with machine learning-③: Visualizing data with Python-
Consolidate a large number of CSV files in folders with python (data without header)
Find out the maximum number of characters in multi-line text stored in a data frame
How an "amateur banker" passed the Python 3 Engineer Certification Basic Exam in a week
Parse a JSON string written to a file in Python
Create a data collection bot in Python using Selenium
Summary of tools needed to analyze data in Python
Power BI visualization of Salesforce data entirely in Python
Receive dictionary data from a Python program in AppleScript
A collection of code often used in personal Python
Tool MALSS (basic) that supports machine learning in Python
I made a program in Python that changes the 1-minute data of FX to an arbitrary time frame (1 hour frame, etc.)
Not being aware of the contents of the data in python
List of Python code used in big data analysis
Basics of Python learning ~ What is a string literal? ~
Until you insert data into a spreadsheet in Python
Python Exercise for Beginners # 1 [Basic Data Types / If Statements]
Let's use the open data of "Mamebus" in Python
Summary of the basic flow of machine learning with Python