[PYTHON] Memorandum of methods useful for organizing columns in DataFrame

Methods that I found useful after reading kaggle's kernel

Recently I started working on kaggle and there was a method to simplify the processing of columns that I had been trying hard to make by hand, so I will summarize it as a memorandum. Only the usage used in the competition I'm doing is summarized briefly, so please jump to the article I referred to for detailed usage.

When you want to display the value you want

In the competition I'm doing this time, the given data existed as train_data and train_label, and there were duplicate items in the two csv. Ultimately, these two data must be merged and given to the model, so duplicate content must be thinned out before being merged.

I want to take multiple targets and perform the same processing, such as grouping by column

--groupby (['first column name you want to group', 'second column name you want to group']) .Process that you want to apply.mean () or its side Calculate the average price of group B that belongs to group A. Use it like this. There will be no duplication of the specified column name.

--agg ({' Column name to be processed': ['What you want to process 1 (min, max, etc.)', What you want to process 2]}) Convenient to use after groupby

Referenced articles

note.nkmk.me CUBE SUGAR CONTAINER

Recommended Posts

Memorandum of methods useful for organizing columns in DataFrame
Summary of methods often used in pandas
Summary of methods for automatically determining thresholds
Summary of various for statements in Python
Summary of built-in methods in Python list
Summary of useful techniques for Python Scrapy
A memorandum of method often used in machine learning using scikit-learn (for beginners)
Import-linter was useful for layered architecture in Python
Full disclosure of methods used in machine learning
A proposal for versioning of features in Kedro
Search for yourself from methods in Django's model
Basic story of inheritance in Python (for beginners)
Selenium-Screenshot is useful for screenshots of web pages in Python3, Selenium and Google Chrome