[PYTHON] Convert numeric variables to categorical with thresholds in pandas

I will write it as my memorandum.

It's very pinpoint from the first article, but as a feature engineering of machine learning,

・ 0 ~ 1 is 1,1 ~ 3 is 2 ・ 3 ~ 10 is 3 ・ 10 ~ 20 is 4 …………

I think there are times when you want to classify features.

A very easy-to-understand example [Beaufort scale](https://ja.wikipedia.org/wiki/Beaufort scale)

python


beaufort = [(0, 0, 0.3), (1, 0.3, 1.6), (2, 1.6, 3.4), (3, 3.4, 5.5), (4, 5.5, 8), (5, 8, 10.8), (6, 10.8, 13.9), 
          (7, 13.9, 17.2), (8, 17.2, 20.8), (9, 20.8, 24.5), (10, 24.5, 28.5), (11, 28.5, 33), (12, 33, 200)]

for item in beaufort:
    train.loc[(train['wind_speed']>=item[1]) & (train['wind_speed']<item[2]), 'beaufort_scale'] = item[0]


This sample code is very simple, powerful and easy to use, Put it in your drawer! !!

Recommended Posts

Convert numeric variables to categorical with thresholds in pandas
Convert comma-separated numeric strings to numbers in Pandas DataFrame
Convert PDFs to images in bulk with Python
How to convert / restore a string with [] in python
How to convert horizontally held data to vertically held data with pandas
Convert the image in .zip to PDF with Python
How to access with cache when reading_json in pandas
How to convert JSON file to CSV file with Python Pandas
[Python] Convert list to Pandas [Pandas]
Ingenuity to handle data with Pandas in a memory-saving manner
How to create dataframes and mess with elements in pandas
Convert .ipynb to .html (with BatchFile)
Create dummy variables in pandas (get_dummies)
Convert markdown to PDF in Python
How to write soberly in pandas
Convert files written in python etc. to pdf with syntax highlighting
Convert list to DataFrame with python
Machine learning with python without losing to categorical variables (dummy variable)
Convert sentences to vectors with gensim
I want to do ○○ with Pandas
Convert PDF to image with ImageMagick
If you want to get multiple statistics with groupby in pandas v1
Try logging in to qiita with Python
How to access environment variables in Python
Load csv with duplicate columns in pandas
How to dynamically define variables in Python
Convert from PDF to CSV with pdfplumber
Convert psd file to png in Python
Convert character strings to features with RoBERTa
Convert Excel data to JSON with python
Convert Hiragana to Romaji with Python (Beta)
Use pandas to convert grid data to row-holding (?) Data
Convert FX 1-minute data to 5-minute data with Python
How to reassign index in pandas dataframe
[Python] Pandas to fully understand in 10 minutes
Try converting to tidy data with pandas
Quickly try to visualize datasets with pandas
To reference environment variables in Python in Blender
Convert PDF files to PNG files with GIMP
Convert array (struct) to json with golang
Convert HEIC files to PNG files with Python
Convert Chinese numerals to Arabic numerals with Python
To work with timestamp stations in Python
How to convert DateTimeField format in Django
Convert from Markdown to HTML in Python
How to read CSV files in Pandas
Adding Series to columns in python pandas
Working with 3D data structures in pandas
Convert absolute URLs to relative URLs in Python
Sample to convert image to Wavelet with Python
Convert Mobile Suica usage history PDF to pandas Data Frame format with tabula-py
Convenient time series aggregation with TimeGrouper in pandas
How to deal with memory leaks in matplotlib.pyplot
Log in to the remote server with SSH
Convert FBX files to ASCII <-> BINARY in Python
Convert DICOM to PNG with Ascending and Descending
Try to aggregate doujin music data with pandas
[REAPER] How to play with Reascript in Python
Convert PDF to image (JPEG / PNG) with Python
Convert mp4 to mp3 with ffmpeg (thumbnail embedded version)
I tried to integrate with Keras in TFv1.1