[PYTHON] The story that the new drawing library "HiPlot" was pretty good

Introduction

A few days ago, from facebook research, HiPlot A new data drawing library has been announced.

https://github.com/facebookresearch/hiplot

How about because the ReadMe of git was simple? I thought, When I actually used it, I felt the future, so I will share it.

Feature

HiPlot is a drawing tool that specializes in discovering data correlations and patterns. I'm not sure, so please watch the next video.

titanic_hiplot2.gif

In this way, not only drawing the data, You can interactively select, filter, and exclude data.

There is a sample that can be moved in the official document, so please touch it.

Official doc

How to use

You can install it from pip.

pip install hiplot

To use it, just pass the dictionary type data or CSV file path to HiPlot.

import pandas as pd
import hiplot as hip

#pandas → dictionary → HiPlot
train = pd.read_csv('../input/titanic/train.csv')

# orient='records'Must be passed in.
import_dict = train.to_dict(orient='records')
dict_hip = hip.Experiment.from_iterable(import_dict)
dict_hip.display()

#Directly from csv
csv_hip = hip.Experiment.from_csv('../input/titanic/train.csv')
csv_hip.display()

In addition, the created graph can be saved as html.

dict_hip.to_html()

good point

Intuitive and easy to use, not only for initial data analysis I think that it can be used in various aspects such as tuning hyperparameters during learning.

Also, because it is extremely lightweight, it can be used without stress.

Subtle points

I have the impression that the functions are not yet complete, probably because it has just been released. It's a little troublesome because there is no function to return to the previous operation.

At the end

It's still a new tool, so I have the impression that it's functionally lacking. I think it's a useful tool.

When doing EDA with kaggle, why not dive into this library first?

Recommended Posts

The story that the new drawing library "HiPlot" was pretty good
The story that XGBoost was finally installed
The story that the return value of tape.gradient () was None
The story that Japanese output was confused with Django
The story that my pull request was incorporated into Scipy
The story that the version of python 3.7.7 was not adapted to Heroku
The story that the guard was confined when the laboratory was converted to IoT
Asynchronous API that combines API Gateway and Step Functions was the strongest story
The story that Apache dealt with because it was down at AH00144
The story that FastAPI may take supremacy
The story that scipy suddenly stopped loading
The story of Django creating a library that might be a little more useful