[Python] Create parallel coordinate plots color-coded by category in Plotly Express

Parallel coordinate plots cannot be color coded well with categorical data

The god plot library (correctly the wrapper library) and the worshiped Plotly Express

import plotly.express as px

px.parallel_categories(
    px.data.tips(), color="size"
)

It's easy to write parallel coordinate plots with category data, but unlike other plots, you can't pass the category data pandas.Series or List to the color parameter. I can do it around scatter`` line, but ...

px.parallel_categories(
    px.data.tips(), color="time"
)

And, when changing from the size column filled with ʻint type to the time` column of the category data, the following error occurs,

ValueError: 
    Invalid element(s) received for the 'color' property of parcats.line
        Invalid elements include: ['Dinner', 'Dinner', 'Dinner', 'Dinner', 'Dinner', 'Dinner', 'Dinner', 'Dinner', 'Dinner', 'Dinner']

    The 'color' property is a color and may be specified as:
      - A hex string (e.g. '#ff0000')
      - An rgb/rgba string (e.g. 'rgb(255,0,0)')
      - An hsl/hsla string (e.g. 'hsl(0,100%,50%)')
      - An hsv/hsva string (e.g. 'hsv(0,100%,100%)')
      - A named CSS color:
            aliceblue, antiquewhite, aqua, aquamarine, azure,
            beige, bisque, black, blanchedalmond, blue,
            blueviolet, brown, burlywood, cadetblue,
            chartreuse, chocolate, coral, cornflowerblue,
            cornsilk, crimson, cyan, darkblue, darkcyan,
            darkgoldenrod, darkgray, darkgrey, darkgreen,
            darkkhaki, darkmagenta, darkolivegreen, darkorange,
            darkorchid, darkred, darksalmon, darkseagreen,
            darkslateblue, darkslategray, darkslategrey,
            darkturquoise, darkviolet, deeppink, deepskyblue,
            dimgray, dimgrey, dodgerblue, firebrick,
            floralwhite, forestgreen, fuchsia, gainsboro,
            ghostwhite, gold, goldenrod, gray, grey, green,
            greenyellow, honeydew, hotpink, indianred, indigo,
            ivory, khaki, lavender, lavenderblush, lawngreen,
            lemonchiffon, lightblue, lightcoral, lightcyan,
            lightgoldenrodyellow, lightgray, lightgrey,
            lightgreen, lightpink, lightsalmon, lightseagreen,
            lightskyblue, lightslategray, lightslategrey,
            lightsteelblue, lightyellow, lime, limegreen,
            linen, magenta, maroon, mediumaquamarine,
            mediumblue, mediumorchid, mediumpurple,
            mediumseagreen, mediumslateblue, mediumspringgreen,
            mediumturquoise, mediumvioletred, midnightblue,
            mintcream, mistyrose, moccasin, navajowhite, navy,
            oldlace, olive, olivedrab, orange, orangered,
            orchid, palegoldenrod, palegreen, paleturquoise,
            palevioletred, papayawhip, peachpuff, peru, pink,
            plum, powderblue, purple, red, rosybrown,
            royalblue, rebeccapurple, saddlebrown, salmon,
            sandybrown, seagreen, seashell, sienna, silver,
            skyblue, slateblue, slategray, slategrey, snow,
            springgreen, steelblue, tan, teal, thistle, tomato,
            turquoise, violet, wheat, white, whitesmoke,
            yellow, yellowgreen
      - A number that will be interpreted as a color
        according to parcats.line.colorscale
      - A list or array of any of the above

In size, ʻA number that will be interpreted as a color according to parcats.line.colorscale` was created, but it is no longer possible.

But

time_color_map = {t: i for i, t in enumerate(px.data.tips()["time"].unique())}
colors = px.data.tips()["time"].replace(time_color_map)
px.parallel_categories(
    px.data.tips(), color=colors
)

When you do

qiita-sample.png

The color that I just wanted to set as a color scale value is also displayed in the graph. Redundant and disgusting ... Isn't it possible to make a parallel coordinate plot that is clearly color-coded with categorical data?

Narrow down the columns to be displayed with dimensions

time_color_map = {t: i for i, t in enumerate(px.data.tips()["time"].unique())}
colors = px.data.tips()["time"].replace(time_color_map)
px.parallel_categories(
    px.data.tips(), color=colors, dimensions=["sex", "smoker", "day", "time", "size"]
)

qiita-sample-2.png

Since you can narrow down the columns to be displayed with dimensions, you can do it by removing unnecessary columns here!

Summary

Hmmm, but I want the interface to be okay even if you specify the category data pandas.Series without any tricks.

Recommended Posts

[Python] Create parallel coordinate plots color-coded by category in Plotly Express
[Python] Write multi-line plots on Plotly Express
Create SpatiaLite in Python
Parallel download in Python
Create a function in Python
Create a dictionary in Python
Run Python unittests in parallel
Create gif video in Python
Sort by date in python
Create a DI Container in Python
Draw knots interactively in Plotly (Python)
Create a binary file in Python
Create Gmail in Python without API
Create Python project documentation in Sphinx
Read files in parallel with Python
Create a Kubernetes Operator in Python
Create a random string in Python
Create and read messagepacks in Python