[PYTHON] A must-see for those involved in Materials Informatics! Visualize compound data with a periodic table heat map.

What to introduce in this article

** INPUT **: Data with arbitrary values for each element in the periodic table

{'H': 772, 'He': 4, 
'Li': 1822, 'Be': 129, 'B': 511, 'C': 458, 'N': 755, 'F': 1756,
'Na': 1214, 'Mg': 905, ..., 'Np': 58, 'Pu': 57}

Against

** OUTPUT **: Periodic table heatmap as below

periodic_table_heatmap.png

Introducing the tools that can be obtained.

For what kind of people?

053b6d93.jpg

This article is mainly for people involved in MI: Materials Informatics, who usually handle large-scale compound data. It is thought to be useful for visualization of compound data.

What I used

077aa480-e6c3-11e9-8b3e-3f7049fb9310.png

Periodic table heatmap drawing is implemented in pymatgen [](Python Materials Genomics), an open source Python library for material analysis developed by Materials Project [] pymatgen.util.plotting []. I used periodic_table_heatmap [].

I tried using it

Yes. I tried to use immediately.

environment

$ python -V
Python 3.7.4

Execution code

I tried to refer to the code described in Test code [] of pymatgen.util.plotting [].

periodic_table_heatmap_example.py


#!/usr/bin/env python3

from pymatgen.util.plotting import periodic_table_heatmap

# The following code is based on:https://github.com/materialsproject/pymatgen/blob/master/pymatgen/util/tests/test_plotting.py
def main():
    random_data = {'Te': 0.11083818874391202,
                   'Au': 0.7575629917425387,
                   'Th': 1.2475885304040335,
                   'Ni': -2.0354391922547705}

    plt = periodic_table_heatmap(random_data, cmap="plasma")
    plt.savefig("periodic_table_heatmap.png ")


if __name__ == '__main__':
    main()

Execution result

periodic_table_heatmap.png

In this way, a heatmap graph reflecting the value of each element of the given data: random_data was generated in one shot. What do you mean?

I want a more elaborate output diagram

Personally, I wanted to arrange the following points of the default code (as a format used in papers and presentations), so I wrote the code with reference to the source code of periodic_table_heatmap [].

-Adjustable with arguments of [original function] periodic_table_heatmap: --The color of the element that is absent in the data is gray → Adjust with blank_color --I want to change the color and size of the color bar → Adjust with cmap and cbar_label_size --I want to label the color bar → Adjust with cbar_label

-[Original function] Cannot be adjusted with the argument of periodic_table_heatmap ** → Code modification required **: --Draw a ruled line on the periodic table --I want to place the element name in the center --I want to change the text color of element names that are absent in the data to something other than black. ――I want to label lanthanoids and actinides (maybe this is unnecessary?)

Example code: ** Output of heatmap of number of constituent elements of arbitrary compound data ** is shown below.

periodic_table_heatmap.py


#!/usr/bin/env python3

import numpy as np
import collections

from pymatgen import MPRester
from pymatgen.core.periodic_table import Element
from pymatgen.core.composition import Composition


def mp_query(YOUR_API_KEY):
    mp = MPRester(YOUR_API_KEY)

    # Properties you need: mp-id;
    # spacegroup number; composition formula; band gap
    basic_properties = ['task_id', 'spacegroup.number', 'pretty_formula']
    electronic_properties = ['band_gap']

    all_properties = basic_properties + electronic_properties

    # Query criteria: must include O element; less than 3 types of elements;
    # band gap value exists
    criteria = {"elements": {"$all": ["O"]},
                "nelements": {"$lte": 3},
                "band_gap": {"$exists": True}}

    # Retrieve material property data which satisfy query criteria
    data = mp.query(criteria=criteria, properties=all_properties)
    return data


# The following code is based on: https://pymatgen.org/pymatgen.util.plotting.html#pymatgen.util.plotting.periodic_table_heatmap
def plot_periodic_table_heatmap(elemental_data, cbar_label="",
                                cbar_label_size=14,
                                cmap="YlOrRd", cmap_range=None,
                                blank_color="grey", value_format=None,
                                max_row=9):

    # Convert primitive_elemental data in the form of numpy array for plotting.
    if cmap_range is not None:
        max_val = cmap_range[1]
        min_val = cmap_range[0]
    else:
        max_val = max(elemental_data.values())
        min_val = min(elemental_data.values())

    max_row = min(max_row, 9)

    if max_row <= 0:
        raise ValueError("The input argument 'max_row' must be positive!")

    value_table = np.empty((max_row, 18)) * np.nan
    blank_value = min_val - 0.01

    for el in Element:
        if el.row > max_row:
            continue
        value = elemental_data.get(el.symbol, blank_value)
        value_table[el.row - 1, el.group - 1] = value

    # Initialize the plt object
    import matplotlib.pyplot as plt
    fig, ax = plt.subplots()
    plt.gcf().set_size_inches(12, 8)

    # We set nan type values to masked values (ie blank spaces)
    data_mask = np.ma.masked_invalid(value_table.tolist())
    # changed edgecolors from 'w' to 'k', and linewidths from 2 to 1
    heatmap = ax.pcolor(data_mask, cmap=cmap, edgecolors='k', linewidths=2,
                        vmin=min_val - 0.001, vmax=max_val + 0.001)
    cbar = fig.colorbar(heatmap)

    # Grey out missing elements in input data
    cbar.cmap.set_under(blank_color)

    # Set the colorbar label and tick marks
    cbar.set_label(cbar_label, rotation=270, labelpad=25, size=cbar_label_size)
    cbar.ax.tick_params(labelsize=cbar_label_size)

    # Refine and make the table look nice
    ax.axis('off')
    ax.invert_yaxis()

    # Label each block with corresponding element and value
    for i, row in enumerate(value_table):
        for j, el in enumerate(row):
            if not np.isnan(el):
                symbol = Element.from_row_and_group(i + 1, j + 1).symbol

                # changed from i + 0.25 to i + 0.5
                # fixed symbol color if the element is absent from data
                if el != blank_value:
                    plt.text(j + 0.5, i + 0.5, symbol,
                             horizontalalignment='center',
                             verticalalignment='center', fontsize=16)
                else:
                    plt.text(j + 0.5, i + 0.5, symbol,
                             color="gray",
                             horizontalalignment='center',
                             verticalalignment='center', fontsize=16)

                if el != blank_value and value_format is not None:
                    plt.text(j + 0.5, i + 0.5, value_format % el,
                             horizontalalignment='center',
                             verticalalignment='center', fontsize=16)

            # added special symbols for Lanthanoid & Actinoid elements
            elif (i == 5 and j == 2) or (i == 7 and j == 1):
                plt.text(j + 0.5, i + 0.5, "*",
                         horizontalalignment='center',
                         verticalalignment='center', fontsize=16)
            elif (i == 6 and j == 2) or (i == 8 and j == 1):
                plt.text(j + 0.5, i + 0.5, "†",
                         horizontalalignment='center',
                         verticalalignment='center', fontsize=16)

    plt.tight_layout()
    plt.savefig("periodic_table_heatmap.png ")

    return plt


def main():
    # get your API_KEY from here: https://materialsproject.org/open
    YOUR_API_KEY = "YOUR_API_KEY"

    data = mp_query(YOUR_API_KEY=YOUR_API_KEY)

    # collecting total # of each element
    elems = []
    for d in data:
        comp = d["pretty_formula"]
        tmp = list(Composition(comp).as_dict().keys())
        elems = elems + tmp

    # get dictionary of {"each element": total #}
    elem_data = collections.Counter(elems)
    elem_data.pop("O")
    
    plot_periodic_table_heatmap(elem_data,
                                cbar_label_size=16,
                                cbar_label="# of data",
                                cmap="autumn_r",
                                blank_color="white")


if __name__ == '__main__':
    main()

The result of running the above code: periodic_table_heatmap.png is the heatmap of the periodic table shown at the beginning.

periodic_table_heatmap.png

Impressions

I wanted to know this sooner. Until now, I didn't know the existence of periodic_table_heatmap [] and drew ** heat ** maps in a rather muddy (too smelly to introduce) method, so every time the compound data is updated * * Hehehe and **. (absolute temperature)

I sincerely hope that all MI officials will see this article and boost their work efficiency.

reference

Recommended Posts

A must-see for those involved in Materials Informatics! Visualize compound data with a periodic table heat map.
Folium: Visualize data on a map with Python
Organize individual purchase data in a table with scikit-learn's MultiLabel Binarizer
Heat Map for Grid Search with Matplotlib