[PYTHON] Calculate Entropy for arrays with zero elements in Numpy

When there are 0 elements in the array, the value will be Nan if you do it normally.

>>> import numpy as np
>>> a = np.array([0.1,0.3,0,0.05,0.15,0.6,0])
>>> np.log(a)
array([-2.30258509, -1.2039728 ,        -inf, -2.99573227, -1.89711998,
       -0.51082562,        -inf])
>>> a*np.log(a)
array([-0.23025851, -0.36119184,         nan, -0.14978661, -0.284568  ,
       -0.30649537,         nan])
>>> -sum(a*np.log(a))
nan

In that case, use masked arrays.

>>> import numpy as np
>>> a = np.array([0.1,0.3,0,0.05,0.15,0.6,0])
>>> np.ma.log(a)
masked_array(data = [-2.3025850929940455 -1.2039728043259361 -- -2.995732273553991
 -1.8971199848858813 -0.5108256237659907 --],
             mask = [False False  True False False False  True],
       fill_value = 1e+20)

>>> a*np.ma.log(a)
masked_array(data = [-0.23025850929940456 -0.3611918412977808 -- -0.14978661367769955
 -0.28456799773288216 -0.30649537425959444 --],
             mask = [False False  True False False False  True],
       fill_value = 1e+20)

>>> -(a*np.ma.log(a)).sum()
1.3323003362673613

By the way, when you simply do it with list comprehension.

>>> import numpy as np
>>> a = np.array([0.1,0.3,0,0.05,0.15,0.6,0])
>>> -sum([v*math.log(v) if v > 0 else 0 for v in a])
1.3323003362673613

Which one is better?

Recommended Posts

Calculate Entropy for arrays with zero elements in Numpy
Handle numpy arrays with f2py
Extract array elements and indexes in descending order with numpy
Extract multiple elements with Numpy array
Tips for dealing with binaries in Python
Process multiple lists with for in Python
Zero padding for dynamic variable values in Python
Try to calculate RPN in Python (for beginners)
[Introduction for beginners] Working with MySQL in Python
Settings for getting started with MongoDB in python