[PYTHON] I tried to redo the non-negative matrix factorization (NMF)

This is an article that I tried to reimplement and confirm the non-negative matrix factorization (NMF) that I understood about a year ago. Support Vector Machine (SVM) and recently Deep Learning are popular, but I would like to introduce that there is also such a method ... I implemented it in C ++ and Python 2.7 for the first time in a while.

[Updated on 2020/02/05]

I created a mock for real-time sound source separation using NMF. Here

[Updated on March 23, 2018]

Some texts have been revised. In addition, I implemented it in Julia, so I posted the link there.

Development environment

Mac OS X 10.11.6 Elcapitan

Python 2.7.11 (I have to upgrade to 3 system soon ...)

numpy 1.11.1

cmake 3.6.2

Non-negative Matrix Factorization

NMF is an algorithm that approximates the nonnegative (> = 0) matrix V by the product of the other two nonnegative matrices W and H. This algorithm itself is used as a feature extraction, natural language processing, and clustering method.

Also, due to its non-negative value, there is also one that uses the power spectrum in acoustic signal processing, and it is also used to find the pronunciation time of the timbre of a specific musical instrument from the spectrogram.

Mathematical expression

The NMF formula for the matrix under i * j is


k \in R \\

V \approx WH\\

V \in R^{i \times j}\\

W \in R^{i \times k} \qquad H \in R^{k \times j}

It becomes.

Since the contents of the input matrix V are known, but the contents of W and H are unknown, NMF minimizes the distance between V and WH.

Update algorithm

A well-known (and simple) algorithm in NMF is the multiplicative update algorithm. As mentioned above, we must define the distance (error) between X and WH and minimize it. There are three types of distances used in NMF that I know of.

D_{EUC}(V, WH) = (V-WH)^2
D_{KL}(V, WH) = V log \frac{V}{WH} - V + WH

https://en.wikipedia.org/wiki/Itakura–Saito_distance

D_{IS}(V,WH) = \frac{V}{WH} - log \frac{V}{WH} -1

There is also β divergence, which expresses these three distance functions. Use these algorithms as multiplicative update algorithms. Derivation and transformation of detailed expressions are omitted.

The update formula after transformation is


H_{n+1} \gets H_{n} \frac{W^{T}_{n}V_{n}}{W^{T}_{n}W_{n}H_{n}}, \qquad W_{n+1} \gets W_{n} \frac{V_{n}H^{T}_{n+1}}{W_{n}H_{n+1}H^{T}_{n+1}}


H_{n+1} \gets H_{n} \frac{W^{T}_{n}\frac{V_{n}}{W_{n}H_{n}}}{W^{T}_{n}}, \qquad W_{n+1} \gets W_{n} \frac{\frac{V_{n}}{W_{n}H_{n+1}}H^{T}_{n+1}}{H^{T}_{n+1}}


H_{n+1} \gets H_{n} \sqrt{\frac{\frac{W_{n}}{W_{n}H_{n}}^{T}\frac{V_{n}}{W_{n}H_{n}}}{\frac{W_{n}}{W_{n}H_{n}}^{T}}}, \qquad W_{n+1} \gets W_{n} \sqrt{\frac{\frac{V_{n}}{W_{n}H_{n+1}}\frac{H_{n+1}}{W_{n}H_{n+1}}^{T}}{\frac{H_{n+1}}{W_{n}H_{n+1}}^{T}}}

I wondered if it was like this ... (I remember a little bit. I'm sorry if I made a mistake) What you have to be careful about is that H in the formula is {n + 1}! If you forget this, you will end up in a situation where it does not converge at all. By repeating this, the distance between V and WH will be reduced. ..

Reference materials / URL

http://papers.nips.cc/paper/1861-algorithms-for-non-negative-matrix-factorization.pdf http://www.mitpressjournals.org/doi/pdf/10.1162/neco.2008.04-08-771

Implementation

NMF.py



# coding:utf-8

import numpy as np


class NMF():
    """
A class that does simple NMF
    """
    def setParam(self, k, row, column):
        """
        :param k
        :param row:Column
        :param column:line
        :return:
        """
        self.__k = k
        self.__row = row
        self.__column = column

        self.__dictionary = np.random.random_sample([self.__row, self.__k])
        self.__activation = np.random.random_sample([self.__k, self.__column])

    def setDictionary(self, index, data):
        """
If there is a template, set the template (default is random number)
        :param index:Index of which template(0 <= index < k)
        :param data:Template data
        :return:
        """
        if index >= self.__k and len(data) != self.__row:
            print "Please NMF.setParam(k,row,column)"
            print "k = " + str(self.__k)
            print "row = " + str(self.__row)
            return

        self.__dictionary[:, index] = np.array(data[:self.__row], np.float32)

    def setAnalyzData(self, data, k):
        """
Call this method at the very beginning
        :param data:Data to decompose
        :param k
        :return:
        """
        if len(np.shape(data)) == 1:
            self.__data = np.ones([np.shape(data)[0], 1], np.float32)
            self.setParam(k, np.shape(data)[0], 1)
        else:
            self.__data = data
            self.setParam(k, np.shape(data)[0], np.shape(data)[1])


    def separate_euc_with_template(self,iter=200):
        """
Performs separation processing of EUC specifications with templates
        :param iter:
        :return:
        """
        counter = 0

        while counter < iter:
            approx = np.dot(self.__dictionary , self.__activation)

            wh = np.dot(np.transpose(self.__dictionary) , self.__data)
            wt = np.dot(np.transpose(self.__dictionary) , approx)

            bias = wh/wt
            bias[np.isnan(bias)] = 0

            self.__activation = self.__activation * bias
            counter += 1
        return self.__dictionary,self.__activation


    def separate_euc_without_template(self,iter=200):
        """
Separation of EUC specifications without template
        :param iter:
        :return:
        """
        counter = 0

        while counter < iter:
            approx = np.dot(self.__dictionary , self.__activation)

            wh = np.dot(np.transpose(self.__dictionary) , self.__data)
            wt = np.dot(np.transpose(self.__dictionary) , approx)

            bias = wh/wt
            bias[np.isnan(bias)] = 0

            self.__activation = self.__activation * bias

            approx = np.dot(self.__dictionary,self.__activation)
            wh = np.dot(self.__data,np.transpose(self.__activation))
            wt = np.dot(approx,np.transpose(self.__activation))

            bias = wh/wt
            bias[np.isnan(bias)] = 0
            self.__dictionary = self.__dictionary * bias
            counter += 1

        return self.__dictionary,self.__activation

The class that executes NMF itself is implemented this time like this. Due to the problem of sentence volume, only the EUC specifications are listed this time. The code is quite troublesome, but if you don't mind ...

test

I also made a test code (including how to use it), so I ran it.

main.py



#coding:utf-8

import numpy as np

"""
It's a hassle, but I have to import these two
"""
from NMF import NMF


"""
Test script
How to use
"""
if __name__ == "__main__":

    """
Basically a troublesome constructor,row,You don't have to pass the column
I just made it properly ...
    """
    nmf = NMF()

    """
After creating the constructor, call it first!
Where k,row,Since the column is set, setDictionary will not work unless this is called!
    """
    nmf.setAnalyzData([[1,2,3,4],[2,3,4,5]],k=3)

    """
Set the template here
    """
    nmf.setDictionary(0,[0.0,2.0])
    nmf.setDictionary(1,[1.0,6.0])
    nmf.setDictionary(2,[11.0,10.0])

    """
Start NMF
Pass the algorithm and the number of iterative updates as arguments
    """
    dic,act = nmf.separate_euc_with_template(iter=200)
    # dic,act = nmf.separate_euc_without_template(iter=200)
    """
Result display
    """
    print "=========================== Dictionary ==========================="
    print dic
    print "=========================== Activation ==========================="
    print act

    """
Check if it can be disassembled properly
    """
    print "=========================== Approx ==========================="
    print np.dot(dic,act)



In this test


V = \begin{pmatrix}
1 & 2 & 3 & 4\\
2 & 3 & 4 & 5
\end{pmatrix}
\qquad k=3

NMF is performed on a matrix with such conditions. For this matrix, the dictionary matrix set by setDictionary ()


W = \begin{pmatrix}
0 & 1 & 11 \\
2 & 6 & 10
\end{pmatrix}

Update the excitation matrix using.

test results

When you run the main.py script ...


$ python main.py
=========================== Dictionary ===========================
[[  0.   1.  11.]
 [  2.   6.  10.]]
=========================== Activation ===========================
[[ 0.11776982  0.05902263  0.00976704  0.33375605]
 [ 0.168019    0.20895539  0.24616295  0.1367387 ]
 [ 0.07563464  0.16282224  0.25034882  0.35120557]]
=========================== Approx ===========================
[[ 1.  2.  3.  4.]
 [ 2.  3.  4.  5.]]

Properly, the excitation matrix was updated so that the distance between V and WH became 0.

Source code

I omitted this time, including the KL / IS version, which is available on GitHub. In addition, the C ++ version source code (the structure is almost the same as the Python version), which is omitted this time, is also released. Click here for Python version (https://github.com/T-Sumida/Simple_NMF) Click here for C ++ version (https://github.com/T-Sumida/NMF_for_cpp) Julia version is here

Finally···

It was a good review to write sentences and mathematical formulas after a long time of studying about a year ago.

As I wrote at the beginning, DNN etc. are attracting attention these days, but there is also such a simple algorithm! I hope it will be an introduction.

Next time, I would like to introduce a practical example using this.

Recommended Posts

I tried to redo the non-negative matrix factorization (NMF)
I tried non-negative matrix factorization (NMF) with TensorFlow
Non-negative Matrix Factorization (NMF) with scikit-learn
I tried to move the ball
I tried to estimate the interval.
I tried to summarize the umask command
I tried to recognize the wake word
I tried to summarize the graphical modeling.
I tried to estimate the pi stochastically
I tried to touch the COTOHA API
Visualization of NMF (Nonnegative Matrix Factorization) Learning
I tried web scraping to analyze the lyrics.
I tried to touch the API of ebay
I tried to correct the keystone of the image
Qiita Job I tried to analyze the job offer
LeetCode I tried to summarize the simple ones
I tried to implement the traveling salesman problem
Initial value problem of NMF (Nonnegative Matrix Factorization)
I tried to predict the price of ETF
I tried to vectorize the lyrics of Hinatazaka46!
I tried to debug.
I tried to paste
I tried to learn the sin function with chainer
I tried to graph the packages installed in Python
I tried to detect the iris from the camera image
I tried to summarize the basic form of GPLVM
I tried to predict the J-League match (data analysis)
I tried to solve the soma cube with python
I tried to approximate the sin function using chainer
I tried to put pytest into the actual battle
[Python] I tried to graph the top 10 eyeshadow rankings
I tried to visualize the spacha information of VTuber
I tried to solve the problem with Python Vol.1
I tried to simulate the dollar cost averaging method
I tried to identify the language using CNN + Melspectogram
I tried to notify the honeypot report on LINE
I tried to complement the knowledge graph using OpenKE
I tried to classify the voices of voice actors
I tried to compress the image using machine learning
I tried to summarize the string operations of Python
I tried to learn PredNet
I tried to organize SVM.
I tried to implement PCANet
I tried the changefinder library!
I tried to reintroduce Linux
I tried to introduce Pylint
I tried to summarize SparseMatrix
I tried to touch jupyter
I tried to implement StarGAN (1)
I tried to find the entropy of the image with python
I tried to find out the outline about Big Gorilla
I tried to introduce the block diagram generation tool blockdiag
I tried porting the code written for TensorFlow to Theano
I tried to simulate how the infection spreads with Python
I tried to analyze the whole novel "Weathering with You" ☔️
[First COTOHA API] I tried to summarize the old story
I tried to get the location information of Odakyu Bus
I tried to find the average of the sequence with TensorFlow
I tried to notify the train delay information with LINE Notify
I tried to simulate ad optimization using the bandit algorithm.
I tried to illustrate the time and time in C language