[PYTHON] Try to extract the features of the sensor data with CNN

I tried convolution with sensor data (one-dimensional vector)

Convolutional neural networks are generally used in image processing, but this time I tried using a one-dimensional vector as seen in sensor data. The point is converted to the same 3D (RGB, X, Y) as the image by converting the data structure using reshape. I think that convolution is effective for anomaly detection because it can extract features even if there are few learning parameters.

1. Create sample data

I use numpy to create a sine wave for one cycle, Add pseudo noise with np.random.rand () Create 100 items with some variation. We used 99 as learning waveforms and the remaining one as verification waveforms.

data=[]
for i in range(100):
    data.append([np.sin(np.pi * n /50)*(1+np.random.rand())for n in range(100)])

2. Create a model of CNN

Create a learning model using Chainer's Convolution 2D. The structure is such that the convoluted data is restored to the original input data in the last layer. As a result, feature extraction is performed as an AutoEncoder.

~~ The activation function is set to Tanh because if you do it with ReLU, the data in the middle of Convlution will be displayed. I thought it would look bad because there was no negative side when visualizing it. ~~ (Addition) The data for visualization is taken out without passing through the activation function. I thought this was more correct.

Even if I did it with ReLU, the Sin wave data did not affect the learning result.

class MyChain(chainer.Chain):

    def __init__(self,n_out):
        super(MyChain, self).__init__()
        with self.init_scope():
            self.l1 = L.Convolution2D(None,2, ksize=(1,4),stride=(1,1))
            self.l2 = L.Convolution2D(None,2, ksize=(1,4),stride=(1,1))
            self.l3 = L.Convolution2D(None,2, ksize=(1,4),stride=(1,1))
            self.l4 = L.Linear(None, n_out)

    def __Call__(self,x,y):
        return F.mean_squared_error(self.fwd(x),y)
    
    def fwd(self, x):
        h1 = F.tanh(F.max_pooling_2d(self.l1(x),2))
        h2 = F.tanh(F.max_pooling_2d(self.l2(h1),2))
        h3 = F.tanh(F.max_pooling_2d(self.l3(h2),2))
        h3 = h3.reshape(h3.shape[0],-1)
        return self.l4(h3)

3. Learn

When using CNN, one-dimensional vector data cannot be read. Therefore, the training data is converted using Reshape. Also, the one that becomes teacher data is the original vector data.

TrainData = np.array(data,dtype=np.float32).reshape(100,1,1,100)
x=chainer.Variable(TrainData[:99])

for epoch in range(201):
    model.zerograds()
    loss=model(x,x.reshape(99,100))
    loss.backward()
    optimizer.update()

4. Result

First, the result of restoration from the input waveform. It can be restored normally. 200-Epoch Validation Graph.png

For reference, let's take a look at the waveform of the convolution process. I'm trying to match the size of the data in a pseudo manner.

200-Epoch Convolution Graph.png

Well, honestly I don't really understand. It can be said that Layer3 is a waveform with narrowed down features. What are the characteristics of the Sin wave itself? Do you capture the shape of a mountain? I tried learning several times, but each time the waveform is different. I think it's interesting.

5. Extension to anomaly detection

I have seen anomaly detection using AutoEncoder in the past, It is a method to express the degree of anomaly by using the difference between the input and the restored output. Similarly, I created anomalous data and verified it.

The first is the phase shift I tried to shift the input value by 5 RBIs (5/100 cycle).

Shift Error Input Graph.png

The Predict waveform is close to the original phase and the line shape is jagged. It seems that this can be easily detected as an abnormality.

Next, when one point like a spike protrudes

Spike Error Input Graph.png

This is also jagged, isn't it? If it is jagged like this, it may be used for anomaly detection from features such as differences.

6. Conclusion

This is my first post. Through this site, I read your posts and study. I thought that I would give back, so I decided to post it. I hope this post will be of some help to you.

(Addition) Actual sample code used

Environment Python 3.6.1 Anaconda 4.4.0 (64-bit) Chainer 2.0.2

import chainer
import chainer.functions as F
import chainer.links as L
import chainer.optimizers
import numpy as np
import matplotlib.pyplot as plt

class MyChain(chainer.Chain):
    def __init__(self,n_out):

        super(MyChain, self).__init__()

        with self.init_scope():

            self.l1 = L.Convolution2D(None,2, ksize=(1,4),stride=(1,1))
            self.l2 = L.Convolution2D(None,2, ksize=(1,4),stride=(1,1))
            self.l3 = L.Convolution2D(None,2, ksize=(1,4),stride=(1,1))
            self.l4 = L.Linear(None, n_out)

    def __call__(self,x,y):
        return F.mean_squared_error(self.fwd(x),y)

    
    def fwd(self, x):

        h1 = F.tanh(F.max_pooling_2d(self.l1(x),2))
        h2 = F.tanh(F.max_pooling_2d(self.l2(h1),2))
        h3 = F.tanh(F.max_pooling_2d(self.l3(h2),2))
        h3 = h3.reshape(h3.shape[0],-1)
        return self.l4(h3)

    def Layaer1(self, x):
        return F.max_pooling_2d(self.l1(x),2)
    def Layaer2(self, x):
        h1=F.tanh(F.max_pooling_2d(self.l1(x),2))
        return F.max_pooling_2d(self.l2(h1),2)
    def Layaer3(self, x):
        h1 = F.tanh(F.max_pooling_2d(self.l1(x),2))
        h2 = F.tanh(F.max_pooling_2d(self.l2(h1),2))
        return F.max_pooling_2d(self.l3(h2),2)

def CreatePlotData(arr,n1):
    Buf1,Buf2=[],[]
    for j in range(n1):
        Buf1.append(0)
        Buf2.append(0)
    for i in range(arr.shape[1]):
        for j in range(n1):
            Buf1.append(arr[0][i].real)
            Buf2.append(arr[1][i].real)
      
    return np.array(Buf1,dtype=np.float32),np.array(Buf2,dtype=np.float32)


data=[]

for i in range(100):
    data.append([np.sin(np.pi * n /50)*(1+np.random.rand())for n in range(100)])

model = MyChain(100)
optimizer = chainer.optimizers.Adam()
optimizer.setup(model)

TrainData = np.array(data,dtype=np.float32).reshape(100,1,1,100)
x=chainer.Variable(TrainData[:99])
ValidationData=TrainData[99].reshape(1,1,1,100)
PlotInput = ValidationData.reshape(100)

for epoch in range(201):
    model.zerograds()
    loss=model(x,x.reshape(99,100))
    loss.backward()
    optimizer.update()

    if epoch%20==0:

        Layer1Arr = np.array(model.Layaer1(ValidationData).data).reshape(2,-1)
        Layer1Arr1,Layer1Arr2 = CreatePlotData(Layer1Arr,2)

        Layer2Arr = np.array(model.Layaer2(ValidationData).data).reshape(2,-1)
        Layer2Arr1,Layer2Arr2 = CreatePlotData(Layer2Arr,4)
        
        Layer3Arr = np.array(model.Layaer3(ValidationData).data).reshape(2,-1)
        Layer3Arr1,Layer3Arr2 = CreatePlotData(Layer3Arr,8)
         
        plt.plot(PlotInput,label='Input')
        plt.plot(Layer1Arr1,label='Lalyer1-1')
        plt.plot(Layer1Arr2,label='Lalyer1-2')
        plt.plot(Layer2Arr1,label='Lalyer2-1')
        plt.plot(Layer2Arr2,label='Lalyer2-2')
        plt.plot(Layer3Arr1,label='Lalyer3-1')
        plt.plot(Layer3Arr2,label='Lalyer3-2')
        plt.legend()
        plt.savefig(str(epoch)+'-Epoch Convolution Graph.png')
        plt.close()

        predict = model.fwd(ValidationData) 
        predict=np.array(predict.data).reshape(100)
        plt.plot(predict,label='Predict')
        plt.plot(PlotInput,label='Input')
        plt.legend()
        plt.savefig(str(epoch)+'-Epoch Validation Graph.png')   
        plt.close()

ErrorPlot = [PlotInput[i+5]for i in range(len(PlotInput)-5)]
for i in range(5):
    ErrorPlot.append(PlotInput[i])

predict = model.fwd(chainer.Variable(np.array(ErrorPlot,dtype=np.float32)).reshape(1,1,1,100)) 
predict=np.array(predict.data).reshape(100)
plt.plot(predict,label='Predict')
plt.plot(ErrorPlot,label='Error Input')    
plt.legend()
plt.savefig('Shift Error Input Graph.png')    
plt.close()

Rnd = np.random.randint(0,99)

ErrorPlot2=np.array(PlotInput)
ErrorPlot2[Rnd]=ErrorPlot2[Rnd]+3

predict = model.fwd(chainer.Variable(np.array(ErrorPlot2,dtype=np.float32)).reshape(1,1,1,100)) 
predict=np.array(predict.data).reshape(100)
plt.plot(predict,label='Predict')
plt.plot(ErrorPlot2,label='Error Input')    
plt.legend()
plt.savefig('Spike Error Input Graph.png')    
plt.close()

Recommended Posts

Try to extract the features of the sensor data with CNN
How to extract features of time series data with PySpark Basics
Try to image the elevation data of the Geographical Survey Institute with Python
I tried to extract features with SIFT of OpenCV
Try to get the contents of Word with Golang
Extract the band information of raster data with python
Try scraping the data of COVID-19 in Tokyo with Python
Try to create a battle record table with matplotlib from the data of "Schedule-kun"
Try to extract Azure SQL Server data table with pyodbc
Try to automate the operation of network devices with Python
I just wanted to extract the data of the desired date and time with Django
Save the results of crawling with Scrapy to the Google Data Store
First python ② Try to write code while examining the features of python
Try to solve the N Queens problem with SA of PyQUBO
Try to solve the shortest path with Python + NetworkX + social data
Try to get the road surface condition using big data of road surface management
I tried to automatically extract the movements of PES players with software
Try to react only the carbon at the end of the chain with SMARTS
Try to separate the background and moving object of the video with OpenCV
Try to solve the fizzbuzz problem with Keras
Try to aggregate doujin music data with pandas
Convert data with shape (number of data, 1) to (number of data,) with numpy.
I tried to save the data with discord
Try to solve the man-machine chart with Python
Try to extract Azure document DB document with pydocumentdb
How to try the friends-of-friends algorithm with pyfof
Try to simulate the movement of the solar system
Try to display the railway data of national land numerical information in 3D
[Verification] Try to align the point cloud with the optimization function of pytorch Part 1
[Introduction to Python] How to get the index of data with a for statement
Extract the Azure SQL Server data table with pyodbc and try to make it numpy array / pandas dataframe
Extract the table of image files with OneDrive & Python
Try to solve the programming challenge book with python3
Add information to the bottom of the figure with Matplotlib
Try to solve the problems / problems of "Matrix Programmer" (Chapter 1)
Try to visualize the room with Raspberry Pi, part 1
Try to solve the internship assignment problem with Python
[Neo4J] ④ Try to handle the graph structure with Cypher
Try to decipher the login data stored in Firefox
Try to specify the axis with PyTorch's Softmax function
How to extract non-missing value nan data with pandas
Extract images and tables from pdf with python to reduce the burden of reporting
It's Christmas, so I'll try to draw the genealogy of Jesus Christ with Cabocha
I tried to visualize the running data of the racing game (Assetto Corsa) with Plotly
I tried to find the entropy of the image with python
Try to factorial with recursion
I tried to display the point cloud data DB of Shizuoka prefecture with Vue + Leaflet
CNN with keras Try it with the image you picked up
Try to evaluate the performance of machine learning / regression model
Try to play with the uprobe that supports Systemtap directly
A network diagram was created with the data of COVID-19.
I tried to find the average of the sequence with TensorFlow
Measure the importance of features with a random forest tool
[Introduction to SIR model] Predict the end time of each country with COVID-19 data fitting ♬
Extract Twitter data with CSV
Try running CNN with ChainerRL
Try to evaluate the performance of machine learning / classification model
Settings to debug the contents of the library with VS Code
I tried to analyze the data of the soccer FIFA World Cup Russia tournament with soccer action
Try to improve the accuracy of Twitter like number estimation
Try to solve the problems / problems of "Matrix Programmer" (Chapter 0 Functions)