This article is the 18th day of the DoCoMo Advanced Technology Research Institute Advent Calendar. I'm Ishioka, a second-year employee of NTT DoCoMo Advanced Technology Research Laboratories. In this article, I am a beginner of AI / machine learning, and I will explain a model that predicts human movements from sensor data created as a hobby.

wrap up

--Let's detect the movement of a person using the data acquired from the acceleration / gyro sensor and the neural network (CNN). ――In recent years, sports sensing seems to be popular, so let's detect the movement of the harki step.

What is Harkistep?

It is one of the training methods in sports. You may have done it once in a physical education class. While standing, lower your center of gravity and step finely with your feet (this is the harki step). In this state, the instructor (physical education teacher or club activity coach) makes an announcement on the upper right, lower left. I used to do training every day when I was a student, but I don't know what it works for. When I look it up, it seems that agility (?) And quickness (?) Can be trained.

What to use

--Acceleration / gyro sensor (MPU6050 is used this time, as of 2019, it is about 300 yen per Amazon. It is cheap and surprising) --PC (Use Raspberry pi for both data acquisition and learning)

Acquisition of sensor data

The picture of the sensor I bought looks like below. Because it is cheap, you have to solder it yourself. The distance between the pins is wide enough that even people like me who are not good at soldering can easily do it. Also, I am happy that it contains two types of pin headers.

The method of acquiring sensor data from the MPU6050 has been introduced in various articles, so I will omit it this time. However, there are two points to note. The first is that the MPU6050 acquires the sensor value via I2C communication. Since the voltage value output from the sensor is not read directly as an analog signal, you may not know what it is at first. Roughly speaking, the recipient side (Raspberry pi, Arduino, etc.) communicates only with the specified block (address specification) in the sensor side (MPU6050, etc.). There are only data bus and clock bus, but it is possible to read data from 3 axes of acceleration sensor and 3 axes of gyro sensor.

The second is the input / output voltage of the MPU6050. The maximum input / output voltage of MPU6050 is 3.46V on the data sheet. On the other hand, Arduino has 5V input / output, so if you connect the input / output pin of Arduino directly to MPU6050, it will not work properly. Since the input / output pin of Raspberry pi is 3.3V, there is no problem if you connect it directly to MPU6050.

MPU6050 is closer to the specifications of Raspberry pi, but if you use another sensor, the opposite case may occur, so be sure to check the data sheet before using it.

Data visualization

In creating the data, we asked three people from the Advanced Technology Research Institute to perform the haki, and created 300 data (acceleration for 1 second and values of the gyro 3 axes). The figure below shows the results of each operation of the Harki step with the sensor fixed at the waist position. Somehow, it seems to represent movement, but to be honest, all of them look like noise data ...

This time, I would like to input this sensor data into a neural network to predict the movement of the human body.

Creating a model

This time, I used CNN (Convolutional Neural Network). There are two reasons for this. ――I don't know what it is even if I cut out every moment of the movement, so I want to make it a lump of time and learn it. ――When I tried to put the matrix as it was, I thought that the CNN code could be used as it is.

The acceleration data $ a = [a_x, a_y, a_z] $ obtained from the MPU6050 and the gyro data $ g = [g_x, g_y, g_z] $ are sampled at a sampling rate of 40 Hz. Use this as 6x40 input data.

In the harki step, it is basically classified into 4 movements, but this time, considering "step only", it is classified into 5 movements (5 movements with no top, bottom, right, left).

The outline of the entire program and the network configuration diagram are shown below. Generally, it has the same network configuration as the image classification model used in CNN tutorials. As input data, 6x40 acceleration / gyro sensor data is input to CNN like image data, and values from 0 to 1 are output for each of the five operations. The closer it is to 1, the higher the probability of the action, and the closer it is to 0, the lower the probability of the action. Estimate the movement of the human body from the maximum of the five values.

Learning results

The figure below shows the results of training by validating 2: 1 for 300 operation data for 3 people. The accuracy is abnormally high. Perhaps the result of the same person doing the same thing was included in both the training and the test data, so it seemed to leak. When separating training data and test data, it may have been necessary to devise ways such as dividing by person instead of randomly (since people were not recorded in order to anonymize, so it will be added later. I will verify it).

Summary

We created a human motion prediction model using a one-coin sensor that can be purchased from Amazon. There was a stereotype that "CNN = image recognition", but it could also be used for sensor data. On the other hand, the data used for learning is small, and it is highly possible that the accuracy is apparently high. This model was used by connecting a sensor to the Raspberry pi, but since it is possible to predict from the acceleration and gyro acquired from the smartphone, I would like to develop an application that uses this motion prediction model.

[PYTHON] Prediction model of human body movement using one coin sensor