[PYTHON] I tried to predict the deterioration of the lithium ion battery using the Qore SDK

Introduction

Hello, it is kon2: sunny :: sunny: I am a student who is usually studying negative electrode materials for lithium batteries at university.

Lithium-ion batteries are widely used in everyday life such as mobile phones and electric vehicles. Having recently won the Nobel Prize, you've come to hear your name a lot.

Although lithium-ion batteries have been in commercialization for more than 40 years, they are still an active research field, and the use of machine learning has recently attracted attention. For example, development of new electrode materials and electrolytes, optimization of battery configurations (electrode film thickness, electrolyte concentration, additives ...), prediction of battery life, etc.

This time, I will use the Qore SDK provided by Quantum Core Co., Ltd. for a limited time to predict the discharge capacity of lithium-ion batteries over time: v :: v :: v:

Introducing and using the Qore SDK is explained in the following article. The world of reservoir computing ~ with Qore ~ Introduction of Qore SDK and detection of arrhythmia with Qore

Since it was a big deal, I did the same learning using LSTM, which is a kind of deep learning, and compared the results.

Acquisition of charge / discharge data

For the charge / discharge data, the data published on the following page by CALCE (Center for Advanced Life Cycle Engineering) of the University of Maryland is used.

https://calce.umd.edu/data#CS2

In this analysis, the charge / discharge data of CS2 Battery (CS2-35, CS2-36, CS2-37, CS2-38) is used. After downloading in Zip format, the xlsx file in the unzipped folder is the charge / discharge data.

Since it is divided into multiple data and the number of cycles is also different, the data is appropriately formatted and preprocessed.

I will not explain the preprocessing in this article, but I posted it on GitHub in Jupyter-Notebook format, so please refer to it: point_down :: point_down :: point_down:

https://github.com/konkon3249/BatteryDatasetPreprocessing/blob/master/Preprocessing_CS2_35.ipynb

Here is the charge / discharge curve of the battery CS2-35 after pretreatment. The total number of cycles was 887.

ダウンロード.png

The charge curve extends to the upper right, and the discharge curve extends to the lower left. As the color changes from blue to purple, it indicates that the cycle is progressing.

What you want to know here is the maximum discharge capacity per cycle of the battery. The figure below plots this against the number of cycles.

ダウンロード (1).png

You can see that the discharge capacity is getting smaller and smaller with the cycle. Generally, the life standard is when the capacity is reduced by 20% (1.1 x 0.8 = 0.88 Ah this time), so this battery has a useful life of about 580 cycles.

Time series analysis using Qore

Since the relationship between the discharge capacity of a secondary battery and the number of cycles can be regarded as time-series data, predictions have been made by ARIMA (not deep learning) or LSTM (deep learning).

Lithium-ion batteries remaining useful life prediction based on a mixture of empirical mode decomposition and ARIMA model (Article using ARIMA ) Long Short-Term Memory Recurrent Neural Network for Remaining Useful Life Prediction of Lithium-Ion Batteries (Paper using LSTM) Assessing the Health of LiFePO4 Traction Batteries through Monotonic Echo State Networks (Paper using Echo State Network)

The final paper uses Echo State Networks, a type of reservoir computing, for prediction. I didn't think anyone was already doing it.

Anyway, it is said that the reservoir computing provided by Qore SDK is good at time series analysis, so I will use this to perform similar time series analysis.

Click here for a list of learning data used this time.

ダウンロード (2).png

From the charge / discharge data of these four lithium-ion batteries, three types (# 35, 37, 38) will be used as teacher data, and one type (# 36) will be used as test data for time series analysis.

Time series analysis is also available on Jupyter-notebook, so please have a look.

https://github.com/konkon3249/BatteryLifePrediction

Below is a brief explanation.

Reading data and creating teacher data

First, read the csv format data. I will also handle missing values just in case.

df_35 = pd.read_csv('CS2_35.csv',index_col=0).dropna()
df_36 = pd.read_csv('CS2_36.csv',index_col=0).dropna()
df_37 = pd.read_csv('CS2_37.csv',index_col=0).dropna()
df_38 = pd.read_csv('CS2_38.csv',index_col=0).dropna()

Create teacher data for time series analysis using the following function. I also referred to the article below that uses the Qore SDK: thumbs up :: thumbs up :: thumbs up:

Regression task using Qore SDK

def ConvertData(dataset,t_width):
    
    X_trains = []
    y_trains = []
    
    for df in dataset:
        t_length = len(df)

        train_x = np.arange(t_length)
        capacity = np.array(df['capacity'])
        train_y = capacity
        
        for i in range(t_length - t_width):
            X_trains.append(train_y[i:i + t_width])
            y_trains.append(train_y[i + t_width])

    X_trains = np.array(X_trains)
    y_trains = np.array(y_trains)
    
    return X_trains,y_trains

X_train,y_train = ConvertData([df_35,df_37,df_38],50)
X_test,y_test = ConvertData([df_36],50)

Check the dimensions of the teacher data obtained by this process.

print(X_train.shape,X_test.shape,y_train.shape,y_test.shape)
>> (2588, 50) (873, 50) (2588,) (873,)

There is too much data to throw into the API, so reduce the number of teacher data to 500. Data will be selected at random.

idx = np.arange(0,X_train.shape[0],1)
idx = np.random.permutation(idx)
idx_lim = idx[:500]

X_train = X_train[idx_lim]
y_train = y_train[idx_lim]

Finally, transform the teacher data into dimensions (number of data, time, real data). Although it is not multivariate, the last dimension is 1.

X_train = X_train.reshape([X_train.shape[0], X_train.shape[1], 1])
X_test = X_test.reshape([X_test.shape[0], X_test.shape[1], 1])
print(X_train.shape,X_test.shape,y_train.shape,y_test.shape)
>> (500, 50, 1) (873, 50, 1) (500,) (873,)

I feel like this.

Learning time series data using Qore SDK

Learning does not require optimizing the network structure or parameters, it is very easy because it is just thrown into the API.

%%time
client = WebQoreClient(username, password, endpoint=endpoint)
time_ = client.regression_train(X_train, y_train)
print('Time:', time_['train_time'], 's')

>> Time: 1.784491777420044 s
>> Wall time: 2.26 s

Moreover, the result will be returned immediately. It's so fast that I'm worried if I'm really learning.

Let's check. First of all, from the training data.

#inference
res = client.regression_predict(X_train)

#plot
fig=plt.figure(figsize=(12, 4),dpi=150)
plt.plot(res['Y'],alpha=0.7,label='Prediction')
plt.plot(y_train,alpha=0.7,label='True')
plt.legend(loc='upper right',fontsize=12)

ダウンロード (6).png

Looking at the predicted values of the training data, it seems that the learning is done properly. Next, the predicted result of the discharge capacity of the battery (# 36) for 300 to 800 cycles, which is the test data, is displayed.

ダウンロード (7).png

Blue is the predicted value and orange is the experimental value. Isn't it pretty good to predict?

I used scikit-learn to display the error (MSE) (for later comparison with LSTM).

from sklearn.metrics import mean_squared_error
print(mean_squared_error(y_test[300:800], res['Y']))
>> 0.00025365167122575003

Now, let's actually predict the discharge capacity from 500 to 550 cycles as follows.


#Future prediction of discharge capacity
initial = X_test[500]
results = []
for i in tqdm(range(50)):
    if(i == 0):
        res = client.regression_predict([initial])['Y']
        results.append(res[0])
        time.sleep(1)
    else:
        initial = np.vstack((initial[1:],np.array(res)))
        res = client.regression_predict([initial])['Y']
        results.append(res[0])
        time.sleep(1)

#plot
fig=plt.figure(figsize=(12,4),dpi=150)
plt.plot(np.linspace(501,550,50),results,'o-',ms=4,lw=1,label='predict')
plt.plot(np.linspace(401,550,150),y_test[400:550],'o-',lw=1,ms=4,label='true')
plt.legend(loc='upper right',fontsize=12)
plt.xlabel('Number of Cycle',fontsize=13)
plt.ylabel('Discharge Capacity (Ah)',fontsize=13)

The result is as follows.

ダウンロード (18).png

I can predict it with a good feeling!

Learning with LSTM

Since it's a big deal, I'll compare it with LSTM. First, change the dimension of the correct answer data from (N,) to (N, 1).

X_train = X_train.reshape([X_train.shape[0], X_train.shape[1], 1])
X_test = X_test.reshape([X_test.shape[0], X_test.shape[1], 1])
y_train = y_train.reshape([y_train.shape[0], 1])
y_test = y_test.reshape([y_test.shape[0], 1])
print(X_train.shape,X_test.shape,y_train.shape,y_test.shape)
>> (500, 50, 1) (873, 50, 1) (500, 1) (873, 1)

The parameters of the LSTM are as follows.

hidden layer:3 optimizer:rmsprop loss function:mean squared error batch size:50 epochs:100

Building and learning the model is like this.

#LSTM model construction
length_of_sequence = X_train.shape[1]
in_out_neurons = 1
n_hidden = 3

model = Sequential()
model.add(LSTM(n_hidden, batch_input_shape=(None, length_of_sequence, in_out_neurons), 
               return_sequences=False))
model.add(Dense(1))
model.add(Activation("linear"))
model.compile(loss="mean_squared_error", optimizer="rmsprop")

#Model learning
%%time
early_stopping = EarlyStopping(monitor='val_loss', mode='auto', patience=20)
history = model.fit(X_train, y_train,
          batch_size=50,
          epochs=100,
          validation_split=0.1,
          callbacks=[early_stopping,PlotLossesKeras()]
          )

Using GPU (GTX1070), the calculation time was 1 min 27 s. The transition of Loss during learning is as follows.

ダウンロード (15).png

Here is the result of predicting the discharge capacity of the test data (CS2-36) using this model as in the case of Qore.

ダウンロード (17).png

You can predict in the same way. The MSE is 0.00021017. Now, as before, predict the discharge capacity from 500 to 550 cycles as follows.

initial = X_test[500]
results = []
for i in tqdm(range(50)):
    if(i == 0):
        initial = initial.reshape(1,50,1)
        res = model.predict(initial)
        results.append(res[0][0])
    else:
        initial = initial.reshape(50,1)
        initial = np.vstack((initial[1:],np.array(res)))
        initial = initial.reshape(1,50,1)
        res = model.predict([initial])
        results.append(res[0][0])

ダウンロード (16).png

It seems that it has been done, but it seems that it is better predictable when using Qore.

Finally, here is a plot of the results of the time series forecast, along with the Qore case.

ダウンロード (19).png

Summary

In this article, we made a time-series prediction of the deterioration process (decrease in discharge capacity) of a lithium-ion battery using the Qore SDK, and compared it with the case of using the commonly used LSTM.

As a result, the error (mse) from the test data and the calculation time are as follows.

QoreSDK LSTM
MSE 2.5365e-4 2.1017e-4
Learning time 1.78 s 87 s

The error is slightly smaller in LSTM, but the learning time is overwhelmingly faster in Qore. It's amazing that you can learn so far without adjusting the learning parameters.

Finally

The battery deterioration prediction made this time is actually still insufficient. This seems to be due to the fact that only the discharge capacity is used for time series prediction.

In fact, there are many other parameters related to battery capacity. For example, the following data can be obtained from this dataset.

ダウンロード (20).png

From the upper left, the relationship between discharge capacity, internal resistance, constant current charging time, constant voltage charging time and charge / discharge cycle. If you perform multivariate time series analysis including the data around here, you should be able to predict deterioration more accurately. (I wasn't strong this time ...)

Quantum Core, thank you for providing such an API this time. Time series analysis was an amateur, but it was very interesting! For a limited time, if you want to play: point_down :: point_down :: point_down:

Machine learning and applied technologies other than deep learning by QuantumCore Advent Calendar 2019

Well then ~~.

Recommended Posts

I tried to predict the deterioration of the lithium ion battery using the Qore SDK
I tried to predict the victory or defeat of the Premier League using the Qore SDK
I tried to predict the price of ETF
I want to use the Qore SDK to predict the success of NBA players
I tried the common story of using Deep Learning to predict the Nikkei 225
I tried to get the index of the list using the enumerate function
I tried to predict the infection of new pneumonia using the SIR model: ☓ Wuhan edition ○ Hubei edition
Predict the rise and fall of BTC price using Qore SDK
I tried to predict Covid-19 using Darts
I tried to predict the up and down of the closing price of Gurunavi's stock price using TensorFlow (progress)
I tried to transform the face image using sparse_image_warp of TensorFlow Addons
I tried to execute SQL from the local environment using Looker SDK
I tried to get the batting results of Hachinai using image processing
I tried to estimate the similarity of the question intent using gensim's Doc2Vec
I tried to extract and illustrate the stage of the story using COTOHA
Using COTOHA, I tried to follow the emotional course of Run, Melos!
I tried to predict the behavior of the new coronavirus with the SEIR model.
I tried to touch the API of ebay
I tried to correct the keystone of the image
I tried using the image filter of OpenCV
I tried to vectorize the lyrics of Hinatazaka46!
I tried to notify the update of "Hamelin" using "Beautiful Soup" and "IFTTT"
[Python] I tried to judge the member image of the idol group using Keras
I tried to predict the presence or absence of snow by machine learning.
python beginners tried to predict the number of criminals
I tried to summarize the basic form of GPLVM
I tried to predict the J-League match (data analysis)
I tried to approximate the sin function using chainer
I tried using the API of the salmon data project
I tried to visualize the spacha information of VTuber
I tried to erase the negative part of Meros
I tried to identify the language using CNN + Melspectogram
I tried to complement the knowledge graph using OpenKE
I tried to classify the voices of voice actors
I tried to compress the image using machine learning
I tried to summarize the string operations of Python
I tried to notify the update of "Become a novelist" using "IFTTT" and "Become a novelist API"
Python practice 100 knocks I tried to visualize the decision tree of Chapter 5 using graphviz
I tried to extract the text in the image file using Tesseract of the OCR engine
I tried to find the entropy of the image with python
[Horse Racing] I tried to quantify the strength of racehorses
I tried to get the location information of Odakyu Bus
I tried to find the average of the sequence with TensorFlow
I tried refactoring the CNN model of TensorFlow using TF-Slim
I tried to simulate ad optimization using the bandit algorithm.
I tried face recognition of the laughter problem using Keras.
[Python] I tried to visualize the follow relationship of Twitter
[TF] I tried to visualize the learning result using Tensorboard
[Machine learning] I tried to summarize the theory of Adaboost
[Python] I tried collecting data using the API of wikipedia
I tried to fight the Local Minimum of Goldstein-Price Function
I tried to approximate the sin function using chainer (re-challenge)
I tried to output the access log to the server using Node.js
I tried to compare the accuracy of machine learning models using kaggle as a theme.
I tried to predict the genre of music from the song title on the Recurrent Neural Network
Implementation of recommendation system ~ I tried to find the similarity from the outline of the movie using TF-IDF ~
I tried to automate the construction of a hands-on environment using IBM Cloud's SoftLayer API
I tried to predict the sales of game software with VARISTA by referring to the article of Codexa
I tried using GrabCut of OpenCV
I tried to move the ball
I tried using the checkio API