Hello, it is kon2: sunny :: sunny: I am a student who is usually studying negative electrode materials for lithium batteries at university.
Lithium-ion batteries are widely used in everyday life such as mobile phones and electric vehicles. Having recently won the Nobel Prize, you've come to hear your name a lot.
Although lithium-ion batteries have been in commercialization for more than 40 years, they are still an active research field, and the use of machine learning has recently attracted attention. For example, development of new electrode materials and electrolytes, optimization of battery configurations (electrode film thickness, electrolyte concentration, additives ...), prediction of battery life, etc.
This time, I will use the Qore SDK provided by Quantum Core Co., Ltd. for a limited time to predict the discharge capacity of lithium-ion batteries over time: v :: v :: v:
Introducing and using the Qore SDK is explained in the following article. The world of reservoir computing ~ with Qore ~ Introduction of Qore SDK and detection of arrhythmia with Qore
Since it was a big deal, I did the same learning using LSTM, which is a kind of deep learning, and compared the results.
For the charge / discharge data, the data published on the following page by CALCE (Center for Advanced Life Cycle Engineering) of the University of Maryland is used.
https://calce.umd.edu/data#CS2
In this analysis, the charge / discharge data of CS2 Battery (CS2-35, CS2-36, CS2-37, CS2-38) is used. After downloading in Zip format, the xlsx file in the unzipped folder is the charge / discharge data.
Since it is divided into multiple data and the number of cycles is also different, the data is appropriately formatted and preprocessed.
I will not explain the preprocessing in this article, but I posted it on GitHub in Jupyter-Notebook format, so please refer to it: point_down :: point_down :: point_down:
https://github.com/konkon3249/BatteryDatasetPreprocessing/blob/master/Preprocessing_CS2_35.ipynb
Here is the charge / discharge curve of the battery CS2-35 after pretreatment. The total number of cycles was 887.
The charge curve extends to the upper right, and the discharge curve extends to the lower left. As the color changes from blue to purple, it indicates that the cycle is progressing.
What you want to know here is the maximum discharge capacity per cycle of the battery. The figure below plots this against the number of cycles.
You can see that the discharge capacity is getting smaller and smaller with the cycle. Generally, the life standard is when the capacity is reduced by 20% (1.1 x 0.8 = 0.88 Ah this time), so this battery has a useful life of about 580 cycles.
Since the relationship between the discharge capacity of a secondary battery and the number of cycles can be regarded as time-series data, predictions have been made by ARIMA (not deep learning) or LSTM (deep learning).
Lithium-ion batteries remaining useful life prediction based on a mixture of empirical mode decomposition and ARIMA model (Article using ARIMA ) Long Short-Term Memory Recurrent Neural Network for Remaining Useful Life Prediction of Lithium-Ion Batteries (Paper using LSTM) Assessing the Health of LiFePO4 Traction Batteries through Monotonic Echo State Networks (Paper using Echo State Network)
The final paper uses Echo State Networks, a type of reservoir computing, for prediction. I didn't think anyone was already doing it.
Anyway, it is said that the reservoir computing provided by Qore SDK is good at time series analysis, so I will use this to perform similar time series analysis.
Click here for a list of learning data used this time.
From the charge / discharge data of these four lithium-ion batteries, three types (# 35, 37, 38) will be used as teacher data, and one type (# 36) will be used as test data for time series analysis.
Time series analysis is also available on Jupyter-notebook, so please have a look.
https://github.com/konkon3249/BatteryLifePrediction
Below is a brief explanation.
First, read the csv format data. I will also handle missing values just in case.
df_35 = pd.read_csv('CS2_35.csv',index_col=0).dropna()
df_36 = pd.read_csv('CS2_36.csv',index_col=0).dropna()
df_37 = pd.read_csv('CS2_37.csv',index_col=0).dropna()
df_38 = pd.read_csv('CS2_38.csv',index_col=0).dropna()
Create teacher data for time series analysis using the following function. I also referred to the article below that uses the Qore SDK: thumbs up :: thumbs up :: thumbs up:
Regression task using Qore SDK
def ConvertData(dataset,t_width):
X_trains = []
y_trains = []
for df in dataset:
t_length = len(df)
train_x = np.arange(t_length)
capacity = np.array(df['capacity'])
train_y = capacity
for i in range(t_length - t_width):
X_trains.append(train_y[i:i + t_width])
y_trains.append(train_y[i + t_width])
X_trains = np.array(X_trains)
y_trains = np.array(y_trains)
return X_trains,y_trains
X_train,y_train = ConvertData([df_35,df_37,df_38],50)
X_test,y_test = ConvertData([df_36],50)
Check the dimensions of the teacher data obtained by this process.
print(X_train.shape,X_test.shape,y_train.shape,y_test.shape)
>> (2588, 50) (873, 50) (2588,) (873,)
There is too much data to throw into the API, so reduce the number of teacher data to 500. Data will be selected at random.
idx = np.arange(0,X_train.shape[0],1)
idx = np.random.permutation(idx)
idx_lim = idx[:500]
X_train = X_train[idx_lim]
y_train = y_train[idx_lim]
Finally, transform the teacher data into dimensions (number of data, time, real data). Although it is not multivariate, the last dimension is 1.
X_train = X_train.reshape([X_train.shape[0], X_train.shape[1], 1])
X_test = X_test.reshape([X_test.shape[0], X_test.shape[1], 1])
print(X_train.shape,X_test.shape,y_train.shape,y_test.shape)
>> (500, 50, 1) (873, 50, 1) (500,) (873,)
I feel like this.
Learning does not require optimizing the network structure or parameters, it is very easy because it is just thrown into the API.
%%time
client = WebQoreClient(username, password, endpoint=endpoint)
time_ = client.regression_train(X_train, y_train)
print('Time:', time_['train_time'], 's')
>> Time: 1.784491777420044 s
>> Wall time: 2.26 s
Moreover, the result will be returned immediately. It's so fast that I'm worried if I'm really learning.
Let's check. First of all, from the training data.
#inference
res = client.regression_predict(X_train)
#plot
fig=plt.figure(figsize=(12, 4),dpi=150)
plt.plot(res['Y'],alpha=0.7,label='Prediction')
plt.plot(y_train,alpha=0.7,label='True')
plt.legend(loc='upper right',fontsize=12)
Looking at the predicted values of the training data, it seems that the learning is done properly. Next, the predicted result of the discharge capacity of the battery (# 36) for 300 to 800 cycles, which is the test data, is displayed.
Blue is the predicted value and orange is the experimental value. Isn't it pretty good to predict?
I used scikit-learn to display the error (MSE) (for later comparison with LSTM).
from sklearn.metrics import mean_squared_error
print(mean_squared_error(y_test[300:800], res['Y']))
>> 0.00025365167122575003
Now, let's actually predict the discharge capacity from 500 to 550 cycles as follows.
#Future prediction of discharge capacity
initial = X_test[500]
results = []
for i in tqdm(range(50)):
if(i == 0):
res = client.regression_predict([initial])['Y']
results.append(res[0])
time.sleep(1)
else:
initial = np.vstack((initial[1:],np.array(res)))
res = client.regression_predict([initial])['Y']
results.append(res[0])
time.sleep(1)
#plot
fig=plt.figure(figsize=(12,4),dpi=150)
plt.plot(np.linspace(501,550,50),results,'o-',ms=4,lw=1,label='predict')
plt.plot(np.linspace(401,550,150),y_test[400:550],'o-',lw=1,ms=4,label='true')
plt.legend(loc='upper right',fontsize=12)
plt.xlabel('Number of Cycle',fontsize=13)
plt.ylabel('Discharge Capacity (Ah)',fontsize=13)
The result is as follows.
I can predict it with a good feeling!
Since it's a big deal, I'll compare it with LSTM. First, change the dimension of the correct answer data from (N,) to (N, 1).
X_train = X_train.reshape([X_train.shape[0], X_train.shape[1], 1])
X_test = X_test.reshape([X_test.shape[0], X_test.shape[1], 1])
y_train = y_train.reshape([y_train.shape[0], 1])
y_test = y_test.reshape([y_test.shape[0], 1])
print(X_train.shape,X_test.shape,y_train.shape,y_test.shape)
>> (500, 50, 1) (873, 50, 1) (500, 1) (873, 1)
The parameters of the LSTM are as follows.
hidden layer:3 optimizer:rmsprop loss function:mean squared error batch size:50 epochs:100
Building and learning the model is like this.
#LSTM model construction
length_of_sequence = X_train.shape[1]
in_out_neurons = 1
n_hidden = 3
model = Sequential()
model.add(LSTM(n_hidden, batch_input_shape=(None, length_of_sequence, in_out_neurons),
return_sequences=False))
model.add(Dense(1))
model.add(Activation("linear"))
model.compile(loss="mean_squared_error", optimizer="rmsprop")
#Model learning
%%time
early_stopping = EarlyStopping(monitor='val_loss', mode='auto', patience=20)
history = model.fit(X_train, y_train,
batch_size=50,
epochs=100,
validation_split=0.1,
callbacks=[early_stopping,PlotLossesKeras()]
)
Using GPU (GTX1070), the calculation time was 1 min 27 s. The transition of Loss during learning is as follows.
Here is the result of predicting the discharge capacity of the test data (CS2-36) using this model as in the case of Qore.
You can predict in the same way. The MSE is 0.00021017. Now, as before, predict the discharge capacity from 500 to 550 cycles as follows.
initial = X_test[500]
results = []
for i in tqdm(range(50)):
if(i == 0):
initial = initial.reshape(1,50,1)
res = model.predict(initial)
results.append(res[0][0])
else:
initial = initial.reshape(50,1)
initial = np.vstack((initial[1:],np.array(res)))
initial = initial.reshape(1,50,1)
res = model.predict([initial])
results.append(res[0][0])
It seems that it has been done, but it seems that it is better predictable when using Qore.
Finally, here is a plot of the results of the time series forecast, along with the Qore case.
In this article, we made a time-series prediction of the deterioration process (decrease in discharge capacity) of a lithium-ion battery using the Qore SDK, and compared it with the case of using the commonly used LSTM.
As a result, the error (mse) from the test data and the calculation time are as follows.
QoreSDK | LSTM | |
---|---|---|
MSE | 2.5365e-4 | 2.1017e-4 |
Learning time | 1.78 s | 87 s |
The error is slightly smaller in LSTM, but the learning time is overwhelmingly faster in Qore. It's amazing that you can learn so far without adjusting the learning parameters.
The battery deterioration prediction made this time is actually still insufficient. This seems to be due to the fact that only the discharge capacity is used for time series prediction.
In fact, there are many other parameters related to battery capacity. For example, the following data can be obtained from this dataset.
From the upper left, the relationship between discharge capacity, internal resistance, constant current charging time, constant voltage charging time and charge / discharge cycle. If you perform multivariate time series analysis including the data around here, you should be able to predict deterioration more accurately. (I wasn't strong this time ...)
Quantum Core, thank you for providing such an API this time. Time series analysis was an amateur, but it was very interesting! For a limited time, if you want to play: point_down :: point_down :: point_down:
Well then ~~.
Recommended Posts