[PYTHON] Try to forecast power demand by machine learning

Introduction

I've tried how much power demand can be predicted by machine learning, so I'll organize the procedure.

Postscript

It's been half a year since I wrote this article, and I've learned a lot about new things, so I wrote the article for Part 2. Predicting power demand with machine learning Part 2

Postscript

The following article was also created on 2017/12/22, so please refer to it as well. Power usage forecast with TensorFlow with Keras

Environment used

Machine: MacBook Air Mid 2012 Language: Python 3.5.1 Run: Jupyter notebook 4.2.1 Library: scikit-learn on Anaconda 4.1.0

Data collection

Power demand

First, download the power demand data from the TEPCO website.

http://www.tepco.co.jp/forecast/html/download-j.html

The file name was "juyo-2016.csv". The downloaded data is a CSV of the date, time, and actual power, but since it contains a little extra character string, correct it and save it in UTF-8.

The format of the saved data is as follows.

DATE TIME Performance(10,000 kW)
2016/04/01 00:00 234
2016/04/01 01:00 235
...

temperature

Next, when I searched for some reference material, I was able to download the past weather data from the Japan Meteorological Agency, so I will download the data for the same period and interval as the power demand data.

http://www.data.jma.go.jp/gmd/risk/obsdl/index.php

The file name was "data.csv". The downloaded data is CSV of date and time, temperature, quality information, and homogeneous number, but since it contains a little extra character string, correct it and save it in UTF-8.

The format of the saved data is as follows.

Date and time temperature(℃)
2016-04-01 00:00 15
2016-04-01 01:00 16
...

Data capture

python


import pandas as pd
import numpy as np

#Reading power data
kw_df = pd.read_csv("juyo-2016.csv")

#Reading temperature data
temp_df = pd.read_csv("data.csv")

python


import pandas as pd

url = "http://www.tepco.co.jp/forecast/html/images/juyo-2016.csv"

kw_df = pd.read_csv(url, encoding="shift_jis", skiprows=2)
kw_df.head()

file = "data.csv"
temp_df = pd.read_csv(file, encoding="shift_jis", skiprows=4)
temp_df = temp_df[temp_df.columns[:2]]
temp_df.columns = ["Date and time","temperature(℃)"]

Data processing

When applying to machine learning, we combine data and convert it to numerical data for machine learning. First, the date data is useless as it is, so let's convert it to the day of the week data.

[Example] Sunday-> 0 Monday-> 1 ...And

Next, since it is hourly data, convert the time to data.

[Example] 0:00 -> 0 1:00 -> 1 ...And

python


#Data combination
df = kw_df
df["temperature"] = temp_df["temperature(℃)"]

#Acquisition of day of the week data

import datetime

pp = df["DATE"]
tmp = []

for i in range(len(pp)):
    d = datetime.datetime.strptime(pp[i], "%Y/%m/%d")
    tmp.append(d.weekday())
    
df["weekday"] = tmp

#Acquisition of time data

pp = df["TIME"]
tmp = []

for i in range(len(pp)):
    d = datetime.datetime.strptime(pp[i], "%H:%M")
    tmp.append(d.hour)
    
df["hour"] = tmp

Creation of training data and test data

Create training data and test data from the processed data. Here, the variables used for input are "temperature", "day of the week", and "time", and the variables to be output are "power".

The sequence of processed data is "DATE", "TIME", "actual (10,000 kW)", "temperature", "weekday", "hour", so input (explanatory variable) is 3,4, It is acquired from the 5th column, and the output (output variable) uses the actual result (10,000 kW) in the 2nd column. It also normalizes the data for machine learning.

python


#input
pp = df[["temperature","weekday","hour"]]
X = pp.as_matrix().astype('float')

#output
pp = df["Performance(10,000 kW)"]
y = pp.as_matrix().flatten()

#Load the cross-validation module
from sklearn import cross_validation

#Training set with labeled data(X_train, y_train)And test set(X_test, y_test)Divided into
X_train, X_test, y_train, y_test = cross_validation.train_test_split(X, y, test_size=.2, random_state=42)

#Load the normalization module
from sklearn.preprocessing import StandardScaler

scaler = StandardScaler()
scaler.fit(X_train)

X_train = scaler.transform(X_train)
X_test = scaler.transform(X_test)

Learning

Let's try machine learning with SVM.

python


#Module loading
from sklearn import svm
model = svm.SVC()

#Learning
model.fit(X_train, y_train)
# model.score(X,y)Use to get prediction accuracy
print(model.score(X_test,y_test))

The predicted score is "0.00312989045383"! !! I was surprised at how low the score was! !! It's no good! You think ...

Confirmation of forecast results

In order to see what it looks like, I drew a graph and checked it.

python


#Error calculation
pp = pd.DataFrame({'kw': np.array(y_test), "result": np.array(result)})
pp["err"] = pp["kw"] - pp["result"]

pp.plot()

Unknown.png

Somehow, it seems to follow as it is (^-^) Check the prediction results with actual values.

python


err_max = 0
err_min = 50000
err_ave = 0

for i in range(len(pp)):
    if err_max < pp["err"][i]:
        err_max = pp["err"][i]
    if err_min > pp["err"][i]:
        err_min = pp["err"][i]
    err_ave += pp["err"][i]

print(err_max)
print(err_min)
print(err_ave / i)

The execution result is as follows.

1571
-879
114.81661442

Well, what about this result? I can't tell if it's good or bad ... (-_-;)

Actually, I think I have to think more about it, but I thought that if there was a situation where I had to forecast the power demand in a situation where there was little past data, it would be a good result.

By the way, the Jupyter Notebook file and the processed CSV file have been released on GitHub, so please refer to them as well.

https://github.com/shinob/predict_kw

Recommended Posts

Try to forecast power demand by machine learning
Power demand forecast by GRU
Predict power demand with machine learning Part 2
Stock price forecast by machine learning Numerai Signals
Introduction to machine learning
Try to draw a "weather map-like front" by machine learning based on weather data (5)
Try to draw a "weather map-like front" by machine learning based on weather data (3)
Try to draw a "weather map-like front" by machine learning based on weather data (1)
Try to draw a "weather map-like front" by machine learning based on weather data (4)
Try to draw a "weather map-like front" by machine learning based on weather data (2)
Try to predict forex (FX) with non-deep machine learning
Machine learning beginners try to make a decision tree
[Machine learning] Try to detect objects using Selective Search
An introduction to machine learning
Super introduction to machine learning
4 [/] Four Arithmetic by Machine Learning
Try machine learning with Kaggle
Try to evaluate the performance of machine learning / regression model
Try to evaluate the performance of machine learning / classification model
Machine learning beginners try to reach out to Naive Bayes (2) --Implementation
Try to predict if tweets will burn with machine learning
Stock price forecast by machine learning Let's get started Numerai
Machine learning beginners try to reach out to Naive Bayes (1) --Theory
Introduction to machine learning Note writing
[Machine learning] Try studying decision trees
Machine learning summary by Python beginners
Machine learning beginners try linear regression
Try machine learning with scikit-learn SVM
Introduction to Machine Learning Library SHOGUN
[Machine learning] Try studying random forest
How to collect machine learning data
Python learning memo for machine learning by Chainer Chapter 8 Introduction to Numpy
Python learning memo for machine learning by Chainer Chapter 10 Introduction to Cupy
Stock price forecast by machine learning is so true Numerai Signals
Try to make a blackjack strategy by reinforcement learning ((1) Implementation of blackjack)
Python learning memo for machine learning by Chainer Chapter 9 Introduction to scikit-learn
Try to predict the triplet of boat race by ranking learning
scikit-learn How to use summary (machine learning)
Making Sandwichman's Tale by Machine Learning ver4
Try to face the integration by parts
Record the steps to understand machine learning
I installed Python 3.5.1 to study machine learning
An introduction to OpenCV for machine learning
Is it possible to eat by forecasting stock prices by machine learning [Machine learning part 1]
Search for technical blogs by machine learning focusing on "easiness to understand"
[Failure] Find Maki Horikita by machine learning
Four arithmetic operations by machine learning 6 [Commercial]
Machine learning
Introduction to ClearML-Easy to manage machine learning experiments-
How to enjoy Coursera / Machine Learning (Week 10)
Try to classify O'Reilly books by clustering
An introduction to Python for machine learning
Stock price forecast using machine learning (regression)
Python & Machine Learning Study Memo ④: Machine Learning by Backpropagation
Judgment of igneous rock by machine learning ②
Try to predict the value of the water level gauge by machine learning using the open data of Data City Sabae
Try to write code from 1 using the machine learning framework chainer (mnist edition)
I tried to predict the presence or absence of snow by machine learning.
Is it possible to eat stock price forecasts by machine learning [Implementation plan]
I tried to predict the change in snowfall for 2 years by machine learning
[Python] Easy introduction to machine learning with python (SVM)