[PYTHON] Predict power demand with machine learning Part 2

Introduction

It's been a long time since I posted with the title of the same name before, and it's getting a little ripe, so I'll post it again.

Postscript

Please refer to the following article created on 2017/12/22. Power usage forecast with TensorFlow with Keras

Data collection

Power demand

First, download the power demand data from the TEPCO website.

http://www.tepco.co.jp/forecast/html/download-j.html

http://www.tepco.co.jp/forecast/html/images/juyo-2016.csv

Also, if you change the URL, you can get the data of 2014.

http://www.tepco.co.jp/forecast/html/images/juyo-2014.csv

The downloaded data is a CSV of the date, time, and actual power.

By the way, it seems that you can also get it with the following command.

python


$ curl -O http://www.tepco.co.jp/forecast/html/images/juyo-2014.csv
$ curl -O http://www.tepco.co.jp/forecast/html/images/juyo-2016.csv

temperature

Like last time, we will download the past weather data from the Japan Meteorological Agency.

http://www.data.jma.go.jp/gmd/risk/obsdl/index.php

The point is "Tokyo", the items are "hourly value" and "temperature", and the period is 2013/12/31 to 2015/1/1 and 2015/12/31 to 2017/1/1, data-2014.csv, Save it as data-2016.csv. The reason for choosing a slightly longer period here is to narrow down the period later.

The downloaded data is a CSV containing data such as date and time, temperature, quality information, and homogeneous number.

By the way, this seems to be downloaded from the site normally.

Data reading

Library

python


import pandas as pd
import numpy as np
import datetime as dt
import math

Power demand

First, load the 2014 power data.

python


filename = "juyo-2014.csv"

#The character code is Shift JIS, and unnecessary lines are skipped and read.
df = pd.read_csv(filename,encoding="SHIFT-JIS",skiprows=2)

#Convert column names
df.columns = ["DATE","TIME","KW"]

#Since the date and time data are separated, connect them into one, convert it to date and time type, and specify it as an index.
df.index = df.index.map(lambda x: dt.datetime.strptime(df.loc[x].DATE + " " + df.loc[x].TIME,"%Y/%m/%d %H:%M"))

#Get monthly data
df["MONTH"] = df.index.month

#Acquisition of day of the week data
df["WEEK"] = df.index.weekday

#Acquisition of time data
df["HOUR"] = df.index.hour

df_kw = df

temperature

Next, load the temperature data for 2014.

python


filename = "data-2014.csv"

#Character code is Shift JIS, skip unnecessary lines and get only the required 2 columns
df = pd.read_csv(filename,encoding="SHIFT-JIS",skiprows=4)[[0,1]]

#Convert column names
df.columns = ["DATE","TEMP"]

#Convert date and time data to date and time type and specify it as an index
df.index = df.index.map(lambda x: dt.datetime.strptime(df.loc[x].DATE,"%Y/%m/%d %H:%M:%S"))

df_temp = df

Combine power demand and temperature data

python


d1 = df_kw.index.min()
d2 = df_kw.index.max()

df_kw["TEMP"] = df_temp.ix[d1:d2].TEMP

Data processing

Acquires input data and output data used for machine learning. Since we are predicting power demand, we will use the KW column for the output and the MONTH, WEEK, HOUR, and TEMP columns for the input.

python


#Specifying the data string used for input
X_cols = ["MONTH","WEEK","HOUR","TEMP"]

#Specifying the data column to use for output
y_cols = ["KW"]

#Acquisition of input / output data
X = df_kw[X_cols].as_matrix().astype('float')
y = df_kw[y_cols].as_matrix().astype('int').flatten()

Divide into training data and validation data.

python


from sklearn import cross_validation

X_train, X_test, y_train, y_test = cross_validation.train_test_split(X, y, test_size=.1, random_state=42)

Normalizes the input data.

python


from sklearn.preprocessing import StandardScaler

scaler = StandardScaler()
scaler.fit(X_train)

X_train = scaler.transform(X_train)
X_test = scaler.transform(X_test)

Learning

Learn with a regression model.

python


from sklearn.ensemble import RandomForestRegressor

model = RandomForestRegressor()
model.fit(X_train, y_train)

Forecast

Calculate the score using the divided test data.

python


print(model.score(X_test,y_test))

The score was "0.91601162513664502" (^-^)

Confirmation of forecast results

Let's graph the prediction result and the actual data and check it.

python


#Prediction result
result = model.predict(X_test)

#Convert to data frame
df_result = pd.DataFrame({
    "y_test":y_test,
    "result":result
})

#Graph library
%matplotlib inline
import matplotlib
import matplotlib.pyplot as plt

#Graph drawing
df_result.plot(figsize=(15, 3))

Unknown.png

It looks like it's hitting, but I'm not sure how it is.

Reduce the number of data and reconfirm.

python


#Graph drawing
df_result[:20].plot(figsize=(15, 3))

Unknown.png

Isn't it a good feeling!

Forecast using 2016 data

Data reading

Load the 2016 data using the same procedure as the 2014 data.

python


#Power demand
filename = "juyo-2016.csv"

df = pd.read_csv(filename,encoding="SHIFT-JIS",skiprows=2)
df.columns = ["DATE","TIME","KW"]
df.index = df.index.map(lambda x: dt.datetime.strptime(df.loc[x].DATE + " " + df.loc[x].TIME,"%Y/%m/%d %H:%M"))
df["MONTH"] = df.index.month
df["WEEK"] = df.index.weekday
df["HOUR"] = df.index.hour

#Use only for April
df_kw = df[df.index.month == 4]

#temperature
filename = "data-2016.csv"

df = pd.read_csv(filename,encoding="SHIFT-JIS",skiprows=4)[[0,1]]
df.columns = ["DATE","TEMP"]
df.index = df.index.map(lambda x: dt.datetime.strptime(df.loc[x].DATE,"%Y/%m/%d %H:%M:%S"))

df_temp = df

#Data join
d1 = df_kw.index.min()
d2 = df_kw.index.max()
df_kw["TEMP"] = df_temp.ix[d1:d2].TEMP

Data processing

python


#Acquisition of input / output data
X = df_kw[X_cols].as_matrix().astype('float')
y = df_kw[y_cols].as_matrix().astype('int').flatten()

X_test = scaler.transform(X)
y_test = y

Forecast

Predict and calculate scores using a model trained with 2014 data.

python


model.score(X_test,y_test)

The result was "0.82435418225963963", which was a little lower.

Confirmation of forecast results

python


#Prediction result
result = model.predict(X_test)

#Convert to data frame
df_result = pd.DataFrame({
    "y_test":y_test,
    "result":result
})

#Graph drawing
df_result.plot(figsize=(15, 3))

Unknown.png

I need a little more ingenuity (-_-;)

Recommended Posts

Predict power demand with machine learning Part 2
Machine learning starting with Python Personal memorandum Part2
Try to forecast power demand by machine learning
Machine learning starting with Python Personal memorandum Part1
Machine learning learned with Pokemon
Try to predict forex (FX) with non-deep machine learning
Machine learning with Python! Preparation
Predict the gender of Twitter users with machine learning
Machine learning Minesweeper with PyTorch
Beginning with Python machine learning
Try machine learning with Kaggle
Machine learning to learn with Nogizaka46 and Keyakizaka46 Part 1 Introduction
Try to predict if tweets will burn with machine learning
Feature Engineering for Machine Learning Beginning with Part 3 Google Colaboratory-Scaling
[PyTorch Tutorial ⑤] Learning PyTorch with Examples (Part 2)
I tried machine learning with liblinear
Machine learning with python (1) Overall classification
Try deep learning with TensorFlow Part 2
Try machine learning with scikit-learn SVM
[PyTorch Tutorial ⑤] Learning PyTorch with Examples (Part 1)
Quantum-inspired machine learning with tensor networks
Get started with machine learning with SageMaker
"Scraping & machine learning with Python" Learning memo
Manga Recommendations with Machine Learning Part 1 First, try dividing without thinking
Machine learning
Amplify images for machine learning with python
Machine learning imbalanced data sklearn with k-NN
Machine learning with python (2) Simple regression analysis
A story about machine learning with Kyasuket
[Shakyo] Encounter with Python for machine learning
Machine learning with Pytorch on Google Colab
Build AI / machine learning environment with Python
EV3 x Pyrhon Machine Learning Part 3 Classification
A beginner of machine learning tried to predict Arima Kinen with python
Machine learning memo of a fledgling engineer Part 1
Can Machine Learning Predict Parallelograms? (1) Can it be extrapolated?
Report_Deep Learning (Part 2)
Report_Deep Learning (Part 1)
Report_Deep Learning (Part 1)
Feature Engineering for Machine Learning Beginning with Part 2 Google Colaboratory-Logarithmic Transformation and Box-Cox Transformation
Predict short-lived works of Weekly Shonen Jump by machine learning (Part 2: Learning and evaluation)
[Machine learning] Supervised learning using kernel density estimation Part 2
EV3 x Pyrhon Machine Learning Part 1 Environment Construction
EV3 x Python Machine Learning Part 2 Linear Regression
Report_Deep Learning (Part 2)
[Machine learning] Supervised learning using kernel density estimation Part 3
[Python] Collect images with Icrawler for machine learning [1000 images]
Machine learning starting from scratch (machine learning learned with Kaggle)
Looking back on learning with Azure Machine Learning Studio
Machine learning memo of a fledgling engineer Part 2
[Memo] Machine learning
Classification of guitar images by machine learning Part 2
Machine learning classification
I started machine learning with Python Data preprocessing
Predict short-lived works of Weekly Shonen Jump by machine learning (Part 1: Data analysis)
Build a Python machine learning environment with a container
Machine Learning sample
I tried to move machine learning (ObjectDetection) with TouchDesigner
Machine learning with Raspberry Pi 4 and Coral USB Accelerator
Learn collaborative filtering along with Coursera Machine Learning materials
Run a machine learning pipeline with Cloud Dataflow (Python)