[PYTHON] Predict the amount of electricity used in 2 days and publish it in CSV

Introduction

I have given lectures on how to predict power consumption using machine learning in various places, but the actually predicted values are not so correct, aren't they? After all, it's a PoC, right? We received various opinions, so I would like to publish the actually predicted value in CSV and have it verify its ability ...

Prediction method

In the past, we have published an example of predicting with multiple methods as an article of Qiita, so please have a look.

Power consumption forecast with Keras (TensorFlow)

Prediction target

Predict power usage after 2 days in the Chugoku Electric Power Area.

Verification of forecast results

The forecast results two days later are published on the website so that they can be compared with the published results of electricity usage by Chugoku Electric Power Company.

https://blueomega.jp/20200811_power_prediction_challenge/yyyy-mm-dd_.csv

If it is September 2, 2020, it will be the following URL. https://blueomega.jp/20200811_power_prediction_challenge/2020-09-02_.csv

You can compare by running the following script on Colaboratory.

python


import datetime as dt
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.metrics import r2_score

#Acquire actual data up to the day before the Chugoku Electric Power Area
url = "https://www.energia.co.jp/nw/jukyuu/sys/juyo-2020.csv"
df_juyo = pd.read_csv(url, skiprows=2, encoding="Shift_JIS")
df_juyo.index = pd.to_datetime(df_juyo["DATE"] + " " + df_juyo["TIME"])

#Acquire actual data of the day in the Chugoku Electric Power Area
d = dt.datetime.now() + dt.timedelta(hours=9) - dt.timedelta(days=0)
url = "https://www.energia.co.jp/nw/jukyuu/sys/juyo_07_" + d.strftime("%Y%m%d") + ".csv"
df_tmp = pd.read_csv(url, skiprows=13, encoding="Shift_JIS",nrows=24)
df_tmp.index = pd.to_datetime(df_tmp.DATE + " " + df_tmp.TIME)

#Get forecast data after August 31st
df = pd.DataFrame()
d = dt.datetime(2020,8,31)
while d < dt.datetime.now() + dt.timedelta(days=3):
  try:
    url = "https://blueomega.jp/20200811_power_prediction_challenge/" + d.strftime("%Y-%m-%d") + "_.csv"
    df = pd.concat([df, pd.read_csv(url)])
  except:
    print("No file.")
  d += dt.timedelta(days=1)

df.index = pd.to_datetime(df.pop("datetime"))

#Reflects the actual value up to the previous day
df["act"] = df_juyo["Performance(10,000 kW)"]

#Reflect the actual value of the day
for idx in df_tmp[df_tmp["Results on the day(10,000 kW)"] > 0].index:
  df.loc[idx, "act"] = df_tmp.loc[idx]["Results on the day(10,000 kW)"]

#Visualize forecasts and performance
df_plot = df.copy()
df_plot = df_plot[["act", "y2"]]
df_plot.columns = ["act", "pred tuned"]

df_plot["2020-08-31":].plot(figsize=(15,5), ylim=(300,1200))
plt.show()

#Coefficient of determination
df_scr = df[df.act > 0]
print("Coefficient of determination(R2 SCORE) : ", r2_score(df_scr.act, df_scr.y2))

This is the execution result as of 5 o'clock on September 3rd. Unknown-2.png Coefficient of determination (R2 SCORE): 0.9494716417755021

Forecasts are updated daily at 1:00 and 12:00, so please take a look if you are interested. We are also looking forward to hearing from those who need the same ability to make predictions.

Postscript

This is the execution result as of 6:00 on September 4th. Unknown-3.png Coefficient of determination (R2 SCORE): 0.9454478929760703

I wonder if it got a little worse ...

Recommended Posts

Predict the amount of electricity used in 2 days and publish it in CSV
[Note] Let's try to predict the amount of electricity used! (Part 1)
Read the csv file and display it in the browser
Scraping the schedule of Hinatazaka46 and reflecting it in Google Calendar
Verify the compression rate and time of PIXZ used in practice
Scraping the list of Go To EAT member stores in Fukuoka prefecture and converting it to CSV
Scraping the list of Go To EAT member stores in Niigata prefecture and converting it to CSV
Find it in the procession and edit it
The one that divides the csv file, reads it, and processes it in parallel
Scrap the published csv with Github Action and publish it on Github Pages
Find the number of days in a month
Fix the argument of the function used in map
If you define a method in a Ruby class and define a method in it, it becomes a method of the original class.
[Linux] I learned LPIC lv1 in 10 days and tried to understand the mechanism of Linux.
The result of making a map album of Italy honeymoon in Python and sharing it
Read the csv file with jupyter notebook and write the graph on top of it
Numerical representation of days of the week in various languages
Used from the introduction of Node.js in WSL environment
Full-width and half-width processing of CSV data in Python
Scraping the member stores of Go To EAT in Osaka Prefecture and converting them to CSV
Let's predict the timing of the bals and enjoy the movie slowly
The process of making Python code object-oriented and improving it
[Tips] Problems and solutions in the development of python + kivy
Explanation of CSV and implementation example in each programming language
[Python] The role of the asterisk in front of the variable. Divide the input value and assign it to a variable
Count the number of Thai and Arabic characters well in Python
[Python] How to get the first and last days of the month
Probability of getting the highest and lowest turnip prices in Atsumori
Notify the contents of the task before and after executing the task in Fabric
Use PyCaret to predict the price of pre-owned apartments in Tokyo!
Convert the result of python optparse to dict and utilize it
Predict the rise and fall of BTC price using Qore SDK
Get the title and delivery date of Yahoo! News in Python
Format the Git log and get the committed file name in csv format
Note that I understand the algorithm of the machine learning naive Bayes classifier. And I wrote it in Python.
[Python / Jupyter] Translate the comment of the program copied to the clipboard and insert it in a new cell
[Cliff in 2025] The Ministry of Economy, Trade and Industry's "DX Report 2" was published, so I read it.
Use Cloud Dataflow to dynamically change the destination according to the value of the data and save it in GCS
I compared the performance of Vaex, Dask, and Pandas in CSV, Parquet, and HDF5 formats (for single files).
Upload data to s3 of aws with a command and update it, and delete the used data (on the way)