[PYTHON] Power of forecasting methods in time series data analysis Semi-optimization (SARIMA) [Memo]

Overview

When analyzing certain power data, I was asked to "predict power demand using AR, MA, ARIMA, SARIMA models." Since it was the first prediction method in Python for the first time, I decided to brute force the parameters.

The parameters are order = (p, d, q) and seasonal_order = (p, d, q), and we decided to adopt the parameter with the smallest aic.

Source

I messed with what I found on stackoverflow and incorporated it. When I picked up the source, s was 12. Was it monthly data? This time, since the power data for one week is predicted from the training data for one month, s is 24, which is the smallest aic. Why···


import numpy as np
import pandas as pd
import statsmodels.api as sm
import statsmodels
import time

p = d = q = range(0,3)

import itertools

pdq = list(itertools.product(p, d, q))

seasonal_pdq = [(x[0], x[1], x[2], 12) for x in list(itertools.product(p, d, q))]

y=TargetData

minaic = 999999
minparam = 0

starttime = time.time()
for param in pdq:
    for param_seasonal in seasonal_pdq:
        try:
            print('Start fitting')
            mod = sm.tsa.statespace.SARIMAX(y,
                                            order=param,
                                            seasonal_order=param_seasonal,
                                            enforce_stationarity=False,
                                            enforce_invertibility=False)

            results = mod.fit()
        
            print('ARIMA{}x{}12 - AIC:{}'.format(param, param_seasonal, results.aic))

            if minaic > results.aic: 
                minaic = results.aic
                minparam = param
                minparams = param_seasonal
            print('aic minimum{} - aic minimumのp,d,q{} - aic minimumのp,d,q,s{}'.format(minaic, minparam,minparams))
            
        except:
            continue
        
duration = time.time() -starttime
print('p,d,End of estimation of q\n Measurement time:{}'.format(duration))

f = open('results_param(SARIMA).txt','w')
f.write('aic minimum{} - aic minimumのp,d,q{} - aic minimumのp,d,q,s{}'.format(minaic, minparam,minparams))
f.close()

Task

――I don't know if I was able to optimize by force --Evaluation required for optimality ――The speed is dramatically slower due to the force work, so the speed is improved.

Recommended Posts

Power of forecasting methods in time series data analysis Semi-optimization (SARIMA) [Memo]
Time series analysis 3 Preprocessing of time series data
Time series analysis 4 Construction of SARIMA model
Smoothing of time series and waveform data 3 methods (smoothing)
Time series analysis related memo
Python: Time Series Analysis: Preprocessing Time Series Data
What you should not do in the process of time series data analysis (including reflection)
Differentiation of time series data (discrete)
Comparison of time series data predictions between SARIMA and Prophet models
Forecasting time series data with Simplex Projection
Conversion of time data in 25 o'clock notation
Instantly illustrate the predominant period in time series data using spectrum analysis
Plot CSV of time series data with unixtime value in Python (matplotlib)
Python: Time Series Analysis: Building a SARIMA Model
Get time series data from k-db.com in Python
Acquisition of time series data (daily) of stock prices
View details of time series data with Remotte
How to read time series data in PyTorch
A well-prepared record of data analysis in Python
Summary of statistical data analysis methods using Python that can be used in business
[Introduction to element decomposition] Let's arrange time series analysis methods in R and python ♬
Power BI visualization of Salesforce data entirely in Python
Anomaly detection of time series data by LSTM (Keras)
List of Python code used in big data analysis
Python: Time Series Analysis
RNN_LSTM1 Time series analysis
Time series analysis 1 Basics
"Measurement Time Series Analysis of Economic and Finance Data" Solving Chapter End Problems with Python
How to calculate the sum or average of time series csv data in an instant
How to generate exponential pulse time series data in python
[Understand in the shortest time] Python basics for data analysis
Graph time series data in Python using pandas and matplotlib
A story about clustering time series data of foreign exchange
Time series analysis part 4 VAR
Time series analysis Part 3 Forecast
[Python] Plot time series data
Time series analysis Part 1 Autocorrelation
How to extract features of time series data with PySpark Basics
A simple data analysis of Bitcoin provided by CoinMetrics in Python
<Pandas> How to handle time series data in a pivot table