[PYTHON] How to deal with problems that occur in future prediction using the SARIMA model (seasonal autoregressive integrated moving average model)

I had never written an article myself, but I couldn't find a solution to the problem I found this time, so I'll leave it as wisdom.

The problem was in the predictions using the SARIMA model used in time series analysis. (Since it is an explanation of the clogged part, please study how to use the SARIMA model in another place.)

Adapt the SARIMA model to a given time series data in the following way:

qiita.rb


SARIMA_2_1_3_011 = sm.tsa.SARIMAX(train.y, order=(2,1,3), seasonal_order=(0,1,1,20)).fit()

I tried to predict the future using this model as follows. (That is, the index from 169 to 206 is beyond the range of the original data and I wanted to predict it)

qiita.rb


pred3 = SARIMA_2_1_3_011.predict(start=169,end=206)

KeyError: 'The `start` argument could not be matched to a location related to the index of the data.'

Then, such an error occurred. Since I can't speak English, I can only understand the meaning, so I'm not sure, but the problem seems to be that the index range is exceeded. However, the site I referred to was able to predict the future without any problems.

From there, I made various changes and copied the error and jumped to an overseas question site to investigate, but I feel that I had a hard time for more than 30 minutes.

So, the prediction that I arrived at as a result is that "the appearance of the date, which is the index in the original data, is irregular, so it may not be possible to determine the index for predicting the future." (For example, if the index of the original data is 1/1, 1/2, 1/3, the next prediction will be 1/4, 1/5 ..., but it will be 1/1, 1/2, 1/4. If so, the next index (date) will not be decided)

Therefore, I think there are two solutions to this problem.

① Somehow adjust the regularity of the index (2) Combine empty data (but with index) with the original data to prepare the future index (of course, the objective variable remains empty).

Actually the second method worked, so I think these are probably the correct solutions.

It may have been difficult to see because I am writing an article for the first time, but please understand ... In addition, I think that there are many points that cannot be reached due to lack of power, but I would be grateful if anyone who noticed it could point out. I hope it helps anyone who is stuck with this problem.

Recommended Posts

How to deal with problems that occur in future prediction using the SARIMA model (seasonal autoregressive integrated moving average model)
[VLC] How to deal with the problem that it is not in the foreground during playback
How to deal with the problem that Japanese characters are garbled when outputting logs using JSON log formatter
How to deal with memory leaks in matplotlib.pyplot
How to deal with run-time errors in subprocess.call
How to deal with SessionNotCreatedException when using Selenium
[systemd] How to deal with the problem that fancontrol does not work after suspending
How to deal with pyenv initialization failure in fish 3.1.0
How to use the model learned in Lobe in Python
How to deal with Executing transaction: failed in Anaconda
How to deal with the phenomenon that Python (Jupyter notebook) executed on WSL becomes Aborted
How to deal with the problem that pandas 1.1.0 or later build fails on Alpine Linux
How to deal with the error that Docker's MySQL container fails to start on Docker Toolbox
How to deal with the terminal getting into the pipenv environment without permission when using pipenv with vscode
How to manipulate the DOM in an iframe with Selenium
How to generate a query using the IN operator in Django
[AWS] How to deal with "Invalid codepoint" error in CloudSearch
A story about how to deal with the CORS problem
For beginners, how to deal with common errors in keras
How to deal with the problem that the current directory moves when Python is executed from Atom
How to deal with the error "Failed to load module" canberra-gtk-module "that appears when you run OpenCV