[PYTHON] A study method for beginners to learn time series analysis

In my personal engineer study session 2019 Advent Calendar, "Introduction to Bayesian Statistics for Statistics Beginners" I wrote an article "How to study until you do" and it was very well received.

So, this time, I would like to write about studying for beginners in statistics to learn time series analysis.

Purpose and target audience of this article

This article aims to help those who have studied the basics of statistics to some extent be able to speak brilliantly on the big topic of "time series analysis" statistics.

The goal of this article is to be able to give you an overview of time series analysis.

Even in time series analysis, the goal is to be able to understand the analysis model called ** state space model **.

The books introduced here are books that you can learn by moving your hands, so knowledge of programming is essential.

What is time series analysis?

What is time series data?

Time series analysis, as the name implies, is a data series that incorporates the concept of time.

The world is full of time series data. I wonder if there is really no data that does not include the concept of time axis! ?? I even think that. Because when you throw the dice 10,000 times (not at that time), the time goes by. However, it is not data in chronological order that tries to ignore the time axis like when throwing dice 10,000 times.

A typical example of time series data is stock prices.

400px-Nikkei_225(1970-).svg.pn g [Nikkei Stock Average wikipedia](https://ja.wikipedia.org/wiki/%E6%97%A5%E7%B5%8C%E5%B9%B3%E5%9D%87%E6%A0%AA% E4% BE% A1) (I can feel it visually, ** the period of high economic miracle ** ...)

Some people may think that "time series analysis is natural language processing !!". To be sure, natural language processing is one of the most noticeable time series analyzes due to the recent development of machine learning. Everyone "Alexa, what is time series analysis?" When I talked to the stylish cylindrical interior, "Wow, ◯ su !!" The reply is due to the development of natural language processing technology.

However, we do not dare to deal with natural language processing here, only for more general time series data such as stock prices and sales data.

Time Series Analysis Topics and State Space Models

For the big picture of what time series analysis is, This is the book of "Basics of Time Series Analysis and State Space Model: Theory and Implementation Learned with R and Stan". It is described in detail in Mr. Baba's blog logic of blue. (It is no exaggeration to say that the content of this article is all contained in the content of this blog ...)

A state-space model is, very roughly speaking, a type of statistical model that assumes an invisible "state." The observed values are only the results produced from the state. From the state x t </ sub> before the t-1 point, the state x t </ sub> at the t point is generated. The observed value y t </ sub> is generated from the state x t </ sub> at this time t.

In the above Hayabusa (p.179), in the example of fishing, The number of fish in the lake one day is the state, and the number of fish caught on that day is the observed value.

Again, as in this example, it is an observation value with a state. However, since we do not know the state, we have no choice but to guess from the observed values.

Unlike state-space models, so-called general machine learning, it deals with states and observations statistically, so it is a topic of statistics.

It has a certain history, but in recent years it is a statistical model that is often talked about within the framework of Bayesian statistics. (For information on how to study Bayesian statistics, please refer to the article at the beginning.)

Study method

Basics of time series analysis

In time series analysis, the state-space model is a difficult model, not a "classical" method.

As a basis for time series analysis, it is recommended that you first learn "classical" methods such as the ARMA model to get an image.

It is recommended to get a rough atmosphere

51mEuEErMJL.SX352_BO1,204,203,200.jpg

『現場ですぐ使える時系列データ分析 ~データサイエンティストのための基礎知識』です。

The formulas are also easy, so you can read it quickly and get a feel for it.

Grasp the image of the state space model

If you can get a feel for time series analysis, then Hayabusa!["Basics of time series analysis and state space model: theory and implementation learned with R and Stan"](https://www.amazon.co.jp/ Go to dp / 4903814874 / ref = cm_sw_em_r_mt_dp_U_2qr2EbRBTCTC5).

516f3jVjO2L.SX353_BO1,204,203,200.jpg

The great thing about Hayabusa is that it explains the image of the state-space model without using mathematical formulas. This is something that great books on statistics and machine learning have in common, but by communicating images in words, the theory that tends to be dry is brought to life.

The explanation of Part 5, which uses the maximum likelihood estimation method for Kalman filter and smoothing, which is the peak of this book, is a masterpiece. Finally, Part 6 also deals with the estimation of state-space models by Bayesian inference.

Hayabusa is also disappointed with Mr. Baba's abstract thinking ability and verbalization ability. I would love to see you once.

A more detailed mathematical understanding of the state-space model

Once you have "completely understood the state-space model" in Hayabusa, let's move on to the hard-boiled mathematical world.

RecommendedTime series analysis from the basics-Kalman filter, MCMC, particle filter practiced in R

517UNNmUFtL.SX396_BO1,204,203,200.jpg

It's such a good book that I don't understand why some people give it a low rating. However, it may be a high hurdle for the first state space model. Also, since the story is developed in Bayesian statistics from the beginning, it may be difficult if you are not familiar with Bayes.

If you challenge yourself in a "completely understood" state, you will be taken to the world of "I don't understand anything". However, if you read it back and forth with Hayabusa, you will be able to understand the state space model deeply in the true sense of the word. Looking back at the Kalman filter learned in Hayabusa from the Bayesian statistical standpoint of this book, there is a very deep connection ...

Summary

Time series analysis is a big topic in statistics, but I think it is still an unfamiliar field. The world is so full of time series. Is it because it's difficult?

The key to time series analysis is that the future is created from the past. And the past and the future are somehow similar.

It turns around The times turn around

Recommended Posts

A study method for beginners to learn time series analysis
[For beginners] How to study Python3 data analysis exam
LSTM (1) for time series forecasting (for beginners)
Time series data anomaly detection for beginners
Python: Time Series Analysis: Building a SARIMA Model
How to use data analysis tools for beginners
[For beginners] How to study programming Private memo
Python: Time Series Analysis
RNN_LSTM1 Time series analysis
Time series analysis 1 Basics
Python 3.4 Create Windows7-64bit environment (for financial time series analysis)
I want to create a lunch database [EP1] Django study for the first time
I want to create a lunch database [EP1-4] Django study for the first time
Experiment to collect tweets for a long time (Program preparation (3))
Experiment to collect tweets for a long time (Program preparation (1))
Time series analysis related memo
I made a package to filter time series with python
Challenge to future sales forecast: ② Time series analysis using PyFlux
How to learn TensorFlow for liberal arts and Python beginners
Experiment to collect tweets for a long time (program preparation (2))
Experiment to collect tweets for a long time (Program preparation (5))
Time series analysis part 4 VAR
Time series analysis Part 3 Forecast
Challenge to future sales forecast: ⑤ Time series analysis by Prophet
~ Tips for beginners to Python ③ ~
Time series analysis Part 1 Autocorrelation
Created a method to downsample for unbalanced data (for binary classification)
Challenges for future sales forecasts: (1) What is time series analysis?
I want to create a Dockerfile for the time being.
[2020 version for beginners] Recommended study method for those who want to become an AI engineer by themselves
Experiment to collect tweets for a long time (immediately before execution)
[For beginners] How to register a library created in Python in PyPI
<Pandas> How to handle time series data in a pivot table
It's time to install DB with Docker! DB installation for beginners on Docker
Python: Time Series Analysis: Preprocessing Time Series Data
Time series analysis practice sales forecast
Movement statistics for time series forecasting
Time series analysis 3 Preprocessing of time series data
How to study for the Deep Learning Association G test (for beginners) [2020 version]
[For beginners] Script within 10 lines (5. Resample of time series data using pandas)
Tips for Python beginners to use Scikit-image examples for themselves 3 Write to a file
(Preserved version: Updated from time to time) A collection of useful tutorials for data analysis hackathons by Team AI
Time series analysis 2 Stationary, ARMA / ARIMA model
[Python] Minutes of study meeting for beginners (7/15)
How to compare time series data-Derivative DTW, DTW-
Time series analysis 4 Construction of SARIMA model
matplotlib Write text to time series graph
Beginners read "Introduction to TensorFlow 2.0 for Experts"
How to handle time series data (implementation)
Time series analysis # 6 Spurious regression and cointegration
A textbook for beginners made by Python beginners
Experiment to collect tweets for a long period of time (aggregation & content confirmation)
What I thought and learned to study for 100 days at a programming school
Introduction to Time Series Analysis ~ Seasonal Adjustment Model ~ Implemented in R and Python
Challenge to future sales forecast: ④ Time series analysis considering seasonality by Stats Models
I want to do machine learning even without a server --Time Series Edition -
A memorandum of method often used when analyzing data with pandas (for beginners)
A memorandum of method often used in machine learning using scikit-learn (for beginners)