[PYTHON] I have read 10 books related to time series data, so I will write a book review.

About the standing position of each book

I'm worried about what axis to explain the standing position, but this time I tried to position it as "discretion and prejudice" on the two axes of "target reader level" and "relationship with time series".

image.png

Introduction

At work, I use various methods for various data related to data analysis, but when I analyze it, there are surprisingly many time-series data.

Numerical predictions and anomaly detection are often recorded with the time. In order to gain knowledge about this time series data, I decided to read a comprehensive book and gain knowledge. This time, I would like to help you purchase your book by making a book review of "What is written in which book and what is the relationship with other books?"

"Check the standing position of this book" "If it's a time-series book, this book is in Akan" Please let me know if you have any.

This article has a similar concept to Book Guide for Time Series Analysis.

"Which is better, python or R?" For time series data

--Both python and R have abundant data analysis library packages, and the speed from paper to implementation is fast. --There are many similar things that can be visualized and functions in both python and R. --Python is advantageous for deep learning --Machine learning is led by python with excellent scikit-learn --R also has attractive auxiliary libraries such as tidymodels and recipes. ――R is the only “statistical language” for the “ease of execution” of tests and visualizations. --Stan for MCMC is available in both R and python -** R leads the time series modeling **

From the above, if you want to learn time series, we recommend choosing R language.

Book title and overview

11 books-as of April 2020

I will fly to amazon, but please feel free to refer to it as it does not include advertising revenue for me.

index title Description code Author / Publishing
1st book VAR empirical analysis learned in R Timeseriesmodelsandespeciallyvectorautoregressive(VAR)A book to understand using R's VAR function etc. R Hiroshi Murao:Ohmsha
2nd book Finance machine learning A good book that explains analysis methods and ideas related to finance while implementing functions in python python Marcos Lopez de Prado:Financial and Financial Situation Study Group
3rd book Econometrics A book that combines "Time Series Modeling" with "Knowledge Dictionary for Data Analysis" that explains carefully with figures and mathematical formulas so that it is friendly not only to time series analysis and modeling but also to fledgling data science. None Yoshihiko Nishiyama:Yuhikaku
4th book Basics of time series analysis and state space model LearnthestatespacemodelusingRandstan.Startingwiththesimplestmodel,stepbystepVAR,Abookthatallowsyoutolearnevenstate-spacemodels.Easy-to-understandstatisticsandtime-seriesblog"logicsofblue"withcodeAuthor's book R Shinya Baba:Pleiades Publishing
5th book Weighing time series analysis of economic and finance data A book with a strong theory. I chose this book as the first book, but I was disappointed. Introductory Time Series Modeling After learning in the book, it is good to use it to supplement the theory. None Tatsuyoshi Okimoto:Asakura Shoten
6th book Time-series data analysis ready to use in the field A book where you can learn simple and basic time series modeling techniques using R. Because of its simplicity, it is undeniable that it can be covered by other time-series books. R Daisuke Yokouchi:Technical Review Company
7th book Time series analysis of point process A book on probability distributions and some modeling used in time series analysis. It may be better to refer to it as a dictionary. The priority is low. None Takahiro Omi:Kyoritsu Shuppan
8th book Econometric analysis by R A book that introduces autoregressive models to GARCH and VAR, starting with the operation of R. Since the number of sheets is small, it is recommended for those who want to handle time series modeling for the time being. R Junichiro Fukuchi:Asakura Shoten
9th book Econometrics for empirical analysis Other books include the contents of predictive modeling, but in the first place, it is important to verify that data science is "effective? Does it include bias? Is the coefficient meaningful?" A book that reminded me. "Empirical analysis" where you learn while using real-world problems. This book is not time-series modeling, but is recommended for beginners in data analysis. None Isamu Yamamoto:Chuokeizai-sha
10th book Time series analysis from the basics As you can see from the fact that the subtitle has a filter and MCMC, this is a book specializing in state space models. Autoregressive modeling is just an introduction. A book that explains with plenty of figures and codes. It may be more suitable for moving hands than for formulas and theories. R Junichiro Hagiwara:Technical Review Company

About time series prediction methods and types

The methods used to predict time-series data, including time-series modeling, can be broadly divided into four methods (** arbitrarily **).

  1. Autoregressive system
  2. State space system
  3. Machine learning regression
  4. Deep learning

1. Autoregressive system

AR, ARMA, ARIMA, SARIMA, ARIMAX, SARIMAX, ARCH, GARCH, VAR

It is a group of time-series modeling methods that have been around for a long time, and it is not obsolete enough to be used even in modern times.

Time-series data expresses time-series data as a function by decomposing it into several components such as "trend", "autocorrelation (cycle)", "season (periodic cycle)", and "error". Method.

image.png

↑ It is decomposed like this (the figure is the output of prophet).

In order to decompose the data into these components, engineering such as ** autoregressive, unit root test, d-order difference, model selection by AIC **, etc. is required.

It may seem difficult, but in modern times the hurdles are low because the modeling of autoregressive systems is basically automated. However, theoretical knowledge is also required for "understanding and consideration" of modeling results.

If you don't know "what theory do you calculate?" ・ Is the model wrong as a result of automatic determination? ・ Is it data that can be modeled by autoregressive in the first place? You will overlook important parts such as.

(I read several books and realized the importance of understanding the theory of machine learning.)

2. State space system

We will model time series data based on the idea that it can be obtained by "probabilistically adding a value" to a numerical value called "state". Consider two mathematical formulas, the "observation equation" and the "equation of state". When trying to model the temperature in Japan, we can first consider the periodic waveform of hot summers and cold winters.

However, one summer there was a lot of rain and the average temperature was falling.

When the periodic waveform is treated as a Japanese temperature model, such changes cannot be predicted, but If the waveform is corrected from the record of "whether it rained", the decrease in temperature may be reflected. It is a method of creating a model based on the idea.

"Did it rain?" Here is created as an equation expressing "state". Since it is considered that the "state" affects the data, there is an advantage that it becomes easier to explain the change by checking the value of the state when considering "why the value was reached".

The word "filter" that appears in the state space model refers to the algorithm used to create and correct "states". In the process of creating a state-space model, the probability density function, which corresponds to posterior probabilities, becomes complicated, and it is necessary to generate random numbers by MCMC in order to estimate its parameters.

3. Machine learning regression

There is a regression model as a machine learning method. This is a method of predicting numerical values using peripheral explanatory variables and past time points, instead of time-series modeling. In the old days, methods such as SVM (SVR) and recently GBDT and LightGNM have been active in data analysis competitions, etc., and have a wide range of applications such as "solving classification problems, quick calculation, and handling multivariates". I feel that it is used casually.

If you know time series modeling, know the theory, and it is easy to formulate, select time series modeling. If you find it multivariate or difficult to formulate, try regressing with another machine learning model.

Isn't it a good approach?

Read time series books to help you decide if you should choose time series modeling. If you want to know about machine learning (classification / regression), I will introduce the following books.

index title Description code Author / Publishing
11th book Data analysis technology that wins with Kaggle A cohesive book on how to use the techniques used in the kaggle competition python Daisuke Kadowaki:Technical Review Company

4. Deep learning

When using deep learning for series data, RNNs and improved LSTMs are often used. Sometimes, in the case of multivariate, it may be solved as regression by multi-layer NN from the idea introduced in "3. Regression of machine learning". This is a common problem in deep learning, but it can be a problem when asked to explain predicted values.

I don't have any knowledge about books that I can introduce yet, so I will omit it. (Is a book like Mr. Nest basket good? We are waiting for a book recommendation)

If you read 11 books and give some advice to yourself at the beginning

If you would like to recommend it to a complete beginner who will start time series modeling / analysis in the future, do the following. I didn't think about "prediction (regression) with a machine learning model" here, but just thought about a way to understand "time series modeling".

--Basics of time series analysis and state space model (cover of Hayabusa) --Simple mathematical formulas and simple explanations of the basics of time series and autoregressive models -** AR, ARMA, ARIMA, SARIMA, ARIMAX, SARIMAX, VAR (partial), state space, Kalman filter, MCMC ** -** You can know all the basics such as unit root, ADF test ** --Story tailored and easy to read --Personally, the text was easy to understand (easy to read) --Econometric analysis by R --There is also an explanation of how to touch and operate the R language. --Getting to know the autoregressive model understood by Hayabusa and ** ARCH, GARCH ** --Time series analysis from the basics --Further squeeze the knowledge of state space -** Kalman filter, particle filter, MCMC ** --Read a supplementary reader about the Kalman filter if you want to follow the path of time series modeling --VAR empirical analysis learned from R --Use the VAR function with R --Understanding multivariate time series modeling techniques (** VAR **)

I think that you have enough power to fight using the time series modeling method so far.

For further progress and growth

--Measurement time series analysis of economic and finance data --Understand the mathematical background of time series modeling

If you read from a beginner to a policy you want to grow

--Econometrics --Understanding the concept of data analysis, statistics, and machine learning from the basics --If you want to move on to finance and economy, "** Finance Machine Learning " ――If you want to improve as a machine learning engineer, " win with kaggle **"

that's all

So far, I have read the time-series books and thought about their impressions, book reviews, and recommendations.

・ "Time-series data analysis that can be used immediately in the field" is for the basics, and it seems that the priority is low because there are few introductions of time-series modeling methods probably because it is old ・ Although "Econometric Analysis by R" is an old book, there are many time-series modeling methods to handle, and basic R operation methods are also described, so I thought that it could be absorbed in many cases, so I raised the priority a little.

I hope it helps you in your learning.

Recommended Posts

I have read 10 books related to time series data, so I will write a book review.
I have read 13 books that connect data science to business results, so I will write a book review.
I tried to make a calculator with Tkinter so I will write it
How to read time series data in PyTorch
I have read a survey paper on time-series anomaly detection, so I will summarize it.
I made a package to filter time series with python
<Pandas> How to handle time series data in a pivot table
Introducing books related to data science.
matplotlib Write text to time series graph
How to handle time series data (implementation)
Books on data science to read in 2020
I don't have time, so can I just ask a question? feat. COTOHA API
I want to do machine learning even without a server --Time Series Edition -
I didn't have to write a decorator in the class Thank you contextmanager
Why does Python have to write a colon?
I read "How to make a hacking lab"
I want to write to a file with Python
Every time I try to read a csv file using pandas, I get a numpy error.