[PYTHON] Let's try analysis! ~ Data scientists also started coding ~ By Fringe81

Introduction

Before getting into the first part, I will briefly explain the purpose and composition of the entire series. The main purpose of this series is to provide the following knowledge and skills:

--Data collection / processing / aggregation / visualization technology (from tool installation to usage). --Knowledge of machine learning, statistics, and simple practical examples (some mathematical explanations are also available).

Business process by providing this knowledge in the practical use scene We would like to provide hints for automation, speeding up, stabilization, and scaling.

Also, try to explain as carefully as possible, and actually use the tools and analysis methods yourself. One of the aims is to be able to use it.

This series is as shown in the table of contents below (subject to change) It consists of an introductory edition, a practical edition, and an advanced topic (planned).


Introductory edition

Part 1 Introduction

-Chapter 1: Utilization of Analytical Technology-From a Business Perspective -Chapter 2: Let's take a look at the analysis method!

Part 2 Introduction to Analysis Learned in Excel

-Chapter 3: Cooperation between Excel and MySQL -Chapter 4: Linking Excel and MySQL 2 importing csv data -Chapter 5: Utilization of Excel analysis tools and solvers (regression, least squares method) -Chapter 6: Application of Excel Solver (Portfolio Optimization) -Bonus 2-1: Reinforcement learning with Excel VBA (Q-learning, ε-greedy / softmax action selection) -Bonus 2-2: Monte Carlo simulation with Excel VBA (Metropolis method: Example of 2D Ising model)

Part 3 Data visualization (not limited to big data)

-Chapter 7: Visualize data using Google Charts Sankey Diagram

Practical edition

Part 4 Beginning with analysis with Python / Scala

-Chapter 8: Analysis environment created with Python and Eclipse (PyDev) for Windows

Part 5 Let's use the numerical calculation library

Part 6 Let's use Hadoop (Streaming)

Part 7 Play with Scala's Spark / Shark

Part 8 Let's play with the twitter API

Part 9 Let's play with the Facebook API

Part 10 Optimization Analysis

Part 11 Time Series Analysis (Autocorrelation Analysis, Cross Correlation Analysis)

Part 12 Network Analysis

Part 13 Machine learning Supervised learning (clustering)

Part 14 Machine learning Unsupervised learning (reinforcement learning)

Part 15 Monte Carlo Simulation

Advanced topic (provisional)

Part 16 Financial Engineering and Particle Physics (Theory)

Part 17 Econophysics (Theory)

Appendix -A-1: Install CUDA6.5 on Windows 7 Professional + Visual Studio Express 2013


from now on

In the introductory part, we will start by using Excel, which seems to be familiar to many people. We will gradually step up with the use of analysis tools and solvers, and MySQL integration.

After explaining how to use the tools, we will explain the analysis method using them. At the end of the introductory part, we will also introduce visualization tools. In the practical edition following the introductory edition, You can handle more complicated data processing by utilizing scripting language and API. Introducing analytical technology.

The advanced topic is different from the introductory and practical editions, and is about the stochastic process. I will cover the theoretical side. This changes with the time of the world Data has probabilistic behavior, and the knowledge of this concept is the business scene. However, it may be helpful.

The above is the overall picture of this series.

The purpose of this series is to introduce specific analytical skills, Before we dive into the details, we'll briefly discuss the following in the remaining chapters of Part 1.

--Advantages of using analysis methods for the business cycle. --Overview of various analytical methods.

In the articles after the second part, how to use each tool specifically We will go into the explanation of individual analysis methods.

Recommended Posts

Let's try analysis! ~ Data scientists also started coding ~ By Fringe81
First satellite data analysis by Tellus
Let's play with the corporate analysis data set "CoARiJ" created by TIS ①
Let's play with the corporate analysis data set "CoARiJ" created by TIS ②
[Data analysis] Let's analyze US automobile stocks
Analysis for Data Scientists: Qiita Self-Article Summary 2020