[PyCPA] Python Data Science Practical Lecture 2nd Loose Awareness

Python Data Science Practical Course 2nd

1.First of all

This is a summary of my own loose points after participating in a study session for the CPA (Certified Public Accountant) community called PyCPA. Python Data Science Practical Course 2nd Powered by PyCPA 004.jpg Actually, I participated in the LT frame (the itinerary of programming learning and the goal in this lecture), but since it is an LT with only the momentum of "I will do my best!", I will not release the material: sweat:

As with Last time, this content has little to do with the content of the lecture, so if you are interested in the content, please use the following in the lecture. See the book and the free content on which this book was based. Book: The University of Tokyo Data Scientist Training Course Free content: GCI Data Scientist Training Course Exercise Content Public Page

2. I've heard of the Gini coefficient!

(1) Comprehensive problem 3-2

When I reread the textbook for review, I found that the general problem 3-2 was "Lorenz curve and Gini counting". This is also included in the free content of the University of Tokyo, so I will quote that. SS 2020-02-10 6.56.53.jpg

Gini coefficient: It seems to be the "Gini value" that appeared in a book I read a long time ago, saying "There is a risk of riot if the value represents unfairness of income exceeds 0.4".

The numerical value of the degree of inequality is called the Gini coefficient. This value is defined as twice the area of the area surrounded by the Lorenz curve and the 45 degree line, and takes a value from 0 to 1. The higher the value, the greater the degree of inequality.

Aside from the overall problem: sweat :, I'm curious about the Gini coefficient, so let's take a look (I hope it doesn't become a "clutter" all the time ...)

(2) Global Note data

First of all, I searched the site for Global Note, which is used when examining public data such as GDP comparison. I'm sorry to Global Note, but I only use data that can be viewed for free: p World Gini Coefficient Country Ranking / Transition SS 2020-02-10 7.12.28.jpg Japan is 0.34. The Nordic countries on stable routes have low Gini coefficients, and countries with high economic growth rates such as China, the United States, and South America are likely to have high Gini coefficients.

(3) OECD data

Since the Global Note only mentions the original data acquisition source as OECD, I will check the OECD data and look for something like that. I wonder if this is it. Income inequality SS 2020-02-10 7.25.40.jpg

This site seems to be very convenient. There is not much aging in Japan and the United States, but Estonia, which has rapidly become an electronic nation with IT in social infrastructure, seems to have a sharp drop in the Gini coefficient. SS 2020-02-10 7.33.50.jpg

(3) Bonus

There were easy-to-understand explanations on Lorenz curves and Gini counting on various sites including Wikipedia. Wiki Gini coefficient Gini coefficient in 5 minutes Indicators for measuring income inequality-Gini coefficient and Lorenz curve-

3. The overall problem may be interesting!

The answer to the comprehensive question can be found in Appendix.2 of Data Scientist Training Course at the University of Tokyo. It seems to be useful for studying Python code (although I haven't done it yet) I won't quote this, so let's do our best: smile:

Unfortunately, I can't participate next time on February 29th due to a prior contract, but I would like to write this series according to PyCPA. Python Data Science Practical Course 3rd Powered by PyCPA

I look forward to the 4th time! !!

Recommended Posts

[PyCPA] Python Data Science Practical Lecture 2nd Loose Awareness
I took Udemy's "Practical Python Data Science"
Data Science Cheat Sheet (Python)
"Data Science 100 Knock (Structured Data Processing)" Python-007 Explanation
"Data Science 100 Knock (Structured Data Processing)" Python-006 Explanation
"Data Science 100 Knock (Structured Data Processing)" Python-001 Explanation
"Data Science 100 Knock (Structured Data Processing)" Python-002 Explanation
[Python] Data Science 100 Knock (Structured Data Processing) 021 Explanation
"Data Science 100 Knock (Structured Data Processing)" Python-005 Explanation
"Data Science 100 Knock (Structured Data Processing)" Python-004 Explanation
[Python] Data Science 100 Knock (Structured Data Processing) 020 Explanation
[Python] Data Science 100 Knock (Structured Data Processing) 025 Explanation
"Data Science 100 Knock (Structured Data Processing)" Python-003 Explanation
[Python] Data Science 100 Knock (Structured Data Processing) 019 Explanation
[Python] 100 knocks on data science (structured data processing) 018 Explanation
[Python] 100 knocks on data science (structured data processing) 023 Explanation
[Python] 100 knocks on data science (structured data processing) 030 Explanation
[Python] 100 knocks on data science (structured data processing) 022 Explanation
[Survey] Kaggle --Data Science Bowl 2017, 2nd place solution
[Python] 100 knocks on data science (structured data processing) 017 Explanation
[Python] 100 knocks on data science (structured data processing) 026 Explanation
[Python] 100 knocks on data science (structured data processing) 016 Explanation
[Python] 100 knocks on data science (structured data processing) 024 Explanation
[Python] 100 knocks on data science (structured data processing) 027 Explanation
[Data science memorandum] Handling of missing values ​​[python]
[Python] 100 knocks on data science (structured data processing) 029 Explanation
[Python] 100 knocks on data science (structured data processing) 015 Explanation
[Python] 100 knocks on data science (structured data processing) 028 Explanation