I have 0 years of programming experience and challenge data processing with python

First, briefly introduce yourself. I started studying data science in May 2020.

・ It is the first time to touch the programming language itself until May 2020 ・ Since Excel is often used for work, it is a level that can handle simple functions.

When I was studying data science, I thought There are few places to practice data processing, which seems to be the most burdensome in practice! !! That is.

Meanwhile, around June, the Data Scientist Association uploaded the optimal issues on GitHub! Quote: General Incorporated Association Data Scientist Association Data Science 100 Knock (Structured Data Processing) https://github.com/The-Japan-DataScientist-Society/100knocks-preprocess

As a first step, I would like to try this 100 knocks with Python, SQL, R without looking at the answer code. As mentioned above, since I am a genuine amateur when it comes to programming, there may be a lot of fucking code, but please take a warm look.


P-001: Display the first 10 items of all items from the data frame (df_receipt) of the receipt details, and visually check what kind of data you have.

In



df_receipt.head(10)

Output result: スクリーンショット 2020-09-05 18.40.20.png

P-002: Specify columns in the order of sales date (sales_ymd), customer ID (customer_id), product code (product_cd), and sales amount (amount) from the receipt statement data frame (df_receipt), and display 10 items.

In



df_clms = df_receipt[["sales_ymd", "customer_id", "product_cd", "amount"]]
df_clms.head(10)

Output result: スクリーンショット 2020-09-05 18.43.40.png

I will update it when I have time.

Recommended Posts

I have 0 years of programming experience and challenge data processing with python
Full-width and half-width processing of CSV data in Python
Challenge principal component analysis of text data with Python
Image processing with Python (I tried binarizing it into a mosaic art of 0 and 1)
I tried to compare the processing speed with dplyr of R and pandas of Python
Get rid of dirty data with Python and regular expressions
I played with PyQt5 and Python3
Coexistence of Python2 and 3 with CircleCI (1.0)
I compared the speed of Hash with Topaz, Ruby and Python
Recommended books and sources of data analysis programming (Python or R)
Speed comparison of Wiktionary full text processing with F # and Python
I tried to teach Python to those who have no programming experience
Basics of binarized image processing with Python
Data pipeline construction with Python and Luigi
Dealing with "years and months" in Python
I installed and used Numba with Python3.5
Drawing with Matrix-Reinventor of Python Image Processing-
Recommendation of Altair! Data visualization with Python
Example of efficient data processing with PANDAS
I replaced the numerical calculation of Python with Rust and compared the speed
Rehabilitation of Python and NLP skills starting with "100 Language Processing Knock 2015" (Chapter 1)
I measured the speed of list comprehension, for and while with python2.7.
I tried to get and analyze the statistical data of the new corona with Python: Data of Johns Hopkins University
Python practice data analysis Summary of learning that I hit about 10 with 100 knocks
I tried hundreds of millions of SQLite with python
[Python] I introduced Word2Vec and played with it.
I made a competitive programming glossary with Python
[Python] I played with natural language processing ~ transformers ~
I tried Jacobian and partial differential with python
I tried to get CloudWatch data with Python
I tried function synthesis and curry with python
Implementation of TRIE tree with Python and LOUDS
I started machine learning with Python Data preprocessing
I / O related summary of python and fortran
Continuation of multi-platform development with Electron and Python
Practice of creating a data analysis platform with BigQuery and Cloud DataFlow (data processing)
Example of reading and writing CSV with Python
Rehabilitation of Python and NLP skills starting with "100 Language Processing Knock 2015" (Chapter 2 second half)
Rehabilitation of Python and NLP skills starting with "100 Language Processing Knock 2015" (Chapter 2 first half)
Get a large amount of Starbucks Twitter data with python and try data analysis Part 1
I created a stacked bar graph with matplotlib in Python and added a data label
For those who are new to programming but have decided to analyze data with Python
I just wanted to extract the data of the desired date and time with Django
Try to solve the programming challenge book with python3
List of Python libraries for data scientists and data engineers
Notes on HDR and RAW image processing with Python
I want to handle optimization with python and cplex
[OpenCV / Python] I tried image analysis of cells with OpenCV
Easy partial download of mp4 with python and youtube-dl!
[Chapter 5] Introduction to Python with 100 knocks of language processing
Visualize the range of interpolation and extrapolation with python
Overview and tips of seaborn with statistical data visualization
[python] Calculation of months and years of difference in datetime
I checked out the versions of Blender and Python
I made a LINE BOT with Python and Heroku
[Chapter 3] Introduction to Python with 100 knocks of language processing
[Chapter 2] Introduction to Python with 100 knocks of language processing
Python asynchronous processing ~ Full understanding of async and await ~
Process csv data with python (count processing using pandas)
Investigate Java and python data exchange with Apache Arrow
I tried to analyze J League data with Python