[PYTHON] How strong is your Qiita? Statistics on the number of Contributes seen in the data

Overview

--I tried to analyze the statistical data of Qiita ――I made a quick reference table so that you can understand the power of Qiita --Contribute number, less than 20 people are about 48% of users writing articles ――My Qiita power was about the top 17%

Preface

Recently, I've been less likely to program at home, and how much is my technical ability? I started to think. So, I started thinking, "I'll do my best until the Contribute reaches 1000 in Qiita by the end of this year," in order to improve my technical skills. However, now I have about 157 Contributes. Given that, how difficult is it? Is it realistic? I thought about the feasibility. Therefore, I collected statistical data on the number of Contributes in Qiita and investigated how difficult it was.

Method

You can crawl one by one, but I extracted the desired data from the site called QiitaUserRanking.

https://qiita-user-ranking.herokuapp.com/

The number of Contributes and the data of the number of people were summarized. First of all, scraping is easy.

wget -O - https://qiita-user-ranking.herokuapp.com/chart | grep Bar | grep -o "\[.*\]" | sed "s/\],\[/\n/g" | grep -o "[0-9][0-9]*,[0-9][0-9]*" > data.csv

As the format of data.csv

Number of Contributes,そのNumber of Contributesのユーザーの人数

It is. This data is converted and converted to percentile according to the number of Contributes. Then, write a Python script that outputs according to the Qiita notation.

qiita_stats.py


if __name__=="__main__":
    data = [
        tuple(map(int,l.split(",")))
        for l
        in open("data.csv")
    ]

    all_num = sum(l[1] for l in data)

    for num in [10,20,30,40,50,60,70,80,90,100,200,300,400,500,600,700,800,900,1000,2000,3000,4000,5000]:
        under = sum(l[1] for l in data if l[0] < num)
        print "| %5d | %5d | %5.2f |" % (num,under,(1-float(under)/all_num)*100)
    
    #For ranking yourself
    num = 159
    under = sum(l[1] for l in data if l[0] < num)
    
    print "num %5d : %5d / %5d : %5.2f%%"  % (num,under,all_num,(1-float(under)/all_num)*100)

Here is the output table.

Number of Contributes Number of people Percentile
10 7767 67.19
20 11156 52.88
30 13046 44.89
40 14307 39.57
50 15275 35.48
60 16083 32.06
70 16735 29.31
80 17231 27.22
90 17686 25.29
100 18065 23.69
200 20174 14.78
300 21092 10.91
400 21684 8.41
500 22063 6.80
600 22347 5.61
700 22556 4.72
800 22718 4.04
900 22840 3.52
1000 22943 3.09
2000 23367 1.30
3000 23515 0.67
4000 23573 0.43
5000 23606 0.29

There are 7767 users with less than 10 Contributes. Therefore, users with 10 Contributes are in the top 67.19% of Qiita users. I will take the view. In another example, there are 22063 users with less than 500 Contributes. Therefore, users with 500 Contributes are in the top 6.80%.

Consideration

According to QiitaUserRanking, there are 23,674 people with Contribute of 1 or more. As you can see from the table above, users with 20 Contributes seem to be in the top 52.88%. Therefore, it can be seen that the upper and lower ranks are divided when the number of Contributes is about 20. Currently, my current number of Contributes is 159. As a result of calculating with the script,

num   159 : 19496 / 23674 : 17.65%

The data came out. Therefore, there are 19,496 users with less than 159 Contributes. And ** my ranking seems to be in the top 17.65%. ** **

Impressions

I was aiming

** Users with 1000 Contributes are in the top 3% of Qiita. ** **

I thought it was pretty tough. It's a personal story, but my article has an average Contribute of about 14 per article. Therefore, 72 articles are required to set ** Contribute to 1000. ** Since 2017 is calculated for 7 months, ** Monthly production 10-11 articles. If you write an article about 3 times a week, you will be in time. ** No, this is spicy. If anything, I wanted to write an article with a slightly higher quality or a strong pull, and increase the number of Contributes per article.

Recommended Posts

How strong is your Qiita? Statistics on the number of Contributes seen in the data
Analyzing data on the number of corona patients in Japan
In Python, change the behavior of the method depending on how it is called
How to get the number of digits in Python
How to find the optimal number of clusters in k-means
Count the number of characters in the text on the clipboard on mac
[Homology] Count the number of holes in data with Python
How to get an overview of your data in Pandas
How is the progress? Let's get on with the boom ?? in Python
Get the number of views of Qiita
Get the number of readers of a treatise on Mendeley in Python
[Java] [Linux] Investigating how the implementation of Java child processes on Linux is realized
How to count the number of elements in Django and output to a template
Set an upper limit on the number of recursive function iterations in Python
Visualize the timeline of the number of issues on GitHub assigned to you in Python
Output the number of CPU cores in Python
Find the number of days in a month
The story of reading HSPICE data in Python
Factfulness of the new coronavirus seen in Splunk
The transition of baseball as seen from the data
Check the status of your data using pandas_profiling
How to identify the element with the smallest number of characters in a Python list?
Find out the maximum number of characters in multi-line text stored in a data frame
How to check in Python if one of the elements of a list is in another list
Let's get notified of the weather in your favorite area from yahoo weather on LINE!
How to count the number of occurrences of each element in the list in Python with weight
Posted the number of new corona positives in Tokyo to Slack (deployed on Heroku)
How to output the number of VIEWs, likes, and stocks of articles posted on Qiita to CSV (created with "Python + Qiita API v2")
Count the number of parameters in the deep learning model
Try to estimate the number of likes on Twitter
About the inefficiency of data transfer in luigi on-memory
Get the size (number of elements) of UnionFind in Python
Not being aware of the contents of the data in python
Difference in results depending on the argument of multiprocess.Process
How to display the regional mesh of the official statistics window (eStat) in a web browser
How to specify an infinite number of tolerances in the numeric argument validation check of argparse
Let's get notified of the weather in your favorite area from yahoo weather on LINE! ~ PART2 ~
The image is displayed in the local development environment, but the image is not displayed on the remote server of VPS
How to calculate the sum or average of time series csv data in an instant
How to know the number of GPUs from python ~ Notes on using multiprocessing with pytorch ~
How to plot the distribution of bacterial composition from Qiime2 analysis data in a box plot
How to delete "(base)" that appears in the terminal when Anaconda is installed on Mac