[PYTHON] I tried to verify the Big Bang theorem [Is it about to come back?]

What is the Big Bang theorem?

The theory that any word results in the "Big Bang"

Quote

I tried to verify the theory that every word reaches the big bang when the meaning is traced back in the dictionary. https://www.youtube.com/watch?v=CN7q1thA7mU

Implementation

This time, we will use "MediaWiki API" to get a list of articles linked to articles. Verify how many articles are linked to Big Bang articles

Source

python


url = "http://ja.wikipedia.org/w/api.php"
payload = {"format":"json", "action":"query", "list":"backlinks", "blnamespace":"0"}
payload['bltitle'] = word
r = requests.get(url, params=payload)

#json plastic surgery
json_load = r.json()
json_load = json.dumps(json_load)
json_load = json.loads(json_load)

#Partially cut out
json_load = json_load['query']['backlinks']

theList = []
#Loop for articles
for value in json_load:

	theDict = {}
	theDict['id'] = value['pageid']
	theDict['title'] = value['title']

	theDict['blTitle'] = word

	theDict['url'] = 'https://ja.wikipedia.org/wiki/' + value['title']
	theDict['floor'] = floor
	theDict['ns'] = value['ns']

	theList.append(theDict)

dataFrame = pd.io.json.json_normalize(theList)

reference

https://qiita.com/yubessy/items/16d2a074be84ee67c01f#記事へリンクしている記事の一覧を取得

inspection result

https://ja.wikipedia.org/wiki/Wikipedia:日本語版の統計 Increase the total number of articles to "1227198"

image.png

n = number of times to return to the big bang

Representative value

n=0 ·big Bang

n=1 ·Physics ·chronology ・ Cosmology

n=2 ·geography ・ Biology ·biology

Output result CSV

All acquisition results (with duplicate articles) https://github.com/Syogo-Suganoya/bigBanete/blob/master/downloads/record.csv

All acquisition results (no duplicate articles) https://github.com/Syogo-Suganoya/bigBanete/blob/master/downloads/uniqueRecord.csv

Conclusion

The article link loop occurred at the 10th trial, and the article coverage rate reached the ceiling. The coverage rate (Big Bangite rate) in Japanese articles of Big Bang is 0.0993%. The proposition "Any word results in the" Big Bang "" is a mistake.


github https://github.com/Syogo-Suganoya/bigBanete

Recommended Posts

I tried to verify the Big Bang theorem [Is it about to come back?]
I tried to find out the outline about Big Gorilla
I tried to organize about MCMC.
I tried to move the ball
I tried to estimate the interval.
I tried to verify the best way to find a good marriage partner
I tried to summarize the umask command
When I tried to run Python, it was skipped to the Microsoft Store
I tried to summarize the graphical modeling.
Matching karaoke keys ~ I tried to put it on Laravel ~ <on the way>
I tried to summarize the logical way of thinking about object orientation.
I tried to estimate the pi stochastically
I tried to touch the COTOHA API
I tried to verify and analyze the acceleration of Python by Cython
[Linux] I tried to verify the secure confirmation method of FQDN (CentOS7)
Since it is the 20th anniversary of the formation, I tried to visualize the lyrics of Perfume with Word Cloud
I tried to verify the result of A / B test by chi-square test
When I tried to change the root password with ansible, I couldn't access it.
I tried to rescue the data of the laptop by booting it on Ubuntu
I didn't understand the Resize of TensorFlow so I tried to summarize it visually.
I tried web scraping to analyze the lyrics.
I tried to optimize while drying the laundry
I tried to save the data with discord
I tried to correct the keystone of the image
Qiita Job I tried to analyze the job offer
LeetCode I tried to summarize the simple ones
I tried to implement the traveling salesman problem
I tried to predict the price of ETF
I tried to vectorize the lyrics of Hinatazaka46!
I tried to make OneHotEncoder, which is often used for data analysis, so that it can reach the itch.
[Introduction] I tried to implement it by myself while explaining the binary search tree.
I tried to scrape YouTube, but I can use the API, so don't do it.
I tried to make the phone ring when it was posted at the IoT post
I tried to verify whether the Natural Language API (sentiment analysis) supports net slang.
[Introduction] I tried to implement it by myself while explaining to understand the binary tree
I realized that it is nonsense to use the module without thinking because it is convenient.
I tried to make it easy to change the setting of authenticated Proxy on Jupyter
I tried to make a "fucking big literary converter"
I tried to graph the packages installed in Python
I tried to detect the iris from the camera image
I tried to summarize the basic form of GPLVM
I tried to touch the CSV file with Python
I tried to solve the soma cube with python
I want to inherit to the back with python dataclass
I tried to approximate the sin function using chainer
I tried to put pytest into the actual battle
[Python] I tried to graph the top 10 eyeshadow rankings
I tried to visualize the spacha information of VTuber
I tried to erase the negative part of Meros
I tried to solve the problem with Python Vol.1
I tried to simulate the dollar cost averaging method
I tried to redo the non-negative matrix factorization (NMF)
AI Gaming I tried it for the first time
I tried to identify the language using CNN + Melspectogram
I tried to notify the honeypot report on LINE
I tried to complement the knowledge graph using OpenKE
I tried to classify the voices of voice actors
I tried to compress the image using machine learning
I tried to summarize the string operations of Python
The sound of tic disorder at work is ... I managed to do it with the code
When I tried the AtCoder Beginner Contest, it was a terrible result, so I look back