I tried to explain how to get the article content with MediaWiki API in an easy-to-understand manner with examples (Python 3)

Background

I am making a Wiki site using MediaWiki, and I searched for an API to get update information etc. in Python, but there are several Qiita articles and ** MediaWiki Japanese version with subtle translation **, English Since there was only information on MediaWiki, I decided to write an article for those who are thinking about using API in the future. This API is basically ** usable for MediaWiki sites other than Wikipedia **, so it is quite versatile. Please note that the detailed explanation may be wrong and it is a little difficult to read because it is Qiita's first post ...

Parameters to use

These are the main parameters used to acquire the content of this article. Regarding the parameters, This Qiita article: Get Wikipedia information using MediaWiki API introduces in detail, so there are other things besides getting the article content. If you do, I think you should refer to it. Note: I can hit it from the link, but I use `` `requests``` for readability.

Parameters Description value example
format Output format json,xml... "format":"json"→ Output with json
action operation query,edit... "action":"query"→取得operationをする
prop Get article components revisions,links,images... "prop":"revisions"→ Get revision
title Article title Article title "titles":"Cat"→「Cat」という記事を取得する
rvprop Elements to get content... "rvprop":"content"→ Get the article text
list Get article list categorymembers,search... "list":"categorymembers"→ Get category members
cmtitle Category title Category name "cmtitle":"Nuko"→「Nuko」カテゴリを取得する
cmlimit Maximum number of acquisitions 1~500 "cmlimit":"100"→ Get up to 100

Implementation example 1

What you want to do

Get the article name with "Category: School Regulations of Chiba Prefecture" on the site kousokuwiki.org created by MediaWiki.

code

 coding: UTF-8
from urllib.request import Request, urlopen
from urllib.parse import urlencode
from urllib.error import URLError, HTTPError
import json
import requests
S = requests.Session()
# Declare the URL. In the case of wikipedia, it will be "https://jp.wikipedia.org/w/api.php".
URL = "https://kousokuwiki.org/w/api.php"

# Parameter settings.
PARAMS = {
    "action": "query",
 "cmtitle": "Category: School Regulations in Chiba Prefecture",
    "cmlimit": "500",
    "list": "categorymembers",
    "format": "json"
}

# Get information with get function
R = S.get(url=URL, params=PARAMS)
DATA = R.json()

# Extracting the required data from json
PAGES = DATA['query']['categorymembers']

# Extract only title information and store in list
Return_List=[]
for page in PAGES:
    Return_List.append(page['title'])
print(Return_List)

output

 [Chiba Prefectural Chiba Higashi High School Regulations, Chiba Prefectural Inba Akira Makoto High School Regulations, Chiba Prefectural Kokubun High School Regulations, Chiba Prefectural Narita International High School Regulations, Chiba Prefectural Matsudo International High School Regulations, Chiba Prefectural Kashiwanoha High School School rules ... (Omitted below)]

Implementation example 2

What you want to do

Get the contents of the article "School rules of Chiba Prefectural Kashiwanoha High School" on the site kousokuwiki.org created by MediaWiki.

code

 coding: UTF-8
from urllib.request import Request, urlopen
from urllib.parse import urlencode
from urllib.error import URLError, HTTPError
import json
import requests
S = requests.Session()
# Declare the URL. In the case of wikipedia, it will be "https://jp.wikipedia.org/w/api.php".
URL = "https://kousokuwiki.org/w/api.php"

# Parameter settings.
PARAMS = {
    "action": "query",
    "prop": "revisions",
 "titles": "School rules of Chiba Prefectural Kashiwanoha High School",
    "rvprop": "content",
    "format": "json"
}

# Get information with get function
R = S.get(url=URL, params=PARAMS)
DATA = R.json()

# Extracting the required data from json
CONTENT = DATA['query']['pages']
    
print(CONTENT)

output

 {'61': {'pageid': 61, 'ns': 0, 'title': 'School rules of Chiba Prefectural Kashiwanoha High School', 'revisions': [{'contentformat': 'text/x-wiki', 'contentmodel': 'wikitext', '*': 'Chiba Prefectural Kashiwanoha High School(Abbreviation:Kashiwanoha High School)Is[[Chiba]]Exists in[[Public school|public]]of[[high school]]<br>\n[[Notation classification]] 、[[Disclaimer]]を確認of上ご利用ください。\n\n=Basic rules=\n\n{| class="wikitable"\n!item\n!School rules\n!Actual situation\n!Remarks\n|-\n|Use smartphone\n|Possible\n|Possible\n|PC is also possible\n|-\n|Use SNS\n|Possible\n|Possible\n| 心得of規定Isあり\n|-\n|part time job\n|Limited\n|Possible\n| 要許可願Actual situationとして成績が悪くなければ可能。シフトIs基本午後8時まで、午後10時までに帰宅することが条件(Was written on the paper).. But it doesn't matter if you don't write it on paper(店側が学校に確認of連絡を取らない場合)、時間だって虚偽of申告してもバレなかった。\n|-\n|Driver's license\n|Limited\n|Limited\n|It is possible to request permission only after deciding the course\n|-\n|School rules改正\n|Unspecified\n|impossible\n|Staff decided\n|}\n\n=Hair and dress code=\n{| class="wikitable"\n!item\n!School rules\n!Actual situation\n!Remarks\n|-\n|Men's uniform\n| colspan="3" |blazer\n|-\n|Women's uniform\n| colspan="3" |blazer+スカート、スラックス可スラックス時ofみネクタイ可\n|-\n|Shoes(Off-campus)\n| colspan="3" |black,茶of革靴又Is運動靴\n|-\n|Shoes(School)\n| colspan="3" |指定of学年色of上履き\n|-\n|Men's socks\n| colspan="3" |紺又Is白ofソックス\n|-\n|Women's socks\n|colspan="3" |black又Is紺ofハイソックス\n|-\n|back\n| colspan="3" |華美でない機能的な高校生らしいもof\n|-\n|hair\n| colspan="3" |Perm decolorization dyeing curl hair etc. prohibited\n|-\n|Coats\n|Possible\n|Possible\n|華美でないもof 校舎内でIs脱ぐ\n|-\n|Mufflers\n|Possible\n|Possible\n| 華美でないもof 校舎内でIs脱ぐ\n|-\n|Sweaters\n|Possible\n|Possible\n| 学校指定ofもof\n|-\n|make up\n|impossible\n|Escape\n|make up マニキュア ピアス カラコン 装飾品Is一切禁止\n|}\n\n=そof他規定=\n\n=Article excerpt=\n[[Category:Chibaof校則]]'}]}}

Slice only the information you want further if necessary.

At the end

I explained the MediaWiki API, which lacks Japanese information, with examples. Other parameter explanations and implementation examples may be added in the future! We hope you find it useful.

Recommended Posts

I tried to explain how to get the article content with MediaWiki API in an easy-to-understand manner with examples (Python 3)
[Python] I tried to summarize the set type (set) in an easy-to-understand manner.
I will explain how to use Pandas in an easy-to-understand manner.
[Deep Learning from scratch] I tried to explain the gradient confirmation in an easy-to-understand manner.
I tried to get the authentication code of Qiita API with Python.
[Python] I tried to explain words that are difficult for beginners to understand in an easy-to-understand manner.
I tried to simulate how the infection spreads with Python
I tried "How to get a method decorated in Python"
I tried to create an article in Wiki.js with SQLAlchemy
[For beginners] I want to explain the number of learning times in an easy-to-understand manner.
I tried to find out how to streamline the work flow with Excel × Python, my article summary ★
How to get the date and time difference in seconds with python
I tried to summarize Cpaw Level1 & Level2 Write Up in an easy-to-understand manner
I tried to summarize Cpaw Level 3 Write Up in an easy-to-understand manner
I tried to get CloudWatch data with Python
How to get the files in the [Python] folder
I tried to display the analysis result of the natural language processing library GiNZA in an easy-to-understand manner
I tried to refactor the template code posted in "Getting images from Flickr API with Python" (Part 2)
To automatically send an email with an attachment using the Gmail API in Python
How to get a list of files in the same directory with python
How to get the number of digits in Python
Explain in detail how to make sounds with python
I tried to get started with blender python script_Part 01
I tried to touch the CSV file with Python
I tried to solve the soma cube with python
I tried to get started with blender python script_Part 02
I tried to implement an artificial perceptron with python
I tried to summarize how to use pandas in python
I tried hitting the API with echonest's python client
I tried to find out how to streamline the work flow with Excel x Python ②
I tried to find out how to streamline the work flow with Excel x Python ④
I tried to automate the article update of Livedoor blog with Python and selenium.
I tried to find out how to streamline the work flow with Excel x Python ①
I tried to find out how to streamline the work flow with Excel x Python ③
The 15th offline real-time I tried to solve the problem of how to write with python
I tried to find the entropy of the image with python
I tried to create API list.csv in Python from swagger.yaml
How to get the last (last) value in a list in Python
I tried to implement the mail sending function in Python
How to get into the python development environment with Vagrant
I tried to divide the file into folders with Python
I tried to get various information from the codeforces API
[Introduction to Python] How to get data with the listdir function
I made a class to get the analysis result by MeCab in ndarray with python
I tried to get the number of days of the month holidays (Saturdays, Sundays, and holidays) with python
I tried to create a Python script to get the value of a cell in Microsoft Excel
How to get the Python version
An easy way to hit the Amazon Product API in Python
I tried to describe the traffic in real time with WebSocket
I tried to solve the ant book beginner's edition with python
How to know the internal structure of an object in Python
How to get followers and followers from python using the Mastodon API
[Python] A memo that I tried to get started with asyncio
How to create a heatmap with an arbitrary domain in Python
Hit the New Relic API in Python to get the server status
I tried to get started with Bitcoin Systre on the weekend
[Python] Explains how to use the format function with an example
I tried to process the image in "pencil style" with OpenCV
[Python] I tried to get various information using YouTube Data API!
How to send a request to the DMM (FANZA) API with python
I tried to make an image similarity function with Python + OpenCV