Parse the Researchmap API in Python and automatically create a Word file for the achievement list

Dear researchers, Thank you for your hard work today. It was awkward to copy and paste the achievements entered in the researchmap, so I tried to parse API. Let's go as far as creating a Word file from Python.

environment

Install python-docx

Please see here. But it's one line:

bash


pip install python-docx

JSON parsing

I'm a weak JSON person, so what are @ id and @ type? !! ?? !! I did it with tears. Please let me know if there is a more efficient method. (By the way, I don't really understand the scope of the word "parse". I think it refers to the point where you get JSON and retrieve the necessary information, but please tell me if you make a mistake. )

Get JSON

I referred to here.

python


import requests
import json

url = "https://api.researchmap.jp/kage"
response = requests.get(url)
jsonData = response.json()

You can get data like this. (When I used the print statement, it was written tightly without line breaks, so I didn't use it.) スクリーンショット 2020-11-17 0.20.40.png

Extract the required data from JSON

Identifying the location of performance data

I'm a weak JSON person, so scroll through the displayed JSON and take a closer look at the structure. It seems that performance data is stored in the part called @ graph. スクリーンショット 2020-11-17 0.24.05.png

Let's display some of the elements of @ graph.

python


for i in range(len(jsonData['@graph'])):
    print(i)
    print(jsonData['@graph'][i]['@type'])

Output result


0
research_interests
1
research_areas
2
research_experience
3
published_papers
4
books_etc
5
misc
6
presentations
7
awards
8
research_projects
9
education
10
teaching_experience
11
committee_memberships

Performance data is included in this. Let's extract the published_papers of element 3.

Display published_papers like a performance list

I was able to display it like that with the script below.

python


for i in range(len(jsonData['@graph'][3]['items'])):
    author_list = []
    authors = jsonData['@graph'][3]['items'][i]['authors']['en']
    title = jsonData['@graph'][3]['items'][i]['paper_title']['en']
    journal = jsonData['@graph'][3]['items'][i]['publication_name']['en']
    date =  jsonData['@graph'][3]['items'][i]['publication_date']
    vol = jsonData['@graph'][3]['items'][i]['volume']
    doi = jsonData['@graph'][3]['items'][i]['identifiers']['doi'][0]
    
    #Arrange the appearance of the author
    for j in range(len(authors)):
        author = ''.join(authors[j].values())
        author_list.append(author)
    
    print(', '.join(author_list))
    print(title)
    print(journal, vol, date)
    print('doi:', doi)
    print('')

スクリーンショット 2020-11-17 0.34.01.png You can copy and paste this into a Word file, but it's a big deal, so let's do our best.

Writing to a Word file

List with numbers in python-docx

I want to make a list of achievements with numbers. First, let's test the list writing. I referred to here.

python


from docx import Document

doc = Document()
docx_file = 'test.docx'

p0 = doc.add_paragraph('Item 1', style='List Number')
p1 = doc.add_paragraph('Item 2', style='List Number')

doc.save(docx_file)

A Word file has been created and the list has been written. スクリーンショット 2020-11-19 21.58.56.png

Format performance data for writing

A little change from the print statement above.

python


#List for storing all paper data
papers = []

for i in range(len(jsonData['@graph'][3]['items'])):
    author_list = []
    authors = jsonData['@graph'][3]['items'][i]['authors']['en']
    title = jsonData['@graph'][3]['items'][i]['paper_title']['en']
    journal = jsonData['@graph'][3]['items'][i]['publication_name']['en']
    date =  jsonData['@graph'][3]['items'][i]['publication_date']
    vol = jsonData['@graph'][3]['items'][i]['volume']
    doi = jsonData['@graph'][3]['items'][i]['identifiers']['doi'][0]
    
    #Arrange the appearance of the author
    for j in range(len(authors)):
        author = ''.join(authors[j].values())
        author_list.append(author)
    
    #Format the treatise and store it in the list
    paper = [', '.join(author_list), title, ', '.join([journal, vol, date]), ' '.join(['doi:', doi])]
    papers.append('\n'.join(paper))

スクリーンショット 2020-11-19 22.13.05.png Write the list papers to Word.

Writing data to a Word file

I referred to here for adding headings.

python


doc = Document()
docx_file = 'publication.docx'

#title
heading = doc.add_heading('Research achievements', level=1)
subheading = doc.add_heading('Original treatise', level=2)

#Treatise list
for paper in papers:
    p = doc.add_paragraph(paper, style='List Number')

doc.save(docx_file)

スクリーンショット 2020-11-19 22.21.11.png Data has been written to the Word file! The rest is just to arrange the fine appearance. While looking at here, let's make all the font colors black for the time being.

python


from docx.shared import RGBColor

doc = Document()
docx_file = 'publication.docx'

#title
heading = doc.add_heading('Research achievements', level=1)
heading.runs[0].font.color.rgb = RGBColor(0, 0, 0)
subheading = doc.add_heading('Original treatise', level=2)
subheading.runs[0].font.color.rgb = RGBColor(0, 0, 0)

#Treatise list
for paper in papers:
    p = doc.add_paragraph(paper, style='List Number')

doc.save(docx_file)

スクリーンショット 2020-11-19 22.28.31.png You can now create a Word file of your achievement list from researchmap without manual copy and paste!

reference

-About WebAPI -Read the old Word file (.doc) of the Gakushin DC application form from Python and operate it -Hit Web API in Python to parse JSON

Recommended Posts

Parse the Researchmap API in Python and automatically create a Word file for the achievement list
Automate background removal for the latest portraits in a directory with Python and API
[Python] Create a date and time list for a specified period
Create a binary file in Python
Create a CGH for branching a laser in Python (laser and SLM required)
Dig the directory and create a list of directory paths + file names
Change the list in a for statement
[GPS] Create a kml file in Python
Test & Debug Tips: Create a file of the specified size in Python
Create a GIF file using Pillow in Python
Created a Python wrapper for the Qiita API
Make a copy of the list in Python
Tips for hitting the ATND API in Python
Create a MIDI file in Python using pretty_midi
[Python] Create a list of date and time (datetime type) for a certain period
Create a REST API using the model learned in Lobe and TensorFlow Serving.
Create a Django project and application in a Python virtual environment and start the server
Create a clean DB for testing with FastAPI and unittest the API with pytest
Save the pystan model and results in a pickle file
Create a function in Python
Create a dictionary in Python
Create a local scope in Python without polluting the namespace
Download the file in Python
Create a list in Python with all followers on twitter
List method argument information for classes and modules in Python
Spit out a list of file name, last modified date and character code in python3
Create a child account for connect with Stripe in Python
Get the MIME type in Python and determine the file format
Sort and output the elements in the list as elements and multiples in Python.
Get the number of specific elements in a python list
Create a script for your Pepper skill in a spreadsheet and load SayText directly from the script
Recursively get the Excel list in a specific folder with python and write it to Excel.
Create a docx file with thumbnails of photos in the folder pasted with the Python python-docx library and the free software "Reduction only."
Useful tricks related to list and for statements in Python
[Note] Import of a file in the parent directory in Python
Create a Twitter BOT with the GoogleAppEngine SDK for Python
Create code that outputs "A and pretending B" in python
How to get the last (last) value in a list in Python
Python: Create a dictionary from a list of keys and values
Create a new list by combining duplicate elements in the list
Google search for the last line of the file in Python
Create a model to store information from the Google Books API for intuitive handling and testing
Output the key list included in S3 Bucket to a file
Create a Python environment for professionals in VS Code on Windows
Search the file name including the specified word and extension in the directory
[Python / Django] Create a web API that responds in JSON format
Create a striped illusion with gamma correction for Python3 and openCV3
Create a shell script to run the python file multiple times
Create a color picker for the color wheel with Python + Qt (PySide)
Specify or create a python folder and then save the screenshot.
[Introduction to Python] How to use the in operator in a for statement?
Read a file in Python with a relative path from the program
Python vba to create a date string for creating a file name
Get the formula in an excel file as a string in Python
Create and return a CP932 CSV file for Excel with Chalice
Getting the arXiv API in Python
Difference between list () and [] in Python
[python] Manage functions in a list
Create a DI Container in Python
Hit the Sesami API in Python
Create Gmail in Python without API