[PYTHON] Natural language processing for busy people

Overview for busy people

Inspired by the article Ore Program Ugokas Omae Genshijin Naru, I tried language processing. I like the ancient story "Series for busy people" [^ 1], so I used the summary API of COTOHA API to make it famous. I tried to make some songs for busy people.

Example: "Powder snow" for busy people [^ 2]

$ python3 youyaku.py < konayuki.txt
Lalarai. Powder snow. If your heart is dyed white.

It doesn't seem like "I'm not coming", but I'm convinced (?) That there are places where it appears frequently!

environment

Python 3.6.9

Implementation

Summarize to 3 lines using the COTOHA API Summary API. Most of the code is written with reference to the article of Genshijin, but BASE_URL is rewritten and parameters are changed for summarization. Also, if I gave the lyrics as they were, they wouldn't summarize at all, so I put "." In some places in the lyrics such as line breaks.

code

Click here to expand.

youyaku.py



import requests
import json
import sys

BASE_URL = "https://api.ce-cotoha.com/api/dev/"
CLIENT_ID = "Enter the ID obtained by COTOHA API"
CLIENT_SECRET = "Enter the password obtained by COTOHA API"


def auth(client_id, client_secret):
    token_url = "https://api.ce-cotoha.com/v1/oauth/accesstokens"
    headers = {
        "Content-Type": "application/json",
        "charset": "UTF-8"
    }

    data = {
        "grantType": "client_credentials",
        "clientId": client_id,
        "clientSecret": client_secret
    }
    r = requests.post(token_url,
                      headers=headers,
                      data=json.dumps(data))
    return r.json()["access_token"]


def summary(document, access_token, sent_len):
    base_url = BASE_URL
    headers = {
        "Content-Type": "application/json",
        "charset": "UTF-8",
        "Authorization": "Bearer {}".format(access_token)
    }
    data = {
        "document": document,
        "sent_len": sent_len
    }
    r = requests.post(base_url + "nlp/beta/summary",
                      headers=headers,
                      data=json.dumps(data))
    return r.json()


if __name__ == "__main__":
    document = "The lyrics are listed here"
    args = sys.argv
    if len(args) >= 2:
        document = str(args[1])

    access_token = auth(CLIENT_ID, CLIENT_SECRET)
    summary_document = summary(document, access_token, 3)
    result_list = list()
    for chunks in summary_document['result']:
      result_list.append(chunks)

    print(''.join(result_list))

result

"Ondo" Mito Komon "Ah, there are tears in life" [^ 3]

$ python3 youyaku.py < mitokomon.txt
If you don't like crying, walk now. There are tears and a smile in my life. Let's live in search of something.

"Doraemon no Uta" [^ 4]

$ python3 youyaku.py < doraemon.txt
Everyone, everyone, will come true. Ann An Ann. I really like Doraemon.

"That's important" [^ 5]

$ python3 youyaku.py < soregadaiji.txt
Don't lose, don't throw, don't run away, believe. When it seems to be ruined. That is the most important.

"Gatchaman's Song" [^ 6]

$ python3 youyaku.py < gachaman.txt
Gatchaman. fly. go.

"Makafushigi Adventure" [^ 7]

$ python3 youyaku.py < makafushigi.txt
DRAGONBALL。try。fly。

"The Birth of the Hero King!" [^ 8]

$ python3 youyaku.py < yushaou.txt
Gagagatsu. Gaogaigar!.. Gagagaga.

Summary

I tried to summarize various masterpieces. Good songs are short but deep

[^ 1]: [Nico Nico Pedia: Series for Busy People](https://dic.nicovideo.jp/a/%E5%BF%99%E3%81%97%E3%81%84%E4%BA % BA% E5% 90% 91% E3% 81% 91% E3% 82% B7% E3% 83% AA% E3% 83% BC% E3% 82% BA) [^ 2]: "Konayuki" Lyrics: Ryota Fujimaki [^ 3]: "Otodo" Mito Komon "Ah, there are tears in life" Lyrics: Michio Yamagami [^ 4]: "Doraemon no Uta" Lyrics: Takumi Kusube [^ 5]: "That's important" Lyrics: Toshiyuki Tachikawa [^ 6]: "Gatchaman no Uta" Lyrics: Ryuko Production Literary Club [^ 7]: "Makafushigi Adventure" Lyrics: Yuriko Mori [^ 8]: "Birth of the Hero King!" Lyrics: Yoshitomo Yonetani

Recommended Posts