[PYTHON] To people who are "recruiting but not recruiting"

** Twitter introduces artificial intelligence and articles written in other media **, so if you want to know more about artificial intelligence, etc. ** Feel free to follow! ** **

<!-<Updated on 2/25 (Tue)> ** Added Shinjiro Koizumi syntax [https://qiita.com/omiita/items/0f811f15e569bf2539b8#6-%E7%95%AA%E5%A4%] 96% E7% B7% A8% E5% B0% 8F% E6% B3% 89% E9% 80% B2% E6% AC% A1% E9% 83% 8E% E6% A7% 8B% E6% 96% 87) Did. **->

1. I teach, but I don't explain

Prime Minister Abe's ** "I am recruiting, but not recruiting" ** remark Inspired by **, I made a program that automatically converts the entered sentence into a sentence that says "I am recruiting but not recruiting" **!

If you enter "Recruit people", it will be converted to the sentence "We are recruiting people, but we are not recruiting".

`Example of seeing cherry blossoms`


$ python abe.py "See the cherry blossoms"
I see the cherry blossoms, but I haven't seen them

2. I use it, but I don't use it

What I used	Use
Python 3.7.0	code
COTOHA API	Morphological analysis and similarity calculation
WordNet	Synonyms
IPA dictionary	Verb conjugation

3. I teach you the mechanism in detail, but I do not explain it in detail.

3.1 Extract nouns and verbs

ステップ1

3.2 List synonyms

ステップ2

3.3 Extract only nouns

ステップ3

3.4 Measurement of similarity between original text and synonyms

ステップ4

Convert to 3.5 continuous connection

ステップ5

3.6 Combine

ステップ6

4. Shows the code but does not show it

Code (click)

Import

`abe.py`


# -*- coding:utf-8 -*-

import os
import urllib.request
import json
import configparser
import codecs
import csv
import sys
import sqlite3
from collections import namedtuple
import types

COTOHA

`abe.py`


#/_/_/_/_/_/_/_/_/_/_/_/_/COTOHA_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/
#The code for this part is taken from here.
# https://qiita.com/gossy5454/items/83072418fb0c5f3e269f

class CotohaApi:
    #Initialization
    def __init__(self, client_id, client_secret, developer_api_base_url, access_token_publish_url):
        self.client_id = client_id
        self.client_secret = client_secret
        self.developer_api_base_url = developer_api_base_url
        self.access_token_publish_url = access_token_publish_url
        self.getAccessToken()

    #Get access token
    def getAccessToken(self):
        #Access token acquisition URL specification
        url = self.access_token_publish_url

        #Header specification
        headers={
            "Content-Type": "application/json;charset=UTF-8"
        }

        #Request body specification
        data = {
            "grantType": "client_credentials",
            "clientId": self.client_id,
            "clientSecret": self.client_secret
        }
        #Encode request body specification to JSON
        data = json.dumps(data).encode()

        #Request generation
        req = urllib.request.Request(url, data, headers)

        #Send a request and receive a response
        res = urllib.request.urlopen(req)

        #Get response body
        res_body = res.read()

        #Decode the response body from JSON
        res_body = json.loads(res_body)

        #Get an access token from the response body
        self.access_token = res_body["access_token"]


    #Parsing API
    def parse(self, sentence):
        #Parsing API URL specification
        url = self.developer_api_base_url + "v1/parse"
        #Header specification
        headers={
            "Authorization": "Bearer " + self.access_token,
            "Content-Type": "application/json;charset=UTF-8",
        }
        #Request body specification
        data = {
            "sentence": sentence
        }
        #Encode request body specification to JSON
        data = json.dumps(data).encode()
        #Request generation
        req = urllib.request.Request(url, data, headers)
        #Send a request and receive a response
        try:
            res = urllib.request.urlopen(req)
        #What to do if an error occurs in the request
        except urllib.request.HTTPError as e:
            #If the status code is 401 Unauthorized, get the access token again and request again
            if e.code == 401:
                print ("get access token")
                self.access_token = getAccessToken(self.client_id, self.client_secret)
                headers["Authorization"] = "Bearer " + self.access_token
                req = urllib.request.Request(url, data, headers)
                res = urllib.request.urlopen(req)
            #Show cause for errors other than 401
            else:
                print ("<Error> " + e.reason)

        #Get response body
        res_body = res.read()
        #Decode the response body from JSON
        res_body = json.loads(res_body)
        #Get analysis result from response body
        return res_body


    #Similarity calculation API
    def similarity(self, s1, s2):
        #Similarity calculation API URL specification
        url = self.developer_api_base_url + "v1/similarity"
        #Header specification
        headers={
            "Authorization": "Bearer " + self.access_token,
            "Content-Type": "application/json;charset=UTF-8",
        }
        #Request body specification
        data = {
            "s1": s1,
            "s2": s2
        }
        #Encode request body specification to JSON
        data = json.dumps(data).encode()
        #Request generation
        req = urllib.request.Request(url, data, headers)
        #Send a request and receive a response
        try:
            res = urllib.request.urlopen(req)
        #What to do if an error occurs in the request
        except urllib.request.HTTPError as e:
            #If the status code is 401 Unauthorized, get the access token again and request again
            if e.code == 401:
                print ("get access token")
                self.access_token = getAccessToken(self.client_id, self.client_secret)
                headers["Authorization"] = "Bearer " + self.access_token
                req = urllib.request.Request(url, data, headers)
                res = urllib.request.urlopen(req)
            #Show cause for errors other than 401
            else:
                print ("<Error> " + e.reason)

        #Get response body
        res_body = res.read()
        #Decode the response body from JSON
        res_body = json.loads(res_body)
        #Get analysis result from response body
        return res_body

Convert to continuous connection

`abe.py`


#/_/_/_/_/_/_/_/_/_/_/_/_/CONVERSION_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/

def convert(word):
    file_name = "./data/Verb.csv"
    with open(file_name,"r") as f:
        handler = csv.reader(f)
        for row in handler:
            if word == row[10]: #Part of speech discovery
                if "Continuous connection" in row[9]: #Utilization discovery
                    return row[0]
    return None

Synonyms

`abe.py`


#/_/_/_/_/_/_/_/_/_/_/_/_/SYNONYM_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/
#The code for this part is taken from here.
# https://www.yoheim.net/blog.php?q=20160201

conn = sqlite3.connect("./data/wnjpn.db")

Word = namedtuple('Word', 'wordid lang lemma pron pos')

def getWords(lemma):
  cur = conn.execute("select * from word where lemma=?", (lemma,))
  return [Word(*row) for row in cur]


Sense = namedtuple('Sense', 'synset wordid lang rank lexid freq src')

def getSenses(word):
  cur = conn.execute("select * from sense where wordid=?", (word.wordid,))
  return [Sense(*row) for row in cur]

Synset = namedtuple('Synset', 'synset pos name src')

def getSynset(synset):
  cur = conn.execute("select * from synset where synset=?", (synset,))
  return Synset(*cur.fetchone())

def getWordsFromSynset(synset, lang):
  cur = conn.execute("select word.* from sense, word where synset=? and word.lang=? and sense.wordid = word.wordid;", (synset,lang))
  return [Word(*row) for row in cur]

def getWordsFromSenses(sense, lang="jpn"):
  synonym = {}
  for s in sense:
    lemmas = []
    syns = getWordsFromSynset(s.synset, lang)
    for sy in syns:
      lemmas.append(sy.lemma)
    synonym[getSynset(s.synset).name] = lemmas
  return synonym

def getSynonym (word):
    synonym = {}
    words = getWords(word)
    if words:
        for w in words:
            sense = getSenses(w)
            s = getWordsFromSenses(sense)
            synonym = dict(list(synonym.items()) + list(s.items()))
    return synonym

Main

`abe.py`


#/_/_/_/_/_/_/_/_/_/_/_/_/MAIN_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/
if __name__ == '__main__':
    #Get the location of the source file
    APP_ROOT = os.path.dirname(os.path.abspath( __file__)) + "/"

    #Get set value
    config = configparser.ConfigParser()
    config.read(APP_ROOT + "config.ini")
    CLIENT_ID = config.get("COTOHA API", "Developer Client id")
    CLIENT_SECRET = config.get("COTOHA API", "Developer Client secret")
    DEVELOPER_API_BASE_URL = config.get("COTOHA API", "Developer API Base URL")
    ACCESS_TOKEN_PUBLISH_URL = config.get("COTOHA API", "Access Token Publish URL")

    #COTOHA API instantiation
    cotoha_api = CotohaApi(CLIENT_ID, CLIENT_SECRET, DEVELOPER_API_BASE_URL, ACCESS_TOKEN_PUBLISH_URL)

    #Analysis target sentence
    if len(sys.argv) >= 2:
        sentence = sys.argv[1]
    else:
        raise TypeError

    #Take a verb from the original sentence and convert it to a continuous form connection
    result = cotoha_api.parse(sentence)
    ret = ""
    verb = ""
    for chunk in result["result"]:
        for token in chunk["tokens"]:
            if token["pos"] == "Verb stem":
                verb = token["lemma"]
                form = token["form"]
                conv_verb = convert(verb)
                if conv_verb==None:
                    ret += form
                else:
                    ret += conv_verb

                if ret[-1] == "Hmm":
                    ret += "But"
                else:
                    ret += "Yes, but"
                break
            else:
                ret += token["form"]

    #Take synonyms for verbs
    synonym = getSynonym(verb)
    noun = ""
    sim = 0.

    #Extract synonyms for the most similar nouns
    for syns in synonym.values():
        for syn in syns:
            result = cotoha_api.parse(syn)['result'][0]['tokens'][0]
            if result['pos'] == 'noun':
                cand = result['form']
                cand_sim = cotoha_api.similarity(sentence, cand+'To do')['result']['score']
                if cand_sim > sim:
                    noun = result['form']
                    sim = cand_sim
    ret += noun
    ret += "Not done"

    #Final output
    print(ret)

config.ini

`config.ini`


#To use the COTOHA API, register with the COTOHA API to get an ID and SECRET, and
# config.You need to create an ini file.
# https://api.ce-cotoha.com/contents/index.html

[COTOHA API]
Developer API Base URL: https://api.ce-cotoha.com/api/dev/nlp/
Developer Client id: IDIDIDIDIDIDIDIDIDIDIDIDIDIDIDI
Developer Client secret: SECRETSECRETSECRETSECRET
Access Token Publish URL: https://api.ce-cotoha.com/v1/oauth/accesstokens

5. I've tried it, but I haven't.

$ python abe.py "drink alcohol"
I'm drinking, but I'm not

$ python abe.py "go back home"
I'm home, but I'm not home

$ python abe.py "See the cherry blossoms"
I see the cherry blossoms, but I haven't seen them

$ python abe.py "Eat sushi"
I'm eating sushi, but I'm not eating

$ python abe.py "Invite to the eve"
I have been invited to the eve, but I have not.

** Other things I tried ** (click)

$ python abe.py "Stay at the hotel"
I'm staying at a hotel, but I'm not staying

$ python abe.py "answer the questions"
Answered the question but not answered

$ python abe.py "Sleep at night"
I sleep at night, but I don't sleep

$ python abe.py "Walk outside"
I'm walking outside, but I'm not walking

$ python abe.py "View the net"
I'm looking at the net, but I haven't checked

$ python abe.py "Buy meat"
I'm buying meat, but not

$ python abe.py "Burn the fire"
I'm burning a fire, but I'm not burning

6. Shinjiro Koizumi syntax

Inspired by Mr. Shinjiro Koizumi's ** "I said that I am reflecting on it, but I am reflecting on it." ** Remark ** I made a program that automatically converts the input sentence to Shinjiro Koizumi syntax! ** **

** If the above is called Shinzo Abe syntax **, Shinzo Abe syntax calls "affirmative sentence + similar negative sentence".

安倍晋三構文

On the other hand, ** Shinjiro Koizumi's syntax is simply "affirmative sentence + similar affirmative sentence" **, and I said that it is similar to ** Shinzo Abe's syntax, but it is similar **. (It just changes the way of joining in step 3.6.)

小泉進次郎構文

6.1 I said I'm trying, but I'm trying

$ python sexy.py 'I keep a promise'
I said I'm keeping my promise, but I'm keeping it.

$ python sexy.py 'Entertaining foreigners'
I said that I am entertaining foreigners, but I welcome them.

$ python sexy.py 'Take a break from the company'
I said I'm absent from work, but I'm resting.

$ python sexy.py 'Address environmental issues'
I said that I am working on environmental issues, but I am facing it.

$ python sexy.py 'Destroy NHK'
I said that I'm destroying nhk, but I'm destroying it.

7. Summarized but not summarized

I made a program that automatically converts a sentence into a sentence that says "I am recruiting but not recruiting"! ** We are soliciting likes and comments, but not soliciting **. (If there is a sentence that you are interested in what the output will be, I will try it, so feel free to comment!)

[PYTHON] To people who are "recruiting but not recruiting"

1. I teach, but I don't explain

Example of seeing cherry blossoms

2. I use it, but I don't use it

3. I teach you the mechanism in detail, but I do not explain it in detail.

3.1 Extract nouns and verbs

3.2 List synonyms

3.3 Extract only nouns

3.4 Measurement of similarity between original text and synonyms

Convert to 3.5 continuous connection

3.6 Combine

4. Shows the code but does not show it

abe.py

abe.py

Convert to continuous connection

abe.py

Synonyms

abe.py

abe.py

config.ini

5. I've tried it, but I haven't.

6. Shinjiro Koizumi syntax

6.1 I said I'm trying, but I'm trying

7. Summarized but not summarized

8. I see it, but I don't refer to it

`Example of seeing cherry blossoms`

`abe.py`

`abe.py`

`abe.py`

`abe.py`

`abe.py`

`config.ini`