Natural language processing (English: natural language processing, abbreviation: NLP) is a series of technologies that allow a computer to process the natural language that humans use on a daily basis, and is used in artificial intelligence and linguistics. It is a field.
[Wikipedia Natural Language Processing](https://ja.wikipedia.org/wiki/Natural Language Processing)
In the last few years (earlier?), When I searched on Google The description of Wikipedia is now displayed in a square frame, That also means "extracting important parts from Wikipedia" It is considered to be a type of natural language processing. (By the way, "extracting the first sentence of a sentence" is A well-known rule of thumb for summarizing tasks such as news and reports. )
API Natural language processing is deep, and it is fun to implement concrete processing by yourself, There seems to be an API that can be used easily even if it is not so, so This time I will introduce it.
There are various things.
--Product Review Summarization API --Morphological Analysis API --Named entity extraction API (Japanese named entity extraction API) --Word similarity calculation API (Japanese word similarity API) --Hiragana conversion API (Japanese hiragana conversion API)
I was told to send a POST request, so It doesn't matter how you do it, but this time I'll try it with python.
python
# -*- coding: utf-8 -*-
import json
import requests
# goo
# Overview: Japanese Word Similarity API
app_id = #ID you get when registration
url_shortsum = "https://labs.goo.ne.jp/api/shortsum"
url_entity = "https://labs.goo.ne.jp/api/entity"
url_similarity = "https://labs.goo.ne.jp/api/similarity"
def shortsum(length, review_list, request_id="record001"):
# shortsum API interface: https://labs.goo.ne.jp/api/2015/1150/
payload = {"app_id": app_id, "request_id": request_id, "length": length, "review_list": review_list}
headers = {'content-type': 'application/json'}
r = requests.post(url_shortsum, data=json.dumps(payload), headers=headers)
print r.text
def entity(sentence, class_filter, request_id="record002"):
# entity API interface: https://labs.goo.ne.jp/api/2015/336/
# “ORG”(organization name), “PSN”(person name), “LOC”(location name), “DAT”(date expression) and “TIM”(time expression)
payload = {"app_id": app_id, "request_id": request_id, "sentence": sentence, "class_filter": class_filter}
headers = {'content-type': 'application/json'}
r = requests.post(url_entity, data=json.dumps(payload), headers=headers)
print r.text
def similarity(word1, word2, request_id="record003"):
# similarity API interface: https://labs.goo.ne.jp/api/2015/330/
payload = {"app_id": app_id, "request_id": request_id, "query_pair": [word1, word2]}
headers = {'content-type': 'application/json'}
r = requests.post(url_similarity, data=json.dumps(payload), headers=headers)
print r.text
if __name__ == '__main__':
shortsum(120, [
"I've bought it before, so this smoothie has a fruity taste and is very delicious. It was a great deal because I was able to purchase it at a low price this time. This smoothie is easy to drink, and the mango taste is refreshing without getting tired, so I would like to repeat it again.",
"The long-awaited green smoothie has arrived. I don't like milk or soy milk, so I divided it with 180cc of water and drank it. It didn't smell as green as I expected, and the mango taste was faint and not delicious. I imagined something richer, but it's quite thin probably because of the water. It seems better to drink at room temperature, but I wanted to break it with ice. Perhaps because of the water, I don't feel hungry, and it seems impossible to replace one meal. Personally, I wanted to incorporate enzymes into my body, but when I told my acquaintance that I had purchased them, I received the following opinions. "Metabolizing enzymes and digestive enzymes are completely different. Smoothies may help digestion, but they are not possible as metabolic enzymes. They are first broken down in the stomach, so they don't go where they are needed." Now, what about the effect? I want to take in enzymes."
]
)
entity("Mr. Yamashita will go to Niigata at 10:30 tomorrow.", "ORG|PSN|LOC|DAT")
similarity("Goodwill", "salesman")
Send a POST request with requests.post (argument). In the product reputation summary API, I will put about two reviews from products on a certain mail order site.
python
(agile_env)nlp $ python test.py
{"request_id":"record001","length":60,"summary":"I've bought it before, so this smoothie has a fruity taste and is very delicious. It was a great deal because I was able to purchase it at a low price this time."}
{"request_id":"record002","class_filter":"ORG|PSN|LOC|DAT","ne_list":[["Yamashita","PSN"],["tomorrow","DAT"],["Niigata","LOC"]]}
{"request_id":"record003","score":0.4345982085070782}
(agile_env)nlp $
(agile_env)nlp $
(agile_env)nlp $
(agile_env)nlp $ python test.py
{"request_id":"record001","length":120,"summary":"It was a great deal because I was able to purchase it at a low price this time. I don't like milk or soy milk, so I divided it with 180cc of water and drank it. It didn't smell as green as I expected, and the mango taste was faint and not delicious. It seems better to drink at room temperature, but I wanted to break it with ice."}
{"request_id":"record002","class_filter":"ORG|PSN|LOC|DAT","ne_list":[["Yamashita","PSN"],["tomorrow","DAT"],["Niigata","LOC"]]}
{"request_id":"record003","score":0.4345982085070782}
"length":60 Is the length (characters) of the summary that you create from the review. I don't feel like I can see any trends in the two cases, As you increase the number of characters, you can see that the details are included accordingly. After that, you can feel the "value for money" with either 60 or 120 characters.
"Goodwill" and "salesman" are It seems interesting to see the similarity between the two words used in the mystery.
Possibilities of "Yahoo Real-time Search" Expanding from "Sentiment Analysis"-Ask the Development Team As you can see in the article
For a given keyword Judging what kind of emotions you have by positive / negative (positive / negative) Such analysis seems to be the mainstream.
This tweet is for the keyword "Anitube closed" Is it "positive" or "negative"?
Recently, Google introduced a technology that expands your dreams. Computer, respond to this email.
And more recently, the algorithms that underpin such technology development Published as OSS so that it can be used from outside. Google releases TensorFlow: Search giant makes its artificial intelligence software available to the public
Recommended Posts