[PYTHON] Loose articles for those who want to start natural language processing

Natural language processing

Natural language processing (English: natural language processing, abbreviation: NLP) is a series of technologies that allow a computer to process the natural language that humans use on a daily basis, and is used in artificial intelligence and linguistics. It is a field.

[Wikipedia Natural Language Processing](https://ja.wikipedia.org/wiki/Natural Language Processing)

In the last few years (earlier?), When I searched on Google The description of Wikipedia is now displayed in a square frame, That also means "extracting important parts from Wikipedia" It is considered to be a type of natural language processing. (By the way, "extracting the first sentence of a sentence" is A well-known rule of thumb for summarizing tasks such as news and reports. )

API Natural language processing is deep, and it is fun to implement concrete processing by yourself, There seems to be an API that can be used easily even if it is not so, so This time I will introduce it.

goo lab API

https://labs.goo.ne.jp/api/ supported by goo

There are various things.

--Product Review Summarization API --Morphological Analysis API --Named entity extraction API (Japanese named entity extraction API) --Word similarity calculation API (Japanese word similarity API) --Hiragana conversion API (Japanese hiragana conversion API)

Try some

I was told to send a POST request, so It doesn't matter how you do it, but this time I'll try it with python.

python


# -*- coding: utf-8 -*-

import json
import requests

# goo
# Overview: Japanese Word Similarity API
app_id = #ID you get when registration

url_shortsum = "https://labs.goo.ne.jp/api/shortsum"
url_entity = "https://labs.goo.ne.jp/api/entity"
url_similarity = "https://labs.goo.ne.jp/api/similarity"


def shortsum(length, review_list, request_id="record001"):
	# shortsum API interface: https://labs.goo.ne.jp/api/2015/1150/
	payload = {"app_id": app_id, "request_id": request_id, "length": length, "review_list": review_list}
	headers = {'content-type': 'application/json'}
	r = requests.post(url_shortsum, data=json.dumps(payload), headers=headers)
	print r.text

def entity(sentence, class_filter, request_id="record002"):
	# entity API interface: https://labs.goo.ne.jp/api/2015/336/
	# “ORG”(organization name), “PSN”(person name), “LOC”(location name), “DAT”(date expression) and “TIM”(time expression)
	payload = {"app_id": app_id, "request_id": request_id, "sentence": sentence, "class_filter": class_filter}
	headers = {'content-type': 'application/json'}
	r = requests.post(url_entity, data=json.dumps(payload), headers=headers)
	print r.text

def similarity(word1, word2, request_id="record003"):
	# similarity API interface: https://labs.goo.ne.jp/api/2015/330/
	payload = {"app_id": app_id, "request_id": request_id, "query_pair": [word1, word2]}
	headers = {'content-type': 'application/json'}
	r = requests.post(url_similarity, data=json.dumps(payload), headers=headers)
	print r.text


if __name__ == '__main__':
	
	shortsum(120, [
		"I've bought it before, so this smoothie has a fruity taste and is very delicious. It was a great deal because I was able to purchase it at a low price this time. This smoothie is easy to drink, and the mango taste is refreshing without getting tired, so I would like to repeat it again.",
		"The long-awaited green smoothie has arrived. I don't like milk or soy milk, so I divided it with 180cc of water and drank it. It didn't smell as green as I expected, and the mango taste was faint and not delicious. I imagined something richer, but it's quite thin probably because of the water. It seems better to drink at room temperature, but I wanted to break it with ice. Perhaps because of the water, I don't feel hungry, and it seems impossible to replace one meal. Personally, I wanted to incorporate enzymes into my body, but when I told my acquaintance that I had purchased them, I received the following opinions. "Metabolizing enzymes and digestive enzymes are completely different. Smoothies may help digestion, but they are not possible as metabolic enzymes. They are first broken down in the stomach, so they don't go where they are needed." Now, what about the effect? I want to take in enzymes."
		]
	)
	entity("Mr. Yamashita will go to Niigata at 10:30 tomorrow.", "ORG|PSN|LOC|DAT")
	similarity("Goodwill", "salesman")

Send a POST request with requests.post (argument). In the product reputation summary API, I will put about two reviews from products on a certain mail order site.

Execution result

python


(agile_env)nlp $ python test.py
{"request_id":"record001","length":60,"summary":"I've bought it before, so this smoothie has a fruity taste and is very delicious. It was a great deal because I was able to purchase it at a low price this time."}
{"request_id":"record002","class_filter":"ORG|PSN|LOC|DAT","ne_list":[["Yamashita","PSN"],["tomorrow","DAT"],["Niigata","LOC"]]}
{"request_id":"record003","score":0.4345982085070782}
(agile_env)nlp $ 
(agile_env)nlp $ 
(agile_env)nlp $ 
(agile_env)nlp $ python test.py
{"request_id":"record001","length":120,"summary":"It was a great deal because I was able to purchase it at a low price this time. I don't like milk or soy milk, so I divided it with 180cc of water and drank it. It didn't smell as green as I expected, and the mango taste was faint and not delicious. It seems better to drink at room temperature, but I wanted to break it with ice."}
{"request_id":"record002","class_filter":"ORG|PSN|LOC|DAT","ne_list":[["Yamashita","PSN"],["tomorrow","DAT"],["Niigata","LOC"]]}
{"request_id":"record003","score":0.4345982085070782}

"length":60 Is the length (characters) of the summary that you create from the review. I don't feel like I can see any trends in the two cases, As you increase the number of characters, you can see that the details are included accordingly. After that, you can feel the "value for money" with either 60 or 120 characters.

"Goodwill" and "salesman" are It seems interesting to see the similarity between the two words used in the mystery.

Sentiment analysis

Possibilities of "Yahoo Real-time Search" Expanding from "Sentiment Analysis"-Ask the Development Team As you can see in the article

For a given keyword Judging what kind of emotions you have by positive / negative (positive / negative) Such analysis seems to be the mainstream.

difficulty

This tweet is for the keyword "Anitube closed" Is it "positive" or "negative"?

感情分析.png

Deep learning

Topics: Automatic email reply

Recently, Google introduced a technology that expands your dreams. Computer, respond to this email.

And more recently, the algorithms that underpin such technology development Published as OSS so that it can be used from outside. Google releases TensorFlow: Search giant makes its artificial intelligence software available to the public

TensorFlow Get Started

Recommended Posts

Loose articles for those who want to start natural language processing
Preparing to start natural language processing
For those who want to start machine learning with TensorFlow2
For those who want to perform natural language processing using WikiPedia's knowledge that goes beyond simple keyword matching
Join Azure Using Go ~ For those who want to start and know Azure with Go ~
5 Reasons Processing is Useful for Those Who Want to Get Started with Python
Anxible points for those who want to introduce Ansible
For those who want to write Python with vim
Reference reference for those who want to code in Rhinoceros / Grasshopper
Natural language processing for busy people
PyPI registration steps for those who want to make a PyPI debut
Python techniques for those who want to get rid of beginners
[Natural language processing] I want to meet an engineer who is changing jobs (or just before)
I analyzed Airbnb data for those who want to stay in Amsterdam
Software training for those who start researching space
For those who want to learn Excel VBA and get started with Python
I asked a friend who works in machine learning at a very famous IT company. Machine learning (natural language processing) What I want to learn for self-study
Set up a development environment for natural language processing
Python: Natural language processing
PostgreSQL-For those who want to INSERT at high speed
[For beginners] Language analysis using the natural language processing tool "GiNZA" (from morphological analysis to vectorization)
3. Natural language processing with Python 4-1. Analysis for words with KWIC
RNN_LSTM2 Natural language processing
Building an environment for natural language processing with Python
Environment construction for those who want to study python easily with VS Code (for Mac)
Compare how to write processing for lists by language
For those who want to display images side by side as soon as possible with Python's matplotlib
[For those who want to use TPU] I tried using the Tensorflow Object Detection API 2
Natural language processing 1 Morphological analysis
Natural language processing 3 Word continuity
Natural language processing 2 Word similarity
A modern environment building procedure for those who want to get started with Python right away
[Short sentence] easygui for those who want to use a simple GUI with Python very easily
[2020 version for beginners] Recommended study method for those who want to become an AI engineer by themselves
Python environment construction 2016 for those who aim to be data scientists
[Introduction to RasPi4] Environment construction; natural language processing system mecab, etc. .. .. ♪
Dockerfile with the necessary libraries for natural language processing in python
Why is distributed representation of words important for natural language processing?
Summarize how to preprocess text (natural language processing) with tf.data.Dataset api
A memo for those who want quick socket communication with netcat
A story about trying to reproduce Katsuwo Isono, who does not react to inconvenience, by natural language processing.
100 language processing knocks for those who do not understand the meaning of problem sentences Chapter 8 Machine learning
Study natural language processing with Kikagaku
100 natural language processing knocks Chapter 4 Commentary
[Natural language processing] Preprocessing with Japanese
Artificial language Lojban and natural language processing (artificial language processing)
100 language processing knock 2020 "for Google Colaboratory"
Natural language processing analyzer installation summary
Image analysis performed with google Colab for those who start space research
3. Natural language processing with Python 1-2. How to create a corpus: Aozora Bunko
For those who want to use Jupyter Notebook as soon as 1 second because they do not know the password