[PYTHON] How to quickly create a morphological analysis environment using Elasticsearch on macOS Sierra

Preparation

I'm sorry for the miso in the foreground, but you will need it later, so please install Jupyter Notebook on macOS on the following page. http://qiita.com/mix_dvd/items/d915752215db67919c06

Check and install JAVA

Execute the following command to check if it is installed.

$ java -version

If it is not installed, the following dialog will be displayed. Click the "Detailed information ..." button.

スクリーンショット 2016-07-20 11.23.23.png

http://www.oracle.com/technetwork/java/javase/downloads/index.html

The above website will be displayed. Download and install the JDK.

スクリーンショット 2016-07-20 11.26.24.png

After installation, execute the command again to confirm that it is installed.

$ java -version
java version "1.8.0_101"
Java(TM) SE Runtime Environment (build 1.8.0_101-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.101-b13, mixed mode)

Elasticsearch installation

[Official site] https://www.elastic.co/jp/products/elasticsearch

Program installation

Execute the following command

$ curl -O https://download.elastic.co/elasticsearch/release/org/elasticsearch/distribution/zip/elasticsearch/2.3.4/elasticsearch-2.3.4.zip
$ unzip elasticsearch-2.3.4.zip
$ sudo mv elasticsearch-2.3.4 /usr/local/elasticsearch

Check version

$ /usr/local/elasticsearch/bin/elasticsearch --version
Version: 2.3.4, Build: e455fd0/2016-06-30T11:24:31Z, JVM: 1.8.0_101

Plugin installation

Execute the following command

$ cd /usr/local/elasticsearch
$ bin/plugin install analysis-kuromoji

Start-up

Execute the following command

$ /usr/local/elasticsearch/bin/elasticsearch

Operation check

Start another terminal and execute the following command

$ curl localhost:9200

Alternatively, access the following URL with a web browser

http://localhost:9200

Successful startup if the following response is received

{
  "name" : "Akasha",
  "cluster_name" : "elasticsearch",
  "version" : {
    "number" : "2.3.4",
    "build_hash" : "Xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
    "build_timestamp" : "2016-06-30T11:24:31Z",
    "build_snapshot" : false,
    "lucene_version" : "5.5.0"
  },
  "tagline" : "You Know, for Search"
}

Installing the library for Python

Execute the following command

$ pip install elasticsearch

Executing sample code

Save the following code as test.py

test.py



# coding: utf-8

# # Elasticsearch

# In[1]:

from elasticsearch import Elasticsearch
es = Elasticsearch("localhost:9200")
es


# #Variable initialization

# In[2]:

esIndex = "bot"
esType = "talks"


# #Add index

# - curl -X POST http://localhost:9200/bot/talks -d '{"mode":"Greetings", "words":"Good morning"}'

# In[3]:

es.index(index=esIndex, doc_type=esType, body={"mode":"Greetings", "words":"Good morning"})


# In[4]:

es.index(index=esIndex, doc_type=esType, body={"mode":"Greetings", "words":"Hello"})
es.index(index=esIndex, doc_type=esType, body={"mode":"Greetings", "words":"Good evening"})
es.index(index=esIndex, doc_type=esType, body={"mode":"Greetings", "words":"goodbye"})
es.index(index=esIndex, doc_type=esType, body={"mode":"Greetings", "words":"good night"})
es.index(index=esIndex, doc_type=esType, body={"mode":"Quotations", "words":"Nothing to die and pick up a corpse"})


# #Index modification

# - curl -X POST http://localhost:9200/bot/talks?id=AVYGQm6Q8mtRod8eIWiq -d '{"mode":"Greetings","words":"Good night"}'
# 
#Update if id exists, add if id does not exist

# In[21]:

es.index(index=esIndex, doc_type=esType, id="AVYGQm6Q8mtRod8eIWiq", body={"mode":"Greetings", "words":"see you tomorrow"})


# #Data acquisition

# - curl -X GET http://localhost:9200/bot/talks/_search?pretty -d '{"query":{"match_all":{}}}'

# In[29]:

res = es.search(index=esIndex, body={"query": {"match_all": {}}})
res


# In[23]:

len(res["hits"]["hits"])

words = []
modes = []

for i in range(len(res["hits"]["hits"])):
    row = res["hits"]["hits"][i]["_source"]
    print(row)
    words.append(row["words"])
    modes.append(row["mode"])


# #Data deletion

# - curl -X DELETE http://localhost:9200/bot/

# In[8]:

#es.indices.delete(index="bot")


# #Use of plugins

# -Morphological analysis

# In[24]:

text = "It's nice weather today, is not it"


# In[25]:

def analyze(es, text):
    
    params = {"analyzer":"kuromoji"}
    body = {"text":text}
    
    http_status, data = es.indices.client.transport.perform_request(
        'GET',
        '/' + esIndex + '/_analyze',
        params=params,
        body=body
    )

    return map(lambda x: x.get('token'), data.get('tokens')[0:])


# In[26]:

tokens = analyze(es, text)
print(' '.join(tokens))


# In[30]:

for word in words:
    print(' '.join(analyze(es, word)))

Execute the following command

$ python test.py

Success if you receive the following response!

{'mode': 'Greetings', 'words': 'Good morning'}
{'mode': 'Greetings', 'words': 'Good evening'}
{'mode': 'Greetings', 'words': 'Hello'}
{'mode': 'Greetings', 'words': 'goodbye'}
{'mode': 'Greetings', 'words': 'good night'}
{'mode': 'Quotations', 'words': 'Nothing to die and pick up a corpse'}
{'mode': 'Greetings', 'words': 'see you tomorrow'}
Nice weather today
Good morning
Good evening
Hello
goodbye
good night
Pick up a dead corpse
tomorrow

Well, what are we going to do now (^ _ ^;)

Postscript

Oh, I didn't use Jupyter Notebook (sweat)

Recommended Posts

How to quickly create a morphological analysis environment using Elasticsearch on macOS Sierra
How to quickly create a machine learning environment using Jupyter Notebook on macOS Sierra with anaconda
How to quickly create a machine learning environment using Jupyter Notebook with UbuntuServer 16.04 LTS
How to quickly create a machine learning environment using Jupyter Notebook with UbuntuServer 16.04 LTS with anaconda
How to build a Python environment using Virtualenv on Ubuntu 18.04 LTS
How to install cx_Oracle on macOS Sierra
How to create a Python virtual environment (venv)
[Note] How to create a Ruby development environment
[Note] How to create a Mac development environment
How to set up a Python environment using pyenv
How to build a Django (python) environment on docker
Build a Python development environment using pyenv on MacOS
[Morphological analysis] How to add a new dictionary to Mecab
How to build a Python environment on amazon linux 2
How to create a Python 3.6.0 environment by putting pyenv on Amazon Linux and Ubuntu
Steps to quickly create a deep learning environment on Mac with TensorFlow and OpenCV
Build a python machine learning study environment on macOS sierra
A memo on how to easily prepare a Linux exercise environment
How to build a new python virtual environment on Ubuntu
A note on how to load a virtual environment in PyCharm
How to set up WSL2 on Windows 10 and create a study environment for Linux commands
Create a Python environment on Mac (2017/4)
How to create a Conda package
How to install graph-tool on macOS
How to create a virtual bridge
Create a Linux environment on Windows 10
Create a python environment on centos
How to create a Dockerfile (basic)
How to create a config file
I tried to create a server environment that runs on Windows 10
How to build a LAMP environment using Vagrant and VirtulBox Note
How to create a CSV dummy file containing Japanese using Faker
From PyCUDA environment construction to GPGPU programming on Mac (MacOS 10.12 Sierra)
How to create a clone from Github
How to create a git clone folder
How to draw a graph using Matplotlib
Build a python environment on MacOS (Catallina)
How to create an NVIDIA Docker environment
Create a python environment on your Mac
[Python] Create a Batch environment using AWS-CDK
Steps to create a Python virtual environment with VS Code on Windows
How to install a package using a repository
[Development environment] How to create a data set close to the production DB
How to create a repository from media
How to test on a Django-authenticated page
[2020 version mac migration] migration to macos 10.15 Catarina Create a work environment from scratch without using an assistant (CUI edition)
[2015/11/19] How to register a service locally using the python SDK on naoqi os
How to build an environment for using multiple versions of Python on Mac
How to run a Django application on a Docker container (development and production environment)
[Python] How to create a local web server environment with SimpleHTTPServer and CGIHTTPServer
How to share a virtual environment [About requirements.txt]
How to code a drone using image recognition
How to create a function object from a string
[Latest] How to build Java environment on Ubuntu
How to create a JSON file in Python
How to quickly install h5py on Windows 10 [Unofficial]
[Venv] Create a python virtual environment on Ubuntu
Try to create a new command on linux
How to create a shortcut command for LINUX
How to live a decent life on 2017 Windows
How to create a Kivy 1-line input box