The first artificial intelligence. I wanted to try natural language processing, so I will try morphological analysis using MeCab with python3.

Background

I have referred to various articles about installing MeCab with python3. I have a short temper, so I love pages that work well by copying from the top of the page. "Mendokusai" is a habit, and "Make if not" is the motto. (It doesn't matter in the second half.)

Environmental setting

CentOS7

Install MeCab

It was quick to clone MeCab.

# git clone https://github.com/taku910/mecab.git
# cd mecab/mecab
# ./configure  --enable-utf8-only
# make
# make check
# make install

You can download MeCab from the page below, but it's quite annoying. There were various MeCabs.

Reference: MeCab https://drive.google.com/drive/folders/0B4y35FiV1wh7fjQ5SkJETEJEYzlqcUY4WUlpZmR4dDlJMWI5ZUlXN2xZN2s2b0pqT3hMbTQ

Dictionaries installation

If you don't have a dictionary, you won't be able to use it, so install it quickly.

# cd mecab-ipadic
# ./configure --with-charset=utf8
# make
# make install

Try MeCab on the console

After installation, you can run it on the console, so let's try it.

# mecab
MeCab is free software

MeCab noun,Proper noun,Organization,*,*,*,*
Is a particle,Particle,*,*,*,*,Is,C,Wow
Free noun,General,*,*,*,*,free,free,free
Software noun,General,*,*,*,*,software,software,software
Auxiliary verb,*,*,*,Special Death,Uninflected word,is,death,death
EOS

It worked. It is a moment of relief that Japanese is displayed without any problems.

Try running MeCab on python3.5 series

This is where pip comes in.

# pip install mecab-python3

On other sites, there are pages that suddenly post this command. But this command shouldn't work without installing MeCab as well. It goes without saying that I believed in the pip universal theory, and when I saw this code, I was caught by pip Hoi Hoi saying, "This is easier!"

You can install it without any problems.

Now let's write the python file test.py.

#test.py

# coding: UTF-8
import sys
import MeCab
m = MeCab.Tagger ("-Ochasen")
print(m.parse ("Make it yourself because it's annoying"))

I will try it.

# python test.py
Mendokusai Mendokusai Mendokusai adjective-Independent adjectives and uninflected words
From Kara to particles-Connection particle
Self Jibun Self Noun-General
De de de particle-Case particles-General
Make Tsukuru Make Verb-Independent five-stage, la line basic form
EOS

You can change the data output format by changing the argument of MeCab.Tagger.

-Ochasen -Owakati -Oyomi mecabrc

There are other things like that.

#test2.py

# coding: UTF-8
import sys
import MeCab
m = MeCab.Tagger ("-Ochasen")
print(m.parse ("Make it yourself because it's annoying"))

m = MeCab.Tagger ("-Owakati")
print(m.parse ("Make it yourself because it's annoying"))

m = MeCab.Tagger ("-Oyomi")
print(m.parse ("Make it yourself because it's annoying"))

m = MeCab.Tagger ("mecabrc")
print(m.parse ("Make it yourself because it's annoying"))

I'm interested, so I'll try to display it.

# python test2.py
Mendokusai Mendokusai Mendokusai adjective-Independent adjectives and uninflected words
From Kara to particles-Connection particle
Self Jibun Self Noun-General
De de de particle-Case particles-General
Make Tsukuru Make Verb-Independent five-stage, la line basic form
EOS

Make it yourself from annoyance

Mendoku Saikara Jibun Detsukuru

Annoying adjectives,Independence,*,*,Adjective, Auoudan,Uninflected word,Troublesome,Annoying,Annoying
From particles,Connection particle,*,*,*,*,From,Kara,Kara
My noun,General,*,*,*,*,myself,Jibun,Jibun
Particles,Case particles,General,*,*,*,so,De,De
Verbs to make,Independence,*,*,Five steps, La line,Uninflected word,create,Tsukuru,Tsukuru
EOS

Memorandum: An error that appeared when installing by yourself

What to do if you are told that you don't have libmecab.so.2.

ImportError: libmecab.so.2: cannot open shared object file: No such file or directory

approach

$ vi /etc/ld.so.conf.d/lib.conf
/usr/local/lib  #<--Newly fill in or add.

$ ldconfig #<--Reload

Reference: Extraction of important words from Wikipedia by TF / IDF using Mecab Python http://yut.hatenablog.com/entry/20130215/1360884220

Reference: Make the morphological analysis engine MeCab available in Python 3 (March 2016 version) http://qiita.com/grachro/items/4fbc9bf8174c5abb7bdd#_reference-f17313e8bc66cbbff3ef

Recommended Posts

The first artificial intelligence. I wanted to try natural language processing, so I will try morphological analysis using MeCab with python3.
[For beginners] Language analysis using the natural language processing tool "GiNZA" (from morphological analysis to vectorization)
Sentiment analysis with natural language processing! I tried to predict the evaluation from the review text
I will write a detailed explanation to death while solving 100 natural language processing knock 2020 with Python
[Python] I played with natural language processing ~ transformers ~
I tried to extract named entities with the natural language processing library GiNZA
100 natural language processing knocks Chapter 4 Morphological analysis (first half)
3. Natural language processing with Python 4-1. Analysis for words with KWIC
Natural language processing 1 Morphological analysis
3. Natural language processing with Python 5-1. Concept of sentiment analysis [AFINN-111]
I wanted to solve the Panasonic Programming Contest 2020 with Python
[Python] Try to classify ramen shops by natural language processing
3. Natural language processing with Python 5-2. Emotion intensity analysis tool VADER
I wanted to run the motor with Raspberry Pi, so I tried using Waveshare's Motor Driver Board
I made a class to get the analysis result by MeCab in ndarray with python
Try the book "Introduction to Natural Language Processing Application Development in 15 Steps" --Chapter 2 Step 03 Memo "Morphological Analysis and Word Separation"
The first artificial intelligence. Challenge web output with python. ~ Flask introduction
Dockerfile with the necessary libraries for natural language processing in python
I wanted to solve the ABC164 A ~ D problem with Python
I played with Mecab (morphological analysis)!
From the introduction of JUMAN ++ to morphological analysis of Japanese with Python
I tried to display the analysis result of the natural language processing library GiNZA in an easy-to-understand manner
Try to poke DB on IBM i with python + JDBC using JayDeBeApi
3. Natural language processing with Python 1-2. How to create a corpus: Aozora Bunko
I wanted to solve ABC160 with Python
3. Natural language processing with Python 2-1. Co-occurrence network
3. Natural language processing with Python 1-1. Word N-gram
I tried natural language processing with transformers.
I wanted to solve ABC172 with Python
Let the COTOHA API do the difficult things-Introduction to "learn using" natural language processing-
3. Natural language processing with Python 3-1. Important word extraction tool TF-IDF analysis [original definition]
I wanted to visualize 3D particle simulation with the Python visualization library Matplotlib.
I was able to mock AWS-Batch with python, moto, so I will leave it
Introduction to Artificial Intelligence with Python 1 "Genetic Algorithm-Theory-"
3. Natural language processing with Python 2-2. Co-occurrence network [mecab-ipadic-NEologd]
Introduction to Artificial Intelligence with Python 2 "Genetic Algorithm-Practice-"
I wanted to solve NOMURA Contest 2020 with Python
Try to solve the man-machine chart with Python
Try using the Chinese morphological analysis engine jieba
I wanted to install Python 3.4.3 with Homebrew + pyenv
I tried using mecab with python2.7, ruby2.3, php7
[Python] I will upload the FTP to the FTP server.
I tried to verify whether the Natural Language API (sentiment analysis) supports net slang.
I tried to compare the processing speed with dplyr of R and pandas of Python
Try to solve the programming challenge book with python3
[First API] Try to get Qiita articles with Python
100 natural language processing knocks Chapter 4 Morphological analysis (second half)
Try to solve the internship assignment problem with Python
The first algorithm to learn with Python: FizzBuzz problem
I tried to touch the CSV file with Python
Collecting information from Twitter with Python (morphological analysis with MeCab)
I tried to solve the soma cube with python
[Chapter 5] Introduction to Python with 100 knocks of language processing
I wanted to use the Python library from MATLAB
I tried to implement an artificial perceptron with python
I want to inherit to the back with python dataclass
Building an environment for natural language processing with Python
[Chapter 3] Introduction to Python with 100 knocks of language processing
100 language processing knock-30 (using pandas): reading morphological analysis results
[Chapter 2] Introduction to Python with 100 knocks of language processing
I tried to solve the problem with Python Vol.1