Make the morphological analysis engine MeCab available in Python 3 (March 2016 version)

Overview

Make the morphological analysis engine "MeCab" available from Python 3 installed in pyenv on Mac.

Basically, the contents of the existing summary article are the same, but the patch was applied to the official repository of GitHub, and the work of manually applying the patch in the original article is just to modify the binding code of ~~ Python by one line. It was in good condition. ~~ (2016 / 3/2 revision) All are no longer needed.

I've just compiled the information from the original article, but I've retried the installation several times, so I'll leave the steps behind.

Official site http://mecab.googlecode.com/svn/trunk/mecab/doc/index.html Repository https://github.com/taku910/mecab

Original article -Make MeCab available from Python 3 -MeCab with Python3 -Use MeCab from Python3 (follow-up) (Article of the person who pulled the patch)

Installation

Install MeCab (Wakame seaweed)

git clone https://github.com/taku910/mecab.git
cd mecab/mecab
./configure  --enable-utf8-only
make
make check
sudo make install

After installation, mecab will be deployed.

/usr/local/etc/mecabrc
/usr/local/bin/mecab
/usr/local/bin/mecab-config

~~ If you start mecab from the console and then enter Japanese, the morphological analysis result will be displayed. ~~ _2016 / 3/2 postscript In the first edition, I wrote the explanation using the mecab command here, but I could not use it until I installed the dictionary. _

Dictionaries installation

~~ Download "IPA Dictionary" from the official website. ~~ ~~http://taku910.github.io/mecab/#install~~ ~~http://taku910.github.io/mecab/#download~~

tar zxfv mecab-ipadic-2.7.0-20070801.tar.gz
cd mecab-ipadic-2.7.0-20070801
./configure --with-charset=utf8
make
sudo make install

_2016/3/2 Addendum 2 Please skip here as well.
It was included in the git project without having to download it.
cd ../mecab-ipadic
./configure --with-charset=utf8
make
sudo make install

_2016/3/2 Addendum 2 This is the latest

At this point, start mecab from the console and continue to enter Japanese, and the morphological analysis results will be displayed.

$ mecab
MeCab is free software
MeCab noun,Proper noun,Organization,*,*,*,*
Is a particle,Particle,*,*,*,*,Is,C,Wow
Free noun,General,*,*,*,*,free,free,free
Software noun,General,*,*,*,*,software,software,software
Auxiliary verb,*,*,*,Special Death,Uninflected word,is,death,death
EOS

Install Python3 bindings

_2016 / 3/2 postscript There was an easier method than the first edition. Please skip it for a while. _

~~ Next, prepare to use MeCab from Python. Since bindings of various languages are prepared in the directory that was git cloned earlier, move to the python directory. ~~

cd [MeCab git cloned directory]
cd mecab/mecab/python

#2016/3/2 Addendum Please skip here

~~ Now, we need to modify the code in setup.py by one line. Be careful not to erase the tab before return. ~~

~~ This article "MeCab with Python 3" ~~

vi setup.py

def cmd2(str):
    return string.split (cmd1(str))

Changed to

def cmd2(str):
    return cmd1(str).split()

#2016/3/2 Addendum Please skip here as well

~~ After fixing, install it. ~~

python setup.py build
sudo python setup.py install

#2016/3/2 Addendum Please skip here as well

_ 2016/3/2 postscript _ _ There was a simpler procedure. As described in the article below, you can use it from Python 3 with the pip command. _

-Using mecab with Python3

pip install mecab-python3

Try using

Try running the Python sample on the official website. The original is the code for Python2, so only print is changed.

import sys
import MeCab
m = MeCab.Tagger ("-Ochasen")
print(m.parse ("I have to do it today"))

Execution result

Today Kyo Today noun-Adverbs possible
Momo particle-Particle
Verb-Independence Sahen / Suru imperfect form
No Nai No Auxiliary verb Special / Nai Basic form
And to and particles-Connection particle
Nene Nene Particles-Final particle
EOS

Please let me know if the procedure is wrong.

Recommended Posts

Make the morphological analysis engine MeCab available in Python 3 (March 2016 version)
[Python] Morphological analysis with MeCab
Make MeCab available from Python3
Make Opencv available in Python
Try using the Chinese morphological analysis engine jieba
Make a copy of the list in Python
Difference in morphological analysis results by mecab dictionary
Make the library created by Eigen in C ++ available from Python with Boost.Numpy.
Text mining with Python ① Morphological analysis (re: Linux version)
Collecting information from Twitter with Python (morphological analysis with MeCab)
Pure Python version online morphological analysis tool Rakuten MA
Association analysis in Python
Regression analysis in Python
[Internal_math version (2)] Decoding the AtCoder Library ~ Implementation in Python ~
I made a class to get the analysis result by MeCab in ndarray with python
Make sure all the elements in the list are the same in Python
[Understand in the shortest time] Python basics for data analysis
Visualize the correlation matrix by principal component analysis in Python
Morphological analysis using Igo + mecab-ipadic-neologd in Python (with Ruby bonus)
Download the file in Python
Python: Japanese text: Morphological analysis
Axisymmetric stress analysis in Python
Don't make test.py in Python!
Make a bookmarklet in Python
Methods available in the list
Japanese morphological analysis with Python
Make python segfault in 2 lines
Simple regression analysis in Python
[Python] PCA scratch in the example of "Introduction to multivariate analysis"
Perform morphological analysis in the machine learning environment launched by GCE
From the introduction of JUMAN ++ to morphological analysis of Japanese with Python
Change the active version in Pyenv from anaconda to plain Python
python> link> from __future__ import print_function> Make Python 3.X print () available in Python 2.X
EEG analysis in Python: Python MNE tutorial
Make python segfault in one line
Getting the arXiv API in Python
First simple regression analysis in Python
Put MeCab in "Windows 10; Python3.5 (64bit)"
Python in the browser: Brython's recommendation
Save the binary file in Python
Get the desktop path in Python
pyenv-change the python version of virtualenv
Self-organizing map in Python NumPy version
Get the script path in Python
In the python command python points to python3.8
Implement the Singleton pattern in Python
Make standard output non-blocking in Python
How to get the Python version
Text mining with Python ① Morphological analysis
Windows10: Install MeCab library in python
Change the Python version of Homebrew
[Python] Make the function a lambda function
■ [Google Colaboratory] Use morphological analysis (MeCab)
Hit the web API in Python
Make python segfault in three lines
I wrote the queue in Python
Calculate the previous month in Python
[Lambda] Make import requests available [python]
Examine the object's class in python
Planar skeleton analysis in Python (2) Hotfix
Get the desktop path in Python