In this article Install mecab with UTF-8 on Sakura shared server. After that, call mecab from python and try morphological analysis. Sakura rental server is not granted root privileges, so Install in the user directory.
-The commands in the procedure are described in bash. How to change to bash Check the supplement "Standardize bash on Sakura rental server"!
・ Mecab body (mecab-0.996) ・ Mecab dictionary -Python package management system pip ・ Mecab-python
(1) Download mecab from the following site. http://taku910.github.io/mecab/#download
(2) Unzip the mecab body
Unzip the body with the tar command
tar xvfz ./mecab-0.996.tar.gz
(3) Move to the unzipped directory
cd mecab-0.996
(4) Compile and build
Type the following command to install in the user directory
./configure --prefix=$HOME/local --with-charset=utf8 --enable-utf8-only
make make install
In my environment, it was installed in the following location. ~/local/bin/mecab
(5) Check if it is installed
mecab -v
If it is installed normally, you can type the above command to "mecab of 0.996"
A log like this is displayed.
(1) Download the mecab dictionary and ipa dictionary from the following sites. http://taku910.github.io/mecab/#download
(2) Unzip the ipa dictionary
tar xvzf mecab-ipadic-2.7.0-20070801.tar.gz
(3) Move to the dictionary folder cd mecab-ipadic-2.7.0-20070801
(4) Type the following command to install in the user directory
The character code of the dictionary is utf-8.
The reason for using utf-8 is to call it on the WEB.
./configure --with-charset=utf8
However, even if you set it to utf-8 here, after running mecab, The character code of the output result may remain EUC.
In that case, in "mecab-ipadic-2.7.0-20070801" Converts files with "csv" and "def" extensions to UTF-8 and overwrites them.
Command to convert to UTF-8 and command to confirm it
nkf -w --overwrite *.csv nkf -w --overwrite *.def nkf --guess *.*
reference How to use mecab with Sakura shared server, UTF-8 dictionary http://nymemo.com/sakura/258/
(5) Compile and build Type the following command to install in the user directory
make make install
In my environment, the dictionary was installed in the following location. ~/local/lib/mecab/dic/ipadic
Specify the dictionary as shown below and start mecab.
mecab -d ~/local/lib/mecab/dic/ipadic
The terminal setting is UTF-8.
If successful, the following will be displayed.
[home@www1635 ~/local/etc]$ mecab -d ~/local/lib/mecab/dic/ipadic Of the thighs and thighs Plum noun, general, *, *, *, *, plum, plum, plum Mo particle, particle, *, *, *, *, mo, mo, mo Peach noun, general, *, *, *, *, peach, peach, peach Mo particle, particle, *, *, *, *, mo, mo, mo Peach noun, general, *, *, *, *, peach, peach, peach Particles, adnominal forms, *, *, *, *, of, no, no Of which nouns, non-independent, adverbs possible, *, *, *, of which, Uchi, Uchi EOS
The reason for installing pip is to install mecab-python using pip.
easy_install --prefix=~/.local pip
[home@www1635 ~/local/etc]$ pip --version pip 9.0.1 from /home/homedir/.local/lib/python2.7/site-packages/pip-9.0.1-py2.7.egg (python 2.7)
pip install mecab-python --user
The reason why --user is attached is that the Sakura server does not have root privileges, so install it in the user directory.
Write this source code.
sample.py
# coding: UTF-8
import MeCab
#Specify the location of the dictionary (for some reason it will not work unless you specify it with the full path)
userdic_path="-d /home/homedir/local/lib/mecab/dic/ipadic"
t = MeCab.Tagger("-Ochasen " + userdic_path)
text = u'Of the thighs and thighs'
encoded_text = text.encode('utf-8')#
meData = t.parse(encoded_text )
print meData
When executed, the following statement will be output ʻA Plum Plum Noun-General Momo particle-particle Peach Peach Noun-General Momo particle-particle Peach Peach Noun-General Nono particle-attributive form Uchi Uchi Noun-Non-independence-Adverb possible EOS `
that's all! Next, I will write an article on how to display the execution result of MeCab on a WEB browser.
Standardize bash on Sakura rental server http://note.sicafe.net/sakuraVPS/sakura_vimInstall.html
Recommended Posts