[PYTHON] Summary of stumbling blocks in installing CaboCha

What is CaboCha?

Japanese dependency analyzer. Used for natural language processing.

Basic install method

As you can see on the Official Site, basically you should download the source and enter it with the following command. You also need to install CRF ++ and Mecab in advance.

./configure
make
make check
sudo make install

If you want to use it with python, you can also do the following in the python folder.

python setup.py install

The part that seems to stumble

Source URL

The official download link jumps to the public folder of Google Drive, so I didn't know for a moment what to do with the URL specified for wget when putting it in CentOS. For the time being, I was able to download it below. (Only CRF ++ was authenticated to googlecode ...)

- url
Mecab http://cabocha.googlecode.com/files/cabocha-0.996.tar.bz2
CRF++ https://googledrive.com/host/0B4y35FiV1wh7fngteFhHQUN2Y1B5eUJBNHZUemJYQV9VWlBUb3JlX0xBdWVZTWtSbVBneU0/CRF++-0.58.tar.gz
Cabocha http://cabocha.googlecode.com/files/cabocha-0.996.tar.bz2

iconv conversion failed. skip this entry.... I think that the following warning (?) Will appear in a row when you make it.

iconv conversion failed. skip this entry....

It seems to be a problem with encode. In ./configure, the options are:

./configure --with-charset=utf8 --enable-utf8-only

Library load error

I get this kind of error when making

error while loading shared libraries: libcrfpp.so.0: cannot open shared object file: No such file or directory

It depends on the environment, but it's okay if you add the dependency information.

echo "/usr/local/lib" >> /etc/ld.so.conf.d/lib.conf
ldconfig

(Python3) setup.py has an error

I think string.split (cmd1 (str)) will somehow get angry. This error occurs because the cabocha code does not support python3. A kind person wrote the patch file, so please modify the source accordingly.

For the time being, like this. I might add it if something happens again.

Recommended Posts

Summary of stumbling blocks in installing CaboCha
A list of stumbling blocks in Django's image upload
Summary of various operations in Tensorflow
Summary of methods often used in pandas
Summary of frequently used commands in matplotlib
Summary of various for statements in Python
Summary of modules and classes in Python-TensorFlow2-
Summary of built-in methods in Python list
Summary of OSS tools and libraries created in 2016
Summary of how to import files in Python 3
Summary of what was used in 100 Pandas knocks (# 1 ~ # 32)
Summary of how to use MNIST in Python
Summary of evaluation functions used in machine learning
[Python/Django] Summary of frequently used commands (2) <Installing packages>
Summary of error handling methods when installing TensorFlow (2)
Summary of Excel operations using OpenPyXL in Python
Summary of tools needed to analyze data in Python
Summary of date processing in Python (datetime and dateutil)
Numerical summary of data
Summary of Tensorflow / Keras
Summary of pyenv usage
Summary of string operations
Summary of Python arguments
Summary of logrotate software logrotate
Summary of test method
Summary of Prototype patterns introductory design patterns learned in Java language
Basic summary of data manipulation in Python Pandas-Second half: Data aggregation
[Updated from time to time] Summary of design patterns in Java
Summary of Singleton patterns introductory design patterns learned in Java language
[For beginners] Summary of standard input in Python (with explanation)
Summary of how to write .proto files used in gRPC