Install Mecab and CaboCha on ubuntu16.04LTS so that it can be used from python3 series

Introduction

I found a library called cabocha that can perform dependency analysis, so I tried to play with it for the time being, but if I started touching it with a light feeling, I had a hard time even installing it ... For the time being, I would like to write that I could do this in my environment.

environment

Main subject

Download MeCab and dictionary data

Execute the following command in the terminal

$ sudo apt-get -y install mecab libmecab-dev mecab-ipadic-utf8 mecab-jumandic-utf8

Binding of MeCab to python

pip3 install mecab-python3

It seems better to do this just in case

Download CRF ++

Here Download the latest version from (.tar). Currently it was ver. 0.58. Unzip in a suitable directory. Move.

$ tar zxvf CRF++-0.58.tar.gz
$ cd CRF++-0.58

Fixed because there is a bug

$ vim node.cpp
#include <time.h>(Addition)

Save, close and install

$ ./configure
$ sudo make
$ sudo make install

Download CaboCha

Download from here. However, the latest version (cabocha-0.69.tar.bz2) failed. There is no choice but to download cabocha-0.60.tar.gz. Unzip in a suitable directory. Move.

$ tar xzvf cabocha-0.60.tar.gz
$ cd cabocha-0.60

Fixed because there is a bug

$ sudo vim /etc/ld.so.conf
include /usr/local/bin      (Postscript)

Run

sudo /sbin/ldconfig

Another fix

$ vim src/utils.cpp

utils.cpp


void Unlink(const char *filename) {
#if defined(_WIN32) && !defined(__CYGWIN__)
  ::DeleteFileA(filename);
#else
  //::unlink(filename);Delete
  ::remove(filename);   //Postscript
#endif
}

Save, close and install

$ ./configure --with-mecab-config=`which mecab-config` --with-charset=utf8
$ sudo make clean
$ sudo make
$ sudo make install
$ sudo /sbin/ldconfig

Try running cabocha

$ cabocha
$ (Favorite sentences)

Binding to python3 series

$ sudo apt-get install swig python3-dev
$ cd cabocha-0.60
$ swig -python -shadow -c++ swig/CaboCha.i
$ mv swig/CaboCha.py python/
$ mv swig/CaboCha_wrap.cxx python/

Rewritten to support python3 series

$ cd python
$ vim setup.py

setup.py


def cmd2(str):
    #return string.split (cmd1(str))Delete
    return cmd1(str).split()   #Postscript

Save, close and run

$ sudo python3 setup.py build_ext
$ sudo python3 setup.py install
$ sudo /sbin/ldconfig
$ python3
$ import CaboCha

OK if no error occurs. This completes the installation. Please play appropriately.

in conclusion

Thank you for your hard work. It feels like I touched it lightly, and the accuracy is quite good, and it took some time to install it, but there seems to be a merit worth it. I would like to write a separate article about the results of playing around with it. Let's have a good natural language processing life!

reference

・ Http://taku910.github.io/mecab/ ・ Https://taku910.github.io/cabocha/ ・ Http://qiita.com/nezuq/items/f481f07fc0576b38e81d ・ Http://azwoo.hatenablog.com/entry/2015/10/01/234434 ・ Https://www.trifields.jp/install-cabocha-in-ubuntu-1038

Recommended Posts

Install Mecab and CaboCha on ubuntu16.04LTS so that it can be used from python3 series
Install mecab on Sakura shared server and call it from python
Python standard module that can be used on the command line
From Python to using MeCab (and CaboCha)
Moved Raspberry Pi remotely so that it can be LED attached with Python
How to install a Python library that can be used by pharmaceutical companies
Linux command that can be used from today if you know it (Basic)
Install selenium on Mac and try it with python
Python knowledge notes that can be used with AtCoder
[Django] About users that can be used on template
Scripts that can be used when using bottle in Python
Install CaboCha in Ubuntu environment and call it with Python.
I made it because I want JSON data that can be used freely in demos and prototypes
I tried to expand the database so that it can be used with PES analysis software
Understand the probabilities and statistics that can be used for progress management with a python program
・ <Slack> Write a function to notify Slack so that it can be quoted at any time (Python)
How to install OpenCV on Cloud9 and run it in Python
Python standard input summary that can be used in competition pro
Install lp_solve on Mac OS X and call it with python.
Install Mecab and mecab-python3 on Ubuntu 14.04
Notes on using MeCab from Python
Mecab / Cabocha / KNP on Python + Windows
I wrote a tri-tree that can be used for high-speed dictionary implementation in D language and Python.
Deploy Python face recognition model on Heroku and use it from Flutter ②
Hide the warning that zsh can be used by default on Mac
Deploy Python face recognition model on Heroku and use it from Flutter ①
Mathematical optimization that can be used for free work with Python + PuLP
Tweet analysis with Python, Mecab and CaboCha
Understand python lists, dictionaries, and so on.
Install pyenv and Python 3.6.8 on Ubuntu 18.04 LTS
I tried to publish my own module so that I can pip install it
Install Python3 + OCI CLI on 7-series Linux & Python version supported by OCI CLI will be 3.5+ after 2020/2/13
Easy program installer and automatic program updater that can be used in any language
I made a familiar function that can be used in statistics with Python
Read the image posted by flask so that it can be handled by opencv
Install and run Python3.5 + NumPy + SciPy on Windows 10
File types that can be used with Go
Functions that can be used in for statements
Install OpenCV 4.0 and Python 3.7 on Windows 10 with Anaconda
PHP and Python integration from scratch on Laravel
Install MongoDB on Ubuntu 16.04 and operate via python
Install Python and libraries for Python on MacOS Catalina
Install ZIP version Python and pip on Windows 10
Overview and useful features of scikit-learn that can also be used for deep learning
Convert images from FlyCapture SDK to a form that can be used with openCV
Summary of statistical data analysis methods using Python that can be used in business
Install PyCall on Raspberry PI and try using GPIO's library for Python from Ruby
Geographic information visualization of R and Python that can be expressed in Power BI
Set up an FTP server that can be created and destroyed immediately (in Python)
[Python] Introduction to web scraping | Summary of methods that can be used with webdriver
Until you can install blender and run it with python for the time being
Notes on how to use StatsModels that can use linear regression and GLM in python
A mechanism to call a Ruby method from Python that can be done in 200 lines
Simple statistics that can be used to analyze the effect of measures on EC sites and codes that can be used in jupyter notebook