Use mecab-ipadic-neologd with igo-python

It is assumed that MeCab is already installed.

procedure

  1. Install igo-python pip install igo-python
  2. Download igo-0.4.5.jar
  3. Download mecab-ipadic-neologd (git clone)
  4. Go to mecab-ipadic-neologd and run ./bin/install-mecab-ipadic-neologd. Then you will have a build directory.
  5. Copy ʻigo-0.4.5.jartomecab-ipadic-neologd / build / mecab-ipadic-2.7.0-20070801-neologd-20150401 and execute the following command java -cp igo- 0.4.5.jar net.reduls.igo.bin.BuildDic neologd. "utf-8" `

That's it. I will try to see if it worked.

Python 2.7.8 (default, Mar 31 2015, 12:51:47)
Type "copyright", "credits" or "license" for more information.

IPython 3.0.0 -- An enhanced Interactive Python.
?         -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help      -> Python's own help system.
object?   -> Details about 'object', use 'object??' for extra details.

In [1]: import igo

In [2]: t  = igo.tagger.Tagger('neologd')  #Java earlier~Path to the directory created in

In [3]: for i in t.parse(u'Apple will release the Apple Watch domestically on April 24th.'):
   ...:     print i.surface
   ...:
Apple
Is
Apple Watch
To
April 24
To
Domestic
Release
Shi
Masu
。

You can't get Apple Watch all at once with regular MeCab, but thanks to mecab-ipadic-neologd, you can get it. This time I ran it in the directory where neologd was created, so there was no problem, but as I wrote in the comment, when actually using it, you need to pass the path to the created neolog directory.

This is convenient because you can flexibly analyze morphological elements without installing MeCab.

important point

Explanation of commands for compiling dictionaries for igo

In the directory where igo-0.4.5.jar is located java -cp igo-0.4.5.jar net.reduls.igo.bin.BuildDic path to the directory storage directory Character code to the buid directory in mecab-ipadic-neologd

Error compiling dictionary for igo

ʻException in thread "main" java.lang.OutOfMemoryError: If you get an error like Java heap space, add -Xmx1024m` to the options. I don't know the details, but it seems that the heap is not enough, so I would like to see it by specifying the size.

java -Xmx1024m -cp igo-0.4.5.jar net.reduls.igo.bin.BuildDic neologd . "utf-8"

I referred to here, but I got the same error on 1024, so I managed to double it to 2048. Then the error disappeared.

reference

I referred to the following article. Thank you very much.

Recommended Posts

Use mecab-ipadic-neologd with igo-python
Use RTX 3090 with PyTorch
Use ansible with cygwin
Use pipdeptree with virtualenv
[Python] Use JSON with Python
Use Mock with pytest
Use indicator with pd.merge
Use Gentelella with django
Use mecab with Python3
Use tensorboard with Chainer
Use DynamoDB with Python
Use pip with MSYS2
Use Python 3.8 with Anaconda
Use pyright with Spacemacs
Use python with docker
Use TypeScript with django-compressor
Use LESS with Django
Use MySQL with Django
Use Enums with SQLAlchemy
Use tensorboard with NNabla
Use GPS with Edison
Use nim with Jupyter
Use mecab-ipadic-neologd from python
Use Trello API with python
Use shared memory with shared libraries
Use "$ in" operator with mongo-go-driver
Use custom tags with PyYAML
Use directional graphs with networkx
Use TensorFlow with Intellij IDEA
Use Twitter API with Python
Use pip with Jupyter Notebook
Use DATE_FORMAT with SQLAlchemy filter
Use TUN / TAP with Python
Use sqlite3 with NAO (Pepper)
Use sqlite load_extensions with Pyramid
Use Windows 10 fonts with WSL
Use chainer with Jetson TK1
Use SSL with Celery + Redis
Use Cython with Jupyter Notebook
Use Maxout + CNN with Pylearn2
Use WDC-433SU2M2 with Manjaro Linux
Use OpenBLAS with numpy, scipy
Use subsonic API with python3
Use Sonicwall NetExtener with Systemd
Use prefetch_related conveniently with Django
Use AWS interpreter with Pycharm
Use Bokeh with IPython Notebook
Use Python-like range with Rust
Use MLflow with Databricks ④ --Call model -
Use pyright with CentOS7, emacs lsp-mode
Python: How to use async with
Use Azure SQL Database with SQLAlchemy
Use PointGrey camera with Python (PyCapture2)
Use vl53l0x with Raspberry Pi (python)
Use PX-S1UD / PX-Q1UD with Jetson nano
Use the preview feature with aws-cli
How to use virtualenv with PowerShell
[Python] Use Basic/Digest authentication with Flask
Use NAIF SPICE TOOLKIT with Python
Use rospy with virtualenv in Python3
Use markdown with jupyter notebook (with shortcut)