Morphological analyzer janome finally [supports NElogd dictionary](https://medium.com/@mocobeta/neologd-%E8%BE%9E%E6% 9B% B8% E5% 86% 85% E5% 8C% 85% E3% 81% AE-janome-% E3% 83% 91% E3% 83% 83% E3% 82% B1% E3% 83% BC% E3 % 82% B8% E3% 81% AE% E3% 83% 80% E3% 82% A6% E3% 83% B3% E3% 83% AD% E3% 83% BC% E3% 83% 89% E3% 81 % A7% E3% 81% 8D% E3% 82% 8B% E3% 82% 88% E3% 81% 86% E3% 81% AB% E3% 81% 97% E3% 81% BE% E3% 81% 97 % E3% 81% 9F-% E4% B8% 8D% E5% AE% 9A% E6% 9C% 9F% E6% 9B% B4% E6% 96% B0-71611ab66415) and tried it. I got it lightly with pip install`.
The OS is Lubuntu14.04 and I am using Anaconda's python3.5.
Drop the file from here (google drive) and make it as in the README
$ pip install Janome-0.3.5.neologd20170814.tar.gz --no-compile
However, I got the following error.
OSError: [Errno 28] No space left on device
Upon examination, this error seems to occur for a variety of reasons. It seems that it may be due to the capacity of the disk or the number of files. In this case, it seems that it was due to the capacity.
$ df -h
Filesystem Size Used Avail Use% Mounted on
...(Omitted)
/dev/zram1 1.5G 1.4G 4.0K 100% /tmp
...(Omitted)
And so on, the tmp directory is full. Since this janome handles a large dictionary file, I imagine that it has exceeded the limit.
As a workaround, it seems easy to temporarily specify the tmp directory and then install.
$ mkdir $HOME/tmp
$ export TMPDIR=$HOME/tmp
$ pip install Janome-0.3.5.neologd20170814.tar.gz --no-compile
The tmp directory is specified by the ʻexport TMPDIR = $ HOME / tmp` command. This specification is temporary and will be discarded when the session is closed.
after this,
>>> from janome.tokenizer import Tokenizer
>>> t = Tokenizer(mmap=True)
>>> for x in t.tokenize("Prime Minister Abe eating melon bread"): print(x)
#Melonpan noun,Proper noun,General,*,*,*,Melon bread,Melon bread,Melon bread
#Particles,Case particles,General,*,*,*,To,Wo,Wo
#Eat verb,Independence,*,*,One step,Uninflected word,eat,Tabel,Tabel
#Prime Minister Abe noun,Proper noun,General,*,*,*,Prime Minister Abe,Abesouri,Avesori
Recommended Posts