What is ipadic-neologd? One of the dictionaries for mecab. It updates more than twice a week, so you can deal with new words and named entities.
#ipadic-neologd unused
m=MeCab.Tagger()
print(m.parse("COVID-19 caused an overshoot."))
>COVID COVID COVID noun-Proper noun-Organization
- - -noun-Change connection
19 19 19 noun-number
By grinning by particles-Case particles-Collocation
Over over over noun-Change connection
Shoot shoot shoot noun-Change connection
Ga ga ga particle-Case particles-General
Wake up ok wake up verb-Independent five-stage / la line continuous connection
Ta ta auxiliary verb special ta ta basic form
.. .. .. symbol-Kuten
EOS
#ipadic-using neologd
m=MeCab.Tagger("-d {Dictionary path}")
print(m.parse("COVID-19 caused an overshoot."))
>COVID-19 nouns,Proper noun,General,*,*,*,COVID-19,Covid Nine Teen,Covid Nine Teen
By particles,Case particles,Collocation,*,*,*,By,Grinning,Grinning
Overshoot noun,Proper noun,General,*,*,*,Overshoot,Overshoot,Overshoot
Is a particle,Case particles,General,*,*,*,But,Moth,Moth
Happening verb,Independence,*,*,Five steps, La line,Continuous connection,Occur,Oko,Oko
Auxiliary verb,*,*,*,Special,Uninflected word,Ta,Ta,Ta
.. symbol,Kuten,*,*,*,*,。,。,。
EOS
The following article was very easy to understand.
If you get an error, adding the following worked fine:
!sudo cp /etc/mecabrc /usr/local/etc/
import MeCab
m=MeCab.Tagger("{Output format(See below)} -d {ipadic-neologd path}")
print(m.parse("Keep your social distance"))
Social distance noun,Proper noun,General,*,*,*,Social distance,Social distance,Social distance
Particles,Case particles,General,*,*,*,To,Wo,Wo
Tamotsu and verb,Independence,*,*,Five steps / Ta line,Connection,keep,Tamoto,Tamoto
Auxiliary verb,*,*,*,Immutable type,Uninflected word,U,C,C
EOS
Surface shape: Remains separated into morphemes Part of speech: nouns, verbs, particles, auxiliary verbs, etc. Part of speech subcategory 1: Noun → proper noun, verb → independence, particle → case particle Part of speech subcategory 2: General, quote Part of speech subdivision 3: Utilization type: Verb → 5th dan / Ta line Inflection type: C connection Prototype. Reading, pronunciation:
Social Distance Social Distance Social Distance Noun-Proper noun-General
Wo Wo particle-Case particles-General
Tamotsu and Tamoto Keep verbs-Independence 5 steps / Ta line connection
Uuu auxiliary verb invariant basic form
EOS
Maintain social distance
Social Distance Otaku
0 BOS BOS/EOS,*,*,*,*,*,*,*,* 0 0 0 0 0 0 2 1 0.000000 0.000000 0.000000 0
6 Social distance nouns,Proper noun,General,*,*,*,Social distance,Social distance,Social distance 0 33 1288 1288 41 7 0 1 0.000000 0.000000 0.000000 -1987
213 particles,Case particles,General,*,*,*,To,Wo,Wo 33 36 156 156 13 6 0 1 0.000000 0.000000 0.000000 -1613
218 Ho and verb,Independence,*,*,Five steps / Ta line,Connection,keep,Tamoto,Tamoto 36 42 739 739 31 2 0 1 0.000000 0.000000 0.000000 3067
234 Auxiliary verb,*,*,*,Immutable type,Uninflected word,U,C,C 42 45 506 506 25 6 0 1 0.000000 0.000000 0.000000 3215
236 EOS BOS/EOS,*,*,*,*,*,*,*,* 45 45 0 0 0 0 3 1 0.000000 0.000000 0.000000 1300```
Recommended Posts