I checked how to use Janome, so I made a note.

What is Janome?

Janome is a dictionary comprehension morphological analyzer written in Pure Python. We aim to be a morphological analysis library with a simple API that can be easily installed without dependent libraries and easily incorporated into applications.

I feel like I'm going to try it out, so I decided to use Janome, which seems to be the easiest to use in Python. Compared to Mecab, it's easier to use with just pip install. For other Japanese morphological analysis tools, see the summary of here.

How to use Janome

Excerpt from the official website.

from janome.tokenizer import Tokenizer
t = Tokenizer()
for token in t.tokenize(u'Of the thighs and thighs'):
    print(token)

About the output of Tokenizer

When the result of Tokenizer.tokenize is output by print, it looks like this.

Is verb, non-independent, \ *, \ *, one-step, uninflected, is, il, il

According to here, from the left, "original word", "part of speech", "part of speech subclassification 1", "classification 2", "classification" 3 ”,“ inflected form ”,“ inflected form ”,“ original form ”,“ reading ”,“ pronunciation ”.

The result of tokenize has the following string properties.

--surface: original word --part_of_speech: [Part of speech], [Part of speech subclassification 1], [Category 2], [Category 3] --infl_type: Inflected form --infl_form: Utilization type --base_form: Prototype --reading: reading --phonetic: Pronunciation.

Recommended Posts

Japanese morphological analysis using Janome

Japanese analysis processing using Janome part1

Python: Japanese text: Morphological analysis

Japanese morphological analysis with Python

■ [Google Colaboratory] Use morphological analysis (janome)

Try using the Chinese morphological analysis engine jieba

Morphological analysis tool installation (MeCab, Human ++, Janome, GiNZA)

Data analysis using xarray

Data analysis using Python 0

Orthologous analysis using OrthoFinder

100 language processing knock-30 (using pandas): reading morphological analysis results