Use Japanese morphological analysis "kuromoji"

kuromoji Home Page kuromoji GitHub

Gradle

build.gradle

dependencies {
    compile group: 'com.atilika.kuromoji', name: 'kuromoji-ipadic', version: '0.9.0'
}

"-Ipadic" is a dictionary.

Supported dictionaries
  • kuromoji-ipadic
  • kuromoji-ipadic-neologd:future version
  • kuromoji-jumandic
  • kuromoji-naist-jdic
  • kuromoji-unidic
  • kuromoji-unidic-kanaaccent
  • kuromoji-unidic-neologd

Example

String text = "I am a cat.";
Tokenizer tokenizer = new Tokenizer();
List<Token> tokenList = tokenizer.tokenize(text);

Recommended Posts

Use Japanese morphological analysis "kuromoji"
Morphological analysis in Java with Kuromoji
NLP4J [001b] Morphological analysis in Java (using kuromoji)
Get detailed results of morphological analysis with Apache Solr 7.6 + SolrJ (Japanese)
Released an API that can use Sentence Piece like morphological analysis
I tried morphological analysis with MeCab
Continued-Published a Web API that can use Sentence Piece like morphological analysis