I'll try.

31. Verb

Extract all the surface forms of the verb.

Maven

Use the version currently under development.

<dependency>
	<groupId>org.nlp4j</groupId>
	<artifactId>nlp4j-core</artifactId>
	<version>1.1.1.0-SNAPSHOT</version>
</dependency>

Text Data

In the morphological analysis used by default (Yahoo! Japan Developer Network Japanese morphological analysis), the upper limit of the request size is 900KB, and the number of times is limited, so a small text file is used.

Java Code

package nlp4j.nokku.chap4;

import java.util.List;

import nlp4j.Document;
import nlp4j.DocumentAnnotator;
import nlp4j.DocumentAnnotatorPipeline;
import nlp4j.Keyword;
import nlp4j.crawler.Crawler;
import nlp4j.crawler.TextFileLineSeparatedCrawler;
import nlp4j.impl.DefaultDocumentAnnotatorPipeline;
import nlp4j.index.DocumentIndex;
import nlp4j.index.SimpleDocumentIndex;
import nlp4j.yhoo_jp.YJpMaAnnotator;

public class Nokku31 {
	public static void main(String[] args) throws Exception {
		//Use the text file crawler provided by NLP4J
		Crawler crawler = new TextFileLineSeparatedCrawler();
		crawler.setProperty("file", "src/test/resources/nlp4j.crawler/neko_short_utf8.txt");
		crawler.setProperty("encoding", "UTF-8");
		crawler.setProperty("target", "text");
		//Document crawl
		List<Document> docs = crawler.crawlDocuments();
		//Definition of NLP pipeline (process by connecting multiple processes as a pipeline)
		DocumentAnnotatorPipeline pipeline = new DefaultDocumentAnnotatorPipeline();
		{
			// Yahoo!Annotator using Japan's morphological analysis API
			DocumentAnnotator annotator = new YJpMaAnnotator();
			pipeline.add(annotator);
		}
		//Execution of annotation processing
		pipeline.annotate(docs);
		//Use DocumentIndex to count keywords.
		SimpleDocumentIndex index = new SimpleDocumentIndex();
		//Add documentation
		index.addDocuments(docs);
		List<Keyword> kwds = index.getKeywords();
		kwds = kwds.stream() //
				.filter(o -> o.getFacet().equals("verb")) // 品詞がverb
				.collect(Collectors.toList());
		for (Keyword kwd : kwds) {
			System.err.println(kwd.getStr());
		}
	}
}

result

Born
Tsuka
Shi
Crying
start
Say
You see
listen
Say
Say
Catch
Boiled
Eat
Say
Thoughts
Loading
Lift
Shi
Ah
Calm down
You see
Say
Thoughts
Remaining
Mot
Shi
Meet
Meet
Shi
Blow
Throat
Ku
Weak
to drink
Say
Know

Summary

With NLP4J, you can easily process natural language in Java!

Project URL

https://www.nlp4j.org/

Return to Index

NLP4J [006-031] 100 language processing knocks with NLP4J # 31 verb

31. Verb

result

Summary

Project URL