A note about text mining.
Divide a large amount of text data into words and phrases by natural language processing ⇒ Previously, natural language was not fully developed and it was difficult to divide.
It has the following three functions.
Removes noise from text data and extracts information necessary for mining
Extract words by morpheme using the recorded dictionary The dictionary needs to be updated from time to time.
Absorbs notation fluctuations by creating and using synonym dictionaries. Determine if it is a synonym by looking at the data. ex) Evaluation is "high" = evaluation is "good" Price is "high" ≠ evaluation is "good"
By extracting morphemes that appear in the vicinity, interrogative forms, negative forms, and fluctuations in expression are extracted. ex) "Are you there?" = "Are you there?" ⇒ Verb + auxiliary verb + symbol
Morphemes are grouped into clauses, and the main predicate relations and modifier relations between clauses are judged.
Obtain new information and knowledge that matches the information you want to obtain from the set of extracted concepts
Calculate relevance from word co-occurrence
Divide text data into similar groups.
Analyze the context in which keywords are used ⇒ Is it similar to the topic model? ??
Helps understand and consider analysis results
Recommended Posts