Precautions when using TextBlob trait analysis

It's a very rudimentary story, but I've found some caveats when using a library that can perform sentiment analysis.

About TextBlob characteristic analysis

Regarding TextBlob in the Python library, you can use the -.sentiment method of this library to make simple sentences Polarity and Subjectivity (Polarity: Polarity. Text). Subjectivity: Independence. Whether the speaker's attitude is positive or negative) can be analyzed. There is an analyzed report on here, so please take a look.

TextBlob trait analysis is a black box

You can use the .sentiment method to analyze the tendency of books and the stream of consciousness of fictitious characters, but the problem is that TextBlob's sentiment analysis is a black box.

In the Official Documents, the Pattern of the University of Antwerp, Computational Linguistics and Psycholinguistics There is a guide that uses the text mining module and dataset, and the Naive Bayes classifier from the NLTK library. In other words, if you do not know the sources of these two, you will not know "Why did this polarity / subjectivity value come out?" It seems that it will be necessary to verify what kind of classification result will be obtained by the .sentiment method in the future.

Characteristic analysis considering context is not possible

Most importantly, "TextBlob's .sentiment method does not allow context-sensitive trait analysis." As part of the verification, I wrote 10 lines of text containing ethical imperatives such as the Ten Commandments of Moses, and compared the texts with the exact opposite.

We must be ethical. We must have the independence of will. We must be based on the concept of duty. We must think universally. We must not tell lies. We must not kill ourselves. We must cultivate our talent very arbitrary. We must be kind to each other. We must preserve our own lives. We must secure our happiness.

We must not be ethical. We must not have the independence of will. We must not be based on the concept of duty. We must not think universally. We must tell lies. We must kill ourselves. We must not cultivate our talent very arbitrary. We must not be kind to each other. We must not preserve our own lives. We must not secure our happiness.

The result is that both texts have almost the same polarity and independence. I had predicted that personal pronouns and modal auxiliary verbs, including myself, such as “We” and “must” would increase Subjectivity, but apparently this is not the case. Also note that adding "not" to completely reverse the meaning of the sentence does not change the polarity or independence at all. In particular,

“We must not be ethical.” *

“We must not preserve our own lives.” *

Such sentences are also regarded as "positive and independent texts".

Correct use of TextBlob characterization

TextBlob's .sentiment method can only analyze an abstract impression such as "whether the word used is totally positive or not?" And "Is the person who wrote it really independent and positive?" It turns out that it is inappropriate to know a specific profile such as "?". In order to use it for the latter purpose, it is necessary to use another library that can analyze characteristics from the context, or to grasp the meaning of sentences to some extent by parsing and use it for labeling. My current goal is to use the characteristics of ethical imperatives in machine learning datasets, so I would like to write another report once I find a good solution.

[PYTHON] Precautions when using TextBlob trait analysis

Precautions when using TextBlob trait analysis

About TextBlob characteristic analysis

TextBlob trait analysis is a black box

Characteristic analysis considering context is not possible

Correct use of TextBlob characterization