Get Japanese synonyms in Python

I haven't found a lot of easy ways to get synonyms in Japanese when doing natural language processing with python, so I will summarize it.

Advance preparation

This time, we will use NLTK's wordnet and the Open Multilingual Wordnet function for handling Japanese.

pip install nltk
python -c "import nltk;nltk.download('wordnet')"
python -c "import nltk;nltk.download('omw)"

Get Synset

Synset is a unit of concept defined in Wordnet. Let's try to get a Synset for the word "rice" and see the definition.

from nltk.corpus import wordnet

synsets = wordnet.synsets("Rice",lang='jpn')
for syn in synsets:
    print(syn,":",syn.definition())

# Synset('rice.n.01') : grains used as food either unpolished or more often polished
# Synset('united_states.n.01') : North American republic containing 50 states - 48 conterminous states in North America plus Alaska in northwest North America and the Hawaiian Islands in the Pacific Ocean; achieved independence in 1776
# Synset('meter.n.01') : the basic unit of length adopted under the Systeme International d'Unites (approximately 1.094 yards)

It can be confirmed that "food", "America", and "meter" are registered as the concept for "rice".

Acquisition of synonyms

Since words belonging to the concept are registered in Synset, they can be obtained as synonyms. Try to get a synonym for "rice" as "food"

rice_synset=synsets[0]
synonyms=rice_synset.lemma_names("jpn")
print(synonyms)
# ['Rice', 'rice', 'I'm sorry', 'U.S.A.', 'Raised rice', 'rice offered to a god', 'Yagi', 'rice', 'Pillow rice', 'Rice production', 'Rice field', 'White rice', 'God rice', 'Valley', 'Rice', 'Rice孫', 'Grain', 'Rice', 'RiceGrain', 'Ricefood', '粮Rice', '糧Rice', 'Sari', '褻Rice', 'Silver rice', 'rice', 'food', 'foodRice']

I was able to acquire good synonyms such as "rice" and "rice".

Summary

I was able to easily search for synonyms from python using NLTK's Open Multilingual Wordnet. As a caveat, multiple concepts are registered for some words, so it seems necessary to choose an appropriate Synset so as not to get synonyms that are different from what you intended.

that's all

reference

Recommended Posts

Get Japanese synonyms in Python
Get date in Python
Japanese output in Python
Get YouTube Comments in Python
I wrote python in Japanese
Get last month in python
Get Terminal size in Python
Explicitly get EOF in python
I understand Python in Japanese!
Get Evernote notes in Python
Get Leap Motion data in Python.
Get data from Quandl in Python
Get the desktop path in Python
Get the script path in Python
Get, post communication memo in Python
Get the host name in Python
How to handle Japanese in Python
Get started with Python in Blender
Get additional data in LDAP with python
Quadtree in Python --2
Python in optimization
Metaprogramming in Python
Python 3.3 in Anaconda
Geocoding in python
SendKeys in Python
Get Suica balance in Python (using libpafe)
Meta-analysis in Python
Unittest in python
Comparison of Japanese conversion module in Python3
Epoch in Python
Discord in Python
Get Google Fit API data in Python
Sudoku in Python
nCr in python
N-Gram in Python
How to get a stacktrace in python
Programming in python
Get Youtube data in Python using Youtube Data API
Plink in Python
Get battery level from SwitchBot in Python
Get a token for conoha in python
Get Started with TopCoder in Python (2020 Edition)
Lifegame in Python.
FizzBuzz in Python
Sqlite in python
N-gram in python
LINE-Bot [0] in Python
Csv in python
Disassemble in Python
Constant in python
nCr in Python.
format in python
Scons in Python3
Puyo Puyo in python
python in virtualenv
PPAP in Python
Get the EDINET code list in Python
Get Precipitation Probability from XML in Python
Get Cloud Logging available in Python in 10 minutes
Quad-tree in Python
Reflection in Python