Notes on using MeCab from Python

A personal note when using MeCab from Python.

mecab.py


#!/usr/bin/env python
# -*- coding:utf-8 -*-

import MeCab
m = MeCab.Tagger()

print m.parse("If a dog walks, it hits a stick.")
$ ./mecab.py
Dog noun,General,*,*,*,*,dog,Dog,Dog
Also particles,Particle,*,*,*,*,Also,Mo,Mo
Walk verb,Independence,*,*,Five-dan / Ka line,Assumed form,walk,Arche,Arche
Particle,Connection particle,*,*,*,*,If,Ba,Ba
Stick noun,General,*,*,*,*,rod,Bow,baud
Particles,Case particles,General,*,*,*,To,D,D
Hit verb,Independence,*,*,Five steps, La line,Uninflected word,Hit,Ataru,Ataru
.. symbol,Kuten,*,*,*,*,。,。,。
EOS

Read from file

mecab_from_file.py


#!/usr/bin/env python
# -*- coding:utf-8 -*-

import sys
param = sys.argv
infile = param[1]

f = open(infile)
line = f.readline() 

import MeCab
m = MeCab.Tagger()

while line:
	res = m.parseToNode(line)

	while res:
		print res.feature
		#noun,General,*,*,*,*,dog,Dog,Dog

		res = res.next

	line = f.readline()

Count the frequency of part of speech read from a file

It's easier to use collections.defaultdict when counting elements.

mecab_class_count.py


#!/usr/bin/env python
# -*- coding:utf-8 -*-

import sys
param = sys.argv
infile = param[1]

f = open(infile)
line = f.readline() 

import MeCab
m = MeCab.Tagger()

from collections import defaultdict
frequency = defaultdict(int)

while line:
	res = m.parseToNode(line)

	while res:
		# print res.feature
		#noun,General,*,*,*,*,dog,Dog,Dog
		
		arr = res.feature.split(",")
		class_1 = arr[0]
		frequency[class_1] += 1
		
		res = res.next

	line = f.readline()

# print frequency
# defaultdict(<type 'int'>, {'...

for k, v in frequency.iteritems():
    print k, v
$ ./mecab_morph_count.py input.txt
Verb 4
BOS/EOS 8
Noun 9
Particle 7
Auxiliary verb 1

option

Specify a dictionary

#MeCab instance
m = MeCab.Tagger(' -d /usr/local/Cellar/mecab/0.996/lib/mecab/dic/mecab-ipadic-neologd')

Specify mecabrc

m = MeCab.Tagger('-r my_mecabrc')

Recommended Posts

Notes on using MeCab from Python
Notes on installing Python using PyEnv
MeCab from Python
Notes on using rstrip with python.
Notes on accessing dashDB from python
Notes for using OpenCV on Windows10 Python 3.8.3.
From Python to using MeCab (and CaboCha)
Notes on using code formatter in Python
Notes using Python subprocesses
Notes on using Alembic
Notes on installing Python3 and using pip on Windows7
Notes on using dict in python [Competition Pro]
[Python] Notes on accelerating genetic algorithms using multiprocessing
Notes on oct2py calling Octave scripts from Python
Python notes using perl-ternary operator
Python notes using perl-special variables
[Django] Notes on using django-debug-toolbar
[Python] Notes on data analysis
Notes on optimization using Pytorch
Notes on installing Python on Mac
Broadcast on LINE using python
Notes on installing Python on CentOS
Minimum notes when using Python on Mac (Homebrew edition)
Notes on Python and dictionary types
Using Rstan from Python with PypeR
Notes on importing data from MySQL or CSV with Python
Introducing Python using pyenv on Ubuntu 20.04
Preparing python using vscode on ubuntu
Notes on using post-receive and post-merge
python + django + scikit-learn + mecab (1) on heroku
python + django + scikit-learn + mecab (2) on heroku
Study on Tokyo Rent Using Python (3-2)
Using Cloud Storage from Python3 (Introduction)
Mecab / Cabocha / KNP on Python + Windows
Install Python on CentOS using Pyenv
Study on Tokyo Rent Using Python (3-3)
Run Ansible from Python using API
Precautions when using phantomjs from python
Access spreadsheets using OAuth 2.0 from Python
Notes on using matplotlib on the server
When using MeCab with virtualenv python
Install Python on CentOS using pyenv
(Beginner) Notes on using pyenv on Mac
Call C / C ++ from Python on Mac
Try using Amazon DynamoDB from Python
Update Python on Mac from 2 to 3
How to know the number of GPUs from python ~ Notes on using multiprocessing with pytorch ~
Install mecab on Sakura shared server and call it from python
Notes using cChardet and python3-chardet in Python 3.3.1.
Execute Python code on C ++ (using Boost.Python)
Notes on nfc.ContactlessFrontend () for nfcpy in python
Using Python and MeCab with Azure Databricks
Learning notes from the beginning of Python 1
Install python library on Lambda using [/ tmp]
Notes on doing Japanese OCR with Python
Connecting from python to MySQL on CentOS 6.4
I tried using UnityCloudBuild API from Python
Notes on building Python and pyenv on Mac
Notes on implementing APNs tests using Pytest
Notes on using OpenCL on Linux on the RX6800
Run servomotor on Raspberry Pi 3 using python