Python: Simplified morphological analysis with regular expressions

SAMPLE

I|Is|Cat|so|is there|。|name|Is|yet|Nothing|I|。

REFERENCE

Shortcut morphological analysis by regular expression

PYTHON Subtly modified.

text_m = []
text = "I am a cat. There is no name yet."
p = re.compile(r"/|[A-Z]+|[a-z]+|[A-Hmm]+|[Ah-Hmm-]+|[A-Month]+|[one-Dragon]+|[。、]|/")
m = p.findall(text)
for row in m:
   if re.compile(r'^[Ah-Hmm]+$').fullmatch(row):
      if row[0] in 'Peeling':
         prefix = row[0]
         token = row[1:]
         text_m.append(prefix)
         if (len(token)>0):
            text_m.append(token)
      elif row[-2:] in 'So from':
         token = row[0:-2]
         suffix = row[-2:]
         text_m.append(token)
         text_m.append(suffix)
      elif row[-1:] in 'Mohagade':
         token = row[0:-1]
         suffix = row[-1:]
         text_m.append(token)
         text_m.append(suffix)
      else:
         text_m.append(row)
   else:
      text_m.append(row)

## output
'|'.join(text_m)

Recommended Posts

Python: Simplified morphological analysis with regular expressions
[Python] Morphological analysis with MeCab
Japanese morphological analysis with Python
Text mining with Python ① Morphological analysis
Handling regular expressions with PHP / Python
[Python] Regular Expressions Regular Expressions
Data analysis with python 2
Voice analysis with python
Data analysis with Python
[Python] Get rid of dating with regular expressions
Text mining with Python ① Morphological analysis (re: Linux version)
[Co-occurrence analysis] Easy co-occurrence analysis with Python! [Python]
Collecting information from Twitter with Python (morphological analysis with MeCab)
Sentiment analysis with Python (word2vec)
Use regular expressions in Python
Planar skeleton analysis with Python
Regular expression manipulation with Python
Muscle jerk analysis with Python
[PowerShell] Morphological analysis with SudachiPy
Get rid of dirty data with Python and regular expressions
Morphological analysis using Igo + mecab-ipadic-neologd in Python (with Ruby bonus)
3D skeleton structure analysis with Python
Impedance analysis (EIS) with python [impedance.py]
I can't remember Python regular expressions
I played with Mecab (morphological analysis)!
Data analysis starting with python (data visualization 1)
Logistic regression analysis Self-made with python
Data analysis starting with python (data visualization 2)
When using regular expressions in Python
From the introduction of JUMAN ++ to morphological analysis of Japanese with Python
Overlapping regular expressions in Python and Java
Marketing analysis with Python ① Customer analysis (decyl analysis, RFM analysis)
Two-dimensional saturated-unsaturated osmotic flow analysis with Python
Machine learning with python (2) Simple regression analysis
2D FEM stress analysis program with Python
Tweet analysis with Python, Mecab and CaboCha
Principal component analysis with Power BI + Python
Data analysis starting with python (data preprocessing-machine learning)
Two-dimensional unsteady heat conduction analysis with Python
How to use regular expressions in Python
From preparation for morphological analysis with python using polyglot to part-of-speech tagging
[Let's play with Python] Aiming for automatic sentence generation ~ Perform morphological analysis ~
FizzBuzz with Python3
Scraping with Python
Statistics with python
Scraping with Python
Pharmaceutical company researchers summarized regular expressions in Python
Python with Go
Data analysis python
[Various image analysis with plotly] Dynamic visualization with plotly [python, image]
Classify Qiita posts without morphological analysis with Tweet2Vec
Python pandas: Search for DataFrame using regular expressions
Twilio with Python
Integrate with Python
Play with 2016-Python
AES256 with python
Tested with Python
python starts with ()
Remove extra strings in URLs with regular expressions
Thorough comparison of three Python morphological analysis libraries
with syntax (Python)