[Python] Get rid of dating with regular expressions

The beginning of that

I was collecting data on Twitter, but after that I left it alone. So, when I look at the data for a long time, the mysterious municipal tweet of fav0 dating is Wansaka Wansaka ……. I actually found it by searching Twitter.

image.png

The user name was so obscene that I hid it ... What is this mysterious word ... It was a little more sentence before ...? And before, I specified a specific word, and when it hit, I used to say goodbye, but there is no common word that can be specified with this number of characters.

So delete it with a regular expression.

Source

Dokan with each sample of operation check. Earthen pipe.

At first glance, as a muttering pattern ① "Hiragana 1 character" "Hiragana or punctuation" "Municipal name" ② "Hiragana 3 characters" "Symbol" "Municipal name" Since these are the two, replace the appropriate ones with blanks and then delete the blank lines.

Since the data is in the data frame, I will manage to do it there. It's been a while since I've had a python time. It's over soon.

import pandas as pd
import re
DF_samp=pd.DataFrame({'col_0': {'row_0': "Oh Osaka", 'row_1': "Oh, Osaka city aaa", 'row_2': "Oops, Osaka"},'col_1': {'row_0': 3, 'row_2': 4, 'row_3': 5},})
cols=DF_samp.col_0
cols0=cols.str.replace("[Ah-From][Ah-From][!-/:-@?[-`{-~.. , ... \].+[town|village|city]$|[Ah-From][Ah-From!-/:-@?[-`{-~.. , ... \].+[town|village|city]$", '')
DF_samp.col_0=cols0
DF_samp.dropna(subset=['col_0'])

With this, only the corresponding mysterious sentence was destroyed. Yattane. I feel like I can hear the voice asking if I'm substituting there, but I don't like it for a long time ...

And now

I realized that a new pattern might come if this was seen by the BOT staff ... At that time, though.

Anyway, I want to be in a world where I can block efficiently! Well, it's the API that collects tweets, so this time it's not related to blocks.

Recommended Posts

[Python] Get rid of dating with regular expressions
Get rid of dirty data with Python and regular expressions
Handling regular expressions with PHP / Python
[Python] Regular Expressions Regular Expressions
Replace non-ASCII with regular expressions in Python
Python: Simplified morphological analysis with regular expressions
Get rid of DICOM images in Python
Get the number of searches with a regular expression. SeleniumBasic VBA Python
Get CPU information of Raspberry Pi with Python
Get date with python
Get the operation status of JR West with Python
Get country code with python
Get Twitter timeline with python
Get Youtube data with python
Get rid of python's KeyError
Use regular expressions in Python
Extract numbers with regular expressions
Get thread ID with python
Regular expression manipulation with Python
About Python and regular expressions
Get started with Python! ~ ② Grammar ~
Get stock price with Python
Get home directory with python
Get keyboard events with python
Get Alembic information with Python
Get a list of purchased DMM eBooks with Python + Selenium
Sample of HTTP GET and JSON parsing with python of pepper
Get the source of the page to load infinitely with python.
Get rid of slow scp -pr
Get started with Python! ~ ① Environment construction ~
Link to get started with python
Get reviews with python googlemap api
I can't remember Python regular expressions
Get web screen capture with python
Get the weather with Python requests 2
[Python] Get economic data with DataReader
Getting Started with Python Basics of Python
How to get started with Python
String replacement with Python regular expression
Life game with Python! (Conway's Game of Life)
[Small story] Get timestamp with Python
10 functions of "language with battery" python
Get Qiita trends with Python scraping
Implementation of Dijkstra's algorithm with python
Coexistence of Python2 and 3 with CircleCI (1.0)
Get started with Python in Blender
Get weather information with Python & scraping
When using regular expressions in Python
Basic study of OpenCV with Python
Python techniques for those who want to get rid of beginners
Get a list of files in a folder with python without a path
PhytoMine-I tried to get the genetic information of plants with Python
Get the width of the div on the server side with Selenium + PhantomJS + Python
Overlapping regular expressions in Python and Java
Get additional data in LDAP with python
Basics of binarized image processing with Python
Get property information by scraping with python
[Examples of improving Python] Learning Python with Codecademy
Get html from element with Python selenium
[Note] Get data from PostgreSQL with Python
How to get rid of long comprehensions