Make one repeating string with a Python regular expression.

Thing you want to do

I wanted to do something when the characters embedded in the PDF were strange. I want to look like below. I want to combine the same characters when they are repeated in succession.

Ah ah → Ah Aiuueo → Aiueo ABCABCABC → ABCABCABC Yui Yui consent → Yui Yui consent

What i did

python


    #It is assumed that result already contains some character string
    result = re.sub(r"(.)\1{1,}", "\g<1>", result)  #Collect repeated strings

Other snippets

Text formatting


import re
from unicodedata import normalize
def clean_text(txt:str):
    result = re.sub(r"\s| ",'',txt)                #Remove whitespace first to make processing lighter
    result = normalize('NFKC', result)              #Unicode normalization
    result = re.sub(r"(.)\1{1,}", "\g<1>", result)  #Collect repeated strings
    if (')(cid:' in result):                        #Correspondence in case of character embedded PDF
        return ''
    return result

Let's try Louis Copipe!

Louise


import re

text = "Louise! Louise! Louise! Ruizuuuuuuuuaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa !! !!\n\
Ah ah ah ... ah ... ah! Ah ah ah ah! !! !! Louise Louise Louise Wow Wow Ah Ah! !! !!\n\
Ah Kunka Kunka! Kunka Kunka! Suha Suha! Suha Suha! It smells good ... Kun\n\
Hmm! I want to squeeze the pink blonde hair of Louise Francoise-tan! Kunka Kunka! Aa! !!\n\
mistook! I want to be fluffy! Mofumofu! Mofumofu! Hair Mofumofu! Crispy Mofumofu ... Kyun Kyun Kyu! !!\n\
The 12th volume of the novel, Louise, was cute! !! Ah ah ... ah ... ah ah ah! !! Fahhhhh! !!\n\
I'm glad that the second season of the anime was broadcast, Ruiz-tan! Oh Oh Oh Oh! cute! Louise! cute! A-aa ~ aa!"

print(re.sub(r"(.)\1{1,}", "\g<1>", text))

#Louise! Louise! Louise! Ruizuu Wow!
#Ah ... ah ... ah! Aa! Louise Louise Louise Wow!
#Ah Kunka Kunka! Kunka Kunka! Suha Suha! Suha Suha! It smells ... kun
#Hmm! I want to squeeze the pink blonde hair of Louise Francoise-tan! Kunka Kunka! Aa!
#mistook! I want to be fluffy! Mofumofu! Mofumofu! Hair fluffy! Crispy Mofumofu ... Kyun Kyun Kyu!
#The 12th volume of the novel, Louise, was cute! Ah ... ah ... ah! Fah!
#I'm glad that the second season of the anime was broadcast, Ruiz-tan! Aa! Cute! Louise! Cute! Ahhhh!

reference

Reverse replacement. I saw various things, but I felt that they were all here.

Grouping when using regular expressions in Python. For Python, it took me a while to realize that I had to write \ g <1> instead of $ 1.

Recommended Posts

Make one repeating string with a Python regular expression.
String replacement with Python regular expression
Determine if a string is a time with a python regular expression
Make a fortune with Python
Regular expression manipulation with Python
Let's make a GUI with python.
Make a recommender system with python
Let's make a graph with python! !!
Get the matched string with a regular expression and reuse it when replacing on Python3
Let's make a shiritori game with Python
I tried to make a regular expression of "amount" using Python
I tried to make a regular expression of "time" using Python
I tried to make a regular expression of "date" using Python
Let's make a voice slowly with Python
Let's make a web framework with Python! (1)
Make a desktop app with Python with Electron
Let's make a Twitter Bot with Python!
Let's make a web framework with Python! (2)
[Python] A function that searches the entire string with a regular expression and retrieves all matching strings.
Get the number of searches with a regular expression. SeleniumBasic VBA Python
Make a Twitter trend bot with heroku + Python
[Python] Make a game with Pyxel-Use an editor-
I want to make a game with Python
Try to make a "cryptanalysis" cipher with Python
[Python] Make a simple maze game with Pyxel
Make a rock-paper-scissors game in one line (python)
Let's replace UWSC with Python (5) Let's make a Robot
Try to make a dihedral group with Python
Decrypt a string encrypted on iOS with Python
[Python] Expression (1,2) does not make tuples with parentheses
A python lambda expression ...
Regular expression with pymongo
python regular expression memo
Regular expression in Python
Regular expression in Python
How to convert / restore a string with [] in python
Try to make a command standby tool with python
[Practice] Make a Watson app with Python! # 2 [Translation function]
[Practice] Make a Watson app with Python! # 1 [Language discrimination]
Make a simple Slackbot with interactive button in python
[Let's play with Python] Make a household account book
Let's make a simple game with Python 3 and iPhone
Make a breakpoint on the c layer with python
When writing an if statement with a regular expression
Make a CSV formatting tool with Python Pandas PyInstaller
A python regular expression, or a memo of a match object
What is God? Make a simple chatbot with python
[Super easy] Let's make a LINE BOT with Python.
Change the string to be replaced according to the matched string by replacing with Python regular expression
[Python] Use a string sequence
Python 處 處 regular expression Notes
Make Puyo Puyo AI with Python
Make a bookmarklet in Python
Create a directory with python
String format with Python% operator
Make a fire with kdeplot
Let's make a websocket client with Python. (Access token authentication)
[Practice] Make a Watson app with Python! # 3 [Natural language classification]
Quickly take a query string with API Gateway-> Lambda (Python)
Associate Python Enum with a function and make it Callable
Experiment to make a self-catering PDF for Kindle with Python