[PYTHON] Remove extra strings in URLs with regular expressions

import re

pattern = re.compile(r"(^[^-]*-[^-]*)-[^-,]*")

with open('out.csv', encoding='utf-8') as f:
    for row in f.readlines():
        m = pattern.match(row)
        if m:
            print(m.group(1))

out.csv

https://www.abcde.com/-0w69e7e1w00-AIUEO
https://www.abcde.com/-0w69e7e9w70-Kakikukeko
https://www.abcde.com/-0w08e1e0w00-SA Shi Su Se So
https://www.abcde.com/-0w69e7e1w70-TA Chi Tsu Te to
https://www.abcde.com/-0w69e6e2w54-What is it

Recommended Posts

Remove extra strings in URLs with regular expressions

Replace non-ASCII with regular expressions in Python

Use regular expressions in C

Extract numbers with regular expressions

Handling regular expressions with PHP / Python

When using regular expressions in Python

Don't use \ d in Python 3 regular expressions!

How to use regular expressions in Python

Python: Simplified morphological analysis with regular expressions

Distinguish between numbers and letters with regular expressions

Pharmaceutical company researchers summarized regular expressions in Python

[Python] Get rid of dating with regular expressions

Remove rows with duplicate indexes in pandas DataFrame

[Python] Regular Expressions Regular Expressions

FizzBuzz with regular expressions etc. without using the'%' operator

Implement hierarchical URLs with drf-nested-routers in Django REST framework