[PYTHON] Remove extra strings in URLs with regular expressions

import re

pattern = re.compile(r"(^[^-]*-[^-]*)-[^-,]*")

with open('out.csv', encoding='utf-8') as f:
    for row in f.readlines():
        m = pattern.match(row)
        if m:
            print(m.group(1))

out.csv

https://www.abcde.com/-0w69e7e1w00-AIUEO
https://www.abcde.com/-0w69e7e9w70-Kakikukeko
https://www.abcde.com/-0w08e1e0w00-SA Shi Su Se So
https://www.abcde.com/-0w69e7e1w70-TA Chi Tsu Te to
https://www.abcde.com/-0w69e6e2w54-What is it

Recommended Posts

Remove extra strings in URLs with regular expressions
Replace non-ASCII with regular expressions in Python
Use regular expressions in C
Extract numbers with regular expressions
Handling regular expressions with PHP / Python
When using regular expressions in Python
Don't use \ d in Python 3 regular expressions!
How to use regular expressions in Python
Python: Simplified morphological analysis with regular expressions
Distinguish between numbers and letters with regular expressions
Pharmaceutical company researchers summarized regular expressions in Python
[Python] Get rid of dating with regular expressions
Remove rows with duplicate indexes in pandas DataFrame
[Python] Regular Expressions Regular Expressions
FizzBuzz with regular expressions etc. without using the'%' operator
Implement hierarchical URLs with drf-nested-routers in Django REST framework