[PYTHON] Regular expression matching method
Regular expression matching method
A note on how to match regular expressions in Python
- 1 Import the regular expression module with ʻimport re`.
- 2 Call the
re.compile ()
function to create a Regex object (using a raw string).
Example: phone_num_regex = re.compile (r'\ d \ d \ d- \ d \ d \ d- \ d \ d \ d \ d')
- 3 If you pass the string to be searched to the
search ()
method of the Regex object, a Match object will be returned.
Example: mo = phone_num_regex.search ('My phone number is 415-555-4242.')
mo
means Matching object
- 4 Call the
group ()
method of the Match object to get the actual matched string.
Example: print ('phone number found:' + mo.group ())
→ Phone number found: 415-555-4242
Abbreviation for general character set
Shortened form |
meaning |
\d |
0~Number 9 |
\D |
0~Other than the number 9 |
\w |
Letters, numbers, underlining(Word word w) |
\W |
Other than letters, numbers and underscores |
\s |
Spaces, tabs, line breaks(Blank space s) |
\S |
Other than spaces, tabs and line breaks |
Summary of symbols used for regular expressions
+? Matches 0 or 1 occurrences of the previous group.
-
- Matches 0 or more occurrences of the previous group.
-
- Matches one or more occurrences of the previous group.
- {n} matches n occurrences of the previous group.
- {n,} matches n or more occurrences of the previous group.
- {, m} matches 0 to m occurrences of the previous group.
- {n, m} matches n ~ m occurrences of the previous group.
- {n, m} ?, *?, +? Make a non-greedy match for the previous group.
- ^ spam matches strings starting with "spam".
- spam $ matches strings ending in "spam".
+. Matches any single character other than the newline character.
- \ d, \ w, \ s match numbers, letters that make up words, and whitespace characters, respectively.
- \ D, \ W, \ S match numbers, letters that make up words, and non-whitespace characters, respectively.
- [abc] matches any single character in square brackets.
- [^ abc] matches any single character except the square bracket character.