I tried to make a regular expression of "date" using Python

Conclusion

Here is the regular expression for "date" in python.

The year version is as follows.

pattern = r'[12]\d{3}[/\-Year](0?[1-9]|1[0-2])[/\-Month](0?[1-9]|[12][0-9]|3[01])Day?$'
# OK
#February 22, 2020
# 2020-2-22
# 2020/2/22
# 1985/01/12
# 2010/12/11
# 2022/02/22

# NG
# 9999/99/99

The Japanese calendar version is as follows.

pattern = r'(Meiji|Taisho|Showa|Heisei|Reiwa)\d{1,2}Year(0?[1-9]|1[0-2])Month(0?[1-9]|[12][0-9]|3[01])Day'
# OK
#Reiwa February 22, 2002
#February 22, 2nd year of Reiwa
#February 22, 1990
#February 22, 1945
#February 22, 1918
#February 22, 1897

# NG
#Reiwa September 99, 1999

Preparation

The environment uses Google Colaboratory. The Python version is below.

import platform
print("python " + platform.python_version())
# python 3.6.9

The regular expression check tool used: https://regex101.com/ While checking here, we will create a regular expression and implement it in the code.

スクリーンショット 2020-04-19 11.50.19.png

Also, this is easy to understand about Python regular expressions in general. https://qiita.com/luohao0404/items/7135b2b96f9b0b196bf3

Let's create a date regular expression

Year version

Let's write the code immediately. First, import the library for using regular expressions.

import re

First of all 2022/02/22 Let's create a regular expression that matches the string.

pattern = r'2022/02/22'

Of course, this is an exact match, so it matches. Let's check with the code.

pattern = r'2022/02/22'
string = r'2022/02/22'
prog = re.compile(pattern)
result = prog.match(string)
if result:
    print(result.group())
# 2022/02/22

The matched string is displayed. After that, for the sake of simplicity, only the regular expression pattern is described.

In addition to "2022/02/22", there are other dates such as "1985/01/12" and "2010/12/11". The regular expressions that match these are as follows.

pattern = r'\d\d\d\d/\d\d/\d\d'

The regular expression used is:

letter Description
\d Any number
Example Matching string
\d\d\d\d 2022
\d\d 02, 22

The regular expression above can be expressed more easily.

pattern = r'\d{4}/\d{2}/\d{2}'

The newly used regular expressions are:

letter Description
{m} Repeat m of the previous character m times
Example Matching string
\d{4} 2022
\d{2} 02, 22

However, this will result in an impossible date string, such as "9999/99/99". This time, we will allow only the following conditions as the YYYY / MM / DD format.

The modified regular expression is as follows.

pattern = r'[12]\d{3}/(0[1-9]|1[0-2])/(0[1-9]|[12][0-9]|3[01])'

The newly used regular expressions are:

letter Description
[abc] a,b,Any letter of c
Example Matching string
[12]\d{3} 1000~2999
0[1-9] 01~09
1[0-2] 10~12
[12][0-9] 10~29
3[01] 30, 31

We also used the following regular expressions.

letter Description
(abc|efg) Either abc or efg string
Example Matching string
(0[1-9]|1[0-2]) 01~09 or 10~12
That is, 01~12
(0[1-9]|[12][0-9]|3[01]) 01~09 or 10~29 or 30, 31
That is, 01~31

You now have a regular expression that matches only the above conditions.

However, with this, things that are not 0-filled (0 padded), such as "2020/2/22", cannot be taken. The modified regular expression is as follows.

pattern = r'[12]\d{3}\/(0?[1-9]|1[0-2])\/(0?[1-9]|[12][0-9]|3[01])$'

The newly used regular expressions are:

letter Description
? Repeat 0 or 1 of the previous character
Example Matching string
0?[1-9] 1~9 or 01~09

We also used the following regular expressions.

letter Description
$ End of string

Without this, "2022/02/22" will only match until "2022/02/2".

With this, it is possible to handle the one without 0 padding (0 padding).

Furthermore, let's modify it so that it matches not only "/ (slash)" but also "-(hyphen)" and "year / month (day)".

pattern = r'[12]\d{3}[/\-Year](0?[1-9]|1[0-2])[/\-Month](0?[1-9]|[12][0-9]|3[01])Day?$'

Here, "\-" is an escape, which means that "-(slash)" is not used in a special meaning but is a character.

Now you have a regular expression that matches not only "/ (slash)" but also "-(hyphen)" and "year / month (day)".

Japanese calendar version

Dates include not only the Western calendar but also Japanese calendar dates such as "February 22, 2nd year of Reiwa", so let's create a regular expression here as well.

Consider the following as conditions for dates in the Japanese calendar. --A character string that starts with any of Meiji, Taisho, Showa, Heisei, and Reiwa. --The year is a two-digit number. There is no such thing as "1999", but this time it is acceptable. --The numbers are separated only by "year / month / day". Excludes "/ (slash)" and "-(hyphen)".

The regular expression is:

pattern = r'(Meiji|Taisho|Showa|Heisei|Reiwa)\d{1,2}Year(0?[1-9]|1[0-2])Month(0?[1-9]|[12][0-9]|3[01])Day'

Summary

This time, I used Python to create a regular expression for "date".

Character strings with a certain pattern, such as dates, times, and amounts, are compatible with regular expressions. Try to extract various character strings with regular expressions.

Recommended Posts

I tried to make a regular expression of "date" using Python
I tried to make a regular expression of "amount" using Python
I tried to make a regular expression of "time" using Python
I tried to make a stopwatch using tkinter in python
I tried to make a todo application using bottle with python
I tried to make a ○ ✕ game using TensorFlow
I tried using Python (3) instead of a scientific calculator
I tried to make a simple mail sending application with tkinter of Python
I tried to make a simple text editor using PyQt
I tried to make a Web API
[5th] I tried to make a certain authenticator-like tool with python
I tried to get a database of horse racing using Pandas
[2nd] I tried to make a certain authenticator-like tool with python
[Python] I tried to implement stable sorting, so make a note
[3rd] I tried to make a certain authenticator-like tool with python
I tried to create a list of prime numbers with python
I tried to make a periodical process with Selenium and Python
I tried to get a list of AMI Names using Boto3
I tried to make a 2channel post notification application with Python
[4th] I tried to make a certain authenticator-like tool with python
[1st] I tried to make a certain authenticator-like tool with python
I tried to make a mechanism of exclusive control with Go
I tried to make a function to retrieve data from database column by column using sql with sqlite3 of python [sqlite3, sql, pandas]
I want to make a game with Python
I tried reading a CSV file using Python
Python: I tried to make a flat / flat_map just right with a generator
I tried to make a traffic light-like with Raspberry Pi 4 (Python edition)
I tried to perform a cluster analysis of customers using purchasing data
I tried to create a sample to access Salesforce using Python and Bottle
I tried to implement a card game of playing cards in Python
I want to make a web application using React and Python flask
[Python] I tried to make a simple program that works on the command line using argparse.
Make one repeating string with a Python regular expression.
I tried to make a "fucking big literary converter"
I tried to make a periodical process with CentOS7, Selenium, Python and Chrome
I tried to draw a route map with Python
[Python] Deep Learning: I tried to implement deep learning (DBN, SDA) without using a library.
[Patent analysis] I tried to make a patent map with Python without spending money
I tried to implement a pseudo pachislot in Python
Continuation ・ I tried to make Slackbot after studying Python3
[Python] Smasher tried to make the video loading process a function using a generator
A super introduction to Django by Python beginners! Part 2 I tried using the convenient functions of the template
How to make a Python package using VS Code
[Python] I tried to make a Shiritori AI that enhances vocabulary through battles
[Python] I want to make a nested list a tuple
I tried to automatically generate a password with Python3
I tried to make a translation BOT that works on Discord using googletrans
[Python] I tried using OpenPose
[Python] I tried running a local server using flask
I tried drawing a pseudo fractal figure using Python
A python regular expression, or a memo of a match object
I made a script to record the active window using win32gui of Python
[Python] I tried to get Json of squid ring 2
I tried to access Google Spread Sheets using Python
I tried to make a suspicious person MAP quickly using Geolonia address data
I tried to draw a configuration diagram using Diagrams
I tried to summarize the string operations of Python
I tried to make something like a chatbot with the Seq2Seq model of TensorFlow
I tried to make a real-time sound source separation mock with Python machine learning
I tried to notify the update of "Become a novelist" using "IFTTT" and "Become a novelist API"
I want to collect a lot of images, so I tried using "google image download"