The URL string of Hatena Blog has a format such as "entry / year / month / day / time" (the URL in the example is invalid).
https://hogefugapiyo.hatenablog.com/entry/2019/09/02/000000
This time, as an example, the year, month, and day will be cut out from this URL.
Use java.util.regex.Matcher
.
Matcher matcher = Pattern
.compile(".+/entry/(\\d+)/(\\d+)/(\\d+)/.+")
.matcher(/*URL string to compare*/);
int year = Integer.parseInt(matcher.replaceFirst("$1"));
int month = Integer.parseInt(matcher.replaceFirst("$2"));
int day = Integer.parseInt(matcher.replaceFirst("$3"));
In the title, it was expressed as cutout, but this process is originally intended to "if the input matches the regular expression, replace the matched part with the specified regular expression". In this case, the entire URL string to be compared matches the regular expression, so the entire string is replaced with the specified content, and as a result, it looks like it has been cut out.
What is $ 1
? It's the regular expression enclosed in()
.
Within a regular expression this can be treated like a variable.
If you ignore the essence, replace . + / Entry / (\\ d +) / (\\ d +) / (\\ d +) /. +
With `. +/ I think it is easy to understand.
Since it is a process for a character string, the result of cutting out will naturally be returned as a character string.
The same processing can be done by writing as follows, but if you do it many times, it is more efficient to compile`` Pattern
first.
String year = /*URL string to compare*/.replaceFirst(".+/entry/(\\d+)/(\\d+)/(\\d+)/.+", "$1");
Regular expressions can be improved by debugging them on the following sites (however, there are differences between languages, so you should also do unit tests).
Recommended Posts