[Final version] Check the operation of the Japanese calendar using ICU to support the new era.

Update contents

--2019/5/7 Added about Node.js --2019 / 5/7 Specify the Japanese calendar locale with the language tag, and add to the usage of some APIs in the source code (no change in operation) --2019/4/26 Updated ICU 64.2 description on Mac Homebrew --Added about 2019/4/18 ICU 64.2. It is no longer necessary to specify the temporary era --2019/4/1 The new era has been decided as Reiwa, so the related parts have been updated. --Added about 2019/4/1 ICU 64.1

Introduction

In this article, the new era "Reiwa" is targeted at applications that incorporate the cross-platform international library ICU (International Components for Unicode). I will explain the actions required to respond to "Iwa)".

The content of this article is based on what is written at here.

ICU status

Starting with version 64.2 released on April 18, 2019 (Japan time), ICU has incorporated resources to support the new era "Reiwa". Prior to that, from version 63 (63.1 *) released on October 15, 2018, resources for supporting the new era were already incorporated in the form of "temporary era", but ** temporary era No need to specify the issue **. Also, from version 64 (64.1) released on March 27, 2019, "first year" can be handled as a notation for both input and output in addition to one year. At the moment, ICU's support for the new era is specifically the following two points.

Note * The first release is 63.1, not a zero start such as 63.0.

  1. "Reiwa" was added as the era name from May 1, 2019 (from 64.2)
  2. Both "1 year" and "first year" notations can be handled (from 64.1)
  3. ~~ A placeholder has been added to indicate the era name "QQ" (two full-width alphabets "queue") as the era name from May 1, 2019 ~~ (from 63.1 to 64.1)
  4. ~~ To use placeholders, specify ʻICU_ENABLE_TENTATIVE_ERA = true` in the JVM arguments and environment variables ~~ (from 63.1 to 64.1)

Applications linked to versions of ICU older than v63.1 can be tested by simply replacing the library with v64.2. However, in order to display "Reiwa" or accept it by input, it is only if the application already supports the Japanese calendar format such as "Heisei" or "Showa" by input and output. Therefore, ** If you are only dealing with dates in the Christian era at this time, replacing the library will not immediately support the new era **

The operation is summarized as follows.

Input in the Japanese calendar 63.1 or less or ICU_ENABLE_TENTATIVE_ERA=false 63.1 or more and ICU_ENABLE_TENTATIVE_ERA=true 64.2 or more (no environment variables required)
May 1, 2019 May 1, 2019 QQ May 1, 1st(63.1) /QQ May 1, 1st year(64.1) May 1, 1st year of Reiwa
QQ May 1, 1st Perspective error QQ May 1, 1st(63.1) /QQ May 1, 1st year(64.1) Perspective error
QQ May 1, 1st year Perspective error Perspective error(63.1) /QQ May 1, 1st year(64.1) Perspective error
May 1, 1st year of Reiwa Perspective error Perspective error(63.1) /Perspective error(64.1) May 1, 1st year of Reiwa

This result assumes java.util.Calendar.setLenient (true) and com.ibm.icu.util.Calendar.setCalendarLenient (true) (both are default values and do not need to be specified). .. "May 1, 2019" is a date that does not actually exist, but whether or not to make it a parse error depends on the application specifications. The behavior of setLenient () and setCalendarLenient () is not covered in this article because it has a wide variety of contents.

Another topic about dealing with the new era "Reiwa" is the handling of ligatures that represent the era with one letter, such as "㍼" and "㍻", but this is also not covered in this article. Hmm.

"First year" notation is supported from ICU v64.1. However, please note that the v64.1 release was before the announcement of the new era "Reiwa", so it does not yet support "Reiwa". Please link with v64.2 or above.

The timeline can be summarized as follows.

season ICU Version Supports new era
October 2018 63.1 "QQ" ICU_ENABLE_TENTATIVE_ERA=Enabled with true
March 2019 64.1 "QQ" ICU_ENABLE_TENTATIVE_ERA=Enabled with true,"1 year="First year" correspondence
April 2019 64.2 Added "Reiwa"(ICU_ENABLE_TENTATIVE_ERAUnnecessary)

Reiwa support for past ICU versions

According to this Pull Request (ICU-20536), in addition to v64.2, there are orders for past versions. It seems that the Japanese era has been added. I was able to confirm the following version numbers.

4.8.2 50.2 51.3 52.2 53.2 54.2 55.2 56.2 57.2 58.3 59.2 60.3 61.2 62.2 63.2

Compared to v64.2, these versions do not support the following new eras:

--There is no "first year" support --No support for Reiwa ligatures added in Unicode v12.1 (because older ICUs are based on older Unicode) --There is no correspondence for collation of ligatures (the order of ㍾ <㍽ <㍼ <㍻ <(reiwa ligature))

What you can do right now

Now, I will explain how to link ICU version 64.2 (and past Reiwa-compatible ICU versions) to the application and check the operation.

Example of C ++ source code using ICU4C

ICU (ICU4C) provides APIs for C and C ++, but here we will use a C ++ example.

sample.cpp


#include <stdio.h>
#include <iostream>
#include "unicode/datefmt.h"
#include "unicode/dtfmtsym.h"
#include "unicode/gregocal.h"
#include "unicode/timezone.h"
#include "unicode/unistr.h"
#include "unicode/ustring.h"
#include "unicode/dtptngen.h"
#include "unicode/dtitvfmt.h"

using namespace icu;

//A simple utility to prevent garbled characters on Windows
void myprintf(std::string format, UnicodeString ustr) {
	char abuf[0x100];
	ustr.extract(0, sizeof(abuf), abuf);
	printf(format.c_str(), abuf);
}

int main( int argc, char **argv )
{
	UErrorCode status = U_ZERO_ERROR;
	//Year-based locale
	Locale loc_jp1 = Locale::getJapanese();
	//Japanese calendar-based locale Designated by LanguageTag
	Locale loc_jp2 = Locale("ja-u-ca-japanese");

	// (A)Pattern generation class
	DateTimePatternGenerator *g_jp1 = DateTimePatternGenerator::createInstance(loc_jp1, status);
	status = U_ZERO_ERROR;
	DateTimePatternGenerator *g_jp2 = DateTimePatternGenerator::createInstance(loc_jp2, status);
	if (U_FAILURE(status)) {
		return 1;
	}

	// (B)Get the appropriate date format pattern for each locale
	status = U_ZERO_ERROR;
	UnicodeString up_jp1 = g_jp1->getBestPattern(UnicodeString("yyyyMMMd"), status);
	myprintf("pattern jp1: %s\n", up_jp1);
	status = U_ZERO_ERROR;
	UnicodeString up_jp2 = g_jp2->getBestPattern(UnicodeString("yyyyMMMd"), status);
	myprintf("pattern jp2: %s\n", up_jp2);

	// (C)Generate using a format pattern
	status = U_ZERO_ERROR;
	SimpleDateFormat *df_jp1 = new SimpleDateFormat(up_jp1, loc_jp1, status);
	status = U_ZERO_ERROR;
	SimpleDateFormat *df_jp2 = new SimpleDateFormat(up_jp2, loc_jp2, status);

	// (A),(B),(C)Click here for how to write DateFormat
	//DateFormat *df_jp1 = DateFormat::createInstanceForSkeleton("yMMMd", loc_jp1, status);
	//DateFormat *df_jp2 = DateFormat::createInstanceForSkeleton("yMMMd", loc_jp2, status);

	UnicodeString uin = UnicodeString(argv[1]);
	myprintf("input:%s\n", uin);
	status = U_ZERO_ERROR;
	//First, Perth as the Christian era
	UDate inDate = df_jp1->parse(uin, status);
	if (U_FAILURE(status)) {
		std::cout << "Parse error (" << u_errorName(status) << ") try another." << std::endl;
        status = U_ZERO_ERROR;
	//Next, Perth as the Japanese calendar
        inDate = df_jp2->parse(uin, status);
    	if (U_FAILURE(status)) {
		    std::cout << "Parse error (" << u_errorName(status) << ") again." << std::endl;
            return 1;
        }
	}
	//Output the entered date in each of the Western and Japanese calendars
	UnicodeString ud_jp1;
	df_jp1->format(inDate, ud_jp1);
	myprintf("output jp1:%s\n", ud_jp1);
	UnicodeString ud_jp2;
	df_jp2->format(inDate, ud_jp2);
	myprintf("output jp2:%s\n", ud_jp2);

	return 0;
}

Execution on macOS

macOS has an ICU built-in, but to test the OS's ICU without updating it directly, take advantage of Homebrew (http://brew.sh/) to install ICU 64.2. (Released on April 19, 2019)

$ brew install icu4c

The above sample code can be compiled as follows. The environment variable part can also be specified as an argument to the compiler. If the installation destination by brew is different from the example, replace it as appropriate.

$ export C_INCLUDE_PATH=/usr/local/opt/icu4c/include
$ export CPLUS_INCLUDE_PATH=/usr/local/opt/icu4c/include
$ export LIBRARY_PATH=/usr/local/opt/icu4c/lib
$ export LD_LIBRARY_PATH=/usr/local/opt/icu4c/lib
$ clang++ -licuio -licui18n -licutu -licuuc -licudata -std=c++1z sample.cpp

You can do this as follows: In the first argument, specify the input date in the Western or Japanese calendar.

The execution example is based on v64.2. For results executed under v63.2, replace "first year" with "1 year".

(A) Input as Heisei and output in the Christian era and Reiwa

Depending on the setting of setLenient (), it can be processed even if you enter the date of the period of Reiwa in Heisei.

Since df_jp1 expects input in the Christian era, Parse error appears once.

$ ./a.out May 1, 2019
pattern jp1:yyyy year M month d day
pattern jp2:Gy year M month d day
input:May 1, 2019
Parse error (U_ILLEGAL_ARGUMENT_ERROR) try another.
output jp1:May 1, 2019
output jp2:May 1, 1st year of Reiwa
(B) Input as Heisei and output in the Christian era and Heisei

Due to Reiwa support, it is no longer possible to forcibly output dates after "May 1, 1st year of Reiwa" in Heisei using the above sample code.

(C) Input in the Christian era and output in the Christian era and Reiwa

I don't get a Parse error.

$ ./a.out May 1, 2019
pattern jp1:yyyy year M month d day
pattern jp2:Gy year M month d day
input:May 1, 2019
output jp1:May 1, 2019
output jp2:May 1, 1st year of Reiwa
(D) Input in the Christian era and output in the Christian era and Heisei

Due to Reiwa support, it is no longer possible to forcibly output dates after "May 1, 1st year of Reiwa" in Heisei using the above sample code.

(E) Input in Reiwa and output in the Christian era and Reiwa

Since df_jp1 expects input in the Christian era, Parse error appears once.

$ ./a.out May 1, 1st year of Reiwa
pattern jp1:yyyy year M month d day
pattern jp2:Gy year M month d day
input:May 1, 1st year of Reiwa
Parse error (U_ILLEGAL_ARGUMENT_ERROR) try another.
output jp1:May 1, 2019
output jp2:May 1, 1st year of Reiwa

Even if you enter in "1 year", it will be output in "first year".

$ ./a.out May 1, 1st year of Reiwa
pattern jp1:yyyy year M month d day
pattern jp2:Gy year M month d day
input:May 1, 1st year of Reiwa
Parse error (U_ILLEGAL_ARGUMENT_ERROR) try another.
output jp1:May 1, 2019
output jp2:May 1, 1st year of Reiwa
(F) Input in Reiwa and output in the Christian era and Heisei

Depending on the setting of setLenient (), it can also be processed by entering the date of the Heisei period in Reiwa. In this case, it will be output in Heisei.

$ ./a.out April 1, 1st year of Reiwa
pattern jp1:yyyy year M month d day
pattern jp2:Gy year M month d day
input:April 1, 1st year of Reiwa
Parse error (U_ILLEGAL_ARGUMENT_ERROR) try another.
output jp1:May 1, 2019
output jp2:April 1, 2019
$ ./a.out Reiwa April 1, 1st
pattern jp1:yyyy year M month d day
pattern jp2:Gy year M month d day
input:April 1, 1st year of Reiwa
Parse error (U_ILLEGAL_ARGUMENT_ERROR) try another.
output jp1:May 1, 2019
output jp2:April 1, 2019

Running on Linux

Confirmed on Ubuntu 18.04 (64bit).

Download ʻicu4c-64_2-Ubuntu-18.04-x64.tgzfrom the ICU site and unzip it. The extracted directory is referred to as$ ICUPATH` here.

The rest can be done in much the same way as on macOS.

$ export C_INCLUDE_PATH=$ICUPATH/include
$ export CPLUS_INCLUDE_PATH=$ICUPATH/include
$ export LIBRARY_PATH=$ICUPATH/lib
$ export LD_LIBRARY_PATH=$ICUPATH/lib
$ clang++ -licuio -licui18n -licutu -licuuc -licudata -std=c++1z sample.cpp

Or

$ g++ sample.cpp -licuio -licui18n -licutu -licuuc -licudata

The execution method is the same as for macOS.

Execution on Windows (VC ++)

Create a project in Visual Studio with Visual C ++> Windows Desktop> Windows Console Application. Here, the solution name is JapaneseNewEraICU.

Download ʻicu4c-64_2-Win64-MSVC2017.zipfrom the ICU site and unzip it. The extracted folder is referred to as% ICUPATH%` here.

Set the following values in the solution properties.

item Value to set
Configuration properties> VC++directory
Include directory %ICUPATH%\include;$(IncludePath)
Library directory %ICUPATH%\lib64;$(LibraryPath)
Configuration properties>Linker>input
Additional dependent files icuio.lib;icuin.lib;icutu.lib;icuuc.lib;%(AdditionalDependencies)

When I compile with this setting, I get the error "The program cannot start because icuin64.dll is missing from your computer. To resolve this issue, try reinstalling the program." This is an error due to the downloaded ICU DLL not being found in the environment variable% PATH%.

When you run it, open a separate command prompt and

C:\> set PATH=%ICUPATH%\bin64;%PATH%

After running

C:\> .\JapaneseNewEraICU.exe February 16, 2020

And so on.

Java source code example using ICU4J

ʻAdd ICU4J` to the Maven repository dependency.

pom.xml


<!-- https://mvnrepository.com/artifact/com.ibm.icu/icu4j -->
<dependency>
    <groupId>com.ibm.icu</groupId>
    <artifactId>icu4j</artifactId>
    <version>64.2</version>
</dependency>

Here is a code example similar to C ++.

test/Sample.java


package test;

import java.text.ParseException;
import java.util.Date;
import java.util.Locale;

import com.ibm.icu.text.DateTimePatternGenerator;
import com.ibm.icu.text.SimpleDateFormat;

public class Sample {

    public static void main(String[] args) {
        Locale loc_jp1 = Locale.JAPANESE;
        Locale loc_jp2 = Locale.forLanguageTag("ja-u-ca-japanese");

        //(A)
        DateTimePatternGenerator g_jp1 = DateTimePatternGenerator.getInstance(loc_jp1);        
        DateTimePatternGenerator g_jp2 = DateTimePatternGenerator.getInstance(loc_jp2);
        //(B)
        String p_jp1 = g_jp1.getBestPattern("yyyyMMMd");
        System.out.format("pattern jp1: %s\n",p_jp1);
        String p_jp2 = g_jp2.getBestPattern("yyyyMMMd");
        System.out.format("pattern jp2: %s\n",p_jp2);
        //(C)
        SimpleDateFormat df_jp1 = new SimpleDateFormat(p_jp1, loc_jp1);
        SimpleDateFormat df_jp2 = new SimpleDateFormat(p_jp2, loc_jp2);

        //(A),(B),(C)Click here for how to write DateFormat
        //import com.ibm.icu.text.DateFormat;In conjunction with
        //DateFormat df_jp1 = DateFormat.getInstanceForSkeleton("yMMMd", loc_jp1);
        //DateFormat df_jp2 = DateFormat.getInstanceForSkeleton("yMMMd", loc_jp2);

        String input = args[0];
        System.out.format("input: %s\n", input);
        Date idate = null;
        try {
            idate = df_jp1.parse(input);
        } catch (ParseException e) {
            e.printStackTrace();
        }
        if (idate==null) {
            try {
                idate = df_jp2.parse(input);
            } catch (ParseException e) {
                e.printStackTrace();
            }
        }
        
        String out_jp1 = df_jp1.format(idate);
        System.out.format("output jp1: %s\n", out_jp1);
        String out_jp2 = df_jp2.format(idate);
        System.out.format("output jp2: %s\n", out_jp2);
    }
}

In addition to execution on the IDE, when executing directly from the command, it will be as follows.

$ java -cp ~/.m2/repository/com/ibm/icu/icu4j/64.2/icu4j-64.2.jar:target/classes test.Sample Reiwa May 10, 1st

(Bonus) Support with Node.js

Starting with Node.js v12.1 released on April 29, 2019, it will be linked to ICU v64.2. Therefore, it is now possible to use Reiwa as the era name without any special designation.

Since there is no library such as ICU for NodeJS, the following is an example of displaying the date of the Reiwa period with a normal Date object.

test/sample.js


var date = new Date(Date.UTC(2019, 11, 20, 3, 0, 0));
var options = {
era: 'short',
year: 'numeric',
month: 'narrow',
day: 'numeric',
weekday: 'narrow'
};
console.log(date.toLocaleString('ja-u-ca-japanese',options));
$ node test/sample.js
December 20, 1st year of Reiwa(Day)

What you can do once the new era is decided

As explained in "ICU Status" above, ICU release v64.2 with embedded "Reiwa" has been released.

So far, if you have thoroughly tested the display and input of the provisional era in "QQ" using ICU 63.1 / 64.1 or later, you can replace it with ICU v64.2 after the official release to get to the new era. Correspondence is completed.

One point to note is the locale setting that corresponds to the Japanese calendar. The locale ja-u-ca-japanese used as jp2 in the sample source code is the locale corresponding to the Japanese calendar. It also works in formats such as ja_JP_TRADITIONAL or ja_JP_JP. Since these values are rarely set by the user or system with the environment variable LANG etc., it is expected that the application will prepare its own logic to determine whether it operates in the Japanese calendar.

at the end

The yuan has already been revised, but we hope that this article will help you in the response work in the future.

Recommended Posts

[Final version] Check the operation of the Japanese calendar using ICU to support the new era.
Summary of revisions (new era) support by Java version
Check the version of Cent OS
I tried to check the operation of gRPC server with grpcurl
Display Japanese calendar and days of the week using java8 standard class
Check the operation using jetty with Maven.
Check the version of the standard Web software.
Check the operation of the interface through threads
Pay attention to the boundary check of the input value when using the float type
Send a notification to slack with the free version of sentry (using lambda)
Check the version of the JDK installed and the version of the JDK enabled
I tried to check the operation of http request (Put) with Talented API Tester
[Rails] How to display the weather forecast of the registered address in Japanese using OpenWeatherMap
[Java] Check the JDK version of the built war file
How to download the old version of Apache Tomcat
The first step to using Xib instead of StoryBoard