This article describes performance issues related to the use of the LocalDateTime and Instant time formats that Alibaba engineers encountered during the serialization process.
From Lv Renqi
When performing performance pressure tests on the new version of Apache Dubbo, the attributes of the Transfer Object (TO) class I found a related issue. Changing Date
to LocalDateTime
reduced throughput from 50,000 to 20,000 and increased response time from 9ms to 90ms.
Of these changes, the one we were most concerned about was the response time change. Response times are, in many ways, the cornerstone of good performance numbers, because performance indicators are only meaningful once a certain response time level is ensured. For stress tests, Gigabit Per Second (GPS) and Transaction Per Second (TPS) numbers are only allowed if the target response time numbers are met. Pure theoretical numbers are meaningless. In cloud computing, every bit of response time is important. A 0.1ms increase in the response time of the underlying service means a 10% increase in overall cost.
Latency is like the Achilles heel of a system with remote users. Data packet delays increase by 1 millisecond for every 100 km. The waiting time between Hangzhou and Shanghai is about 5ms, and the waiting time between Shanghai and Shenzhen is naturally even higher due to the considerably larger distance. The direct result of latency is an increase in response time, which worsens the overall user experience and increases costs.
If the request modifies the records in the same row in different units, the cost is very high, even if it can be consistent and consistent. Remote High Speed Service Framework, A distributed RPC service framework widely used in Alibaba. If one service calls another on a request that requires access to the 0.18e27d1f7aNxOS) (HSF) service or other remote database more than 10 times, the latency will be added immediately, resulting in a snowball effect.
Dealing with time is everywhere in the world of computer science. Without the rigorous notion of time, 99.99% of applications would be meaningless and impractical. This is especially true of the time-oriented custom processing found in most surveillance systems in the cloud these days.
Java Development Kit 8 (JDK 8) Previously, java.util.Date
was used to describe the date and time, and java.util.Calendar
was used for time-related computing. JDK 8 introduces more convenient time classes such as ʻInstant,
LocalDateTime, ʻOffsetDateTime
, ZonedDateTime
. In general, these classes have made time processing more convenient.
ʻInstantstores time stamps in Coordinated Universal Time (UTC) format and provides a machine-facing or internal time display. It is suitable for database storage, business logic, data exchange, and serialization scenarios.
LocalDateTime, ʻOffsetDateTime
, and ZonedDateTime
contain time zone or seasonal information and also provide a time display for users to input and output data. If the same time is output to different users, the values will be different. For example, the shipping time of an order is displayed to buyers and sellers in different local times. You can think of these three classes as tools that are directed to the outside, rather than the internal working parts of your application.
In short, ʻInstant is good for back-end services and databases, while
LocalDateTime` and its cohort are good for front-end services and displays. The two are theoretically compatible, but they actually perform different functions. The international business team has a wealth of experience and ideas in this regard.
Date
and ʻInstant` are often used to integrate Dubbo with Alibaba's internal High Speed Services Framework (HSF).
You can try to reproduce it in order to get an accurate picture of what is behind the performance problems you saw earlier. But before that, let's consider the performance benefits of ʻInstantthrough a brief demo. To do this, consider the general scenario of defining a date in the
Date format and then using the ʻInstant
format.
@Benchmark
@BenchmarkMode(Mode.Throughput)
public String date_format() {
Date date = new Date();
return new SimpleDateFormat("yyyyMMddhhmmss").format(date);
}
@Benchmark
@BenchmarkMode(Mode.Throughput)
public String instant_format() {
return Instant.now().atZone(ZoneId.systemDefault()).format(DateTimeFormatter.ofPattern(
"yyyyMMddhhmmss"));
}
After doing this, run the stress test for 30 seconds on four local concurrent threads. The result is as follows.
Benchmark Mode Cnt Score Error Units
DateBenchmark.date_format thrpt 4101298.589 ops/s
DateBenchmark.instant_format thrpt 6816922.578 ops/s
From these results, we can conclude that ʻInstant` is advantageous in terms of format performance. In fact, Instant has a performance advantage for other operations as well. For example, Instant has been found to show promising performance in date and time addition and subtraction operations.
Then, as a replication of the problem seen above, Java and Hessian (optimized for Taobao) We also performed stress tests to see the changes in performance during serialization and deserialization operations, respectively.
Hessian defaults to HSF 2.2 and Dubbo serialization schemes:
@Benchmark
@BenchmarkMode(Mode.Throughput)
public Date date_Hessian() throws Exception {
Date date = new Date();
byte[] bytes = dateSerializer.serialize(date);
return dateSerializer.deserialize(bytes);
}
@Benchmark
@BenchmarkMode(Mode.Throughput)
public Instant instant_Hessian() throws Exception {
Instant instant = Instant.now();
byte[] bytes = instantSerializer.serialize(instant);
return instantSerializer.deserialize(bytes);
}
@Benchmark
@BenchmarkMode(Mode.Throughput)
public LocalDateTime localDate_Hessian() throws Exception {
LocalDateTime date = LocalDateTime.now();
byte[] bytes = localDateTimeSerializer.serialize(date);
return localDateTimeSerializer.deserialize(bytes);
}
The result was as follows. By using the Hessian protocol, the throughput dropped sharply when using the ʻInstant` and LocalDateTime formats. In reality, the throughput is 100 times lower than when using the Date format. Upon further investigation, we found that the Date serialized byte stream is 6 bytes, while the LocalDateTime stream is 256 bytes. It also increases the cost of network bandwidth for transmission. Java's built-in serialization solution shows a slight drop, but it doesn't make a substantial difference.
Benchmark Mode Cnt Score Error Units
DateBenchmark.date_Hessian thrpt 2084363.861 ops/s
DateBenchmark.localDate_Hessian thrpt 17827.662 ops/s
DateBenchmark.instant_Hessian thrpt 22492.539 ops/s
DateBenchmark.instant_Java thrpt 1484884.452 ops/s
DateBenchmark.date_Java thrpt 1500580.192 ops/s
DateBenchmark.localDate_Java thrpt 1389041.578 ops/s
Our analysis is as follows. Date is one of eight primitive types of serialization of Hessian objects.
Second, Instant had to go through Class.forName
for both serialization and deserialization, causing a sharp drop in throughput and response time. Therefore, Date has an advantage.
I found that you can upgrade and optimize Hessian by implementing com.alibaba.com.caucho.hessian.io.Serializer in a class such as Instant via extension and registering with SerializerFactory, so in this article You can solve the problem you have addressed. However, there are compatibility issues with earlier and future versions. This is a serious problem. Alibaba's fairly complex dependencies make this impossible. Given this problem, the only recommendation we can make is to use Date as the preferred time attribute for the TO class.
Technically, the HSF RPC protocol is a session layer protocol, and version recognition is also done here. However, the presentation layer of service data is implemented by a self-describing serialization framework like Hessian and lacks version recognition. Therefore, it is very difficult to upgrade.
Recommended Posts