[PYTHON] Load test method when using external service API

Last time, I wrote about the overall outline of the load test, but this time I will write about the load test of the external API that was done at that time. This article is from the previous article [Minimum load test knowledge obtained in the first API server load test], "This is a difficult load test. It deserves the item "I'm hitting the API of an external service".

Server configuration with load test

The API server created this time is hitting the API of the external service inside the API. スクリーンショット 2016-03-21 23.19.08.png

This time, the external service sundbox did not exist, so the load test including the external service could not be performed directly. Moreover, this mechanism was configured so that authentication is performed on an external server at the first time, and other APIs can be hit only after authentication is passed, so if a problem occurs here, a new user can do anything It played such an important role that it disappeared.

Purpose of load test for external services

--Investigate how much the performance of the API of the external service affects --If an external service is involved, the performance of your own server will be greatly affected by the performance status of the external service. In order to deal with problems, it is necessary to know in advance how influential they are.

How to test

As I said at the beginning, this time the sundbox environment did not exist in the API of the external service, so it was not possible to directly connect to the external service during the load test.

Therefore, we prepared a dummy server, and this time we decided to perform a load test by connecting to the dummy server. Then, how to test with a dummy server, but this time I added only one convenient function to the dummy server, and by using that function, I was able to measure the influence of the external server.

Dummy server conditions

--Accepts the same request as a normal external API and returns the same type of response. ――First of all, it is natural, but the minimum requirement is to behave in the same way as the external API that you actually use.

--You can specify the response time. ――This is the convenient function used this time. By adjusting the response time, it is possible to perform load tests by assuming several patterns of external server performance.

To write an example, the response time can be adjusted according to the value of X-Sleep </ b> in the header as shown below.

curl -d '{"user":"hoge"}' https://dummy-server.jp/test/api -H "Content-type: application/json" -H "X-Sleep: 2000"

Load test method

--The normal value is a CPU usage rate of 50% or less. --Create a scenario of only API linked with external API with locasut and apply load. --Dummy server API response time Measures with a fixed number of patterns between 100ms and 3000ms, and measures how the dummy server API response time affects it. --It is better to change the response time measured according to the timeout setting time. --Set the limit value at the place where Median becomes unstable or the 502 error starts, and measure the RPS and CPU usage up to that point. --When Median jumps up at once, it is possible that the performance has deteriorated due to the load.

スクリーンショット_2016-03-22_0_37_02.png

Test results

Below is an example of the load test results.

Response time RPS CPU usage
100ms 230 90%
200ms 220 80%
300ms 200 75%
400ms 200 72%
500ms 190 65%
600ms 185 60%
700ms 175 60%
800ms 170 55%
900ms 155 50%
1000ms 140 45%
1500ms 110 30%
2000ms 85 20%
3000ms 60 15%

From this result, it can be seen that when the response time exceeds 1000ms, the CPU usage exceeds 50% and the performance deteriorates.

The content derived from this result is that if the average response time of the external server is about 1000ms or more, it may affect the company's API server. </ b>

Possible measures according to the load test results

――When you enter the service, you can measure the response time of the external server API, and if the average response time is 1000ms or more, you can take measures such as increasing the number of servers and quickly decide to maintain the performance. --In addition, you can select the TimeOut time setting when connecting to the external service API and the judgment of whether to retry, etc. based on the above results.

at the end

This time I tested with this approach, but I think I have to find out if there is another better way. However, if you have some information as a number rather than doing nothing, you will feel more secure.

Recommended Posts