When I wrote Previous article, the response of the serverless environment I built myself was slow. There was a skin sensation. I imagined that it would be slower than a general server because the container was started using Docker internally. But how much does the response speed deteriorate? There aren't many comparative articles, so I decided to actually measure it.
A brief survey of the technical field of serverless architecture
According to the article
Is written. This means that in a similar Serverless environment, it can be predicted that the response time will be about the same, or a little slower.
We use a Serverless environment that runs on-premises called Iron Functions. Regarding this, I wrote an introductory article in the past, so please have a look there.
Roughly speaking, it is a convenient product that allows you to easily introduce a serverless environment like AWS Lambda.
This time, we will use three languages for benchmarking: Go, Node.js, and Python. Write code that works almost the same in each language. Let's see how much difference there is when running them on Serverless and running on HTTP servers built in for each language (Native).
Go Serverless
package main
import (
"encoding/json"
"fmt"
"os"
)
type Person struct {
Name string
}
func main() {
p := &Person{Name: "World"}
json.NewDecoder(os.Stdin).Decode(p)
fmt.Printf("Hello %v!", p.Name)
}
Go Native
package main
import (
"encoding/json"
"fmt"
"net/http"
)
type Person struct {
Name string
}
func handler(w http.ResponseWriter, r *http.Request) {
p := &Person{Name: "World"}
json.NewDecoder(r.Body).Decode(p)
fmt.Fprintf(w, "Hello %v!", p.Name)
}
func main() {
http.HandleFunc("/", handler)
http.ListenAndServe(":2000", nil)
}
Node.js Serverless
name = "World";
fs = require('fs');
try {
obj = JSON.parse(fs.readFileSync('/dev/stdin').toString())
if (obj.name != "") {
name = obj.name
}
} catch(e) {}
console.log("Hello", name, "from Node!");
Node.js Native
const http = require('http');
name = "World";
http.createServer(
(req, res) => {
var body = "";
req.on(
"data",
(chunk) => { body+=chunk; }
);
req.on(
"end",
() => {
obj = JSON.parse(body);
res.writeHead(200, {'Content-Type': 'text/plain'});
res.end('Hello ' + obj.name + " from Node Native!");
}
);
}
).listen(6000);
Python Serverless
import sys
sys.path.append("packages")
import os
import json
name = "World"
if not os.isatty(sys.stdin.fileno()):
obj = json.loads(sys.stdin.read())
if obj["name"] != "":
name = obj["name"]
print "Hello", name, "!!!"
Python Native
from http.server import BaseHTTPRequestHandler,HTTPServer
from json import loads
from io import TextIOWrapper
class Handler(BaseHTTPRequestHandler):
def do_POST(self):
content_length = int(self.headers.get('content-length'))
text = TextIOWrapper(self.rfile).read(content_length)
self.send_response(200)
self.send_header('Content-type','text/plain')
self.end_headers()
obj = loads(text)
self.wfile.write("Hello {name} !! Welcome to Native Python World!!".format(name=obj["name"]).encode("utf-8"))
PORT = 1000
server = HTTPServer(("127.0.0.1", PORT), Handler)
print("serving at port", PORT)
server.serve_forever()
The server for each benchmark runs on the same machine. Ubuntu 16.04 on a virtual machine with 1 core and 2GB of memory. The server that puts the load and the server that receives the load are the same, and Apache Bench is used. Prepare the following json and prepare
johnny.json
{
"name":"Johnny"
}
Load is applied by throwing Post processing with Apache Bench. The number of requests is 100 and the number of parallels is 5. (The number of requests is small will be described later) At that time, the time until the response is returned from the server (Response Time) is measured.
#XXXX/XXXX complements as appropriate
ab -n 100 -c 5 -p johnny.json -T "Content-Type: application/json" http://localhost:XXXXX/XXXXX
Response Time | min[ms] | mean[ms] | std[ms] | median[ms] | max[ms] | Native ratio(mean) |
---|---|---|---|---|---|---|
Go Serverless | 3951 | 6579 | 1010 | 6512 | 8692 | 1644.75 |
Go Native | 0 | 4 | 5 | 2 | 37 | - |
Node Serverless | 5335 | 14917 | 3147 | 15594 | 20542 | 621.54 |
Node Native | 5 | 24 | 45 | 12 | 235 | - |
Python Serverless | 5036 | 13455 | 4875 | 14214 | 29971 | 840.94 |
Python Native | 6 | 16 | 4 | 16 | 26 | - |
** Please note that the vertical axis of the figure below is a logarithmic scale (the magnitude relationship is shown logarithmically for easy understanding) **
As you can see from the table, with ** Go, the Serverless environment is more than 1600 times slower than the Native environment. You can see that other Node.js is 600 times slower and Python is 800 times slower. ** ** If you compare the results of Python and Node.js, you may find it strange that Python is faster. When I tried a follow-up test in the Native environment, Python was sometimes faster when the number of requests was small and the number of parallels was small. When the number of requests is 10,000 or more, Node.js can process more stably, and the processing finishes faster than Python. In addition, Python's Native implementation sometimes resulted in an error and the request could not be processed normally. Probably, Go, which has a speed difference with this number of requests, is tuned abnormally. Here, let's compare it with the above-mentioned "AWS Lambda has a processing time of 250ms to 8000ms". The result of this benchmark was a result with little discomfort personally. When I made a request to Iron Functions by myself with Curl, I felt "slow", and if I make Docker for each request, I think that there is no help for it. On the other hand, I had the impression that AWS Lambda's 250ms is very fast.
Looking at, AWS Lambda seems to have two types of startup methods, cold start and warm start. The former is no different from the intuitive serverless, and the image is that "a container is created for each request", and the latter seems to be "a container once created is reused". In this way, the Lambda implementation is faster because it may not create a container. It seems that. I think that is the reason why we can respond in 250ms at the fastest. On the other hand, Iron Functions probably only implements cold start, so I think it's not that fast. However, when Go is run on Serverless in an environment created by myself, I think that Max of about 8600ms is a good processing speed. Of course, there is a difference in the number of clients processing, but the speed of container generation and disposal does not actually change that much. I thought.
Below is a link to the AWS Lambda price list.
AWS Lambda pricing https://aws.amazon.com/jp/lambda/pricing/
The price plan seems to be charged by usage time * memory usage, and it seems that you can not select the CPU. So how do you allocate the CPU? It was written in Help.
And one more article, the following article
The story of the actual introduction of Serverless Framework
It was stating. I've done some serverless environment tests, but the CPU usage of the host machine goes up considerably. Therefore, I was thinking about how to pay on the host side (Amazon), but it seems that the Amazon side is set to have a high hourly unit price. Also, serverless is basically made so that the response can only be returned after starting and discarding the container, so I wonder if the CPU is set to a good one for high-speed response. I thought.
I think that the number of benchmarks is usually 10,000 times, or some patterns are performed. However, in this experiment, the number of requests is limited to about 100. There are two reasons. One is because it is "slow". When running with Serverless, it takes about 4000ms at the fastest. Therefore, we did not benchmark large-scale requests because it was not realistic. The second is because it is "unstable". Iron Functions has some unstable behavior. Therefore, even if the request is about 100 times, it may fail about 10 times. Therefore, if you increase the number of parallels or the number of requests, there is a high probability that processing will not be possible. This also seems to depend on the life cycle of the Doquer container of Iron Functions, and it was a product that timed out or did not time out even if the same request was sent, and it was difficult to get an accurate value. Therefore, the data described in this article follows the processing time value itself. Rather, I think it is more accurate to recognize that there is an order for processing time compared to Native. Also, the fact that the machine that puts the load and the machine that receives the load are the same may be a slightly inaccurate benchmark. This is simply that I did not prepare two environments, but if you look at the speed difference between the Native implementation and the Serverless implementation, it may not be a problem if you keep the server state the same. .. I think.
This benchmark took a very long time. Posting was delayed considerably because it took 3 to 4 hours to request about 600 times at most. And the slowness of Serverless has become a really striking result. Let's use it. I thought, but I should stop it for a while ... On the other hand, AWS Lambda is excellent ... And the speed of Go's http server is amazing. I never thought that even such a small bench would be fast. Also, Iron Functions has almost no Japanese know-how, so it's painful. Actually, there is a reverse proxy called fnlb, and a method of clustering by it is also officially prepared. Then it will be easier to scale. Although I think that, the operation by itself is too slow, so it may be essential to tune more or improve the bottleneck. Iron Functions itself is written in Go in the first place, so it shouldn't be that slow, but ... I wonder if it's around the docker container ... Hmm. The serverless road is a long way off.
Recommended Posts