Online book being written by Zenn "Introduction to self-made Python web applications for a sluggish third year web engineer" The has been updated
Chapter "What is HTTP?" has been updated.
If you want to read more, please like the book or follow the author ;-)
The following is an excerpt of the contents of the book.
By the way, up to the previous chapter, I made an "Ese Web server" by imitating Apache and Chrome. I would like to evolve this into a "minimum decent web server", but what exactly is a "minimum decent web server"?
I have to explain ** HTTP ** in order to proceed from here, so please keep in touch with me again.
If you get out of here, you'll feel like, "The rest is just writing!"
Up to the previous chapter, you all learned about TCP
.
(To recap, TCP was a rule for sending "without omissions and in order.")
Being able to send "in order without omission" guarantees that the sent message will be transmitted to the other party as it is, but then you can use that TCP to convey ** "what" (= what message). Is not it? ** **
Now consider when a client sends a request to a server for a web service called Google. What you want to convey in the request is I want to search for the word `hoge. Cookie uses hoge. `` Let's say. (Cookies will be explained in detail later in this book)
The server is in trouble if each client conveys this information in the format they want.
For example
Client 1) Japanese mixed ...
Cookieはfugaね、今回はwww.google.com/searchで検索ワードはhoge
Client 2) The delimiter is a comma, a number, a colon ...
1.www 2.google 3.com 4.search, word:: hoge, cookie=fuga
If the request is sent in a different format each time, the server will not know which part of the request indicates what information.
Therefore, a convention (= protocol) was established worldwide. ** "When using a Web service, use TCP and send a message in this format." ** It is a promise.
In the case of the client request earlier,
GET /search?q=hoge HTTP/1.1
Cookie: fuga
Is supposed to be sent.
If you know that all clients in the world will send in this format, even strangers to each other can properly parse (decompose) the message and retrieve the information on the Web server side.
This convention is called ** HTTP (HyperText Transfer Protocol) **.
::: details Column: Transport layer protocol and application layer protocol
The protocol for what to tell
depends on the service you want to provide.
The reason is that what you have to tell the other party depends on the service.
For example, in the case of ** mail sending service **
--My email address --Destination email address --Title
You will need to tell the other party.
Similarly, if it is a ** Web service **,
--URL of the web page you request --Request type (whether you want to see the web page, send the data of the completed form, etc.) --Whether to use HTTP or HTTPS --What is the value of the cookie (cookies will be explained in detail later in this document)
It is necessary to tell the other party.
Of course, email-sending services and web services will deliver messages to the other party in different formats, which means that the protocol will change.
(By the way, the protocol SMTP
is used when sending mail, and POP
is used when receiving mail.)
However, HTTP, SMTP, and POP are all conventions regarding the format of messages, and it is a major premise that messages arrive "in order without omission" on both the sending side and the receiving side. In other words, HTTP, SMTP, and POP are all protocols ** that are premised on TCP communication.
The structure is such that there is a "how to send" protocol first, and then there is a "what to send" protocol.
A well-known model for the hierarchical structure of such protocols is the OSI Reference Model (https://ja.wikipedia.org/wiki/OSI%E5%8F%82%E7%85%A7%E3%83). There is% A2% E3% 83% 87% E3% 83% AB).
In the OSI reference model, the protocol related to "how to send" is called ** transport layer **, and the protocol related to "what to send" is called ** application layer **.
So, the senior engineer "Is that a transport layer problem?" If you say, "I haven't talked about the contents of the response or the order, but I'm talking about the mechanism for delivering messages in order without omission." It means that · · ·
Being able to understand the meaning of these words immediately and accurately will be one factor that separates intermediate and advanced users.
This HTTP rule, which is used worldwide, is enacted by an organization called IETF.
In addition to HTTP, the IETF has established many protocols and specifications related to Internet technology, and detailed specifications and explanations are in the document RFC. Is issued as.
RFCs are easy to search online and can be read by anyone.
For example, you can read RFC2616
, which describes the basics of HTTP in RFC, from here.
The whole picture of HTTP is written in this RFC2616
, so let's read and study here.
However, I think that the person who opened the RFC link immediately broke his heart.
The RFC is a document at a level that functions like a law, which is the basis of Internet technology all over the world, and systematically summarizes the background, purpose, and detailed specifications of the protocol. It's hard to figure out.
First of all, it's all in English and it's a little tight.
However, these references can be surprisingly understandable if you read them after getting an overview.
** Understanding the primary source is a very important skill to improve as an engineer. ** ** Whether or not you can read the official reference properly to find out how to use frameworks and libraries, not just RFCs, is an essential skill to step up to advanced level.
So, in the following, I will explain the outline of HTTP in my own words, and then read the relevant part of RFC from time to time.
Also, when referring to the RFC, use the following Japanese translation site instead of the original text for the sake of simplicity. https://triple-underscore.github.io/rfc-others/RFC2616-ja.html
Thank you, Hidehiko Hashimoto.
In the protocol called HTTP, different rules (formats) are defined for each request and response. Let's take a look at each of these two formats in turn.
By the way, when we simply say "request", it generally means "message from client to server (regardless of format or communication method)", but among them, the message sent according to HTTP rules is ** HTTP. Called a request **. Similarly for "response", the response that follows the HTTP rules is called ** HTTP response **.
Please see the continuation from here.
Recommended Posts