TCP in a Nutshell
Published on
Overview
TCP is a transport layer protocol. It is known to be more reliable than UDP, thanks to the fact that its implementation allows the establishment of a connection, as opposed to its counterpart which relies on sending packets without performing any checks or measures to ensure that those arrive to destination. A TCP connection is established through a couple steps: a TCP handshake, the actual communication, and then finally termination. It is full-duplex, meaning that both hosts are able to perform the same operations, and thus communication can go both ways. TCP however has a couple disadvantages: if a chunk of the data is not received, none of the remaining parts can be used; and in case the connection over which the protocol operates is a slow one, it can act as a bottleneck.
TCP three-way handshake
A TCP handshake is a sequence of steps that occur for a network host to establish a connection with another host. Its main purpose is to acknowledge that the receiving host is ready to communicate, and is one of the reasons that TCP is more reliable than UDP. The host that attempts to start the connection sends a packet with a SYN flag set. The receiver sends back an acknowledgment under the form of a packet with a SYN/ACK flag. Finally, the initiating host sends an ACK to acknowledge that it received the response, and the connection is thus established as a result of this last step. Before looking at further details, it is worth mentioning that TCP headers (which are added to the packet in the transport layer) have the following fields that are required by the protocol for proper functioning:
- Sequence number: this field is used to keep track of the order of packets sent during the communication in order to synchronize between the two hosts
- Acknowledgment number: the value that this field holds is the sequence number + 1 to indicate that the packet in question was successfully received
- Flag: this field is used during the three-way handshake to determine how the packet should be handled
There are other fields contained in the header, but these 3 fields are the most important ones to the three-way handshake.
Now that we have an idea of the components of a packet, we can look at the aforementioned steps in more detail.
As mentioned previously, the flag field is the one that is modified in order to indicate the type of the packet in question, and hence its role. The flag field is 6 bits long, and each bit position refers to a specific type of packet. The possible types are: SYN, SYN/ACK, ACK, RST, PSH, URG and FIN. The URG and PSH flags are not relevant to our discussion. For information, PSH tells the current process to process a packet instead of buffering it, while the URG indicates an urgent packet which should be prioritized.
The host that initializes the sequence sends a SYN packet to indicate that it wants to synchronize with the other host (for the sake of clarity, we’ll refer to the target host as the server and the initializing host the client). The packet contains a random sequence number (the ISN or Initial Sequence Number), and the bit indicating a SYN packet is set in the flag field. Upon receiving the SYN packet, the server sends back a SYN/ACK packet, where it includes its own ISN as the sequence number and sets the ACK to the ISN + 1 to acknowledge that the SYN from the client was received. A final ACK is then sent from the client (again, the value is that of the server’s ISN incremented by 1) to indicate that it received the SYN from the server.
After the final ACK packet is sent in the final step of the handshake, the connection is established, and the two hosts are ready to start exchanging data.
We can see in the first screenshot above the contents of a SYN packet captured using Wireshark. The second screenshot is the SYN/ACK packet from the same TCP “stream” (Wireshark refers to multiple packets belonging to the same connection as a stream, in our case stream 7 as denoted by its index), which is the second step of our three-way handshake as explained before. We can see that the sequence number (3731298078
), which was picked randomly as a value for the SYN packet, was incremented by one and sent back as the ACK value in the SYN/ACK packet. The sequence number in the SYN/ACK packet however is a new value, also chosen randomly. We can also see that the bits are set accordingly in both screenshots, where the corresponding lines in the Flags
field are 1 and thus serve to identify the step in the handshake.
Connection Termination:
Once the communication is fulfilled, both hosts close the connection from their side by sending a packet with the FIN flag set, and then waiting to receive an acknowledgment. One of the devices will send the first FIN packet to indicate that the connection is no longer needed, and after it sent its own ACK to the other device’s FIN, it waits for some time (double the MSL (Maximum Segment Life) time to guarantee that it was received. The connection is then considered closed with no issues. In case any problems are encountered during the connection, an RST packet is sent to abruptly terminate it.