Transport Control Protocol (TCP)

This chapter is a précis of Chapter 25 of Doug Comer's book [Comer 2004] with additional material from Section 6.5 of Andy Tanenbaum's book [Tanenbaum 2003] .

Transport protocols provide reliability, which is fundamental for many applications. The Transmission Control Protocol (TCP) is the transport level protocol that provides reliability in the TCP/IP protocol suite. TCP was defined in [RFC 793] with clarifications and bug fixes in [RFC 1122] . From an application program's perspective, TCP offers seven major features.

TCP uses IP to carry messages, known as segments. Each TCP segment is encapsulated in an IP datagram and sent across the Internet. Figure 1 illustrates that TCP treats IP as an end-to-end transport protocol.

Example internet
Figure 1: Example internet

As illustrated, TCP software is required at both ends of the virtual connection, but not on intermediate routers. From TCP's point of view, the entire Internet is a communication system capable of accepting and delivering messages without changing or altering their contents.

Packet Loss and Retransmission

TCP copes with the loss of packets using a technique called retransmission. When TCP data arrives, an acknowledgement is sent back to the sender. When TCP data is sent, a timer is started. If the timer expires before an acknowledgement arrives, TCP retransmits the data. This is illustrated in Figure 2.

Example retransmission
Figure 2: Example retransmission

The host on the left is sending data, and the host on the right is receiving data. TCP must be ready to retransmit any packet that is lost on either connection. How long should TCP wait? The TCP software does not know whether it is using a local area network (acknowledgements within a few milliseconds) or a long-distance satellite connection (acknowledgements within a few seconds).

Adaptive Retransmission

TCP estimates the round-trip delay for each active connection. For each connection, TCP generates a sequence of round-trip estimates and uses a statistical function to produce a weighted average. It also maintains an estimate of the variance and uses a linear combination of the estimated mean and variance as the value of the timeout. This is illustrated in Figure 3.

Timeout and retransmission
Figure 3: Timeout and retransmission

In (a), we have a connection with a relatively long round-trip delay. In (b), we have a connection with a shorter round-trip delay. The goal is to wait long enough to decide that a packet was lost, without waiting longer than necessary. When delays start to vary, TCP adjusts the timeout to a value greater than the mean to accommodate peaks.

Flow Control

TCP uses a window mechanism to control the flow of data. When a connection is established, each end of the connection allocates a buffer to hold incoming data, and sends the size of the buffer to the other end. As data arrives, the receiver sends acknowledgements together with the amount of buffer space available called a window advertisement.

If the receiving application can read data as quickly as it arrives, the receiver will send a positive window advertisement with each acknowledgement. However, if the sender is faster than the receiver, e.g. has a faster CPU, incoming data will eventually fill the receiver's buffer, causing the receiver to advertise a zero window. A sender that receives a zero window advertisement must stop sending until it receives a positive window. This is illustrated in Figure 4.

TCP flow control
Figure 4: TCP flow control

The sender is using a maximum segment size of 1,000 octets. The receiver intially advertises an initial window size of 2,500 octets. The sender transmits three segments (two containing 1,000 octets and one containing 500 octets) and then waits for an acknowledgement. The first three segments fill the receiver's buffer faster than the receiving application can consume the data, so the advertised window reaches zero. After the application reads 2,000 octets, the receiving TCP sends an additional acknowledgement advertising a window of 2,000 octets.

The sender responds by sending two 1,000 octet segments resulting in another zero window. The application reads 1,000 octets, so the receiving TCP sends an acknowledgement with a positive window size.

TCP Connection Management

TCP uses a single format for all segments. This is illustrated in Figure 5.

TCP segment format
Figure 5: TCP segment format

As with UDP, TCP uses its own port numbers to identify processes. A socket identifies one end of a connection and consists of an IP address and a port number, e.g.193.61.29.127:80. A socket pair is identified by the socket identifiers at both ends and forms a virtual circuit identifier. Port numbers below 1024 are called well-known ports and are reserved for standard services. A few of these are listed in Figure 6.

PortProtocolDescription
21FTPFile transfer
23TelnetRemote login
25SMTPE-mail
69TFTPTrivial file transfer protocol
79FingerLookup information about a user
80HTTPWorld wide web
110POP-3Remote e-mail access
119NNTPUsenet news

Figure 6: TCP server port numbers

If a well-known service is provided by both TCP and UDP, the port number is usually chosen to be the same for both transport layers. This is purely for convenience and is not required by the protocols.

TCP Code Bits

TCP Connection Establishment

Connections are established by means of a three-way handshake. One side executes a CONNECT primitive, specifying the destination IP address, destination port, window size, and optionally some user data. This is delivered in a TCP segment with the SYN flag on, the ACK flag off, and an Initial Sequence Number (ISN) which is randomly chosen. This is illustrated in Figure 7.

TCP open connection
Figure 7: TCP open connection

In (a), host 1 opens the connection with an ISN of x. Host 2 has previously performed a LISTEN primitive on the appropriate port. It accepts the connect request by sending a TCP segment which acknowledges host 1's request (ACK flag on and the ACKNOWLEDGEMENT NUMBER set to x+1) and its own connection request (SYN flag on with an ISN of y). Host 1 acknowledges this request. Note that the SYN flag consumes one byte of sequence space so that it can be acknowledged unambiguously.

In (b), both hosts attempt a connection request at the same time. Only one connection is established, not two, because a connection is identified by their end points, which are the same in this case.

TCP Connection Release

The three-way handshake is also used to terminate a connection, as illustrated in Figure 8.

TCP close connection
Figure 8: TCP close connection

In this example, host 1 terminates the connection by transmitting a segment with the FIN flag set containing optional data. Host 2 acknowledges this (the FIN flag also consumes one byte of sequence space) and sets its own FIN flag. The third and last segment contains host 1's acknowledgement of host 2's FIN flag.

TCP Connection Management

Although complex, the steps to manage a TCP connection can be represented in a finite state machine with the eleven states listed in Figure 9.

TCP states
Figure 9: TCP states

Each connection starts in the CLOSED state and leaves that state with either a passive open (LISTEN) or an active open (CONNECT). After the three-way handshake the connection is ESTABLISHED. Either side can initiate connection release. When this is completed, the state returns to CLOSED.

The finite state machine is shown in Figure 10. The common case of a client actively connecting to a passive server is shown with heavy lines - solid for the client, dotted for the server.

TCP connection management finite state machine
Figure 10: TCP connection management finite state machine

The lightface lines are unusual event sequences. Each line is marked by an event/action pair. The event can be either a user-initiated system call (CONNECT, LISTEN, SEND, or CLOSE), a segment arrival (SYN, FIN, ACK, or RST), or a timeout (in one case).

TCP Transmission Policy

When the window size is zero, the sender cannot send segments, with two exceptions. First, urgent data may be sent, e.g. to allow the user to kill the process running on a remote host. Second, the sender may send a 1-byte segment to make the receiver renounce the data, but re-advertise the window size. This prevents deadlock if a window advertisement gets lost.

Senders are not required to transmit data as soon as they appear from the application. Neither are receivers required to send acknowledgements as soon as possible. This freedom can be exploited to improve performance.

Consider a telnet connection to an interactive editor that reacts to every keystroke. In the worst case, when a character arrives at the TCP software, a 21-byte TCP segment is given to IP to send as a 41-byte datagram. At the receiving side, TCP returns a 40-byte acknowledgement (20-byte TCP header + 20-byte IP header). When the editor reacts to the character, it echoes it as a 41-byte packet which is acknowledged with a 40-byte packet. In all, 162 bytes and four segments are sent for each character typed. It makes you wonder how the Internet survives!

There are a number of problems that can degrade TCP performance. One of these is the silly window syndrome. This problem occurs when data are passed in large blocks, but an application on the receiving side reads data one byte at a time. This is illustrated in Figure 11.

Silly window syndrome
Figure 11: Silly window syndrome

The TCP buffer on the receiving side is full and the sender knows this (window size is zero). The application reads one byte. TCP sends a window update of one. The sender obliges and sends one byte. The buffer is now full so the receiver acknowledges the byte and sets the window size to zero.

One solution is to prevent the receiver sending window updates with small sizes. The sender can also help by not sending small segments.

References

  1. Douglas Comer, Computer Networks and Internets with Internet Applications (fourth edition), Prentice Hall, Upper Saddle River, NJ, 2004, ISBN 0-13-143351-2. http://netbook.cs.purdue.edu
  2. RFC 1122, Requirements for Internet Hosts - Communication Layers, October 1989.
  3. RFC 793, Transmission Control Protocol, September 1981.
  4. Andrew Tanenbaum, Computer Networks (fourth edition), Prentice Hall, Upper Saddle River, NJ, 2003, ISBN 0-13-038488-7. http://www.phptr.com/tanenbaumcn4/


Last modified: Thu Oct 27 16:01:12 2005