How should I mark the end of a TCP packet?

Keith Maurino picture Keith Maurino · Mar 5, 2010 · Viewed 21.5k times · Source

In a client/server application were text data of varying length will be sent back and forth between the client and server, how should I mark the end of a packet that is being sent? For example, when the server is receiving packet data from a client how does the server know that the client packet has fully been received?

Is it more common to tell the server the full length of the packet that it is going to receive before the data or to have something marking the end of the packet?

Some of the data sent will only be a few characters long and some could be thousands of characters.

Answer

Thomas Pornin picture Thomas Pornin · Mar 5, 2010

TCP provides a continuous stream of data. TCP is implemented using packets but the whole point of TCP is to hide them.

Think of it as if it was a wall on which you want to draw. The wall is made of bricks. Bricks are glued together with mortar, and plaster is applied to that the wall surface become smooth. Bricks are the IP packets, TCP is the plaster.

So now you have your smooth plastered TCP tunnel, and you want to add some structure in it. You want to draw boxes, so that your drawings are kept separate from each other. This is what you want to do: to add a bit of "administrative" structure (boxes around the drawings) to your data.

Many protocols use the concept of a packet, which is a bunch of data beginning with a fixed-format administrative header. The header contains enough information to decide where the packet ends; e.g., it includes the packet length. HTTP does that, with a Content-Length header, or (with HTTP/1.1) with the "chunked transfer encoding" where data is split into one or several mini-packets, each with a simple header consisting of exactly a mini-packet-length indication.

Another way is to have a special terminator sequence which cannot appear in "normal data". If your data is text, then you could use a byte of value zero as terminator.

Yet another way is to use self-terminated data. This is data structured in such a way that you can know at any point whether the end of the element has been reached. For instance, XML data is organized as nested pairs of markers such as <foo>...</foo>. When the end marker (</foo>) is reached, you know that the element is finished.