Socket Protocol Fundamentals

Justin Ethier picture Justin Ethier · Mar 3, 2010 · Viewed 7.5k times · Source

Recently, while reading a Socket Programming HOWTO the following section jumped out at me:

But if you plan to reuse your socket for further transfers, you need to realize that there is no "EOT" (End of Transfer) on a socket. I repeat: if a socket send or recv returns after handling 0 bytes, the connection has been broken. If the connection has not been broken, you may wait on a recv forever, because the socket will not tell you that there's nothing more to read (for now). Now if you think about that a bit, you'll come to realize a fundamental truth of sockets: messages must either be fixed length (yuck), or be delimited (shrug), or indicate how long they are (much better), or end by shutting down the connection. The choice is entirely yours, (but some ways are righter than others).

This section highlights 4 possibilities for how a socket "protocol" may be written to pass messages. My question is, what is the preferred method to use for real applications?

Is it generally best to include message size with each message (presumably in a header), as the article more or less asserts? Are there any situations where another method would be preferable?

Answer

Eli Bendersky picture Eli Bendersky · Mar 3, 2010

The common protocols either specify length in the header, or are delimited (like HTTP, for instance).

Keep in mind that this also depends on whether you use TCP or UDP sockets. Since TCP sockets are reliable you can be sure that you get everything you shoved into them. With UDP the story is different and more complex.