Blocking sockets: when, exactly, does "send()" return?

David Citron picture David Citron · Mar 23, 2011 · Viewed 33.9k times · Source

When, exactly, does the BSD socket send() function return to the caller?

In non-blocking mode, it should return immediately, correct?

As for blocking mode, the man page says:

When the message does not fit into the send buffer of the socket, send() normally blocks, unless the socket has been placed in non-blocking I/O mode.

Questions:

  1. Does this mean that the send() call will always return immediately if there is room in the kernel send buffer?
  2. Is the behavior and performance of the send() call identical for TCP and UDP? If not, why not?

Answer

Arvid picture Arvid · Apr 1, 2011

Does this mean that the send() call will always return immediately if there is room in the kernel send buffer?

Yes. As long as immediately means after the memory you provided it has been copied to the kernel's buffer. Which, in some edge cases, may not be so immediate. For instance if the pointer you pass in triggers a page fault that needs to pull the buffer in from either a memory mapped file or the swap, that would add significant delay to the call returning.

Is the behavior and performance of the send() call identical for TCP and UDP? If not, why not?

Not quite. Possible performance differences depends on the OS' implementation of the TCP/IP stack. In theory the UDP socket could be slightly cheaper, since the OS needs to do fewer things with it.

EDIT: On the other hand, since you can send much more data per system call with TCP, typically the cost per byte can be a lot lower with TCP. This can be mitigated with sendmmsg() in recent linux kernels.

As for the behavior, it's nearly identical.

For blocking sockets, both TCP and UDP will block until there's space in the kernel buffer. The distinction however is that the UDP socket will wait until your entire buffer can be stored in the kernel buffer, whereas the TCP socket may decide to only copy a single byte into the kernel buffer (typically it's more than one byte though).

If you try to send packets that are larger than 64kiB, a UDP socket will likely consistently fail with EMSGSIZE. This is because UDP, being a datagram socket, guarantees to send your entire buffer as a single IP packet (or train of IP packet fragments) or not send it at all.

Non blocking sockets behave identical to the blocking versions with the single exception that instead of blocking (in case there's not enough space in the kernel buffer), the calls fail with EAGAIN (or EWOULDBLOCK). When this happens, it's time to put the socket back into epoll/kqueue/select (or whatever you're using) to wait for it to become writable again.

As usual when working on POSIX, keep in mind that your call may fail with EINTR (if the call was interrupted by a signal). In this case you most likely want to call send() again.