missing messages when reading with non-blocking udp

lgwest picture lgwest · Oct 18, 2010 · Viewed 10.2k times · Source

I have problem with missing messages when using nonblocking read in udp between two hosts. The sender is on linux and the reader is on winxp. This example in python shows the problem.
Here are three scripts used to show the problem.
send.py:

import socket, sys
s = socket.socket(socket.AF_INET,socket.SOCK_DGRAM)
host = sys.argv[1]
s.sendto('A'*10,   (host,8888))
s.sendto('B'*9000, (host,8888))
s.sendto('C'*9000, (host,8888))
s.sendto('D'*10,   (host,8888))
s.sendto('E'*9000, (host,8888))
s.sendto('F'*9000, (host,8888))
s.sendto('G'*10,   (host,8888))

read.py

import socket
s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
s.bind(('',8888))
while True:
    data,address = s.recvfrom(10000)
    print "recv:", data[0],"times",len(data) 

read_nb.py

import socket
s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
s.bind(('',8888))
s.setblocking(0)
data =''
address = ''
while True:
    try:
        data,address = s.recvfrom(10000)
    except socket.error:
        pass
    else: 
        print "recv:", data[0],"times",len(data) 

Example 1 (works ok):

ubuntu> python send.py
winxp > read.py

give this ok result from read.py:

recv: A times 10
recv: B times 9000
recv: C times 9000
recv: D times 10
recv: E times 9000
recv: F times 9000
recv: G times 10

Example 2 (missing messages):
in this case the short messages will often not be catched by read_nb.py I give two examples of how it can look like.

ubuntu> python send.py
winxp > read_nb.py

give this result from read_nb.py:

recv: A times 10
recv: B times 9000
recv: C times 9000
recv: D times 10
recv: E times 9000
recv: F times 9000

above is the last 10 byte message missing

below is a 10 byte message in the middle missing

recv: A times 10
recv: B times 9000
recv: C times 9000
recv: E times 9000
recv: F times 9000
recv: G times 10

I have checked with wireshark on windows and every time all messages is captured so they reach the host interface but is not captured by read_nb.py. What is the explanation?

I have also tried with read_nb.py on linux and send.py on windows and then it works. So I figure that this problem has something to do with winsock2

Or maybe I am using nonblocking udp the wrong way?

Answer

Len Holgate picture Len Holgate · Oct 18, 2010

If the datagrams are getting to the host (as your wireshark log shows) then the first place I'd look is the size of your socket recv buffer, make it as big as you can, and run as fast as you can.

Of course this is completely expected with UDP. You should assume that datagrams can be thrown away at any point and for any reason. Also you may get datagrams more than once...

If you need reliability then you need to build your own, or use TCP.