I wrote a simple UDP Server program to understand more about possible network bottlenecks.
UDP Server: Creates a UDP socket, binds it to a specified port and addr, and adds the socket file descriptor to epoll interest list. Then its epoll waits for incoming packet. On reception of incoming packet(EPOLLIN), its reads the packet and just prints the received packet length. Pretty simple, right :)
UDP Client: I used hping as shown below:
hping3 192.168.1.2 --udp -p 9996 --flood -d 100
When I send udp packets at 100 packets per second, I dont find any UDP packet loss. But when I flood udp packets (as shown in above command), I see significant packet loss.
Test1: When 26356 packets are flooded from UDP client, my sample program receives ONLY 12127 packets and the remaining 14230 packets is getting dropped by kernel as shown in /proc/net/snmp output.
cat /proc/net/snmp | grep Udp:
Udp: InDatagrams NoPorts InErrors OutDatagrams RcvbufErrors SndbufErrors
Udp: 12372 0 14230 218 14230 0
For Test1 packet loss percentage is ~53%.
I verified there is NOT much loss at hardware level using "ethtool -S ethX" command both on client side and server side, while at the appln level I see a loss of 53% as said above.
Hence to reduce packet loss I tried these:
- Increased the priority of my sample program using renice command.
- Increased Receive Buffer size (both at system level and process level)
Bump up the priority to -20:
renice -20 2022
2022 (process ID) old priority 0, new priority -20
Bump up the receive buf size to 16MB:
At Process Level:
int sockbufsize = 16777216;
setsockopt(sockfd, SOL_SOCKET, SO_RCVBUF,(char *)&sockbufsize, (int)sizeof(sockbufsize))
At Kernel Level:
cat /proc/sys/net/core/rmem_default
16777216
cat /proc/sys/net/core/rmem_max
16777216
After these changes, performed Test2.
Test2: When 1985076 packets are flooded from UDP client, my sample program receives 1848791 packets and the remaining 136286 packets is getting dropped by kernel as shown in /proc/net/snmp output.
cat /proc/net/snmp | grep Udp:
Udp: InDatagrams NoPorts InErrors OutDatagrams RcvbufErrors SndbufErrors
Udp: 1849064 0 136286 236 0 0
For Test2 packet loss percentage is 6%.
Packet loss is reduced significantly. But I have the following questions:
Thanks for your help and time!!!
Tuning the Linux kernel's networking stack to reduce packet drops is a bit involved as there are a lot of tuning options from the driver all the way up through the networking stack.
I wrote a long blog post explaining all the tuning parameters from top to bottom and explaining what each of the fields in /proc/net/snmp
mean so you can figure out why those errors are happening. Take a look, I think it should help you get your network drops down to 0.