Troubleshooting a slow network or network packet drops can be tricky. In addition to slower network communication, there may be additional symptoms observed, such as :
- Attempts to connect to a server with ssh and/or sftp results in time-outs or a delayed response.
- When the network load goes high, there are high number of network retransmits observed.
- There may be large number of packet drops seen in the output of ” ifconfig eth[x] ” command.
Some other symptoms also include:
– The output of the command: netstat -s shows increasing values for the following stats :(run several times of ‘netstat -s’):
13336 packets pruned from receive queue because of socket buffer overrun 516 times the listen queue of a socket overflowed 516 SYNs to LISTEN sockets ignored 2040077 packets collapsed in receive queue due to low socket buffer TCPBacklogDrop: 744165
– The output of command : ethtool -S eth[x] shows increasing values against the counter : “rx_fw_discards”:
rx_fw_discards: 4493
Causes of slower network performance
There could be multiple reasons for slower network performance. But some of the possible causes are :
- The network is already highly loaded, upto its maximum capacity and there is congestion.
- The receive buffers configured is not sufficient for the network load.
- There are packet drops due to errors at the physical layer.
Troubleshooting slow network performance
1. Check the network throughput using iperf tool and find whether the network bandwidth utilization is nearing the maximum throughput observed.
2. Set the values for the network parameters appropriately to support the maximum network throughput. Find the Bandwidth delay product (BDP) value and set the network buffer size accordingly. It is calculated as the product of the link bandwidth, and the Round Trip Time.
For example:
– For a 1Gb/s network and Rount Trip Time of 0.1s, the BDP=(0.1 * 10^9)/8. On such a network, set the following parameter values under the file : /etc/sysctl.conf
# vi /etc/sysctl.conf net.core.rmem_max = 12500000 net.core.wmem_max = 12500000 net.ipv4.tcp_rmem = 4096 87380 12500000 net.ipv4.tcp_wmem = 4096 65536 12500000
And increase the following parameters as well :
# vi /etc/sysctl.conf net.core.netdev_max_backlog = 30000 net.ipv4.tcp_max_syn_backlog = 4096
And then execute the command:
# sysctl -p
a) For both the changes there is no need for a system reboot.
b) Post this change, it is required to monitor the output of the command “netstat -s” and check if the following counters are still seen to be increasing:
packets pruned from receive queue because of socket buffer overrun times the listen queue of a socket overflowed SYNs to LISTEN sockets ignored packets collapsed in receive queue due to low socket buffer TCPBacklogDrop
3. Increase the NIC’s RX Ring buffer size. There is a trade-off when setting this number. A larger value could delay the processing of the packets and a lower value could cause packet drops when the corresponding driver encounters delay with processing the incoming packets.
a) Generally, start by doubling the RX Ring size and monitor the output of command “ethtool -S eth[x]”.
# ethtool -G eth[x] rx 512
b) To make this change permanent, append the following to the file : /etc/sysconfig/network-scripts/ifcfg-eth[x]:
# vi /etc/sysconfig/network-scripts/ifcfg-eth[x] ETHTOOL_OPTS="-G rx 512"