Using Ping Sweep to Find MTU Ceiling

Earlier this week I had breakfast with a very interesting group. One of those present had an extensive history with Cisco systems. We talked about his tenure and several of the projects that he had been involved in. For some reason, one that caught my attention was the sweep option that we find in the extended Ping utility. Although it is hard to believe, there was a point in time that this gem didn’t exist.

I’ve written a few articles about the challenges of path MTU discovery and the issues that arise when it misbehaves. Today’s article looks specifically at using a ping sweep and how it can be used to quickly identifying the path MTU ceiling. The topology used for testing is simple and shown below. Notice that the two top routers are connected by a link with a lowered MTU (1492).

Ping Sweep MTU Discovery

Let’s step through the process that an administrator might go through when a networked application isn’t working correctly. He or she would likely determine the endpoints and confirm reachability. For this example, I am testing a connection between 192.168.1.1 and 192.168.4.4. The ping command is the tool of choice for confirming reachability.

Basic Connectivity Test

R1#ping 192.168.4.4

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 192.168.4.4, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 44/60/80 ms

That seems to be working. A logical next step would be to test further with larger packets.

R1#ping 192.168.4.4 size 1500

Type escape sequence to abort.
Sending 5, 1500-byte ICMP Echos to 192.168.4.4, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 72/84/92 ms

That also seems to be working.  Another test might be repeating our previous example and setting the DF Bit.

R1#ping 192.168.4.4 size 1500 df-bit

Type escape sequence to abort.
Sending 5, 1500-byte ICMP Echos to 192.168.4.4, timeout is 2 seconds:
Packet sent with the DF bit set
.....
Success rate is 0 percent (0/5)

This failure is a symptom that could indicate a problem with the path MTU. I would typically re-test reachability with something much lower. This would confirm that we still have connectivity and the issue isn’t something crazy (like filtering based on the df-bit).

R1#ping 192.168.4.4 size 100 df-bit

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 192.168.4.4, timeout is 2 seconds:
Packet sent with the DF bit set
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 36/60/80 ms

Since that worked, I would conclude that there is some issue with sending larger packets when the df-bit is set. My next step would likely be to identify the packet size that causes this to happen. There are several ways to determine where the ceiling is. One option would be to divide and conquer. Using that method, I might try sending 1000 Bytes. If that succeeds, I could increase it to 1250 Bytes. At some point, the issue would present again and I would have to get more granular with the packet size.

Another option, as mentioned in the introduction, is to use the sweep option. This involves using the normal ping command and going into the extended menu. This would look something like the following—

R1#ping
Protocol [ip]:
Target IP address: 192.168.4.4
Repeat count [5]: 1
Datagram size [100]:
Timeout in seconds [2]:
Extended commands [n]: yes
Source address or interface:
Type of service [0]:
Set DF bit in IP header? [no]: yes
Validate reply data? [no]:
Data pattern [0xABCD]:
Loose, Strict, Record, Timestamp, Verbose[none]:
Sweep range of sizes [n]: y
Sweep min size [36]: 1400
Sweep max size [18024]: 1550
Sweep interval [1]:
Type escape sequence to abort.
Sending 151, [1400..1550]-byte ICMP Echos to 192.168.4.4, timeout is 2 seconds:
Packet sent with the DF bit set
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!M.M.M.M.......
Success rate is 86 percent (93/107), round-trip min/avg/max = 52/63/76 ms

It’s worth noting that depending on intermediary router configurations, the output may vary. In our example the M’s (!!!!M.M.M) indicate that intermediary routers are sending unreachables. In some cases, all that is seen are periods (!!!!!….). This would indicate that ip unreachables aren’t being received (or perhaps produced). Before explaining how the results of this test can be used to determine the MTU ceiling, I want to draw attention to the more relevant parameters we selected above.

Repeat count [5]: 1 — Set to one because this is actually the number of times that the entire range will be iterated through. For example, if it was left at 5 and we swept through a range of 100 possible sizes, 500 echo requests would be generated.

Extended commands [n]: yes — Enables the additional menu items for ping, including the options for sweep.

Set DF bit in IP header? [no]: yes — Should typically be set when testing MTU issues. If not, the echo request may be fragmented and sent to the destination.

Sweep range of sizes [n]: y — Enables the sweep option.

Sweep min size [36]: 1400 — Minimum Packet Size for the test.

Sweep max size [18024]: 1550 — Maximum Packet Size for the test.

Sweep interval [1]: — Size delta between packets (i.e. 1400, 1401, 1402, 1403…)

So the extended ping options used would send 151 ICMP echo requests to 192.168.4.4. The first packet would have an IP length of 1400 bytes. The next packet would be 1401. Packet 151 would be 1550 bytes in length.

As can be seen by the output above, I used the break sequence to exit the ping command when I started seeing failures. 93 packets were successfully sent and responses were received. Based on this output, we have a pretty good idea now where our MTU ceiling is. If packet 1 was 1400 Bytes, then the 93rd packet is 1492 Bytes. Even if we miscalculated, we are definitely going to be close to the ceiling. To confirm, we can test with specific packet sizes.

R1#ping 192.168.4.4 size 1492 df-bit

Type escape sequence to abort.
Sending 5, 1492-byte ICMP Echos to 192.168.4.4, timeout is 2 seconds:
Packet sent with the DF bit set
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 56/60/64 ms

R1#ping 192.168.4.4 size 1493 df-bit

Type escape sequence to abort.
Sending 5, 1493-byte ICMP Echos to 192.168.4.4, timeout is 2 seconds:
Packet sent with the DF bit set
.....
Success rate is 0 percent (0/5)

Conclusion

As seen above, we have definitively determined the MTU ceiling. The impact of a lowered MTU will depends on the architecture and design. If PMTUD (path MTU Discovery) is working properly, a lowered MTU may not produce any problems. However, there are situations in which MTU issues can present themselves in strange and amazing ways.

By the way, the gentleman from breakfast left Cisco a few years ago and is involved in some important work around the world. His organization is working to make Internet accessible and sustainable to those in developing areas of the world. I encourage everyone to take a look at the work he’s doing and check out Wireless Networking in the Developing World.

 

Other Articles about MTU

Disclaimer: This article includes the independent thoughts, opinions, commentary or technical detail of Paul Stewart. This may or may not reflect the position of past, present or future employers.

No related content found.

About Paul Stewart, CCIE 26009 (Security)

Paul is a Network and Security Engineer, Trainer and Blogger who enjoys understanding how things really work. With over 15 years of experience in the technology industry, Paul has helped many organizations build, maintain and secure their networks and systems.
This entry was posted in How-To. Bookmark the permalink.

One Response to Using Ping Sweep to Find MTU Ceiling

  1. Great if you have access to the routers. Sometimes the Server or Client (Windows) in the network is the culprit. For that I have written a small tool you can download here: https://dl.dropboxusercontent.com/u/53426117/PermLinks/Pinger.exe
    Its pretty self-explaining. Nice to find black holes in a MPLS structure,

Comments are closed.