Traceroute


Traceroute is the most useful tool we have to troubleshoot network problems. It shows the path that packets take from the machine you are using to the destination you give it on the command line.

Each gateway (router) is displayed, along with the round trip time (in milliseconds) for each of three trace packets to reach the specified gateway and return. These intervals may vary widely as a function of network load.

Contents

  1. Quick 'n' Dirty: traceroute -n
  2. Tracerouting from Multiple Sides of a Problem
  3. Troubleshooting Traceroute Results
  4. Traceroute Described in Detail
  5. Troubleshooting Congestion: Traceroute Speeds

Quick 'n' Dirty: traceroute -n

Normally when you do a traceroute to a domain name or IP address, traceroute will try to look up the DNS hostnames of the IP address at each hop. If there is any problem with the DNS server, your traceroute will "hang". This can throw you off track.

Instead, use traceroute -n. This means "IP numbers only", and any hangs or problems that show up in the traceroute will be entirely due to routing -- which is what you're troubleshooting.

Here's an example:

_fairy.tlg.net[~]> traceroute -n gw1-sj-tlg
traceroute to gw1-sj-tlg.tlg.net (140.174.74.1), 30 hops max, 40 byte packets
 1  140.174.77.5  2.2 ms  2.119 ms  2.257 ms
 2  140.174.178.1  7.593 ms  4.177 ms  26.672 ms
 3  140.174.125.5  6.958 ms  49.766 ms  17.813 ms
 4  140.174.161.2  12.194 ms *  71.78 ms

Here's what the same traceroute looks like with DNS lookups:

_fairy.tlg.net[~]> traceroute gw1-sj-tlg
traceroute to gw1-sj-tlg.tlg.net (140.174.161.2), 30 hops max, 40 byte packets
 1  gw1-ms-tlg (140.174.77.5)  2.354 ms  2.191 ms  4.617 ms
 2  ln1_gw2-sf-tlg_ms (140.174.178.1)  29.144 ms  4.482 ms  4.305 ms
 3  border-sf-tlg (140.174.125.5)  5.013 ms  4.663 ms  5.071 ms
 4  border-sj-tlg (140.174.161.2)  13.748 ms *  12.346 ms

Notice that "border-sj-tlg" and "gw1-sj-tlg" are the same IP address. This is set up in the DNS for that IP address.


Tracerouting From Multiple Sides of a Problem

You can traceroute to any location from multiple starting points:

If you are trying to isolate a network problem, this can be very useful!


Troubleshooting Traceroute Results

This is an example of what traceroute will show you if it can't get to the destination:

_fairy.tlg.net[~]> traceroute -n 140.174.92.14 
traceroute to 140.174.92.14 (140.174.92.14), 30 hops max, 40 byte packets
 1  140.174.77.5  2.194 ms  3.202 ms  3.023 ms
 2  140.174.178.1  6.875 ms  6.881 ms  26.222 ms
 3  140.174.125.5  7.021 ms  5.411 ms  7.047 ms
 4  140.174.125.5  6.484 ms !H *  21.091 ms !H

Possible problems

  1. * (star)
  2. !H
    A !H indicates that the router at that hop doesn't know anything about the target address; the packet comes back to the source with a message saying "No Forwarding Address".
  3. A Routing Loop
    Sometimes you'll see the packets bounce back and forth between two routers or even do a big loop through the Internet and come back to where it started. In this case, the Internet routing tables are incomplete or in conflict for your target IP address. If the loop happens within a customer's internal network or between the routers on either end of a customer link, the customer has not set up their routing correctly.
  4. The Traceroute "Hangs"
    When the traceroute has reached its destination, you'll get your command line back again. If you do not, but no further information appears, the traceroute is "hung". This is probably a DNS problem. Use traceroute -n.

Traceroute Described Line-by-Line

Here's an example of a regular traceroute from the Mission NOC to a router on a customer's site ("theiripaddr" in the database). This particular customer is digicol. You could also use "traceroute -n" for this.

_fairy.tlg.net[/usr/local/etc/httpd/htdocs/NOC.new]> t 140.174.166.18
traceroute to 140.174.166.18 (140.174.166.18), 30 hops max, 40 byte packets
 1  gw1-ms-tlg (140.174.77.5)  2.289 ms  2.737 ms  2.373 ms
 2  ln1_gw2-sf-tlg_ms (140.174.178.1)  3.992 ms  7.06 ms  4.416 ms
 3  204.94.172.10 (204.94.172.10)  99.63 ms  24.727 ms  22.045 ms

Traceroute shows the path packets take from the machine you are using to the destination given on the command line. Each hop is a port on a router somewhere in the Internet. Let's take this one line at a time:

Starting Point

traceroute to 140.174.166.18 (140.174.166.18), 30 hops max, 40 byte packets

"30 hops max" means that the traceroute will only hop 30 times and then it will stop. You should be able get to where you're trying to go within 30 hops!

"40 byte packets" is the size of the packets that are being sent out by traceroute.

First Hop

 1  gw1-ms-tlg (140.174.77.5)  2.289 ms  2.737 ms  2.373 ms

There are three things to notice here:

  1. There are three different sets of numbers; this is because traceroute sends out 3 of those 40 byte packets. It's possible that these packets can take different routes, for instance between SJ and MV since we have three identical T1s connecting those two POPs.
  2. The numbers give the time that each packet took to get to that IP address and return to the starting point. (Sort of like little messengers coming back yelling "it's there! It's there!")
  3. It gives a router name and an IP address. The IP address indicates a particular port on the router (serial port or ethernet port). The IP number will always be the number of the port that the packets enter the router by. In this case all three of them travel from fairy.tlg.net through the Mission ethernet cord to the ethernet port on gw1-ms-tlg, the gateway router in the Mission POP. The IP address of the ethernet port is 140.174.77.5.

Second Hop

 2  ln1_gw2-sf-tlg_ms (140.174.178.1)  3.992 ms  7.06 ms  4.416 ms

The packets have left gw1-ms-tlg via a serial port attached to a T1 line which runs downtown to the SF POP and then connects to a serial port on gw2-sf-tlg. This particular serial port has its own name (ln1_gw2-sf-tlg_ms) and its IP address is 140.174.178.1.

Third Hop

 3  204.94.172.10 (204.94.172.10)  99.63 ms  24.727 ms  22.045 ms

This is the third and final hop. The packets have travelled across another link and entered another router (the customer's router) through a serial port with the IP address 204.94.172.10. Notice: