Simple Traceroute and Performance Testing in Google Cloud

Traceroute is a tool to trace the path between two hosts. A traceroute can be a helpful first step to uncovering many different types of network problems. Support or network engineers often ask for a traceroute when diagnosing network issues.


Traceroute shows all Layer 3 (routing layer) hops between the hosts. This is achieved by sending packets to the remote destination with increasing TTL (Time To Live) value (starting at 1). The TTL field is a field in the IP packet which gets decreased by one at every router. Once the TTL hits zero, the packet gets discarded and a "TTL exceeded" ICMP message is returned to the sender. This approach is used to avoid routing loops; packets cannot loop continuously because the TTL field will eventually decrement to 0. By default the OS sets the TTL value to a high value (64, 128, 255 or similar), so this should only ever be reached in abnormal situations. So traceroute sends packets first with TTL value of 1, then TTL value of 2, etc., causing these packets to expire at the first/second/etc. router in the path. It then takes the source IP/host of the ICMP TTL exceeded message returned to show the name/IP of the intermediate hop. Once the TTL is high enough, the packet reaches the destination, and the destination responds. The type of packet sent varies by implementation. Under Linux, UDP packets are sent to a high, unused port. So the final destination responds with an ICMP Port Unreachable. Windows and the mtr tool by default use ICMP echo requests (like ping), so the final destinations answers with an ICMP echo reply. Let's try it out by setting up a traceroute on one of your virtual machines. For this step ssh in to the VMs. Install these performance tools in the SSH window:

sudo apt-get update
sudo apt-get -y install traceroute mtr tcpdump iperf whois host dnsutils siege

Now try a few other destinations and also from other sources: VMs in the same region or another region (eu1-vm, asia1-vm, w2-vm) (works best if you increase max TTL, so traceroute -m 255 Anything else you can think of To stop traceroute, Ctrl-c in the SSH window and return to the command line.

Use iperf to test performance

Between two hosts

When you use iperf to test the performance between two hosts, one side needs to be set up as the iperf server to accept connections. Important: The following commands transfer Gigabytes of traffic between regions, which is charged at Internet egress rates. Be mindful of this when using them. If you are not on a whitelisted project, or in the free trial, you might want to skip, or only skim. (Costs should be less than $1 USD.) Try a very simple test: SSH into the VM and install the performance tools:

sudo apt-get update
sudo apt-get -y install traceroute mtr tcpdump iperf whois host dnsutils siege

SSH into other VMs (will be used a server) and run:
iperf -s #run in server mode

On first VM (will be used as a Client) run this iperf:
iperf -c us-test-01 #run in client mode, connection to eu1-vm

You will see some output like this:
Client connecting to eu-vm, TCP port 5001
TCP window size: 45.0 KByte (default)

On europe-test-01 use Ctrl-c to exit the server side when you're done.
[ 3] local port 35923 connected with port 5001
[ ID] Interval Transfer Bandwidth
[ 3] 0.0-10.0 sec 298 MBytes 249 Mbits/sec

Between VMs within a region

Now you'll deploy one instance (ex. us-test-01)in one zone and another inatnce in a different zone ( You will see that within a region, the bandwidth is limited by the 2 Gbit/s per core egress cap. 

In Cloud Shell, create us-test-02:
gcloud compute instances create us-test-02 \
--subnet subnet-us-central \
--zone us-central1-b \
--tags ssh,http

SSH to us-test-02 and install performance tools:
sudo apt-get update
sudo apt-get -y install traceroute mtr tcpdump iperf whois host dnsutils siege

Between regions you reach much lower limits, mostly due to limits on TCP window size and single stream performance. You can increase bandwidth between hosts by using other parameters, like UDP. On europe-test-01 run:
iperf -s -u #iperf server side

On us-test-01 run:
iperf -c europe-test-01 -u -b 2G #iperf client side - send 2 Gbits/s

This should be able to achieve a higher speed between EU and US. Even higher speeds can be achieved by running a bunch of TCP iperfs in parallel. Let's test this. In the SSH window for us-test-01 run:
iperf -s

In the SSH window for us-test-02 run:
iperf -c us-test-01 -P 20

The combined bandwidth should be really close to the maximum achievable bandwidth. Test a few more combinations. If you use Linux on your laptop you can test against your laptop as well. (You can also try iperf3 which is available for many OSes, but this is not part of the lab.) As you can see, to reach the maximum bandwidth, just running a single TCP stream (for example, file copy) is not sufficient; you need to have several TCP sessions in parallel. Reasons are: TCP parameters such as Window Size; and functions such as Slow Start. See TCP/IP Illustrated for excellent information on this and all other TCP/IP topics. Tools like bbcp can help to copy files as fast as possible by parallelizing transfers and using configurable window size. Optional: If you have large enough quota, spin up some 2/4/8/16 core VMs, install iperf, and see what performance you can reach. Where is the limit?

Post a Comment