Thursday, December 7, 2017

Quick test for GCP inter-zone networking

Prologue: It took a year to move to Down Under and another 6 months to settle here, or at least to start feeling settled, but it looks like I'm back to writing, at least.

I'm in the process of designing how to move our systems to multi-zone deployment in GCP and wanted to have a brief understanding of the network latency and speed impacts. My Google-fu didn't yield any recent benchmarks on the subject, so I decided to run a couple of quick checks myself and share the results.

Setup

We are running in us-central1 zone and using n1-highmem-8 (8 CPUs / 50Gb RAM) instances as our main work horse. I've setup one instance in each of the zones - a, b, and c; with additional instance in zone a to measure intra-zone latency.

VMCREATOR='gcloud compute instances create \
                  --machine-type=n1-highmem-8 \
                  --image-project=ubuntu-os-cloud \
                  --image=ubuntu-1604-xenial-v20171121a'

$VMCREATOR --zone=us-central1-a us-central1-a-1 us-central1-a-2
$VMCREATOR --zone=us-central1-b us-central1-b
$VMCREATOR --zone=us-central1-c us-central1-c

Latency

I used ping to measure latency, the flooding version of it:

root@us-central1-a-1 $ ping -f -c 100000 us-central1-b
Here are the results:
A A
rtt min/avg/max/mdev = 0.041/0.072/2.882/0.036 ms, ipg/ewma 0.094/0.066 ms
A B
rtt min/avg/max/mdev = 0.132/0.193/7.032/0.073 ms, ipg/ewma 0.209/0.213 ms
A C
rtt min/avg/max/mdev = 0.123/0.189/4.110/0.060 ms, ipg/ewma 0.205/0.190 ms
B C
rtt min/avg/max/mdev = 0.123/0.176/4.399/0.047 ms, ipg/ewma 0.189/0.161 ms

While inter-zone latency is twice as big as intra-zone latency, it's still within typical LAN figures. Mean deviation is quite low as well. Too bad that ping can't count percentiles.

Throughput

I used iperf tool to measure throughput. Both unidirectional (each way) and bidirectional throughputs were measured.
  • Server side: iperf -s
  • Client side: iperf -c -t 60 -r and iperf -c -t 60 -d

Note: iperf has a bug where in client mode it ignores any parameters specified before client host, therefore it's crucial to specify the host as a first parameter.

Here are the results. All throughput numbers are in gigabits.

ZonesSendReceiveSend + Receive
A & A12.013.98.12 + 10.1
A & B7.968.224.57 + 6.30
A & C6.878.513.97 + 5.98
B & C5.757.513.05 + 3.96

Conclusion

I remember reading in GCP docs, that their zones are kilometers away from each other, yet, according to the above quick tests, they still can be treated as one huge 10Gbit LAN - that's pretty impressive.