Sunday, April 17, 2016

Kubernetes cluster access by fixed IP

If you:

  • Have Kubernetes cluster running in GKE
  • Connected GKE to your company network through VPN
  • Puzzled how to assign a fixed IP to particular k8s service

Then read on.

Prologue

The ideal solution would be to configure k8s service to use GCP LoadBalancer and have the latter to provide private IP only. However as of April 2016, LoadBalancers on GCP do not provide an option for private IP only, though GCP solution engineers said this feature "is coming".

Therefore the only option we have it to run a dedicated VM with fixed IP and proxy traffic through it.

The approach

Kubernetes service itself provides two relevant ways to access pods behind it:
ClusterIP
By default, every service has a virtual ClusterIP (which can be manually set to a predefined address) which can be used to access pods behind the service. However for this to work, a client has to have kube-proxy running on its host as explained here.
NodePort
A k8s service can be configured to expose a certain pod on every k8s node which will redirected to the service's pods (this comes on top of ClusterIP).

ClusterIP approach obviously is not feasible outside the h8s cluster, so we only have left with NodePort approach. The problem is that k8s node IPs are not static and may change. That's why we need a dedicated VM which has a fixed IP.

After we have a VM, we can either

  • Join it to the k8s cluster, so service's NodePort will be exposed on the VM's fixed IP as well.
  • Run a reverse HTTP proxy on VM to forward traffic on k8s nodes together with a script that monitors k8s nodes and updates proxy configuration when necessary.

I chose the second option because it allows a single VM to proxy requests for multiple k8s clusters and is easier to setup.

The setup

Create an instance

Lets create an VM and assign it a static IP. The below is my interpretation of the official guide.

Create an instance first:

gcloud compute instances create fixed-ip-proxy --can-ip-forward
The last switch is crucial here.

I chose IP for my testing cluster to be 10.10.1.1. Lets add it to the instance:

cat <<EOF >>/etc/network/interfaces.d/eth0-0
auto eth0:0
iface eth0:0 inet static
address 10.250.1.1
netmask 255.255.255.255
EOF

Now change /etc/network/interfaces and make sure that source-directory /etc/network/interfaces.d line comes last. Apply your new configuration by running:

sudo service networking restart

The final step is to instruct GCE to forward traffic destined to 10.250.1.1 to the new instance:

gcloud compute routes create fixed-ip-production \
                                --next-hop-instance fixed-ip-proxy \
                                --next-hop-instance-zone us-central1-b \
                                --destination-range 10.10.1.1/32

To add more IPs (adding dedicated IP per cluster is a good practice), add another file under /etc/network/interfaces.d/ and add a GCE route.

NGINX configuration

Install NGINX:
sudp apt-get install nginx

Install Google Cloud Python SDK:

sudo easy_install pip
sudo pip install --upgrade google-api-python-client

Now download the IP watcher script:

sudo wget -O /root/nginx-ip-watch https://gist.githubusercontent.com/haizaar/f19bdf9e5a6e278c57b96cce945b4fd9/raw/79f11225825607ba78ba84221d27439c1669a492/nginx-ip-watch
sudo chmod 755 /root/nginx-ip-watch

NOTE: You are downloading my script that will run as root on your machine - read its contents first!

Test the script:

$ sudo /root/nginx-ip-watch -h 
usage: Watch GKE node IPs for changes [-h] -p PROJECT -z ZONES
                                      name gke-prefix listen-ip listen-port
                                      target-port

positional arguments:
  name                  Meaningful name of your forwarding rule
  gke-prefix            GKE node prefix to monitor and forward to
  listen-ip             IP listen on
  listen-port           Port to listen on
  target-port           IP listen on

optional arguments:
  -h, --help            show this help message and exit
  -p PROJECT, --project PROJECT
                        Project to list instances for
  -z ZONES, --zones ZONES
                        Zones to list instances for

Now lets setup NGINX to listen for HTTP traffic on 10.10.1.1:5601 and forward it to GKE testing cluster nodes on port 30601 by adding the following to /etc/cron.d/nginx-ip-watch:

PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin

* * * * * root /root/nginx-ip-watch kibana-testing -p my-project -z us-central1-a gke-testing 10.10.1.1 5601 30601

After that, within one minute, your forwarding should be up and running. For more services, just keep adding more lines in the cron file. This will work well for a dozen or so services. After that, I would refactor the solution to issue only one gcloud compute instances list command per minute.

Since we are using NGINX in load-balancer mode, checking GKE hosts only once a minute is good-enough even during cluster upgrades - NGINX will detect and blacklist a shutting down GKE node by itself.

Epilogue

Create a snapshot of your instance to keep a backup of your work every time you change it. Don't forget to issue sync command on the system before taking snapshot of the disk.

Update

  • The first version of my script used gcloud command line util to fetch instances list. It turned out that gcloud performs logging to ~/.config/gcloud/logs and spits 500KB on every invocation. To mitigate this, I've updated my script to use Google Cloud Python SDK to bypass gcloud util completely.
  • As Vadim points out below, you can now specify fixed internal IP during instance creation time. Though you'll still need the setup above if you want to have more then one IP per instance.