Provision, Install, and Configure the Load Balancer


We now interrupt our regularly scheduled provisioning to focus on the Load Balancer.

As mentioned at the start of this tutorial, you will need (at least) two servers to complete this process. Or one server and an Amazon Network Load Balancer, which won't be covered in this tutorial series, but is a viable option none the less.

Provisioning the Rancher 2 load balancer involves a very similar procedure to that which we have already covered. We will need:

  • The common role
  • A custom nginx role
  • A new playbook, group, and group_vars

Yes, I am opting for group_vars instead of host_vars.

Whilst we won't be covering this here, you could - and perhaps should - have multiple load balancers, mitigating the single point of failure. It's all well and good having a highly available kubernetes cluster, but if the $2.50 VPS in front of it all suddenly dies... well, you're done-diddly-done-for.

One point of note is that we will be running a full VPS here, with NGINX installed on that VPS. An alternative approach is to run a VPS with Docker, and run NGINX inside a Docker container. This adds complexity (one extra layer that can go wrong, and needs managing / monitoring), and in my opinion we don't gain anything from doing that here.

The nginx Load Balancer Config

Rancher docs provide a sample config, and some versioning information.

They list the tested versions of NGINX as 1.14, and 1.15.

At the time of writing / recording, NGINX 1.14.2 and 1.15.8 are the latest stable tags. We will opt for 1.15.8. This will add a minor complication, which we will need to address along the way.

The example config given is as follows:

worker_processes 4;
worker_rlimit_nofile 40000;

events {
    worker_connections 8192;
}

http {
    server {
        listen         80;
        return 301 https://$host$request_uri;
    }
}

stream {
    upstream rancher_servers {
        least_conn;
        server <IP_NODE_1>:443 max_fails=3 fail_timeout=5s;
        server <IP_NODE_2>:443 max_fails=3 fail_timeout=5s;
        server <IP_NODE_3>:443 max_fails=3 fail_timeout=5s;
    }
    server {
        listen     443;
        proxy_pass rancher_servers;
    }
}

The 'trick' for us is to replace the rancher_servers entries with our own k8s cluster node public IP addresses. No hardcoding though :)

The Load Balancer Playbook

We need a new Ansible playbook for our NGINX load balancer.

touch rancher-2-load-balancer.yml

Into which I will add:

---
- name: Rancher 2 Load Balancers
  hosts: rancher-2-load-balancer
  roles:
     - codereviewvideos.common
     - geerlingguy.firewall

We'll need to also update the master playbook, site.yml:

---
- import_playbook: rancher-2-kubernetes-node.yml
- import_playbook: rancher-2-load-balancer.yml

And we'll add in a new entry to the production file, listing out our new VPS:

rk8s-lb-1 ansible_host=5.9.117.211 ansible_python_interpreter=/usr/bin/python3
rk8s-node-1 ansible_host=6.10.118.222 ansible_python_interpreter=/usr/bin/python3

[rancher-2-kubernetes-nodes]
rk8s-node-1

[rancher-2-load-balancers]
rk8s-lb-1

This assumes that your load balancer VPS is pre-configured with your SSH key, allowing you to log in as root, much like for our rk8s-node-1 host.

Basic Firewall Entries

The playbook for our Load Balancer so far includes the common role, and the firewall.

Much like for our kubernetes nodes, we will want a general set of firewall rules that apply to every load balancer we provision.

Here's my basic setup:

# group_vars/rancher-2-load-balancers.yml
---
firewall_allowed_tcp_ports:
  - "22"
  - "80"
  - "443"

At this stage we could be able to run, and do a basic provision against our load balancer server.

We could do this by running make run_playbook, but that will run the playbooks for all our infrastructure. Sometimes that's what we want. But when starting out it's nice to limit the specific host, or groups of hosts that we wish to target.

In order to do this using our Docker and Ansible stack, we need a long winded command:

docker run --rm \
        -v $(CURDIR):/crv-ansible \
        -v ~/.ssh/id_rsa:/root/.ssh/id_rsa \
        -v ~/.ssh/id_rsa.pub:/root/.ssh/id_rsa.pub \
        -w /crv-ansible \
        williamyeh/ansible:alpine3 \
        ansible-playbook -i production site.yml \
        --limit rancher-2-load-balancers

In other words, run the full ansible-playbook command as before, but limit the run to just the hosts in the rancher-2-load-balancers group. You can find out more about limiting the host(s) or groups of hosts your Ansible playbook runs will target in the docs.

That's cool, and it works, but we don't want to have hardcoded entries inside our Makefile, nor run long winded commands like that every time we want to provision. At least, I certainly don't.

Therefore I will adapt the Makefile entry further:

run_playbook:
    @docker run --rm \
        -v $(CURDIR):/crv-ansible \
        -v ~/.ssh/id_rsa:/root/.ssh/id_rsa \
        -v ~/.ssh/id_rsa.pub:/root/.ssh/id_rsa.pub \
        -w /crv-ansible \
        williamyeh/ansible:alpine3 \
        ansible-playbook -i production site.yml $(cmd)

Now we can optionally pass any additional commands with the cmd=... argument. Or in summary:

# run every playbook with:
make run_playbook

# or run more granular commands with:
make run_playbook cmd="--limit rancher-2-load-balancers"

Nice.

Adding NGINX

Adding NGINX will be a two step process.

Much like in previous tutorials, we will be leveraging existing Ansible community / galaxy roles to do the vast majority of the hard work.

Here, however, we hit on a slight snag.

Rancher's documentation suggests we install NGINX v1.14.x or v1.15.x. At the start we said we'd opt for v1.15.8, as that's the latest and therefore greatest at the time of recording.

NGINX do have an official Ansible role, but I haven't been able to figure out how to replicate the Rancher example NGINX config with that role alone. If you know, then please shout up and I'll happily alter my approach.

There is a community role that will allow us to get the config we need, and that's the jdauphant.nginx role, which I have been using for a few years now.

In short we will install NGINX using the official role, and configure NGINX using the community role. Sounds a bit weird, but it works.

Rather than me showing you two commands to install each role individually, here's the requirements.yml file at this point in time:

---
- src: geerlingguy.docker
  version: 2.5.2

- src: geerlingguy.firewall
  version: 2.4.1

- src: jdauphant.nginx
  version: v2.21.2

- src: nginxinc.nginx
  version: 0.11.0

- src: singleplatform-eng.users
  version: v1.2.5

The two new entries are both mention nginx and are hopefully immediately obvious. Once you have those in there, run make install_roles, and you should be up to speed.

Installing NGINX with Ansible

We will use the official NGINX Ansible repository to install NGINX v1.15.8 on our load balancer(s).

In order to do this we just need to add the new role to our rancher-2-load-balancer.yml file:

---
- name: Rancher 2 Load Balancers
  hosts: rancher-2-load-balancer
  roles:
     - codereviewvideos.common
     - geerlingguy.firewall
     - nginxinc.nginx

Really there's not a lot more to it than this. The default configuration will mean that running this playbook does little more than install the latest version of NGINX.

All the configuration will be done by jdauphant.nginx, which we haven't yet configured.

After running the playbook, nginx should be installed:

root@your-remote-server:~# nginx -v
nginx version: nginx/1.15.8

Cool.

Now, let's configure NGINX.

Configuring NGINX with Ansible

To configure NGINX we will use the jdauphant.nginx role with the following configuration:

# group_vars/rancher-2-load-balancers

---
firewall_allowed_tcp_ports:
  - "22"
  - "80"
  - "443"

nginx_installation_type: configuration-only

nginx_worker_rlimit_nofile: 40000

nginx_events_params:
  - worker_connections 8192

nginx_sites:
  server:
    - listen 80
    - return 301 https://$host$request_uri

suppress_default_site: true

nginx_stream_configs:
  upstream:
    - upstream rancher_servers {
        least_conn;
        {% for host in groups['rancher-2-kubernetes-nodes'] %}
          server {{ hostvars[host]['ansible_host'] }}:443 max_fails=3 fail_timeout=5s;
        {% endfor %}
      }
  server:
    - server {
        listen 443;
        proxy_pass rancher_servers;
      }

All of this config is important, but if we don't set nginx_installation_type: configuration-only then the jdauphant.nginx role will remove our existing install, and install 1.10.3.

The nginx_worker_rlimit_nofile and nginx_events_params come straight from the Rancher example config.

A basic site, named as server, again because of the Rancher example config, follows the documentation from jdauphant.nginx pretty much one to one.

I've added in suppress_default_site, simply because I do not want the default site to be enabled after the playbook has run. It shouldn't be, but this ma kes absolutely certain that it will not be.

More interesting is the nginx_stream_configs config.

I'll address these back to front.

Following the Rancher 2 Example config we need a server entry that listens for https (443) connections and passes them on to an upstream group called rancher_servers. An upstream group is an NGINX directive that allows us to configure a named group of servers to which we can distribute incoming requests.

The confusing thing here is the naming - server inside server. I guess you could go with something like:

  server:
    - somethingelse {
        listen 443;
        proxy_pass rancher_servers;
      }

But I want to mirror the example config as closely as possible.

Certainly the most interesting piece is in the upstream config.

The name of the upstream exactly matches that which we proxy_pass too in the server block.

Using a for loop inside group_vars is not something that I've found in the official documentation. It might be there, it's possible I have missed it. But it's not a well publicised feature, as best I can tell.

The aim is to loop over all the hosts in the rancher-2-kubernetes-nodes group, and use the IP address given as the ansible_host.

This isn't the most dynamic, most customisable solution. If you have a more complex setup, whereby you're connecting with Ansible via a different IP address to that which will be the server's public IP then this process will need adjusting.

Essentially Ansible will look to our production file:

rk8s-lb-1 ansible_host=5.9.117.211 ansible_python_interpreter=/usr/bin/python3
rk8s-node-1 ansible_host=6.10.118.222 ansible_python_interpreter=/usr/bin/python3
rk8s-node-2 ansible_host=7.11.119.233 ansible_python_interpreter=/usr/bin/python3

[rancher-2-kubernetes-nodes]
rk8s-node-1
rk8s-node-2

[rancher-2-load-balancers]
rk8s-lb-1

It's going to see all the servers in the rancher-2-kubernetes-nodes group, look at each ansible_host entry for that given hostname, and then use that as the IP address to add to our resulting /etc/nginx/conf.d/stream/upstream.conf.

For every server in that group, a new line will be added. Ultimately we should end up with something like this:

#Ansible managed

upstream rancher_servers {
     least_conn;
     server 6.10.118.222:443 max_fails=3 fail_timeout=5s;
     server 7.11.119.233:443 max_fails=3 fail_timeout=5s;
     # ... and so on
}

Pretty cool.

That's the basic load balancer config up and running.

In the next Rancher 2 / Kubernetes video tutorial video we will take a short breather whilst we tidy up our config.

Episodes