What is Container Orchestration?

Over the last two or three years I’ve given a similar presentation on containers to operations groups at clients, potential clients, conferences and meetups. Generally, they’re just getting started with containers and are wondering what orchestration is and how it impacts them. In this post, I will talk about what container orchestration is and provide several videos with simple examples of what it means.

Application Deployment the Old Way

A decade or two ago, when we had a new application to deploy into production, we would:

  • Order hardware;
  • install and configure the operating system from a runbook; and
  • install the application from a runbook.

Then, when virtualization came along, this process got a little easier.

  • Order a virtual machine with operating system installed;
  • configure the operating system from a runbook; and
  • install the application with a shell script.

Finally, we started using system configuration management tools like Chef, Puppet or Ansible.

  • Request a virtual machine with operating system installed;
  • configure the operating system and install the application using a configuration management tool.

We’d repeat this error prone process for hardware, operating system and application updates. If we wanted it to be high available or scale it, we’d repeat this process for 1, 2, 3, … more servers. Plus, we’d have to worry about storage and networking.

And, if we wanted geographical redundancy and/or disaster recovery, we’d repeat it in another location but now we had to worry about networking and firewalls.

Finally, we’d repeat this over and over again as we maintained this application over the years. And, we’d be doing similar things for all of our other applications.

What is Container Orchestration?

In the same vein as virtual machines abstracted the server hardware, and configuration management tools abstracted the operating system, container orchestration abstracts the datacenter giving us a virtual, software defined datacenter.

One of my collegues said “Container orchestration makes high availability, elastic scalability, non-disruptive upgrade, and disaster recovery available for free.” Does this sound like it’s too good to be true. It isn’t.

Container orchestration provides the following (non-exhaustive) list of features.

  • Declarative configuration;
  • Placement and scheduling;
  • Scaling;
  • Update;
  • Health monitoring;
  • Networking and service discovery;
  • Volumes, secrets and configurations;
  • Load balancing and external access;
  • Role based access control;
  • Collections, namespaces, labels and metadata; and
  • Node and cluster management.

I’m not going to dig into each of these here but further down we’ll talk about and demonstrate some of them.

Container Orchestrators

The two most popular container orchestrators are Swarm and Kubernetes. They both provide all of the features above in one form or another. Swarm is much easier to setup, maintain and use. Kubernetes is much more configurable and, while it’s much harder to setup, maintain and use, it definitely has the majority of the mindshare and press coverage. The good news is, you don’t have to make a choice and, potentially, paint yourself into a corner. Docker Enterprise provides both out of the box and makes it easy to use either. In particular, there’s no need to install Kubernetes the Hard Way.

Demonstrations

In all of the demonstrations that follow, we’re using the same cluster and each demonstration starts where the previous one left off. The cluster was built with Docker Enterprise and has one UCP (Universal Control Plane) manager node, one DTR (Docker Trusted Registry) node and four worker nodes. Two of the nodes have a label of cloud=private and two have cloud=public to simulate two locations in a hybrid cloud.

I’m going to use Swarm for these demonstrations because it’s easier and clearer but all of them can be done with Kubernetes as well. In fact, in my next blog post I’ll do the same demonstrations using Kubernetes.

Demonstrate Deploying, Scaling and Upgrading

We’ll start by creating a service using the official NGINX 1.14 image. This service will have two replicas and be constrained to nodes in our private cloud. In addition, we’ve published the nginx container’s internal port 80 to port 8081.

$ docker service create --name=nginx --publish=8081:80/tcp --constraint node.labels.cloud=private --replicas 2 nginx:1.14

Note that the two nginx containers are running on the two nodes with the private cloud label.

Next, we’ll scale this service from two to four replicas.

$ docker service update --replicas 4 nginx

We now have four nginx containers running on the private cloud nodes.

Now, let’s remove the private cloud restraint from the service.

$ docker service update --constraint-rm node.labels.cloud==private nginx

Notice that nothing changes. The four nginx containers remain where they are. The orchestrator won’t make any changes since the current state matches the declared state. It only makes a change if those two differ.

Let’s make a change that causes the orchestrator to take action. We’ll update the service to have eight replicas.

$ docker service update --replicas 8 nginx

The four new containers are started on the public cloud nodes. The orchestrator made this choice primarily due to resource utilization. Those nodes weren’t being used so it used them.

Finally, let’s upgrade NGINX from 1.14 to 1.15.

$ docker service update --image nginx:1.15 nginx

There are a lot of options available to control the upgrade process, i.e. start a new container before stopping an old container, the number of containers to upgrade at a time, the time between container upgrades, what to do in the event of a failure, etc. In this case we’re using the default which is to stop an old container before starting a new container and doing them one at a time.

Demonstrate Failures

We’ll start our demonstrations of failure scenarios by rolling back NGINX to the previous version.

$ docker service update --rollback nginx

Again, this rollback proceeds identically to the previous upgrade.

We’ll demonstrate what happens when a container fails by killing a container; in this case, the one on the upper right.

$ docker container kill bb83c3dcc863

The orchestrator sees that the current state, 7 replicas, doesn’t match the declared state, 8 replicas, so it starts a new one.

To simulate a node failure, we’ll drain the worker node on the left like we might do in a planned maintenance case but we could have simply powered off the node and the orchestrator would have behaved the same way.

$ docker node update --availability drain ip-172-30-11-114.us-east-2.compute.internal

Again, the orchestrator starts the necessary containers on the other nodes.

We could extend this demonstration to simulate a disaster recovery scenario by draining (or powering off) the other private cloud node with similar results.

Let’s bring that node back online.

$ docker node update --availability active ip-172-30-11-114.us-east-2.compute.internal

As before, the orchestrator doesn’t move any work there since it doesn’t have to do it.

Finally, we’ll remove the nginx service entirely.

$ docker service rm nginx

Summary

We’ve seen some of the ways container orchestration makes it easier for an operations or DevOps team (or, in many cases today, a CI/CD pipeline) to deploy applications into production. There are a lot more options and features available to you. If you want or need help, Capstone IT is a Docker Premier Consulting Partner as well as being an Azure Gold and AWS Select partner. If you are interested in finding out more and getting help with your Container, Cloud and DevOps transformation, please Contact Us.

Ken Rider
Solutions Architect
Capstone IT