In the previous article we discussed reasons why you might want the ability to scale your apps up and down and the different resources needed by different types of web app. We also discussed the use cases for different app architecture styles. How do we actually put this theoretical knowledge into practice?
Enter Kubernetes. Kubernetes is the leading solution for deploying web apps at scale, and can be used for a wide variety of system administration tasks, from a simple deploy with one container to vastly more complex setups with automatically scaling components. Unfortunately, with this power comes a lot of complexity. We'll try to cut through a lot of that complexity today and give you an idea of how to setup your apps with Kubernetes.
transcoderapp from the web-105 git repository.
- Install kubectl from the Kubernetes website.
- Install minikube from the minikube website.
- Make sure docker is installed, if you didn't follow previous tutorials.
This tutorial will likely be easier to follow on Linux or MacOS than Windows-based systems.
It's recommended you read the previous articles in the series to understand containers and machines as otherwise Kubernetes will be very difficult to understand.
For this tutorial we're going to use Minikube. Minikube is a simple local Kubernetes environment which can be used to learn before moving onto a more production-friendly environment such as Azure, AWS, or a bespoke setup.
Once you've installed everything you can get started very easily:
This will install or start minikube, and then bring up an admin dashboard where you can view your current Kubernetes clusters. But wait! What the heck is a pod?
Much of the Kubernetes terminology can be confusing if it's the first time you've seen it, so we'll introduce these concepts in the next section.
Key Kubernetes Koncepts
Kubernetes comes from the Greek for "captain". It has the same root as cybernetics, which is not a mistake. In cybernetics, a control loop looks at existing inputs and attempts to reach a desired state. For example, an oven will check its current temperature before deciding whether or not to continue heating. If it's at temperature it will stop heating until it cools down to slightly below the target temperature, then switch back on. In the same way, Kubernetes uses controls to reach a desired target state.
A container in Kubernetes is the same as it is in Docker, and Docker is the most common container runtime used with Kubernetes.
A node is a machine. This can be a real or virtual machine (think back to our first class) and makes a certain number of resources available to Kubernetes. As minikube runs on our local machine using either its own docker container or a virtual machine, it has only a single node.
A cluster is a collection of nodes which are managed by Kubernetes. As minikube has only one node, it also has just one cluster.
A pod is a collection of one or more containers running on a node. It's called a pod to recall a group of whales (i.e. Docker containers). Normally a pod is a set of identical containers (a replica set) but it can also be a collection of closely-coupled containers. In Kubernetes, we rarely deploy pods directly (unlike in Docker) because we want to be able to easily create and destroy them.
We manage pods by using deployments. In Kubernetes, a deployment is declarative. This means that we don't tell Kubernetes how to move from one state to another (e.g. to reduce the number of pods for an app) but instead we tell it our desired final state and allow it to manage our resources to move to that state.
A service controls how kubernetes resources, especially pods, are connected to the network. It lets us create ingresses, which are entry points into our app from outside the kubernetes cluster.
Finally, a load balancer lets you distribute requests from services across multiple different containers or pods. It's essential to make sure that no individual container is overwhelmed by requests.
There are many other types of resource available in Kubernetes: however, these are the most important to remember and the ones we will use in this tutorial.
The Transcoder App
Our transcoder app looks something like this. It's similar to the approach we used in our docker-compose tutorial with the reader and writer apps. The frontend sends requests to the API. This could be a request for information or a file upload. The API responds by writing to the volume or by returning information to the frontend.
The volume is just some file storage service. In Kubernetes we can use many types of volumes: in this tutorial we'll use the simplest
hostPath type. Note that because we're using minikube, the
hostPath points inside minikube, not to our real host machine.
The watcher app checks for changes to the volume every 0.5 seconds. In particular, it checks whether any
.mp4 file exists in its target directory which don't have a counterpart of the other format. If it finds any files without a matching file of the other format, it uses
ffmpeg to transcode them to the other format. If not, it just waits.
Deploying Our Apps
Deploying the Frontend
Note: at present the frontend is a static web page. This means that we can't pass environment variables (e.g. the URL of the API) to it, but you can still use it on your local machine to perform uploads easily. We're including the information on this deployment anyway, as it's the simplest and you may well have a need to deploy static sites with Kubernetes.
First, we need to load into the minikube environment:
eval $(minikube docker-env)
This lets us build docker images directly in minikube's built-in docker environment, without having to specify remote repositories.
Next, build the frontend image. This is a simple nginx image.
docker build . -t transcode-frontend
Next, we'll create a Kubernetes deployment from a deployment.yml file. We'll go through this file and break it down line-by-line in the next part.
kubectl create -f deployment.yml
Next, we'll expose the frontend so we can access it from our browser by automatically creating a load balancer. This automatically publishes using the ports we wrote above.
kubectl expose deployment frontend --name=balancer --type=LoadBalancer
minikube service balancer --url gives you the IP address of your frontend, and you can connect to it on the address given.
Explaining the deployment.yml file
Let's take a look at our
- name: frontend
- containerPort: 80
The first two lines just describe which Kubernetes version and which type of resource we are creating. In this case it's a
metadata section contains a
name and some
name has to be unique across all Kubernetes resources of this type in a cluster. On the other hand, we can assign labels freely and use them to filter and apply rules to different resources. More on this below.
spec section contains information on how to create this
Deployment. It uses the
selector field to choose which resources it wants to apply this spec to. In this case we're using the
matchLabels selector to choose everything labelled with
For more information on labels and selectors see here.
replicas fields says how many replicas of our image (i.e. how many containers) we want to run in this deployment. Behind the scenes, it creates a
replicaSet resource, which is a slightly lower level type than a
Deployment and can be created independently.
template section specifies what we want each pod in our
replicaSet to look like. Each has the same label as the deployment, and contains a single container.
This container has the name
frontend. It uses the image
transcode-frontend which we built earlier. The
imagePullPolicy line prevents minikube from attempting to pull the image from a remote repository: because this is a local image, pulls will always fail. Finally the
ports section defines which ports should be made available for
services to publish. You can add more information to this section such as the
hostPort to publish, but we are just providing information about the container itself.
You can use
kubectl describe deploy frontend to get information about this deployment, which should reflect the information from our
CreationTimestamp: Sun, 14 May 2023 12:25:13 +0800
Annotations: deployment.kubernetes.io/revision: 1
Replicas: 3 desired | 3 updated | 3 total | 3 available | 0 unavailable
RollingUpdateStrategy: 25% max unavailable, 25% max surge
Host Port: 0/TCP
Type Status Reason
---- ------ ------
Available True MinimumReplicasAvailable
Progressing True NewReplicaSetAvailable
NewReplicaSet: frontend-79d8dd6f87 (3/3 replicas created)
Likewise, you can use
kubectl describe service balancer to check information about the load balancer we created:
IP Family Policy: SingleStack
IP Families: IPv4
Port: <unset> 80/TCP
NodePort: <unset> 32550/TCP
Session Affinity: None
External Traffic Policy: Cluster
Deploying the API
Phew - that was a lot of information. Don't worry: we only need a little more to understand how the API works. First, let's build it and get it running:
eval $(minikube docker-env)
docker build . -t transcode-api
kubectl create -f deployment.yml
kubectl expose deployment api --name=api-balancer --type=LoadBalancer
minikube service api-balancer --url
If you go to the url given by minikube you should get a simple json response.
Then, let's take a look at the
- name: api
- mountPath: /files
- name: FILE_PATH
- containerPort: 5000
- name: file-mount
In many ways this is identical to our previous file. However, it includes two new sections:
volume. If you recall our docker-compose class these should look pretty familiar.
spec.template.spec we have the new
- mountPath: /files
- name: FILE_PATH
These do basically the same as their equivalents in a
volumeMounts.mountPath says where to mount the volume inside the container.
name says which volume to use.
env.name is the key for our environment variable in the container.
value is the value.
Now let's look at
- name: file-mount
We name the volume we're creating, then specify its path on the host. Note that the host in this case is actually minikube, not your host machine. The
/data/* directory is a special directory in minikube which always persists between restarts.
type is an optional field, but in this case we set it to
DirectoryOrCreate to make sure that the
/data/transcode/ directory always exists.
Note that hostPath is only one kind of volume available on Kubernetes, and is not recommended for production use. More likely in production is something like AWS EBS storage or another type of cloud storage. For more information on Kubernetes volumes see here.
Deploying the Watcher
Our final service in this app is the watcher. This waits for new files to be uploaded then transcodes them using
ffmpeg uses all available system resources by default, we need to limit the resources available to prevent our computer crashing. In
spec.template.spec.containers.resources of our deployment file, we set a limit on available resources:
This limits each container in our pod to using 1 CPU core and 1GB (1000MB) of memory. Note that we can also use other units:
cpu: 500m specifies 500 milli-cores of compute (i.e. half a core), while
memory: 1Gi would limit our memory usage to 1 gibibyte, which is 1024 mibibytes and a more traditional way to measure RAM usage.
Another thing to note about watcher is that it has no service. As watcher is a purely server process, it has no need to expose any ports to the outside.
You can play with these resource limits for the watcher to see which works best for your system by using
kubectl delete watcher and redeploying until you find a balance between low resource usage and transcode speed which works for you.
- Try converting an app you previously wrote in docker or with docker-compose to use Kubernetes-style deployment and run it on minikube. A good candidate for this might be the reader-writer demo from the docker-compose class - however, this requires more networking than we covered in this class.