An Intro to Kubernetes
In the previous article we discussed reasons why you might want the ability to scale your apps up and down and the different resources needed by different types of web app. We also discussed the use cases for different app architecture styles. How do we actually put this theoretical knowledge into practice?
Enter Kubernetes. Kubernetes is the leading solution for deploying web apps at scale, and can be used for a wide variety of system administration tasks, from a simple deploy with one container to vastly more complex setups with automatically scaling components. Unfortunately, with this power comes a lot of complexity. We'll try to cut through a lot of that complexity today and give you an idea of how to setup your apps with Kubernetes.
Requirements
- The
transcoder
app from the web-105 git repository. - Install kubectl from the Kubernetes website.
- Install minikube from the minikube website.
- Make sure docker is installed, if you didn't follow previous tutorials.
This tutorial will likely be easier to follow on Linux or MacOS than Windows-based systems.
It's recommended you read the previous articles in the series to understand containers and machines as otherwise Kubernetes will be very difficult to understand.
Getting Started
For this tutorial we're going to use Minikube. Minikube is a simple local Kubernetes environment which can be used to learn before moving onto a more production-friendly environment such as Azure, AWS, or a bespoke setup.
Once you've installed everything you can get started very easily:
minikube start
minikube dashboard
This will install or start minikube, and then bring up an admin dashboard where you can view your current Kubernetes clusters. But wait! What the heck is a pod?
Much of the Kubernetes terminology can be confusing if it's the first time you've seen it, so we'll introduce these concepts in the next section.
Key Kubernetes Koncepts
Kubernetes comes from the Greek for "captain". It has the same root as cybernetics, which is not a mistake. In cybernetics, a control loop looks at existing inputs and attempts to reach a desired state. For example, an oven will check its current temperature before deciding whether or not to continue heating. If it's at temperature it will stop heating until it cools down to slightly below the target temperature, then switch back on. In the same way, Kubernetes uses controls to reach a desired target state.
A container in Kubernetes is the same as it is in Docker, and Docker is the most common container runtime used with Kubernetes.
A node is a machine. This can be a real or virtual machine (think back to our first class) and makes a certain number of resources available to Kubernetes. As minikube runs on our local machine using either its own docker container or a virtual machine, it has only a single node.
A cluster is a collection of nodes which are managed by Kubernetes. As minikube has only one node, it also has just one cluster.
A pod is a collection of one or more containers running on a node. It's called a pod to recall a group of whales (i.e. Docker containers). Normally a pod is a set of identical containers (a replica set) but it can also be a collection of closely-coupled containers. In Kubernetes, we rarely deploy pods directly (unlike in Docker) because we want to be able to easily create and destroy them.
We manage pods by using deployments. In Kubernetes, a deployment is declarative. This means that we don't tell Kubernetes how to move from one state to another (e.g. to reduce the number of pods for an app) but instead we tell it our desired final state and allow it to manage our resources to move to that state.
A service controls how kubernetes resources, especially pods, are connected to the network. It lets us create ingresses, which are entry points into our app from outside the kubernetes cluster.
Finally, a load balancer lets you distribute requests from services across multiple different containers or pods. It's essential to make sure that no individual container is overwhelmed by requests.
There are many other types of resource available in Kubernetes: however, these are the most important to remember and the ones we will use in this tutorial.
The Transcoder App
Our transcoder app looks something like this. It's similar to the approach we used in our docker-compose tutorial with the reader and writer apps. The frontend sends requests to the API. This could be a request for information or a file upload. The API responds by writing to the volume or by returning information to the frontend.
The volume is just some file storage service. In Kubernetes we can use many types of volumes: in this tutorial we'll use the simplest hostPath
type. Note that because we're using minikube, the hostPath
points inside minikube, not to our real host machine.
The watcher app checks for changes to the volume every 0.5 seconds. In particular, it checks whether any .mov
or .mp4
file exists in its target directory which don't have a counterpart of the other format. If it finds any files without a matching file of the other format, it uses ffmpeg
to transcode them to the other format. If not, it just waits.
Deploying Our Apps
Deploying the Frontend
Note: at present the frontend is a static web page. This means that we can't pass environment variables (e.g. the URL of the API) to it, but you can still use it on your local machine to perform uploads easily. We're including the information on this deployment anyway, as it's the simplest and you may well have a need to deploy static sites with Kubernetes.
First, we need to load into the minikube environment:
eval $(minikube docker-env)
This lets us build docker images directly in minikube's built-in docker environment, without having to specify remote repositories.
Next, build the frontend image. This is a simple nginx image.
docker build . -t transcode-frontend
Next, we'll create a Kubernetes deployment from a deployment.yml file. We'll go through this file and break it down line-by-line in the next part.
kubectl create -f deployment.yml
Next, we'll expose the frontend so we can access it from our browser by automatically creating a load balancer. This automatically publishes using the ports we wrote above.
kubectl expose deployment frontend --name=balancer --type=LoadBalancer
minikube service balancer --url
gives you the IP address of your frontend, and you can connect to it on the address given.
Explaining the deployment.yml file
Let's take a look at our deployment.yml
file:
apiVersion: apps/v1
kind: Deployment
metadata:
name: frontend
labels:
app: frontend
spec:
replicas: 3
selector:
matchLabels:
app: frontend
template:
metadata:
labels:
app: frontend
spec:
containers:
- name: frontend
image: transcode-frontend
imagePullPolicy: Never
ports:
- containerPort: 80
The first two lines just describe which Kubernetes version and which type of resource we are creating. In this case it's a Deployment
.
The metadata
section contains a name
and some labels
. The name
has to be unique across all Kubernetes resources of this type in a cluster. On the other hand, we can assign labels freely and use them to filter and apply rules to different resources. More on this below.
The spec
section contains information on how to create this Deployment
. It uses the selector
field to choose which resources it wants to apply this spec to. In this case we're using the matchLabels
selector to choose everything labelled with app: frontend
.
For more information on labels and selectors see here.
The replicas
fields says how many replicas of our image (i.e. how many containers) we want to run in this deployment. Behind the scenes, it creates a replicaSet
resource, which is a slightly lower level type than a Deployment
and can be created independently.
The template
section specifies what we want each pod in our replicaSet
to look like. Each has the same label as the deployment, and contains a single container.
This container has the name frontend
. It uses the image transcode-frontend
which we built earlier. The imagePullPolicy
line prevents minikube from attempting to pull the image from a remote repository: because this is a local image, pulls will always fail. Finally the ports
section defines which ports should be made available for services
to publish. You can add more information to this section such as the hostPort
to publish, but we are just providing information about the container itself.
You can use kubectl describe deploy frontend
to get information about this deployment, which should reflect the information from our deployment.yml
file.
Name: frontend
Namespace: default
CreationTimestamp: Sun, 14 May 2023 12:25:13 +0800
Labels: app=frontend
Annotations: deployment.kubernetes.io/revision: 1
Selector: app=frontend
Replicas: 3 desired | 3 updated | 3 total | 3 available | 0 unavailable
StrategyType: RollingUpdate
MinReadySeconds: 0
RollingUpdateStrategy: 25% max unavailable, 25% max surge
Pod Template:
Labels: app=frontend
Containers:
frontend:
Image: transcode-frontend
Port: 80/TCP
Host Port: 0/TCP
Environment: <none>
Mounts: <none>
Volumes: <none>
Conditions:
Type Status Reason
---- ------ ------
Available True MinimumReplicasAvailable
Progressing True NewReplicaSetAvailable
OldReplicaSets: <none>
NewReplicaSet: frontend-79d8dd6f87 (3/3 replicas created)
Events: <none>
Likewise, you can use kubectl describe service balancer
to check information about the load balancer we created:
Name: balancer
Namespace: default
Labels: app=frontend
Annotations: <none>
Selector: app=frontend
Type: LoadBalancer
IP Family Policy: SingleStack
IP Families: IPv4
IP: 10.111.106.157
IPs: 10.111.106.157
Port: <unset> 80/TCP
TargetPort: 80/TCP
NodePort: <unset> 32550/TCP
Endpoints: 10.244.0.49:80,10.244.0.51:80,10.244.0.54:80
Session Affinity: None
External Traffic Policy: Cluster
Events: <none>
Deploying the API
Phew - that was a lot of information. Don't worry: we only need a little more to understand how the API works. First, let's build it and get it running:
eval $(minikube docker-env)
docker build . -t transcode-api
kubectl create -f deployment.yml
kubectl expose deployment api --name=api-balancer --type=LoadBalancer
minikube service api-balancer --url
If you go to the url given by minikube you should get a simple json response.
Then, let's take a look at the deployment.yml
file:
apiVersion: apps/v1
kind: Deployment
metadata:
name: api
labels:
app: api
spec:
replicas: 3
selector:
matchLabels:
app: api
template:
metadata:
labels:
app: api
spec:
containers:
- name: api
image: transcode-api
imagePullPolicy: Never
volumeMounts:
- mountPath: /files
name: file-mount
env:
- name: FILE_PATH
value: /files
ports:
- containerPort: 5000
volumes:
- name: file-mount
hostPath:
path: /data/transcode
type: DirectoryOrCreate
In many ways this is identical to our previous file. However, it includes two new sections: env
and volume
. If you recall our docker-compose class these should look pretty familiar.
In spec.template.spec
we have the new volumeMounts
and env
fields:
volumeMounts:
- mountPath: /files
name: file-mount
env:
- name: FILE_PATH
value: /files
These do basically the same as their equivalents in a compose.yml
file. volumeMounts.mountPath
says where to mount the volume inside the container. name
says which volume to use.
env.name
is the key for our environment variable in the container. value
is the value.
Now let's look at spec.template.spec.volumes
:
volumes:
- name: file-mount
hostPath:
path: /data/transcode
type: DirectoryOrCreate
We name the volume we're creating, then specify its path on the host. Note that the host in this case is actually minikube, not your host machine. The /data/*
directory is a special directory in minikube which always persists between restarts. type
is an optional field, but in this case we set it to DirectoryOrCreate
to make sure that the /data/transcode/
directory always exists.
Note that hostPath is only one kind of volume available on Kubernetes, and is not recommended for production use. More likely in production is something like AWS EBS storage or another type of cloud storage. For more information on Kubernetes volumes see here.
Deploying the Watcher
Our final service in this app is the watcher. This waits for new files to be uploaded then transcodes them using ffmpeg
. As ffmpeg
uses all available system resources by default, we need to limit the resources available to prevent our computer crashing. In spec.template.spec.containers.resources
of our deployment file, we set a limit on available resources:
resources:
limits:
cpu: 1.0
memory: 1G
This limits each container in our pod to using 1 CPU core and 1GB (1000MB) of memory. Note that we can also use other units: cpu: 500m
specifies 500 milli-cores of compute (i.e. half a core), while memory: 1Gi
would limit our memory usage to 1 gibibyte, which is 1024 mibibytes and a more traditional way to measure RAM usage.
Another thing to note about watcher is that it has no service. As watcher is a purely server process, it has no need to expose any ports to the outside.
You can play with these resource limits for the watcher to see which works best for your system by using kubectl delete watcher
and redeploying until you find a balance between low resource usage and transcode speed which works for you.
Assignment
- Try converting an app you previously wrote in docker or with docker-compose to use Kubernetes-style deployment and run it on minikube. A good candidate for this might be the reader-writer demo from the docker-compose class - however, this requires more networking than we covered in this class.