Machines, Virtual Machines, and Containers

In this class we're going to talk about machines, virtual machines, and containers, as well as how you can use these different technologies as a web developer.

What is a Machine?

In simple terms, a computer is a machine. When we say a 'machine' that doesn't always mean a desktop or laptop. It could be anything from a tiny chip to a huge server rack.

Physical vs Virtual Machines

Not all machines are physical devices. We can also have virtual machines, which exist on the same hardware but act like different computers. Some common uses for this are:

  • Running different operating systems on the same computer
  • Experimenting with different setups for a computer - this is very common on Linux desktops
  • Working with potentially dangerous software (e.g. viruses)
  • Breaking one powerful computer (e.g. a professional workstation, a server) into many smaller virtual machines

Virtual machines run on hypervisors. A hypervisor is a piece of software which runs and manages virtual machines. Some popular hypervisors which you can run on your own computer are: Oracle Virtualbox, QEMU, and Hyper-V. If you've used the Linux Subsystem for Windows, then you've used a version of Hyper-V.

Host and Guest Machines

When using a virtual machine we have a distinction between the host machine and the guest machine. The host machine is your normal computer running its normal operating system. The guest machine is the virtual machine running whichever operating system you installed on it.

There are also a lot of popular cloud-based virtual machine platforms. These are often known as Virtual Private Servers. Some examples of these are Amazon EC2, Azure Virtual Machines, Google Compute Engine, and Vultr.

A Note on Hardware

Hypervisors can't solve all compatibility issues. Some hypervisors require certain types of hardware, and most hypervisors can't handle software designed for a different type of CPU. For example, Android and new versions of Mac OS run on ARM CPUs, while most desktops and laptops run on x86 CPUs. Very few hypervisors are able to bridge this gap, and those which can (e.g. Bluestacks) are specialised and often slower than hypervisors using the same type of CPU.

It's also worth being aware that because hypervisors hook into your host OS for hardware compatibility, virtual machines might not give a good indication of how a guest OS would run on your physical hardware. This is good to know if you're going to use a virtual machine to demo a possible new operating system.

Pros and Cons of Virtual Machines

As with any technology, there are advantages and disadvantages of VMs. Some advantages of VMs are:

  • In a cloud or workstation environment, they're more efficient than running one physical computer per OS.
  • They let you run lots of different software on one computer.
  • They separate your host and your guest: this can provide better security.
  • They're pretty fast to set up and destroy (useful on the cloud).

Some disadvantages are:

  • Trying to run more than one app on a VM can cause problems. This is especially true of web servers.
  • It can be challenging to automate VMs: since you have a full install of an operating system you need to automate many different things.
  • It can be time-consuming and expensive to deal with problems in a VM's configuration.

To solve disadvantages of VMs, containers were developed.

Containers

Containers are closely related to virtual machines and let you do many of the same things. Some common uses of containers are:

  • Running multiple apps on a single host or virtual machine
  • Running multiple instances of the same app across more than one machine (this becomes important in scaling and deployment)
  • Ensuring a consistent environment between a developer's local machine and a remote web server

Containers bundle everything that a service needs to run with the service. They also separate or contain incompatible versions of software and libraries. For example, an older Python app might need Python version 2.7 to run, while a new one might require version 3.10. If you try to run both on the same virtual machine then you'll find that you can only run one at a time. This situation is called dependency hell. As the name suggests, we want to avoid it.

There are several different options for managing containers. By far the most popular of these is Docker. Others are buildkit and nerdctl. These are based on the containerd project.

There are also several different options for using containers on the cloud, such as AWS Fargate, Google Kubernetes Engine, or Azure Container Apps.

Differences Between Containers and Virtual Machines

Containers are similar to virtual machines, but have some important differences. Virtual machines:

  • Each need their own install of a guest operating system
  • Need separate copies of all the software used by an app
  • Are mostly separated from each other and can't share system resources easily
  • Can be slower to set up and deploy

On the other hand, containers:

  • Share a single host operating system
  • Only need the extra libraries and runtimes for a given app
  • Can share system resources, possibly including dependencies
  • Are faster to set up, deploy, and destroy
  • Can be automated using deployment files like dockerfiles.

Which of the two solutions you choose depends on your specific needs for a project. If you want total separation of your apps, including possible hardware separation, then using virtual machines might be the best choice. If you want to share system resources and don't mind having a less strict separation then containers are often the better choice.

About Virtual Environments

Remember the Python example I gave earlier? Well, it wasn't quite true. Python ships with support for virtual environments (venvs). A virtual environment is similar to a container in that it separates different environments. However, a Python venv is much simpler and quicker to use. It's also limited to only being useful for separating Python dependencies. Therefore, it's mostly helpful for separating different projects and doesn't provide much real separation on the operating system level.

Other languages have features or tools similar to virtual environments. These fall short of being true containers, but they offer some of the advantages and can be really helpful in organising your projects.


Review Questions

  • What are some uses for virtual machines?
  • In a virtual machine context, what's the host machine? How about a guest?
  • Can I use virtual machines to run Windows on my iPad?
  • What's the main difference between a virtual machine and a container?
  • Why are containers helpful when we have several versions of the same library as a dependency?
  • Why would I use a container instead of a virtual machine?
  • Why would I use a virtual machine instead of a container?
  • I have an awesome gaming computer with many cores and a great graphics card. I want to play multiplayer games with my friends online without having them bring their own computers to my house. What should I do?

Open Ended Questions

  • Do you think virtual machines or containers are more suitable for developing applications across small, medium, large, and global organisations? Why?
  • What kind of solution would you propose to a fintech company that needs to hold very sensitive financial data but also wants their day-to-day development to move fast? Justify your answer.
  • What kind of solution would you propose to a startup making clothes for dogs, who want a web app where people can upload their dog photos and see what different dogs would look like in different clothes? Justify your answer.
  • What kind of solution would you propose for storing confidential military and espionage data in such a way that only people with appropriate clearance can access them? The number of people allowed to access any given piece of data might be as few as 1 (only the creator of the data) or as many as 50,000 (civilian contractors who've signed a secrets act). Justify your answer.

Assignment I: Let's Try Linux!

It's very common for enthusiasts to get their first exposure to virtual machines by using them to play with some other operating system. Normally, this is Linux. For our first assignment we're going to set up a Linux distribution called Debian. Debian is lightweight, reliable, and fully open-source. In fact, many web servers run on Debian.

First, download Oracle VirtualBox for your computer. Set it up and open the application.

Next, head over to the Debian Netinstall page and get the appropriate download for your CPU. Normally this will be amd64, but it might be arm64 if you're using Apple Silicon.

Create a new virtual machine in OpenBox and select the Debian ISO you downloaded. Make sure you check 'skip unattended installation'.

Now boot up your virtual machine. You should be able to follow the instructions through the graphical installer prompt to get a working operating system. You can choose default options for most things, but make sure you allow writing to disk and select an appropriate location for your install.

On Debian you have a few choices of Desktop Environment (the user interface and apps for interacting with your computer). The Debian Wiki has a good article discussing different DE options. I've chosen GNOME as I'm familiar with it and it's easy to use.

You can experiment with Debian in your virtual machine. How is it similar to your host OS? How is it different? Try making another copy of Debian with a different desktop environment and check out the differences.

For Next Time

Next class we'll move onto basic features of Docker, a common container solution.

You'll need to install Docker somewhere. You can do this on your local machine or on a remote server. I'd recommend setting up both.

You'll also need a basic web app to test it with. Choosing one that you want to recycle from a previous project is easiest.