Open main menu

OpenVZ Virtuozzo Containers Wiki β

WP/What are containers

< WP
Revision as of 17:38, 15 March 2011 by Kir (talk | contribs) (moar)

Contents

OpenVZ Linux Containers technology whitepaper

OpenVZ is a virtualization technology for Linux, which lets one to partition a single physical Linux machine into multiple smaller units called containers. Technically, it consists of three major things:

  • Namespaces
  • Resource management
  • Checkpointing

Namespaces

A namespace is a feature to limit the scope of something. Here, namespaces are used as containers building blocks. A simple care of a namespace is chroot.

Chroot

Traditional UNIX chroot() system call is used to change the root of the file system of a calling process to a particular directory. That way it limits the scope of file system for the process, so it can only see and access a limited sub tree of files and directories.

Chroot is still used for application isolation. For example, running ftpd in a chroot to avoid a potential security breach.

Chroot is also used in containers, which have the following consequences:

  • there is no need for a separate block device, hard drive partition or filesystem-in-a-file setup
  • host system administrator can see all the containers' files
  • containers backup/restore is trivial
  • mass deployment is easy

Other namespaces

OpenVZ builds on a chroot idea and expands it to everything else that applications have. In other words, every API that kernel provides to applications are "namespaced", making sure every container have its own isolated subset of a resource. Examples include:

  • File system namespace -- this one is chroot() itself, making sure containers can't see each other's files.
  • PID namespace, so in every container processes have its own unique process IDs, and the first process inside a container have a PID of 1 (it is usually /sbin/init process which actually relies on its PID to be 1). Containers can only see their own processes, and they can't see (or access in any way, say by sending a signal) processes in other containers.
  • IPC namespace, so every container have its own IPC (Inter-Process Communication) shared memory segments, semaphores, and messages.
  • Networking namespace, so every container have its own network devices, IP addresses, routing rules, firewall (iptables) rules, network caches and so on.
  • /proc and /sys namespaces, for every container to have their own representation of /proc and /sys -- special filesystems used to export some kernel information to applications. In a nutshell, those are subsets of what a host system have.
  • FIXME moar moar moar

Note that memory and CPU need not be namespaced. Existing virtual memory and multitask mechanisms are already taking care of it.

Single kernel approach

So, namespaces lets a single kernel run multiple isolated containers. To say it again, all the containers running on a single piece of hardware share one single Linux kernel. Yet again, there is only one single OS kernel running, and on top of that there are multiple isolated instances of user-space programs.

Single kernel approach is much more light-weight than traditional VM-style virtualization. The consequences are:

  1. Waiving the need to run multiple OS kernels leads to higher density of containers (compared to VMs)
  2. Software stack that lies in between an application and the hardware is much thinner, this means higher performance of containers (compared to VMs)

Resource management

Due to a single kernel model used, all containers share the same set of resources: CPU, memory, disk and network.

Every container can use all of the available hardware resources if configured so. From the other side, containers should not step on each other's toes, so all the resources are accounted for and controlled by the kernel.

FIXME link to resource management whitepaper goes here

Live migration

Various

Containers overhead

OpenVZ works almost as fast as a usual Linux system. The only overhead is for networking and additional resource management (see below), and in most cases it is negligible.

OpenVZ host system scope

From the host system, all containers processes are visible.

Resource control

Networking (routed/bridged)

Does it differ much from VMs?

Other features

  • Live migration

Limitations

From the point of view of a container owner, it looks and feels like a real system. Nevertheless, it is important to understand what are container limitations:

  • Container is constrained by limits set by host system administrator. That includes usage of CPU, memory, disk space and bandwidth, network bandwidth etc.t
  • Container only runs Linux (Windows or FreeBSD is not an option), although different distributions is not an issue.
  • Container can't boot/use its own kernel (it uses host system kernel).
  • Container can't load its own kernel modules (it uses host system kernel modules).
  • Container can't set system time, unless explicitly configured to do so (say to run ntpd in a CT).
  • Container does not have direct access to hardware such as hard drive, network card, or a PCI device. Such access can be granted by host system administrator if needed.