Friday, August 14, 2009

Containing Linux Instances with OpenVZ

Understanding the OpenVZ way of virtualisation and getting started with it.

Virtualisation is going mainstream, with many predicting that it will expand rapidly in the next few years. Virtualisation is a term that can refer to many different techniques. Most often, it is just software that presents a virtual hardware on which other software can run. Virtualisation is also done at a hardware level, like in the IBM mainframes or in the latest CPUs that feature the VT or SVM technologies from Intel and AMD, respectively. Although a fully featured virtual machine can run unmodified operating systems, there are other techniques in use that can provide special virtual machines, which are nevertheless very useful.

Performance and virtualisation
The x86 architecture is notorious for its virtualisation unfriendly nature. Explaining why this is the case requires a separate article on the subject. The only way to virtualise x86 hardware was to emulate it at the instruction level or to use methods like ‘Binary Translation’ and ‘Binary Patching’ at runtime. Well known software in this arena are QEMU, Vmware and the previously well-known Bochs. These programs emulate a full PC and can run unmodified operating systems.

The recent VT and SVM technologies provided by Intel and AMD, respectively, do away with the need to interpret/patch guest OS instruction streams. Since these recent CPUs provide hardware-level virtualisation, the virtualisation solution can trap into the host OS for any privileged operation that the guest is trying to execute. Although running unmodified operating systems definitely has its advantages, there are times when you just need to run multiple instances of Linux, for example. Then why emulate the whole PC? VT and SVM technologies virtualise the CPU very well, but the various buses and the devices sitting on them need to be emulated. This hits the performance of the virtual machines.

As an example, let us take the cases of QEMU, Xen, KVM and UML. This comparison is kind of funny, since the guys who wrote these software, never wanted to end up in a table like Table 1. This is like comparing apples to oranges, but all we want to understand from this table is whether the VMM can run an unmodified operating system, at what level it runs, and how the performance is compared to natively running it.

Introducing OpenVZ
Let us suppose you want to run only Linux, but want to make full use of a physical server. You can run multiple instances of Linux for hosting purposes, education and in testing environments, for example. But do you have to emulate a full PC to run these virtual, multiple instances? Not really. A solution like User Mode Linux (UML) lets you run Linux on the Linux kernel, where each Linux is a separate, isolated instance. To get a simplified view of a Linux system, let us take three crucial components that make up a system. They are: the kernel, the root filesystem, and the processes that are created as the system boots up and runs. The kernel is, of course, the core of the operating system; the root filesystem is what holds the programs and the various configuration files; and the processes are running instances of the programs created from binaries on the root file system. They are created as the system boots up and runs.

In UML, there is a host system and then there are guests. The host system has a kernel, and the root file system and its set of processes. Each guest has a kernel, a root file system and its own set of processes.

Under OpenVZ, things are a bit different. There is a single kernel and there are multiple root file systems. The guest’s root file systems are directory trees under the host file system. A guest under OpenVZ is called a Virtual Environment (VE) or Virtual Private Server (VPS). Each VPS is identified using a name or a number, where VPS 0 is the host itself. Processes created by these VEs remain isolated from others. That is, if VPS 101 creates five processes and VPS 102 creates seven, they can’t ‘see’ each other. This may sound a lot like chroot jails, but you must note the differences as well. A chroot jail provides only filesystem isolation. The processes in a chroot jail still share processes, networks and other namespaces with the host. For example, if you run ps -e from a chroot jail, you still see a list of systemwide processes. If you run a socket program from the chroot environment and listened on localhost, you can connect to it from outside the chroot jail. This simply means there is no isolation at the process or the network level. You can also verify this by running netstat –a from the chroot jail. You will be able to see the status of system wide networking connections.

OpenVZ is rightly called a container technology. In case of OpenVZ, there is no real virtual machine. The OpenVZ kernel is a modification of the Linux kernel that isolates namespaces and contains or separates processes created by one VPS from another. By doing so, the overhead of running multiple kernels is avoided and maximum performance is obtained. In fact, the worse case overhead compared to native performance in OpenVZ is said to be rarely more than 3 per cent. So, on a server with a few gigs of RAM, it is possible to run tens of VPSs and still have decent performance. Since there is only one kernel to deal with, memory consumption is also under check.

User bean counters
OpenVZ is not just about the isolation of processes. There are various resources on a computer system that processes compete for. These are resources like CPU, memory, disk space and at a finer level, file descriptors, sockets, locked memory pages and disk blocks, among others. At a VPS level, it is possible in OpenVZ to let the administrator set limits for each of these items so that resources can be guaranteed to VPSs and also to ensure that no VPS can misuse available resources. OpenVZ developers have chosen about 20 parameters that can be tuned for each of the VPSs.

The OpenVZ fair scheduler
Just as various resources are guaranteed to VPSs, CPU time for a VPS can also be guaranteed. It is possible to specify the minimum CPU units a VPS will receive. To make sure this happens, OpenVZ employs a two-level scheduler. The first level fair scheduler makes sure that no VPS is starved of its minimum CPU guarantee time. It basically selects which VPS has to run on the CPU next. At the scheduler level, a VPS is just a set of processes. Then, this set is passed on to the regular Linux kernel scheduler and one from the set is scheduled to run. In a VPS Web hosting environment, the hosting provider can thus guarantee the customer some minimum CPU power.

Installing OpenVZ
To install OpenVZ and have it work, you need to download or build an OpenVZ kernel, and also build or download pre-built OpenVZ tools. When you install the OpenVZ tools, it also installs the init scripts that take care of setting up OpenVZ. During system start-up and shut down, VEs are automatically started and shut down along with the Hardware Node (HN). Once the tools are installed, you can see that a directory named ‘vz’ is created in the root directory and it also contains other directories. On a production server, you may want ‘/vz’ on a separate partition.

Source of Information : Linux For You May 2009

No comments: