Editing Checkpointing and live migration

Jump to: navigation, search

Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.

The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then save the changes below to finish undoing the edit.
Latest revision Your text
Line 1: Line 1:
 
<translate>
 
<translate>
<!--T:1-->
 
 
CPT is an extension to the OpenVZ kernel which can save the full state of a running VE and to restore it later on the same or on a different host in a way transparent to running applications and network connections. This technique has several applications, the most important being live (zero-downtime) migration of VEs and taking an instant snapshot of a running VE for later resume, i.e. CheckPointing.
 
CPT is an extension to the OpenVZ kernel which can save the full state of a running VE and to restore it later on the same or on a different host in a way transparent to running applications and network connections. This technique has several applications, the most important being live (zero-downtime) migration of VEs and taking an instant snapshot of a running VE for later resume, i.e. CheckPointing.
  
<!--T:2-->
 
 
Before CPT, it was only possible to migrate a VE through a shutdown and subsequent reboot. The procedure not only introduces quite a long downtime for network services, it is not transparent for clients using the VE, making migration impossible when clients run some tasks which are not tolerant to shutdowns.
 
Before CPT, it was only possible to migrate a VE through a shutdown and subsequent reboot. The procedure not only introduces quite a long downtime for network services, it is not transparent for clients using the VE, making migration impossible when clients run some tasks which are not tolerant to shutdowns.
  
<!--T:3-->
 
 
Compared with this old scheme, CPT allows migration of a VE in a way which is essentially invisible both for users of this VE and for external clients using network services located inside the VE. It still introduces a short delay in service, required for actual checkpoint/restore of the processes, but this delay is indistinguishable from a short interruption of network connectivity.
 
Compared with this old scheme, CPT allows migration of a VE in a way which is essentially invisible both for users of this VE and for external clients using network services located inside the VE. It still introduces a short delay in service, required for actual checkpoint/restore of the processes, but this delay is indistinguishable from a short interruption of network connectivity.
  
<!--T:4-->
+
{{Note|In future kernels, CPT is to be replaced by our sub-project [http://criu.org CRIU]}}
{{Note|In OpenVZ 7, CPT is replaced by our sub-project [http://criu.org CRIU]}}
 
  
 +
== Online migration ==
  
== Online migration == <!--T:5-->
 
 
<!--T:6-->
 
 
There is a special utility vzmigrate in the OpenVZ distribution intended to support VE migration. With its help one can perform live (a.k.a. online) migration, i.e. during migration the VE “freezes” for some time, and after migration it continues to work as though nothing had happened. Online migration can be performed with:
 
There is a special utility vzmigrate in the OpenVZ distribution intended to support VE migration. With its help one can perform live (a.k.a. online) migration, i.e. during migration the VE “freezes” for some time, and after migration it continues to work as though nothing had happened. Online migration can be performed with:
 
<pre>vzmigrate --online <host> VEID</pre>
 
<pre>vzmigrate --online <host> VEID</pre>
  
<!--T:7-->
 
 
During online migration all VE private data saved to an image file, which is transferred to the target host.
 
During online migration all VE private data saved to an image file, which is transferred to the target host.
  
<!--T:8-->
 
 
In order for vzmigrate to work without asking for a password, ssh public keys from the source host should be placed in the destination host's <code>/root/.ssh/authorized_keys</code> file. In other words, command <code>ssh root@host</code> should not ask you for a password. See [[ssh keys]] for more info.
 
In order for vzmigrate to work without asking for a password, ssh public keys from the source host should be placed in the destination host's <code>/root/.ssh/authorized_keys</code> file. In other words, command <code>ssh root@host</code> should not ask you for a password. See [[ssh keys]] for more info.
  
== Manual Checkpoint and Restore Functions == <!--T:9-->
+
== Manual Checkpoint and Restore Functions ==
  
<!--T:10-->
 
 
<code>vzmigrate</code> is not strictly required to perform online migration. The <code>vzctl</code> utility, accompanied with some file system backup tools, provides enough power to do all the tasks.
 
<code>vzmigrate</code> is not strictly required to perform online migration. The <code>vzctl</code> utility, accompanied with some file system backup tools, provides enough power to do all the tasks.
  
<!--T:11-->
 
 
A VE can be checkpointed with:
 
A VE can be checkpointed with:
 
<pre>vzctl chkpnt VEID --dumpfile <path></pre>
 
<pre>vzctl chkpnt VEID --dumpfile <path></pre>
 
This command saves all the state of a running VE to the dump file and stops the VE. If the option <code>--dumpfile</code> is not set, <code>vzctl</code> uses a default path <code>/vz/dump/Dump.VEID</code>.  
 
This command saves all the state of a running VE to the dump file and stops the VE. If the option <code>--dumpfile</code> is not set, <code>vzctl</code> uses a default path <code>/vz/dump/Dump.VEID</code>.  
  
<!--T:12-->
 
 
After this it is possible to restore the VE to the same state executing:
 
After this it is possible to restore the VE to the same state executing:
 
<pre>vzctl restore VEID --dumpfile <path></pre>
 
<pre>vzctl restore VEID --dumpfile <path></pre>
 
If the dump file and file system is transferred to another HW node, the same command can restore the VE there with the same success.
 
If the dump file and file system is transferred to another HW node, the same command can restore the VE there with the same success.
  
<!--T:13-->
 
 
It is a critical requirement that file system at the moment of restore must be identical to the file system at the moment of checkpointing. If this requirement is not met, depending on the severity of changes, the process of restoration can be aborted or the processes inside a VE can see this as an external corruption of open files. When a VE is restored on the same node where it was checkpointed, it is enough to not touch the file system accessible by the VE. When a VE is transferred to another node it is necessary to synchronize the VE file system before restore. <code>vzctl</code> does not provide this functionality and external tools (i.e. <code>rsync</code>) are required.
 
It is a critical requirement that file system at the moment of restore must be identical to the file system at the moment of checkpointing. If this requirement is not met, depending on the severity of changes, the process of restoration can be aborted or the processes inside a VE can see this as an external corruption of open files. When a VE is restored on the same node where it was checkpointed, it is enough to not touch the file system accessible by the VE. When a VE is transferred to another node it is necessary to synchronize the VE file system before restore. <code>vzctl</code> does not provide this functionality and external tools (i.e. <code>rsync</code>) are required.
  
== Step-by-step Checkpoint and Restore == <!--T:14-->
+
== Step-by-step Checkpoint and Restore ==
  
<!--T:15-->
 
 
The process of checkpointing can be performed in stages. It consists of three steps.
 
The process of checkpointing can be performed in stages. It consists of three steps.
  
<!--T:16-->
 
 
First step – suspend the VE. At this stage CPT moves all the processes to a special beforehand known state and stops VE network interfaces. This stage can be done with:
 
First step – suspend the VE. At this stage CPT moves all the processes to a special beforehand known state and stops VE network interfaces. This stage can be done with:
 
<pre>vzctl chkpnt VEID --suspend</pre>
 
<pre>vzctl chkpnt VEID --suspend</pre>
  
  
<!--T:17-->
 
 
Second step – dumping VE. At this stage CPT saves the state of processes and global state of VE to an image file. All the process private data needs to be saved: address space, register set, opened files/pipes/sockets, System V IPC structures, current working directory, signal handlers, timers, terminal settings, user identities (uid, gid, etc), process identities (pid, pgrp, sid, etc), rlimit and other data. This stage can be done with:
 
Second step – dumping VE. At this stage CPT saves the state of processes and global state of VE to an image file. All the process private data needs to be saved: address space, register set, opened files/pipes/sockets, System V IPC structures, current working directory, signal handlers, timers, terminal settings, user identities (uid, gid, etc), process identities (pid, pgrp, sid, etc), rlimit and other data. This stage can be done with:
 
<pre>vzctl chkpnt VEID --dump --dumpfile <path></pre>
 
<pre>vzctl chkpnt VEID --dump --dumpfile <path></pre>
  
  
<!--T:18-->
 
 
Third step – killing or resuming processes. If the migration succeeds the VE can be stopped with the command:
 
Third step – killing or resuming processes. If the migration succeeds the VE can be stopped with the command:
 
<pre>vzctl chkpnt VEID --kill</pre>
 
<pre>vzctl chkpnt VEID --kill</pre>
Line 64: Line 48:
 
<pre>vzctl chkpnt VEID --resume</pre>
 
<pre>vzctl chkpnt VEID --resume</pre>
  
<!--T:19-->
 
 
The process of restoring consists of two steps.
 
The process of restoring consists of two steps.
  
<!--T:20-->
 
 
The first step is to restore processes and to leave them in a special frozen state. After this step processes are ready to continue execution, however, in some cases CPT has to do some operations after a process is woken up, therefore CPT sets process return point to function in our module. This stage can be done with:
 
The first step is to restore processes and to leave them in a special frozen state. After this step processes are ready to continue execution, however, in some cases CPT has to do some operations after a process is woken up, therefore CPT sets process return point to function in our module. This stage can be done with:
 
<pre>vzctl restore VEID --undump --dumpfile <path></pre>
 
<pre>vzctl restore VEID --undump --dumpfile <path></pre>
  
<!--T:21-->
 
 
Second step – waking up processes or killing them if the restore process failed. After CPT wakes up process, it performs necessary operations in our function and continues execution. This stage can be done with:
 
Second step – waking up processes or killing them if the restore process failed. After CPT wakes up process, it performs necessary operations in our function and continues execution. This stage can be done with:
 
<pre>vzctl restore VEID --resume</pre>
 
<pre>vzctl restore VEID --resume</pre>
Line 77: Line 58:
 
<pre>vzctl restore VEID --kill</pre>
 
<pre>vzctl restore VEID --kill</pre>
  
== See also == <!--T:22-->
+
== See also ==
  
<!--T:23-->
 
 
* http://criu.org/
 
* http://criu.org/
 
</translate>
 
</translate>

Please note that all contributions to OpenVZ Virtuozzo Containers Wiki may be edited, altered, or removed by other contributors. If you don't want your writing to be edited mercilessly, then don't submit it here.
If you are going to add external links to an article, read the External links policy first!

To edit this page, please answer the question that appears below (more info):

Cancel Editing help (opens in new window)

Template used on this page: