Difference between revisions of "CR tools"

From OpenVZ Virtuozzo Containers Wiki
Jump to: navigation, search
Line 35: Line 35:
 
#* The dumper writes out files and pipes parameter and data.
 
#* The dumper writes out files and pipes parameter and data.
 
# The procedure continues for every '''$pid'''.
 
# The procedure continues for every '''$pid'''.
 +
 +
=== Restore ===
 +
 +
The restore procedure (aka restorer) proceed in the following steps:
 +
 +
# A process tree has been read from a file.
 +
# Every process started with saved (i.e. original) '''$pid''' via <code>clone()</code> call with new <code>CLONE_CHILD_USEPID</code> flag.
 +
# Files and pipes are restored (by restored it's meant - they are opened and positioned).
 +
# A new file generated. The file has an Elf format but with modified executable and program header types (telling the kernel that this particular file is not a regular Elf'oid but rather the kernel needs to handle it in a slightly different way).
 +
# Finally execve with new Elf file as an argument is executed, which initiate the kernel's stage of restore procedure.

Revision as of 09:05, 15 October 2011

What CRtools is

CRtools is an utility to checkpoint/restore process tree. Unlike checkpoint/restore implemented completely in kernel space, it tries to achieve the same target mostly in user space.

Agenda

  1. Basic design (checkpoint == proc + SEIZE, restore == syscalls + execve)
  2. What's required from kernel

Basic design

Checkpoint

The checkpoint procedure relies heavily on /proc file system (it's a general place where crtools takes all the information it needs). Which includes

  • Files descriptors information (via /proc/$pid/fd and /proc/$pid/fdinfo).
  • Pipes parameters.
  • Memory maps (via /proc/$pid/maps).

The process dumper (lets call it simply the dumper further) does the following steps during checkpoint stage

  1. A $pid of a process group leader is obtained from the command line.
  2. By using this $pid the dumper walks though /proc/$pid/status and gathers children $pids recursively. At the end we will have a process tree.
  3. Then it takes every $pid from a process tree, sends SIGSTOP to every process found, and performs the following steps on each $pid.
    • Collects VMA areas by parsing /proc/$pid/maps.
    • Seizes a task via relatively new ptrace interface. Seizing a task means to put it into a special state when the task have no idea if it's being operated by ptrace.
    • Core parameters of a task (such as registers and friends) are being dumped via ptrace interface and parsing /proc/$pid/stat entry.
    • The dumper injects a parasite code into a task via ptrace interface. This allows us to dump pages of a task right from within the task's address space.
      • An injection procedure is pretty simple - the dumper scans executable VMA areas of a task (which were collected previously) and tests if there a place for syscall call, then (by ptrace as well) it substitutes an original code with syscall instructions and creates a new VMA area inside process address space.
      • Finally parasite code get copied into the new VMA and the former code which was modified during parasite bootstrap procedure get restored.
    • Then (by using a parasite code) the dumper flushes contents of a task's pages to the file. And pulls out parasite code block completely, since we don't need it anymore.
    • Once parasite removed a task get unseized via ptrace call but it remains stopped still.
    • The dumper writes out files and pipes parameter and data.
  4. The procedure continues for every $pid.

Restore

The restore procedure (aka restorer) proceed in the following steps:

  1. A process tree has been read from a file.
  2. Every process started with saved (i.e. original) $pid via clone() call with new CLONE_CHILD_USEPID flag.
  3. Files and pipes are restored (by restored it's meant - they are opened and positioned).
  4. A new file generated. The file has an Elf format but with modified executable and program header types (telling the kernel that this particular file is not a regular Elf'oid but rather the kernel needs to handle it in a slightly different way).
  5. Finally execve with new Elf file as an argument is executed, which initiate the kernel's stage of restore procedure.