Difference between revisions of "Containers/Mini-summit 2008"

From OpenVZ Virtuozzo Containers Wiki
Jump to: navigation, search
(is -> was, other minor fixes)
 
(37 intermediate revisions by 25 users not shown)
Line 1: Line 1:
There will be a containers mini-summit at the [http://www.linuxsymposium.org/2008/ OLS'08]. This page is for organizing this mini-summit. Feel free to edit.
+
There was a containers mini-summit at the [http://www.linuxsymposium.org/2008/ OLS'08]. This page served for organizing this event.
  
'''When''': 22nd of July 2008<br/>
+
'''When''': 22nd of July 2008, 8:30-16:30<br/>
'''Where''': Ottawa, ON, Canada.
+
'''Where''': Ottawa, ON, Canada, Novotel Hotel (Albion A).
 +
 
 +
== Notes ==
 +
 
 +
For the notes from the mini-summit, '''see [[Containers/Mini-summit 2008 notes]].'''
  
 
== Proposal ==
 
== Proposal ==
  
The mini-summit proposal sent to OLS organizers. '''See [[/Proposal|proposal]]'''.
+
The mini-summit proposal sent to OLS organizers; '''see [[/Proposal|proposal]]'''.
  
 
== Topics to discuss ==
 
== Topics to discuss ==
Line 24: Line 28:
 
<!-- Put this in three columns if browser is smart enough -->
 
<!-- Put this in three columns if browser is smart enough -->
 
<div style="-moz-column-count:3; -webkit-column-count:3; column-count:3; text-align: left; background: #fefef0; border: 1px solid #ddddc0;">
 
<div style="-moz-column-count:3; -webkit-column-count:3; column-count:3; text-align: left; background: #fefef0; border: 1px solid #ddddc0;">
# Kir Kolyshkin
 
 
# Pavel Emelyanov
 
# Pavel Emelyanov
 
# Denis Lunev
 
# Denis Lunev
Line 47: Line 50:
 
# Constant Chan
 
# Constant Chan
 
# Linda Knippers
 
# Linda Knippers
# Uchida Satoshi
+
# Satoshi Uchida
# Takahashi Masahiko
+
# Masahiko Takahashi
 +
# Martine Silbermann
 +
# Benoit des Ligneris
 +
# Patrick Naubert
 +
# Daisuke Nishimura
 +
# Sudhir Kumar
 +
# Munehiro Ikeda
 +
# Kamalesh Babulal
 +
# John Schulz
 +
# Poornima Nayak
 +
# Gyuil Cha
 +
# YoungHo Kim
 +
# Rob Woolley
 +
# Daniel Robbins
 +
# Jason Baron
 +
# Subrata Modak
 +
# Veerendra C
 +
# Joe MacDonald
 +
# Andrew Theurer
 +
# Myron Stowe
 +
# Peter Teoh
 +
# Ricky Liang
 
</div>
 
</div>
  
 
== Agenda ==
 
== Agenda ==
  
* Namespaces/Containers
+
* Namespaces/Containers (8:30am-11am)
 +
** sysfs issues (and any /proc issues)
 +
*** uevents/hotplug
 +
** Network namespaces issues
 +
*** multiple namespaces in one process
 +
** Device namespace design?
 +
** User namespace
 
** Additional needed namespaces
 
** Additional needed namespaces
 
*** Small namespaces ''What to do with small subsystem that might need virtualization. E.g. in openvz we have FUSE, binfmt_misc and some other small stuff virtualized. But how to merge it in mainline? Create a separate namespace for each? Mere them into one? How to call this then?''
 
*** Small namespaces ''What to do with small subsystem that might need virtualization. E.g. in openvz we have FUSE, binfmt_misc and some other small stuff virtualized. But how to merge it in mainline? Create a separate namespace for each? Mere them into one? How to call this then?''
** Nature of a 'container' — kernel object or userspace fiction
 
** Handling of /proc and /sysfs within containers
 
 
** Handling filesystem/namespace synchronization  (not sure what the issue is)
 
** Handling filesystem/namespace synchronization  (not sure what the issue is)
** How to enter a container
+
** Container design
** User namespaces?
+
*** How to enter a container
* Cgroups+Resource management
+
*** Nature of a 'container' — kernel object or userspace fiction
 +
 
 +
* Cgroups+Resource management (11:30-2pm)
 
** Cgroup implementation
 
** Cgroup implementation
 +
*** Locking (don't let cgroup_lock() become the BKL)
 +
*** Transactional attachment
 +
*** "procs" file
 +
*** User-space notification API
 +
**** Resource counter hit soft/hard limit
 +
**** Task entered/left cgroup
 +
**** OOM occurred
 +
*** Binary statistics API
 +
** Existing cgroups
 +
*** Memory (Balbir's NOTE: I would prefer to take some of this discussion to my memory controller BoF on Wednesday. Lets discuss this at the end)
 +
**** Supporting over-commit and guarantees
 +
**** Soft-limits
 +
**** Hierarchical borrowing - in kernel or userspace?
 +
**** Per-cgroup refault information?
 +
*** Kernel memory
 +
*** Device
 +
*** Memrlimit
 +
**** Some push-back over this - can we give real use cases?
 +
*** CPU scheduler
 
** Additional cgroups and their design
 
** Additional cgroups and their design
** libcg - userspace explotation of control                           groups/resource management
+
*** Swap (separate subsystem or merge with memory?)
** Resource management
+
*** Disk I/O (several proposed designs)
* Checkpoint/Restart
+
*** Network traffic classification
** Summary of existing c/r patchsets/designs
+
*** Freezer
** How to initiate and synchronize checkpoint/restart
+
*** Signaller
** (state of freezer subsystem?)
+
*** OOM Handler
** Memory state dump
+
** libcg - userspace explotation of control groups/resource management
** How to dump/fetch data for resource (file, ipc) checkpoint
+
*** Overview so far
** How to do restart
+
*** Is kernel-based reclassification needed?
** Hopefully we can make decisions here, and get a bit of a hackfest going during OLS
+
*** Real use-cases
 +
*** Future directions
 +
 
 +
 
 +
* Checkpoint/Restart (2:30pm-5pm)
 +
** Documentation : Look at "See Also" section below
 +
** Goals and expectations of this summit
 +
*** identify, discuss and (if possible) agree on the general design
 +
*** identify, discuss and (if possible) agree on the technical points
 +
*** decide on priorities for different components (eg. high, medium, low)  such that the final outcome is a practical road-map that would keep us busy for (at least) until the next OLS (though the "O" may change ;)
 +
** What are the problems that the linux community can solve with the checkpoint/restart ?
 +
** Preparing the kernel internals
 +
*** How we implement it without affecting long term maintainability ?
 +
*** What are the kernel subsystems, process resources and framework for CR ?
 +
*** Which pieces to target first ?
 +
 
 +
The following technical points can be discussed during the mini-summit if we have time or later at the OLS.
 +
 
 +
** Checkpointing / Restarting
 +
*** Reaching a quescient point - network, processes, aio, avoiding side effects of quiesce/revive
 +
*** Checkpoint - signal handler ? syscall ? crfs ? process hierarchy, resource dependencies, system and process resources
 +
*** Restarting - New binary format handler ? converting between formats (from older kernel to newer)
 +
*** Notification to processes which explicitly wish to be notified about quiesce, checkpoint and restart - container state ? new signals ?
 +
** Determining the userspace API - Posix 1003.1m ?
 +
** Passing the kernel internal state to/from userspace - coredump like file ? swap per container ? netlinks, CR filesystem ? army of different call for the CR (proc, existing syscalls, ...)
 +
** Hopefully we can continue to discuss in the next days and get a bit of a hackfest going during OLS :)
  
 
== Moderators ==
 
== Moderators ==
Line 84: Line 158:
 
* http://www.linuxsymposium.org/2008/cfp.php — OLS call for papers
 
* http://www.linuxsymposium.org/2008/cfp.php — OLS call for papers
 
* https://lists.linux-foundation.org/pipermail/containers/2008-January/009688.html
 
* https://lists.linux-foundation.org/pipermail/containers/2008-January/009688.html
 +
* http://openvz.org/pipermail/devel/2008-July/012891.html
 +
* Checkpoint/Restart
 +
** Zap : http://www.ncl.cs.columbia.edu/publications/usenix2007_fordist.pdf
 +
** Metacluster : http://lxc.sourceforge.net/doc/ols2006/lxc-ols2006.pdf
 +
** OpenVZ : [[Checkpointing and live migration]]
 +
** Checkpoint/Restart technology : http://en.wikipedia.org/wiki/Application_checkpointing
 +
** Virtual Servers and Checkpoint/Restart in Mainstream Linux : Sigops document
 +
** Remote fork: http://www.cse.nd.edu/~dthain/courses/classconf/wowsys2004/talks/rfork.pdf
 +
** Vmadump : http://bproc.sourceforge.net/c268.html
 +
** Posix CR : http://techpubs.sgi.com/library/tpl/cgi-bin/getdoc.cgi/0650/bks/SGI_Admin/CPR_OG/sgi_html/ch03.html
 +
** An OS services overview : http://sw-eng.falls-church.va.us/itsg/P08V31.htm
  
 
[[Category: Containers]]
 
[[Category: Containers]]
 
[[Category: Events]]
 
[[Category: Events]]

Latest revision as of 17:18, 18 September 2008

There was a containers mini-summit at the OLS'08. This page served for organizing this event.

When: 22nd of July 2008, 8:30-16:30
Where: Ottawa, ON, Canada, Novotel Hotel (Albion A).

Notes[edit]

For the notes from the mini-summit, see Containers/Mini-summit 2008 notes.

Proposal[edit]

The mini-summit proposal sent to OLS organizers; see proposal.

Topics to discuss[edit]

  • Device accessibility cgroup (maybe with remap ability)
  • TTYs
  • Syslog
  • Checkpoint/restart
  • Memory controllers
  • more?..

List of attendees[edit]

Please fill in your name here if you are going to attend, or email kir at openvz dot org if you are too lazy. Surely the list is not final, so put your name even if you are not sure you can make it.

This list is in no particular order.

  1. Pavel Emelyanov
  2. Denis Lunev
  3. Andrey Mirkin
  4. Serge Hallyn
  5. Dave Hansen
  6. Daniel Lezcano
  7. Srivatsa Vaddagiri
  8. Balbir Singh
  9. Sukadev Bhattiprolu
  10. Paul Menage
  11. Eric W. Biederman
  12. Oren Laadan
  13. Yamamoto Takashi
  14. Kamezawa Hiroyuki
  15. Benjamin Thery
  16. Herbert Pötzl
  17. Oleg Nesterov
  18. Dhaval Giani
  19. Bart Trojanowski
  20. Joseph Ruscio
  21. Constant Chan
  22. Linda Knippers
  23. Satoshi Uchida
  24. Masahiko Takahashi
  25. Martine Silbermann
  26. Benoit des Ligneris
  27. Patrick Naubert
  28. Daisuke Nishimura
  29. Sudhir Kumar
  30. Munehiro Ikeda
  31. Kamalesh Babulal
  32. John Schulz
  33. Poornima Nayak
  34. Gyuil Cha
  35. YoungHo Kim
  36. Rob Woolley
  37. Daniel Robbins
  38. Jason Baron
  39. Subrata Modak
  40. Veerendra C
  41. Joe MacDonald
  42. Andrew Theurer
  43. Myron Stowe
  44. Peter Teoh
  45. Ricky Liang

Agenda[edit]

  • Namespaces/Containers (8:30am-11am)
    • sysfs issues (and any /proc issues)
      • uevents/hotplug
    • Network namespaces issues
      • multiple namespaces in one process
    • Device namespace design?
    • User namespace
    • Additional needed namespaces
      • Small namespaces What to do with small subsystem that might need virtualization. E.g. in openvz we have FUSE, binfmt_misc and some other small stuff virtualized. But how to merge it in mainline? Create a separate namespace for each? Mere them into one? How to call this then?
    • Handling filesystem/namespace synchronization (not sure what the issue is)
    • Container design
      • How to enter a container
      • Nature of a 'container' — kernel object or userspace fiction
  • Cgroups+Resource management (11:30-2pm)
    • Cgroup implementation
      • Locking (don't let cgroup_lock() become the BKL)
      • Transactional attachment
      • "procs" file
      • User-space notification API
        • Resource counter hit soft/hard limit
        • Task entered/left cgroup
        • OOM occurred
      • Binary statistics API
    • Existing cgroups
      • Memory (Balbir's NOTE: I would prefer to take some of this discussion to my memory controller BoF on Wednesday. Lets discuss this at the end)
        • Supporting over-commit and guarantees
        • Soft-limits
        • Hierarchical borrowing - in kernel or userspace?
        • Per-cgroup refault information?
      • Kernel memory
      • Device
      • Memrlimit
        • Some push-back over this - can we give real use cases?
      • CPU scheduler
    • Additional cgroups and their design
      • Swap (separate subsystem or merge with memory?)
      • Disk I/O (several proposed designs)
      • Network traffic classification
      • Freezer
      • Signaller
      • OOM Handler
    • libcg - userspace explotation of control groups/resource management
      • Overview so far
      • Is kernel-based reclassification needed?
      • Real use-cases
      • Future directions


  • Checkpoint/Restart (2:30pm-5pm)
    • Documentation : Look at "See Also" section below
    • Goals and expectations of this summit
      • identify, discuss and (if possible) agree on the general design
      • identify, discuss and (if possible) agree on the technical points
      • decide on priorities for different components (eg. high, medium, low) such that the final outcome is a practical road-map that would keep us busy for (at least) until the next OLS (though the "O" may change ;)
    • What are the problems that the linux community can solve with the checkpoint/restart ?
    • Preparing the kernel internals
      • How we implement it without affecting long term maintainability ?
      • What are the kernel subsystems, process resources and framework for CR ?
      • Which pieces to target first ?

The following technical points can be discussed during the mini-summit if we have time or later at the OLS.

    • Checkpointing / Restarting
      • Reaching a quescient point - network, processes, aio, avoiding side effects of quiesce/revive
      • Checkpoint - signal handler ? syscall ? crfs ? process hierarchy, resource dependencies, system and process resources
      • Restarting - New binary format handler ? converting between formats (from older kernel to newer)
      • Notification to processes which explicitly wish to be notified about quiesce, checkpoint and restart - container state ? new signals ?
    • Determining the userspace API - Posix 1003.1m ?
    • Passing the kernel internal state to/from userspace - coredump like file ? swap per container ? netlinks, CR filesystem ? army of different call for the CR (proc, existing syscalls, ...)
    • Hopefully we can continue to discuss in the next days and get a bit of a hackfest going during OLS :)

Moderators[edit]

  • Namespaces/containers: Serge Hallyn, Dave Hansen
  • Cgroups and resource management: Paul Menage, Balbir Singh, Dhaval Giani
  • Checkpoint/restart: Daniel Lezcano, Oren Laadan

See also[edit]