Containers/Mini-summit 2008 notes

From OpenVZ Virtuozzo Containers Wiki
Jump to: navigation, search

Intros (8:36am)

       Dave Hansen
       Eric Biederman
       Jason Byron redhat
       Joe Rusio - evergreen
       Joe McDonald
       HP China
       Sonny Rao
       HP
       HP
       Matine Silberman HP
       Sandy Harris
       NEC Japan
       John Schultz aol
       Pavel Emelyanov, Parallels/OpenVZ
       Denis Lunev, Parallels/OpenVZ
       Constant Chan
       Benjamin
       Daniel
       Serge

On Phone:

       Amy Griffith HP

(Later walk-ins)

Topics:

Why do various companies want containers?

       ibm: workload management
       EB: using containers as improved chroot
       HP: wants similar to ibm, plus security
       parallels: hosted providers

sysfs issues

       EB gives status: should go into next merge window

mini-namespaces

       NFS
               clients should behave differently on diff. containers
               currently uses single sunrpc transport for all containers
       Dave: is there a list of all openvz mini-ns?
       EB:
               proposal:
                       create little filesystems
                       still store everything in nsproxy
               currently:
                       some people want same process in different netns's
                       almost possible now, but can't open new sockets
               namespace enter:
                       3 purposes
                               login
                               monitoring
                               configuring
               may be worth prototyping the proposal
                       address mqns, or sunrpc, or fuse?
       DH:
               openvz addresses this using one big clone(), right?
               (yes)

userid namespaces

       EB summarizes his proposal
               userid ns is unsharable without privilege
               userids, capabilities, security labels become ns-local
               hierarchical like pidns
       openvz: just does chroot
       DH:
               observers that system vs. app containers have different requirements
       EB:
               so with userid namespaces, user has god-like powers over created namespaces
       EB+SH will talk about hacking something this week during ols
       Uses:
               user unttrusted mounts
               build systems

device namespaces

       tty namespaces rejected
       should be solved with generic device namespaces
               virtualize the major:minor->device mapping
       reserved device numbers (unnamed)
               created with /proc?
               get_unnamed_device()
       tty ideas:
               use selinux ptys
               use user namespaces
               use legacy ptys
               leverage ptyfs
       Suka is not on, so he gets volunteered to do pure /dev/pts fs approach

per-container LSMs:

       SH: thinks LSMs should handle it
       EB:
               original purpose of chroot
               set up policies from inside container
               creating smack container inside selinux would be ideal

entering a container

       netns: identified using pid of a ns
       sh: can we solve this using EB's namespace filesystems proposal?
       (EB goes to the board to demonstrate his proposal)
       PM: Can we use control groups?
       PE: Can we re-use /proc/pid/ ?
       EB: could have a ns with no processes in it
       Example of command using this:
               ip set eth0 netns <pid>
               becomes
               ip set eth0 netns /proc/<pid>/
       DL:
               a real netns problem is knowing when a childns has died
               the netnsfs mount could solve that
       PE: EB, can you send POC patches for the namespace?
               EB and EM will both send their own POC.

DL: people have complained about needing CAP_SYS_ADMIN to unshare ns

       EB: example, setuid root sysvipc-using program could be fooled

PE: Entering a container:

       reasons:
               monitoring
               enter an administrative command
       DH: how do you do it now?
       PE: numerical ID for each VE, use it to enter
       EB:
               one need for entering: /sbin/hotplug
       (someone): does hijack suffice?
       EB: two cases:
               partial entering
               full entering
               sys_hijack does not address partial entering
       DH:
               why need partial entering?
               fs stuff can be done without entering
       PM: privileged process
       PE:
               will look at hijack patches
               someone will re-send hijack to containers@
               EB:
                       if we can do sys_hijack cleanly,
                       we can use it to solve kthread problem