Containers/Mini-summit 2008 notes

From OpenVZ Virtuozzo Containers Wiki
< Containers
Revision as of 15:08, 22 July 2008 by Dhaval (talk | contribs)
Jump to: navigation, search


Intros (8:36am)

       Dave Hansen
       Eric Biederman
       Jason Byron, Red Hat
       Joe Rusio, Evergreen
       Joe McDonald
       HP China
       Sonny Rao
       HP
       HP
       Matine Silberman HP
       Sandy Harris
       NEC Japan
       John Schultz, AOL
       Pavel Emelyanov, Parallels/OpenVZ
       Denis Lunev, Parallels/OpenVZ
       Constant Chan
       Benjamin Thery, Bull
       Daniel Lezcano, IBM
       Serge Hallyn, IBM

On Phone:

       Amy Griffith HP
       Dhaval Giani, IBM

(Later walk-ins)

Topics:

Why do various companies want containers?

       ibm: workload management
       EB: using containers as improved chroot
       HP: wants similar to ibm, plus security
       parallels: hosted providers

sysfs issues

       EB gives status: should go into next merge window

mini-namespaces

       NFS
               clients should behave differently on diff. containers
               currently uses single sunrpc transport for all containers
       Dave: is there a list of all openvz mini-ns?
       EB:
               proposal:
                       create little filesystems
                       still store everything in nsproxy
               currently:
                       some people want same process in different netns's
                       almost possible now, but can't open new sockets
               namespace enter:
                       3 purposes
                               login
                               monitoring
                               configuring
               may be worth prototyping the proposal
                       address mqns, or sunrpc, or fuse?
       DH:
               openvz addresses this using one big clone(), right?
               (yes)

userid namespaces

       EB summarizes his proposal
               userid ns is unsharable without privilege
               userids, capabilities, security labels become ns-local
               hierarchical like pidns
       openvz: just does chroot
       DH:
               observers that system vs. app containers have different requirements
       EB:
               so with userid namespaces, user has god-like powers over created namespaces
       EB+SH will talk about hacking something this week during ols
       Uses:
               user unttrusted mounts
               build systems

device namespaces

       tty namespaces rejected
       should be solved with generic device namespaces
               virtualize the major:minor->device mapping
       reserved device numbers (unnamed)
               created with /proc?
               get_unnamed_device()
       tty ideas:
               use selinux ptys
               use user namespaces
               use legacy ptys
               leverage ptyfs
       Suka is not on, so he gets volunteered to do pure /dev/pts fs approach

per-container LSMs:

       SH: thinks LSMs should handle it
       EB:
               original purpose of chroot
               set up policies from inside container
               creating smack container inside selinux would be ideal

entering a container

       netns: identified using pid of a ns
       sh: can we solve this using EB's namespace filesystems proposal?
       (EB goes to the board to demonstrate his proposal)
       PM: Can we use control groups?
       PE: Can we re-use /proc/pid/ ?
       EB: could have a ns with no processes in it
       Example of command using this:
               ip set eth0 netns <pid>
               becomes
               ip set eth0 netns /proc/<pid>/
       DL:
               a real netns problem is knowing when a childns has died
               the netnsfs mount could solve that
       PE: EB, can you send POC patches for the namespace?
               EB and EM will both send their own POC.

DL: people have complained about needing CAP_SYS_ADMIN to unshare ns

       EB: example, setuid root sysvipc-using program could be fooled

PE: Entering a container:

       reasons:
               monitoring
               enter an administrative command
       DH: how do you do it now?
       PE: numerical ID for each VE, use it to enter
       EB:
               one need for entering: /sbin/hotplug
       (someone): does hijack suffice?
       EB: two cases:
               partial entering
               full entering
               sys_hijack does not address partial entering
       DH:
               why need partial entering?
               fs stuff can be done without entering
       PM: privileged process
       PE:
               will look at hijack patches
               someone will re-send hijack to containers@
               EB:
                       if we can do sys_hijack cleanly,
                       we can use it to solve kthread problem