Open main menu

OpenVZ Virtuozzo Containers Wiki β

Download/kernel/rhel5/028stab059.3/changes

< Download‎ | kernel‎ | rhel5‎ | 028stab059.3
Revision as of 15:44, 24 October 2008 by Kir (talk | contribs) (gfs not working)

Contents

Changes

Since 028stab057.2:

Compatibility

  • gfs.ko module can not be loaded. If you depend on this, please use previous stable kernel.

Configs

Same as in 028stab057.2.


Patches

diff-cpt-synwait-restore-lock-20080821

Patch from Pavel Emelianov <xemul@openvz.org>

cpt: lock sock before restoring its synwait queue

This new socket already has all the necessary TCP timers armed, so tcp_keepalive_timer can fire during the rst_restore_synwait_queue and (for the latter being lockless) can spoil the queue.

Locking in the restoring procedure is requires.

Bug #118912.

diff-fs-quotcompat-xencomp-fix-20080806

Patch from Marat Stanichenko <mstanichenko@openvz.org>

quota: Compilation fix for XEN kernels

CONFIG_QUOTA_COMPAT is not enabled in Xen config so we bump into the problem of undefined structures.

Bug #118177.

diff-ms-add-limits_h-to-sumversions_c-20080820

Patch from Pavel Emelianov <xemul@openvz.org>

Fix sumversion.c compilation with some modern compilers

OpenVZ Bug #951.

diff-ms-cifs-lanman-off-compilation-20080820

Patch from Pavel Emelianov <xemul@openvz.org>

cifs: fix compilation for no-lanman case

backported mainstream commit 516897a208bc1423d561ce2ccce0624c3b652275

OpenVZ Bug #951.

diff-ms-fix-xfrm-compilation-20080818

Patch from Vitaliy Gusev <vgusev@openvz.org>

[PATCH] Fix compilation error when CONFIG_XFRM is not set.

OpenVZ Bug #963.

diff-rh-export-flush_tbl_page-for-xen-20080806

Patch from Marat Stanichenko <mstanichenko@openvz.org>

xen: Fix build for x86_64 arch

we have to export flush_tlb_page() symbol in Xen-x86_64 kernel because cpt modules uses this symbol.

Bugs #118432, #118177.

diff-rh-xen-include-cacheflush-20080806

Patch from Marat Stanichenko <mstanichenko@openvz.org>

xen: Bad «#include» directives position causes Xen-i386 compilation error.

 arch/i386/mm/ioremap-xen.c
  |#include <asm/cacheflush.h> (#define _I386_CACHEFLUSH_H)
    |#include <linux/mm.h>
      |#include <linux/pagemap.h>
        |#include <linux/highmem.h>
           |#include <asm/cacheflush.h>

(but nothing actually includes right now(see #define _I386_CACHEFLUSH_H)).

cacheflush.h uses the functions from <asm/cacheflush.h> so the definition of the functions occurs after somebody uses them.

Bug #118177.

diff-rh-xfrm-more-macros-compilation-20080820

Patch from Pavel Emelianov <xemul@openvz.org>

xfrm: more compilation fixes for wierd openvz users config

OpenVZ Bug #951.

diff-ubc-subbcino-gen-fix-20080820

Patch from Pavel Emelianov <xemul@openvz.org>

bc: fix subbeancounter inode number calculations in /proc/bc

0 and 0.0 still have the same number…

Bug #116868.

diff-ve-nf-ct-checksum-ro-inve-20080722

Patch from Vasily Averin <vvs@openvz.org>

netfilter: Fix broken isolation for ip_conntrack_checksum sysctl

net.ipv4.netfilter.ip_conntrack_checksum should be read-only inside VE.

Bug #117138.

diff-vfs-lock-inversion-in-drop_pagecache_sb-20080820

Patch from Dmitry Monakhov <dmonakhov@openvz.org>

vfs: fix lock inversion in drop_pagecache_sb()

backport mainstream commit: eccb95cee4f0d56faa46ef22fb94dd4a3578d3eb

  Fix longstanding lock inversion in drop_pagecache_sb by dropping inode_lock
  before calling __invalidate_mapping_pages().  We just have to make sure inode
  won't go away from under us by keeping reference to it and putting the
  reference only after we have safely resumed the scan of the inode list.  A bit
  tricky but not too bad…
 
  Signed-off-by: Jan Kara <jack@suse.cz>
  Cc: Fengguang Wu <wfg@mail.ustc.edu.cn>
  Cc: David Chinner <dgc@sgi.com>
  Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
  Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
 

Bug #116673.

diff-cciss-reformat-error-handling,
diff-cciss-add-sg-io-ioctl,
diff-cciss-printk-creq-flags,
diff-scsi-add-modalias-mainstream

Patches from Marat Stanichenko <mstanichenko@parallels.com>

Various kludges to make cciss work properly and make udev receive scsi uevents.

Bugs #114972, #114130.

diff-cpt-add-snmp-stats-20080930

Patch from Pavel Emelianov <xemul@openvz.org>

cpt: dump and restore global snmp statistics

Per device exists for ipv6 only and is probably not used now, but anyway — I'll do it later.

This patch adds new section CPT_SECT_SNMP_STATS that is populated with CPT_OBJ_BITS set of objects — one for each type of statistics. Objects have variable length. Stats are stored as a plain array of __u32 numbers and thus the order in which stats types are stored is implicitly hard-coded.

In case we do not have an IPV6 turned on all ipv6 stats are dumped as CPT_OBJ_BITS/CPT_CONTENT_VOID and are skipped on restore.

When we restore from an image with more stats in any type, the not supported ones are dropped with a warning.

Stats add 28K to image file.

Bug #113930.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>

diff-cpt-fix-cpt_family-restore-20080915

Patch from Vitaliy Gusev <vgusev@openvz.org>

[PATCH] rst: Fix memory corruption if cpt_family is wrong.

During restore, if parent socket is AF_INET but cpt_family is wrong (non initialized, see bug ##95113), then consider request as related to AF_INET6 is not right and leads to memory corruption.

As there are a lot of buggy images, so we can't check only on values AF_INET and AF_INET6.

Decision: - Check request on AF_INET6 first, and consider request as AF_INET by default. - Additionally checkup for AF_INET6 request (protect from random value cpt_family == AF_INET6)

Bug #118912.

Signed-off-by: Vitaliy Gusev <vgusev@openvz.org> Acked-by: Denis V. Lunev <den@openvz.org>

diff-cpt-ipip-20080923

Patch from Pavel Emelianov <xemul@openvz.org>

cpt: add support for ipip tunnel

Actually, sit also uses the ip_tunnel structure I'm saving and restoring in the image, but this only adds support for ipip device (sit will be checked later).

I add new object type and store most of the ip_tunnel_parm contents. Restoration is a little bit more tricky, as the fb device is created on container start.

Bug #115412.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>

diff-cpt-open-init-stds-early-20080923

Patch from Pavel Emelianov <xemul@openvz.org>

cpt: fix restoring of /dev/null opened early by init

The problem is the following:

  • init from fc9 starts and opens /dev/null for its stdin, stdout and stderr
  • udev starts and overmounts /dev with tmpfs

After this cpt cannot dump this ve, since one process holds a file, that is inaccessible from ve root.

The proposed solution is the following:

  1. allow for /dev/null to be over-mounted
  2. restore init's file in two stages:
  • stage1: *before* we restored mounts restore init's 0, 1 and 2 file descriptors, since most likely (in fc9 case — definitely) init opened them before any other manipulations with fs;
  • stage2: restore the rest files later, at usual time to make sore that e.g. sockets etc are restored properly.

Comment from Alexey:

 ACK.
 
 Though this is really ugly, it really produces 100% correct result
 for this particular situation.
 

Bug #116261.

diff-cpt-sit-20080930

Patch from Pavel Emelianov <xemul@openvz.org>

cpt: add sit devices migration

The code mostly re-uses the ipip migration one, by adding the CPT_DEV_SIT flag to the image, thus making the name CPT_OBJ_NET_IPIP_TUNNEL a bit confusing  :(

Bug #115412.

diff-ms-copyfiles-20080910

Patch from Denis Lunev <den@openvz.org> ms: properly assign value to the tsk->files

The race is the following:

 slm_task_inst_usage
    task_lock(t);
    files = t->files;
                             flush_old_exec
 			      unshare_files
 			        copy_files
 			      put_files_struct
    files->fdt
 

So, we are definitely accessing already freed memory for the case. The only correct fix for the case is to bound the assignment with task_lock.

Bug #120812.

diff-ms-no-hotplug-compilation

Patch from Pavel Emelianov <xemul@openvz.org>

Fix kobjects compilation for hotplug-less config

OpenVZ Bug #980.

diff-ms-procinodegen-20080919

Patch from Denis Lunev <den@openvz.org>

proc: generate inode number for proc pid inodes correctly

with max_pid=200000 inode numbers are generated wrong due and can duplicate

Bug #121659.

Signed-off-by: Denis V. Lunev <den@openvz.org>

diff-nfs-fake-lookuproot-ops-20080910

Patch from Denis Lunev <den@openvz.org>

nfs: prohibit lookup on mountpoint inode (nfs submount) for VEFS

This is a very rough kludge to prevent an OOPS which could happen if nfs server has submounts and does not hide them.

Signed-off-by: Denis V. Lunev <den@parallels.com>

Bug #119698.

diff-ub-sndbuf-synack-leak-20080910

Patch from Denis Lunev <den@openvz.org>

ub: incorrect skb is charged in tcp_send_synack

New one should be charged rather than old.

OpenVZ Bug #987.

diff-ve-binfmt_misc_ve_stop_oops_fix-20080930

Patch from Konstantin Ozerkov <kozerkov@openvz.org>

Fix OOPS while stopping VE after binfmt_misc.ko loaded

ve_binfmt_fini() should check if current VE have registered binfmt_misc fs. (Properly handling situation while stopping VE which started before binfmt_misc.ko loaded)

OpenVZ Bug #1028.

xemul: this doesn't affect rhel5 kernel, since this one is 'y' at our config, but I have patches to add its migration, which requires it to be a module. Thus this patch might become required.

diff-ve-drop-oom-immunity-at-enter-20080901

Patch from Konstantin Khlebnikov <khlebnikov@openvz.org>

drop OOM protection at entering to CT

At CT enter switch to default OOM adjustment level if task is OOM-immune.

This is a very bad idea to have OOM-unkillable tasks inside container, because all forked tasks inherit this setting.

Proc interface for changing OOM adjustment (/proc/<pid>/oom_adj) already restricted in CT by diff-ve-oom-adjust-20070604.

On some systems sshd got OOM protection at start and not drop it after fork.

(example: ssh root@HN -> vzctl enter -> restart apache — apache now OOM immune)

(example from xemul@: ssh root@HN vzctl start — VE is now OOM immune)

Debian bug #480020.

diff-ve-fix-idle-time-accout-20080815

Patch from Konstantin Khlebnikov <khlebnikov@openvz.org>

Fix idle time account in case of iowait tasks presence

one uninterruptible task block idle time counter on all idle vcpus in ve.

originaly at diff-ve-fairsched-statiow-20050823 idle time after strt_idle_time accounted in idle or iowait depends on total count of uninterruptible tasks, but after diff-ve-sched-stat-iowait-20060417 and diff-ve-iowait-20060525 iowait branch triggered by nonzero vcpu_rq(vcpu)->nr_iowait.

this patch do the same for idle branch.

split interface into two functions:

ve_sched_get_idle_time(cpu) — cpu idle time in current ve.

ve_sched_get_idle_time_total(ve) — ve total idle time.

v2 changes:

change __ve_sched_get_idle_time second argument from vsched to vcpu and make it optional — without vcpu time after strt_idle_time not accounted as idle.

remove vsched lookup code in case if ve init task not in ve vsched (init was dead and VE in the middle of shutdown process), in this case no reason to care about idle-time accounting accuracy.

Bug #114633.

diff-ve-net-ipip-20080912

Patch from Pavel Emelianov <xemul@openvz.org>

ipip: add ipip tunnel support in VEs

This is the same patch I did for mainstream, but for 2.6.18 kernel and thus resembles the sit virtualization patch.

Some functions are exported for the patch #2 — checkpointing support (yes, I still remember the bug #101061  ;) )

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>

diff-ve-nf-cthelpers-fix-leaks-20080929

Patch from Vitaliy Gusev <vgusev@openvz.org>

conntrack: Fixed leak used counter of ip_conntrack_ftp module

struct ip_conntrack_helper has a name as a pointer to nul-terminated string, but a list_named_find() must be used for the structures with inlined string. Thus list_named_find() always returns NULL in virt_ip_conntrack_helper_unregister().

OpenVZ Bug #1033.

Signed-off-by: Vitaliy Gusev <vgusev@openvz.org>

diff-vzdq-with-nfs-disable-20080930

Patch from Denis Lunev <den@openvz.org>

nfs: NFSD/vzquota mutual exclution

NFSD and vzquota can't run simultaneously on the same filesystem as NFSD works with not attached dentries while vzquota requires this in order to function properly.

This patch prohibits to vzquota on over remotely mounted filesystem and to mount a filesystem with vzquota on.

Bug #115332.

Signed-off-by: Denis V. Lunev <den@parallels.com>

diff-ve-net-bridge-via-phys-dev2-20070514

Patch from Dmitry Mishin <dim@openvz.org>

[BRIDGE] bridge deliver to original eth0 device

  • now packets are input to the local system as they are coming from phys device only;
  • fixed bunch of bugs with VE <-> HN communications.

diff-ve-net-sit-virtualize-20080627

Patch from Pavel Emelianov <xemul@openvz.org>

Virtualize sit device.

This mostly looks as sit netnsization patches I did for mainstream, but have some pecularities:

  1. sit is builtin in ipv6 module in this kernel
  2. VE_FEATURE_SIT controlls the sit availability in VE

Bug #115411.