From OpenVZ Virtuozzo Containers Wiki
Jump to: navigation, search


  • Updated to RHEL5.1 kernel (2.6.18-53.el5) -- new drivers, lots of updates
  • Mainstream security fixes
  • DRBD update to 8.0.7
  • Forcedeth driver 7 hours hang fixed
  • TUN/TAP CPT fixed
  • GFS lockfs disabled since broken
  • OOM hangs for long when run out of swap fix
  • minor compilation and other fixes

Config changes


New RHEL5.1 options:

  • +CONFIG_CFG80211=m
  • +CONFIG_MAC80211=m
  • +CONFIG_MAC80211_LEDS=y
  • +CONFIG_MAC80211_DEBUG=y
  • +CONFIG_E1000E=m
  • +CONFIG_IWL4965=m
  • +CONFIG_DE600=m
  • +CONFIG_DE620=m
  • +CONFIG_R3964=m



Patch from Andrey Mirkin <>:
[PATCH] CPT: improve dst capabilities checks

  1. Return different error codes in case of unsupported features and insufficient cpu capabilities.
  2. Print error messages with prefix "Error: ". This should improve checks of dst node capabilities.

Bug #81355.


Patch from Kirill Korotaev <>:
[PATCH] CPT: fix misprint in Andrey changes

Compilation fix: misprint in Andrey patch.


Patch from Kirill Korotaev <>:
[PATCH] CPT: declare recalc_sigpending_tsk() back

It was hidden in RHEL5.1, while CPT uses it.


Patch from Evgeny Kravtsunov <>:
[PATCH] CPT: fix tun/tap dev flags restore

dev flags were corrupted in rst_restore_tuntap. As a result dev->qdisc->enqueue was not set to &pfifo_fast_enqueue but stayed to be &noop_enqueue after restore. noop_enqueue drops all the skbs.

Bug #94879.


Patch from Kirill Korotaev <>:
[PATCH] CPT: utrace core changes

utrace core changes for CPT in RHEL5.1


Patch from Kirill Korotaev <>:
[PATCH] CPT: update utrace support for RHEL5.1

update utrace code according to changes in RHEL5.1


Patch from Alexandr Andreev <>:
[PATCH] fairsched: increase max VCPU timeslice

Increase default MAX VCPU timeslice, this increases performance under high load (vConsolidate test).

FYI: VMware uses VCPU timeslice much bigger, 50ms.


Patch from Jeff Layton <>:
[CIFS] fix bad handling of EAGAIN error on kernel_recvmsg in cifs_demultiplex_thread

It's a part of the following commit from mainstream

When kernel_recvmsg returns -EAGAIN or -ERESTARTSYS, then
cifs_demultiplex_thread sleeps for a bit and then tries the read again.
When it does this, it's not zeroing out the length and that throws off
the value of total_read. Fix it to zero out the length.

Can cause memory corruption:
If kernel_recvmsg returns an error and total_read is a large enough
value, then we'll end up going through the loop again. total_read will
be a bogus value, as will (pdu_length-total_read). When this happens we
end up calling kernel_recvmsg with a bogus value (possibly larger than
the current iov_len).

At that point, memcpy_toiovec can overrun iov. It will start walking
up the stack, casting other things that are there to struct iovecs
(since it assumes that it's been passed an array of them). Any pointer
on the stack at an address above the kvec is a candidate for corruption

Many thanks to Ulrich Obergfell for pointing this out.

Signed-off-by: Jeff Layton <>
Signed-off-by: Steve French <>

X-Git-Tag: v2.6.24-rc1~1382~5
X-Git-Url: c18c732ec6bf372aa959ca6534cbfc32e464defd


Patch from Roland McGrath <>:
wait_task_stopped: Check p->exit_state instead of TASK_TRACED (CVE-2007-5500)

patch a3474224e6a01924be40a8255636ea5522c1023a in mainline

The original meaning of the old test (p->state > TASK_STOPPED) was
"not dead", since it was before TASK_TRACED existed and before the
state/exit_state split.  It was a wrong correction in commit
14bf01bb0599c89fc7f426d20353b76e12555308 to make this test for
TASK_TRACED instead.  It should have been changed when TASK_TRACED
was introducted and again when exit_state was introduced.

Signed-off-by: Roland McGrath <>
Cc: Oleg Nesterov <>
Cc: Alexey Dobriyan <>
Cc: Kees Cook <>
Acked-by: Scott James Remnant <>
Signed-off-by: Linus Torvalds <>
Signed-off-by: Greg Kroah-Hartman <>

X-Git-Tag: v2.6.23.8~1
X-Git-Url: 36ef66c5d137b9a31fd8c35d236fb9e26ef74f97


Patch from Vitaliy Gusev <>:
[PATCH] Alt-sysrq-p: do synchronous NMI IPI

Wait for completion of the NMI IPI callbacks then do sysrq_handle_showregs(). Otherwise nested bust_spinlocks() calls may occur.

Bug #94875.


Patch from Kirill Korotaev <>:
[PATCH] ioacct: don't mangle VE0 statistics

Instead show whole node I/O stats as VE0 one.

OpenVZ Bug #731.


Patch from Kirill Korotaev <>:
[PATCH] VE: fix OOM loop in RHEL5.1


Patch from Pavel Emelianov <>:
[PATCH] taskstats: consider the pid, coming from the user-space to be a virtual one

When the user send a netlink message to get the taskstats it can validly be a virtual one, but the find_task_by_pid_all() (which is used to convert this pid to task) assumes (with the appropriate BUG_ON) that his pid is global.

Fix it by using the find_task_by_pid_ve() searching routine. The rest of the taskstats.c code seems to handle pids properly.

OpenVZ Bug #730.
Bug #94329.

Signed-off-by: Pavel Emelyanov <>


Patch from Evgeny Kravtsunov <>:
[PATCH] set PER_LINUX32 personality when restoring 32bit app on 64bit OS

Current implementation of task personality migration is incorrect:


static int dump_one_process(cpt_object_t *obj, struct cpt_context *ctx)
 v->cpt_personality = tsk->personality;


static int hook(void *arg)
 if (ti->cpt_personality != 0)

On both i686 and x86_64 task->personality == 0 == PER_LINUX. But for 32-bit VE running on x86_64 personality must be set to PER_LINUX32.

Solution is to set the personality of 32-bit tasks to PER_LINUX32 during restore process on x86_64 node. Attribute ti->cpt_64bit allows to distinguish 32-bit tasks that came from i686 node.

Bug #94205.


Patch from Evgeny Kravtsunov <>:
[PATCH] CPT: another fix for TUN/TAP restore

1) Restore of tun->bind_file attribute added in rst_restore_tuntap.

tun->bind_file contains the pointer to open file which tun_struct is binded to. tun->bind_file data is used for cpt/rst only. This attribute is to be initialized on creating tun/tap device (tun_set_iff) and on restore tun/tap device (rst_restore_tuntap). If it is not initialized on restore, futher dumps will not contain any information on the open file binded, so futher restore will fail.

Bug #94995.

2) Restoring bind file (rst_file call) is moved up to be called before allocating and registering tunX netdevice. This is done to avoid doing netdevice related cleanups when rst_file returns error.

Bug #94992.


Patch from Alexey Kuznetsov <>:
[CPT] strace blocked checkpointing

PTRACE_SYSCALL was not detected and checkpointing rejected to checkpoint due to one of sanity checks.


Patch from Vitaliy Gusev <>:
[PATCH] fairsched: fixup per-VE nrrunning/nrunint stats on VCPU add/del

When any online ( >=2 ) vcpu is removed and attached again then its statistic is initialized. It leads to bad loadavg results. Right way is merge statistic of deleted VCPU into any online VCPU.

OpenVZ Bug #732.


Patch from Denis Lunev <>:
[PATCH] OOM if swap is full even for GFP_NOFS allocation.

The problem is that when swap if over, the kernel can hang tens of minutes looking for a memory... So when swap is over we have to be more agressive...

Bug #93284.


Patch from Vitaliy Gusev <>:
[PATCH] simfs: fix statfs() in case of HUGE limits

If quota is too big then unsigned becomes negative signed. Get rid of explicit type cast and do honest math.

OpenVZ Bug #722.


Patch from Vitaliy Gusev <>:
[PATCH] NFS: lockd has unclosed sockets when stopping VE.

Try force destroy hosts (nlm_host) when VE is stopped. It is needed because some hosts may exists and has opened sockets when we call fini_venet(). But at the time of fini_venet() all sockets related to given VE must be closed.

Thanks to Denis Lunev <> for help.

Bug #94468.


Patch from Vitaliy Gusev <>:
[PATCH] proc: don't update /proc file permissions when not needed.

Update only when needed fields in proc_dir_entry in proc_notify_change(). VE can mess VE0 /proc mode, uid, gid on entries which have global PDE only. No much harm can be done, i.e. not exploitable. But still very unpleasant.

Bug #95301.


From Kirill Korotaev (dev@):

linux-2.6-net-forcedeth-update-to-driver-version-0-60.patch patch from RHEL5.1 added the following piece of code to nv_probe():

       if (id-&gt;driver_data &amp; DEV_HAS_MGMT_UNIT) {
               /* management unit running on the mac? */
               if (readl(base + NvRegTransmitterControl) &amp; NVREG_XMITCTL_SYNC_PHY_INIT) {
                       np-&gt;mac_in_use = readl(base + NvRegTransmitterControl) &amp; NVREG_XMITCTL_MGMT_ST;
                       dprintk(KERN_INFO "%s: mgmt unit is running. mac in use %x.\n", pci_name(pci_dev), np-&gt;mac_in_use);
                       for (i = 0; i &lt; 5000; i++) {
                               if (nv_mgmt_acquire_sema(dev)) {
                                       /* management unit setup the phy already? */
                                       if ((readl(base + NvRegTransmitterControl) &amp; NVREG_XMITCTL_SYNC_MASK) ==
                                           NVREG_XMITCTL_SYNC_PHY_INIT) {
                                               /* phy is inited by mgmt unit */
                                               phyinitialized = 1;
                                               dprintk(KERN_INFO "%s: Phy already initialized by mgmt unit.\n", pci_name(pci_dev));
                                       } else {
                                               /* we need to init the phy */

Obviously, this loops 5000 times and calls nv_mgmt_acquire_sema() inside, which in the worst case does msleep(500) 10 times. So this loop can last 5000*10*0.5sec = 25000sec = 6.94 hours This is exactly what we face in the bug: boot hanged at 14:04:19, continued at 21:02:02, i.e. it took ~25063 seconds.

Bug #95327.


Patch from Evgeniy Kravtsunov:
Patch updates drbd from 8.0.6 to 8.0.7.

Sources taken from

Here is the announcement:


patch from Dmitry Monakhov (dmonakhov@):
[PATCH] GFS: disable lockfs support since it's broken

Currently gfs lockfs fearure is broken. Some applications such LVM snapshot, acronis trueimage, and etc. trying to use it cause system livelock. So in order to be on the safe side it is beter to disable this "feature".

Red Hat bug #403171.