Open main menu

OpenVZ Virtuozzo Containers Wiki β

Changes

Download/kernel/rhel4/023stab030.1/changes

37,620 bytes added, 18:21, 20 March 2008
created
== Changes ==
* Rebased on 2.6.9-42.0.3EL
* Driver updates, configs synchronization with RHEL
* CPT fixes
* Mainstream updates
* A lot of virtualization enhancements (sysctls, sysfs, stats)
* OOM-killer fixes
* VE cleanup speedup.

=== Config changes ===

A lot of changes due to:
* attempt to approximate configs to RedHat ones
* driver updates to match HCL of 2.6.8 branch
<includeonly>[[{{PAGENAME}}/changes#Driver updates|{{Long changelog message}}]]</includeonly><noinclude>
=== Driver updates ===
==== diff-drv-adp94xx-freeze-20060906 ====
<div class="change">
Patch from Kostja:<br/>
This patch fixes kernel compilation with CONFIG_SCSI_ADP94XX=y
by removing uses of PF_FREEZE flag in adp94xx driver.
</div>

==== diff-wrn-implicit-funcs-20060906 ====
<div class="change">
Patch from Kostja:<br/>
This patch removes wanings "implicit declaration of function"
during compilation on x86 and x86_64 arches.
</div>

==== linux-2.6.9-e1000-7.2.7.patch ====
<div class="change">
Patch ported by Kostja (khorenko@):<br/>
e1000 driver updated up to 7.2.7 version
sources were taken from
[http://sourceforge.net/projects/e1000/ sourceforge.net/projects/e1000]

Bug #19952.
</div>

==== linux-2.6.9-r8169-2.2.patch-1 ====
<div class="change">
Patch ported by Kostja (khorenko@):<br/>
r8169 driver updated up to 2.2 version
sources were taken from 2.6.8-022stab078.20 vz kernel.

Bug #19950.
</div>

==== linux-2.6.9-sk98lin-8.31.2.3.patch ====
<div class="change">
Patch ported by Kostja (khorenko@):<br/>

sk98lin driver updated up to version 8.31.2.3
sources were taken from [http://www.skd.de/ skd.de]

Bug #28918.
</div>

==== linux-2.6.9-sky2-1.4.patch ====
<div class="change">
Patch ported by Kostja (khorenko@):<br/>
sky2 driver updated up to version 1.4

sources were taken from 2.6.8-022stab078.20 vz kernel

Bug #19950.
</div>
==== linux-2.6.9-qla4xxx-5.00.02.patch ====
<div class="change">
Patch ported by Kostja (khorenko@):<br/>
qla4xxx driver updated up to version 5.00.02

sources from Qlogic's site.

Bug #27641.
</div>

==== linux-2.6.9-arcmsr-1.20.0X.12.patch ====
<div class="change">
Patch ported by Kostja (khorenko@):<br/>
Areca driver v1.20.0X.12 added.

Sources are from
[ftp://ftp.areca.com.tw/RaidCards/AP_Drivers/Linux/DRIVER/SourceCode/ Areca site]

Bug #59933.
</div>

==== linux-2.6.9-dell_rbu-0.9.patch ====
<div class="change">
Patch ported by Kostja (khorenko@):<br/>
dell_rbu driver updated to 0.9 version.

sources from
[ftp://ftp.us.dell.com/sysman/OMI-SrvAdmin-Dell-Web-LX-50_A00.tar.gz Dell site].

Bug #55618.
</div>

==== linux-2.6.9-aoe-14.patch ====
<div class="change">
Patch ported by Kostja (khorenko@):<br/>
AoE driver version 14 added; sources from site

Bug #51009.
</div>

==== linux-2.6.9-dpt_i2o-2.5.0-2426.patch ====
<div class="change">
Patch ported by Kostja (khorenko@):

added alternative driver for I2O hardware, version 2.5.8, build 2426
sources taken from Mark Salyzyn,

!!! Obsoletes diff-drv-dpt-entropy-20040525 !!!

Bug #68066.
</div>

==== linux-2.6.9-i2o-1.325.patch ====
<div class="change">
Patch ported by Kostja (khorenko@):<br/>
updates i2o layer,

backported from to 2.6.17 linux mainstream kernel by Vasily (vvs@)

Bug #68066.
</div>

==== diff-scsi-megaraid-dma64-20060621 ====
<div class="change">
Patch from Vasily:

this patch prevent enable of 64-bit DMA on the Megaraid SATA 150-4 controller
because of it does not support 64-bit DMA.

Bug #52530.
</div>

==== linux-2.6.9-qla2xxx-8.01.05.patch ====
<div class="change">
Patch prepared by Kostja:<br/>
Qlogic qla2xxx driver updated up to 8.01.05 version.

Sources from
[http://download.qlogic.com/drivers/48436/qla2xxx-v8.01.05-dist.tgzQLogic site]

Bug #27641.
</div>

=== Patches ===

==== diff-cpt-annoying-printk ====
<div class="change">
Patch from Alexey:<br/>
[CPT] remove annoying printk

In 2.6.9 printk("=") in refrigerator() is commented out.
We should remove printk("&gt;\n") in cpt. The code with comment
is not removed, but commented out to remember that we have to
return this, if the printk in refrigerator() is uncommented.
</div>

==== diff-cpt-asmlinkage ====
<div class="change">
Patch from Alexey:<br/>
[CPT] asmlinkage attribute was forgotten
This fixes CPT with CONFIG_REGPARAM compiled
</div>

==== diff-cpt-check-syscall-cap-20060814 ====
<div class="change">
Patch from Andrey:

This patch adds check for syscall cpu capability,
which is needed by vsyscall page on x86_64 arch like sysenter
capability which is already checked.
</div>

==== diff-cpt-check-vsyscall-20060817 ====
<div class="change">
Patch from Andrey:

This patch checks if 64-bit task in vsyscall now on x86_64 arch while suspend.
If we noticed that task in vsyscall while suspend then we can try to suspend again.
Check for vsyscall page on x86_64 in dump_one_mm() is removed.
</div>

==== diff-cpt-ifindex-renumber-20060815 ====
<div class="change">
Patch from Andrey:

This patch adds renumbering of netdev-&gt;ifindex'es on
restore. We can do this because network is suspended.
All manipulations are protected with rtnl_lock().
if the required index is already busy, then it swaps
ifindex on the device in question and device which holds ifndx.
</div>

==== diff-cpt-lic-forkret-20060810-2 ====
<div class="change">
Patch from Dmitry:<br/>
fixed oops caused by diff-cpt-vzent-ovz-20060804, added modules license.
Remove use of syscall_exit.

Bug #66511.
</div>
==== diff-cpt-mm-eagain-20060817 ====

<div class="change">
Patch from Andrey:<br/>
In tests we can see message: "mm_struct is referenced outside"
After that message checkpoint fails.

It seems that this situation is legal, so checkpoint could be restarted.
So we return -EAGAIN to be able to restart checkpoint.
</div>

==== diff-cpt-skb-pcount ====
<div class="change">
Patch from Alexey:<br/>
[CPT] save/restore tcp_skb_pcount()

Backport from 2.6.16. 2.6.9 has this thing too.
</div>

==== diff-cpt-suid-dumpable ====
<div class="change">
Patch from Alexey:<br/>
[CPT] restore mm-&gt;dumpable correctly

mm-&gt;dumpable is not boolean in &gt;=2.6.9, but tri-state.
Just save and restore raw value.
</div>

==== diff-cpt-test-caps-fix-20060815 ====
<div class="change">
Patch from Andrey:

This patch fixes old test capabilities code.
We can't use context in this code, because
it is not yet initialized.
Was broken due to diff-cpt-checks-20060808
</div>

==== diff-cpt-ve-features-20060815 ====
<div class="change">
Patch from Andrey:

Feature set were not saved in CPT, so VPSes based on Suse
template could fail after restore (VE_FEATURE_SYSFS was lost).
Save feature set in place which were not used before
(cpt_os_version and cpt_os_features fields in image header).
</div>

==== diff-cpt-vsyscall-page-20060814 ====
<div class="change">
Patch from Andrey:<br/>
Changes:

* checks for errors are added
* externs are moved to .h file
* current_thread_info()-&gt;sysenter_return are set to right value on both arches
</div>

==== diff-cpt-x86_64-debuginfo ====
<div class="change">
Patch from Alexey:<br/>
[CPT] fix compilation with CONFIG_DEBUG_INFO
Just #undef it.

</div>

==== diff-ms-dcache-shrink-sb ====
<div class="change">
Patch from Kirill:

Introduce per-sb list of dcache entries to improve
shrink_dcache_sb() and shrink_dcache_parent().
This should eliminate customers problem
when on VE stop umount takes an hour to complete
while holding s_umount semaphore.
</div>

==== diff-ms-nf-ipt-compat-20060814 ====
<div class="change">
Patch from Dmitry:

remove extra checks from compat_copy_* functions.
Previously lead to extra module put on error way.

Bug #66569.
</div>

==== diff-ms-retranscollapse ====
<div class="change">
Patch from mainstream, prepared by Denis:<br/>
[TCP]: Do not try to collapse multi-packet SKB

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</div>

==== diff-ms-smp-send-stop-irqs-fix-20060726 ====
<div class="change">
Patch from Pavel:

Do not rely on smp_call_function() to notify other cpus they
must stop. Just call IPI after setting call_data accordingly.

smp_call_function() operates on global static call_data_stuct
under lock to be sure it is valid during the call.

smp_send_stop() sends IPI w/o syncronisation with ones from
smp_call_function(), but this is OK if handler will ACK
booth of them.

Bug #65573.
</div>

==== diff-ms-vsyscall-page-20060814 ====
<div class="change">
Patch from Andrey:<br/>

Changes:
* new entry sysenter_return is added to thread_info structure on x86_64 arch, ia32entry.S code changed accordingly
* constants are changed to defined values
* now we have a hole between IA32_STACK_TOP and vsyscall page
* VSYSCALL32_SYSEXIT value must match SYSENTER_RETURN_OFFSET value to be able to migrate vsyscall-sysenter page from x86_64 to i386

Now we are able to migrate int80 and sysenter vsyscall pages from i386 to
x86_64 and back.
</div>

==== diff-ms-vsyscall-sysenter-align-20060814 ====
<div class="change">
Patch from A. Mirkin:<br/>
<pre>
&gt;There is one unexpected place:
&gt;&gt;
&gt;&gt; #define VSYSCALL32_SYSEXIT (VSYSCALL32_BASE + 0x41A)
&gt;&gt;
&gt;&gt; If we cannot avoid this, I am afraid it would be better just
&gt;&gt; to add alignment to 0x420 in vsyscall-sysenter both in i386
&gt;&gt; and in ia32/x86_64 and to undo that code mimicing i386 mmap.
&gt;&gt;
&gt;&gt; If we need to know 0x41A explicitly, that trick loses sense completely.
&gt;&gt;
&gt;&gt; But this can be done later.
&gt;&gt;
&gt;&gt; Alexey
</pre>

I have added necessary alignment in both archs and removed redundant
code from
x86_64 sysenter page. Now we have return offset at 0x420.
</div>

==== diff-ms-netlinkcb-1 ====
<div class="change">
Patch from mainstream for the netlink memory corruption:<br/>
>Bug #66596.

[NETLINK]: Fix sk_rmem_alloc assertion failure.

In netlink_dump we're operating on sk after dropping the cb lock.
This is racy because the owner of the socket could close it after
we drop the cb lock.

This is possible because netlink_dump isn't always called from the
context of the process that owns the socket. For instance, if there
is contention on rtnl then rtnetlink requests will be processed by
the process that owns the rtnl.

The solution is to hold a ref count on the socket before we drop
the cb lock.

Signed-off-by: Herbert Xu &lt;herbert@gondor.apana.org.au&gt;<br/>
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;<br/>
</div>

==== diff-cpt-x8664-setpriority ====
<div class="change">
Patch from Alexey:<br/>
[CPT] process priority was restored incorrectly on x86_64

Ugly type casting bug. u32 was implicitly casted to long
and on 64bit archs negative nice values were rejected as huge positive ones.
</div>

==== diff-ms-dcache-shrink-sb-fix ====
<div class="change">
Patch from Pavel:<br/>
Newly added s_dentry_unused list must be initialized...

Bug #66944.
Bug #66923.
</div>

==== diff-ve-net-dev-sysctl-20060821 ====
<div class="change">
Patch from Dmitry:<br/>
This patch allows VE owner to use net.ipv4.conf.&lt;net_device&gt;.*
sysctls.

Bug #66842.
</div>

==== diff-ve-sysfs-root-20060818 ====
<div class="change">
Patch from Vasily Tarasov:<br/>
Fix of sysfs tree visibility in VPS.

sysfs_root variable must be virtualized, so that VPS see
only class subsystem and class net.

Bug #66581.
</div>

==== diff-arch-4gb-mce-20060824 ====
<div class="change">
Patch by Vasily (vvs@):

This patch fixes 4Gb-split-related issue: access to kernel-space memory
(machine_check_vector) before context switching

Bug #67271.
</div>

==== diff-cpt-restore-mnt-flags-20060831 ====
<div class="change">
Patch from Andrey:

Mount point's mnt_flags (noexec,nosuid,nodev) were omitted and
not restored correctly.
This patch should be applied with previous patch
(diff-ms-bind-mount-flags-20060816), in other case we should do the following:

# Remove check for bind-mounts in do_remount() function
# Change procedure for restoring bind-mounts in next way:
<pre>
do_mount(bind);
do_remount(mnt_flags).
</pre>
</div>

==== diff-cpt-rst-dir ====
<div class="change">
Patch from Alexey:<br/>
[CPT] do not keep open cwd while restore

&gt;&gt;From the viewpoint of CPT, cwd/root are very similar to an open
file, it is just pair dentry/mnt. Normally, when opening some file
we store it and its inode in special object cache to resolve opening
of the same inode, when some of its aliases (dentries) are deleted.

But it is useless for directories, which cannot be hardlinked ever.
And this consumes numfile UBC, so that restore can fail easily.
So, do not store cwd/root file, unless it is deleted. This does not
solve problem with restoring VE hitting numfiles, but relieves it a lot.

Now we can temporarily increase numfile limit while cpt/rst by 2 and
everything should be OK.

Bug #62876.
</div>

==== diff-cpt-rst-sigdfl-20060830 ====
<div class="change">
Patch from Alexey:<br/>
[CPT] save/restore even SIG_DFL handlers

Linux has a funny feature: when SA_ONESHOT signal resets
handler, flags are not set to default. And LTP tests verify
this pathology.
</div>

==== diff-cpt-tcp-bind-bug-20060831 ====
<div class="change">
Patch from Alexey:<br/>
[CPT] tcp sockets were bind()ed incorrectly during restore

This case was totally missed. Fortunately, this happens rarely.

If checkpoint happens after some listening socket was closed,
but it left behind some children (including timewait buckets),
restore fails to bind them, unless the service used SO_REUSEADDR.

Stress checkpointing of LTP tests did not catch this earlier
only because... I repaired the tests not to fail upon exhaustion
of port space some time ago. Before that they failed with obvious
and harmless diagnosis long before the first binding conflict happened.
</div>

==== diff-cpt-vsyscall-checks-20060817 ====
<div class="change">
Patch from Andrey:

This patch adds check for vsyscall cpu capabilities in compat mode on x86_64.
We need to check it to be sure that migration of processes with vsyscall will
be successful.
</div>

==== diff-ms-bind-mount-flags-20060816 ====
<div class="change">
Patch from Andrey:

This patch adds support of 3 mount flags (nodev, noexec, nosuid) to
--bind mount.
Now we can do bind mounts with noexec, nosuid and nodev options w/o
need to do remount.
This patch is also required for diff-cpt-restore-mnt-flags-20060831
</div>

==== diff-ms-emt64-entry-bad-iret ====
<div class="change">
Patch from Andrey backported from 2 mainstream patches:

[PATCH] x86_64: Don't call do_exit with interrupts disabled after IRET
exception

This caused a sigreturn with bad argument on a preemptible kernel
to complain with

<pre>
Debug: sleeping function called from invalid context
at /home/lsrc/quilt/linux/include/linux/rwsem.h:43
in_atomic():0, irqs_disabled():1

Call Trace: {__might_sleep+190} {profile_task_exit+21}
{__do_exit+34} {do_wait+0}
</pre>

Signed-off-by: Andi Kleen &lt;ak@suse.de&gt;<br/>
Signed-off-by: Linus Torvalds &lt;torvalds@osdl.org&gt;

[http://kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=2391c4b594eb28abd58102de8f4e5d7a4fa39f4c Git: 2391c4b594eb28abd58102de8f4e5d7a4fa39f4c]

[PATCH] x86_64: Report SIGSEGV for IRET faults

tcsh is not happy with the -9999 error code.

Suggested by Ernie Petrides

Signed-off-by: Andi Kleen &lt;ak@suse.de&gt;<br/>
Signed-off-by: Linus Torvalds &lt;torvalds@osdl.org&gt;

[http://kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=3076a492a5e8dd624f237886646b35d12193502d Git: 3076a492a5e8dd624f237886646b35d12193502d]

Bug #67257.
</div>

==== diff-ms-ext3-commit ====
<div class="change">
Patch from mainstream:<br/>
[PATCH] jbd: fix BUG in journal_commit_transaction()

Fix possible assertion failure in journal_commit_transaction() on
jh-&gt;b_next_transaction == NULL (when we are processing BJ_Forget list and
buffer is not jbddirty).

!jbddirty buffers can be placed on BJ_Forget list for example by
journal_forget() or by __dispose_buffer() - generally such buffer means
that it has been freed by this transaction.

Freed buffers should not be reallocated until the transaction has committed
(that's why we have the assertion there) but they *can* be reallocated when
the transaction has already been committed to disk and we are just
processing the BJ_Forget list (as soon as we remove b_committed_data from
the bitmap bh, ext3 will be able to reallocate buffers freed by the
committing transaction). So we have to also count with the case that the
buffer has been reallocated and b_next_transaction has been already set.

And one more subtle point: it can happen that we manage to reallocate the
buffer and also mark it jbddirty. Then we also add the freed buffer to the
checkpoint list of the committing trasaction. But that should do no harm.

Non-jbddirty buffers should be filed to BJ_Reserved and not BJ_Metadata
list. It can actually happen that we refile such buffers during the commit
phase when we reallocate in the running transaction blocks deleted in
committing transaction (and that can happen if the committing transaction
already wrote all the data and is just cleaning up BJ_Forget list).

Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;<br/>
Acked-by: "Stephen C. Tweedie" &lt;sct@redhat.com&gt;<br/>
Signed-off-by: Andrew Morton &lt;akpm@osdl.org&gt;<br/>
Signed-off-by: Linus Torvalds &lt;torvalds@osdl.org&gt;<br/>

GIT: 9ada7340987aa24395809570840c7c6847044f52

Bug #67362.
</div>

==== diff-ms-fib-info-leak-20060829 ====
<div class="change">
[PATCH] one more memory leak in fib_semantics
This is the last patch of sequence of three patches curing
memory leakages. This closes bug #67568.

This is mainstream bug specific for 2.6.9, the bug has been fixed:

<pre>
commit b7656e7f2944984befa3ab99a5b99f99a23b302b
Author: David S. Miller &lt;davem@davemloft.net&gt;
Date: Fri Aug 5 04:12:48 2005 -0700

[IPV4]: Fix memory leak during fib_info hash expansion.

When we grow the tables, we forget to free the olds ones
up.

Noticed by Yan Zheng.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>

==== diff-ms-neigh-table-memleak ====
<div class="change">
Patch from Alexey:<br/>
[PATCH] memory leak in neigh table destructor

It leaks one/two size-64 (size-128 on x86_64) per VE destruction.
I do not see any more leaks in 2.6.16 on i386.

2.6.9 (or x86_64) still leaks a in size-64, size-128. Probably, in
size-32.
</div>

==== diff-ms-pit-cpukhz ====
<div class="change">
Patch from mainstream:<br/>
pit timer doesn't initialize cpu_khz

Bug #66955.
</div>

==== diff-ubc-net-wait-mem-fix-20060823 ====
<div class="change">
Patch from Pavel Emelianov &lt;xemul@openvz.org&gt;:

Return sk_stream_wait_memory() prototype to original state
to make inifiniband driver (and any other caller) compile.
Places that use new version call __sk_stream_wait_memory().
</div>

==== diff-ubc-oomwake-20060823 ====
<div class="change">
Patch from Denis:

This patch wakes up OOM killed process if it stucks in 5 second
uninterruptible sleep in oom_kill
</div>

==== diff-ve-memleak-fib-hash-20060828 ====
<div class="change">
Patch from Alexey:<br/>
[PATCH] memory leakage in fib_hash

FIB hash tables and zone structs were never freed.
Each time, when VE is stopped, they leak. All the kernels are affected.

It is surprising it was not detected earlier, it says something about quality
of testing. Obviously, vzctl chkpnt/restore tests were never made, they bring
a system with 4G of ram quite soon. Of course, vzctl start/stop is not so fast
to bring down a system with decent amount of RAM, but hundreds of thousands of
slab entries are still well visible.

The patch solves leakage in size-128 and most of leakage in size-64.

We still leak two objects in size-64 and 6 entries in size-32.
</div>

==== diff-ve-multi-cleanup-20060824 ====
<div class="change">
Patch from Pavel Emelianov:

Try to cleanup each VE in a separate thread.
This alows simultaneous stop of many VEs at once

Bug #60673.
</div>

==== diff-ve-net-fib-leak-fix-20060830 ====
<div class="change">
Patch from Pavel Emelianov:<br/>
Fix memory leak in case of CONFIG_VE_NETDEV=n
Do not create fib rules if we're not going to use them.
</div>

==== diff-ve-net-loop-stat-20060821 ====
<div class="change">
Patch from Dmitry:<br/>
Virtualized loopback_stats
<br/>Bug #66571.
</div>

==== diff-ve-net-mtu-20060828 ====
<div class="change">
Patch from Dmitry:

* removed mtu restore logic for moved devices
* added posibility to set mtu &gt; 1500 for veth devices

Bug #66836.
</div>

==== diff-ve-portrange-b-20060829 ====
<div class="change">
Patch from Denis:<br/>
This patch fixes virtualization of ip_local_port_range sysctl
</div>

==== diff-ms-nf-compat-err-fix2-20060908-2 ====
<div class="change">
Patch from Dmitry (dim@), found by Vasiliy:<br/>
add flush of offsets on error way, may lead to table corruption on
the next compat_do_replace.
<br/>Bug #65826.
</div>

==== diff-ms-nf-security-checks2-20060904 ====
<div class="change">
Patch from Dmitry:<br/>
A lot of changes in order to unify compat checks with regular ones.
Fixed bugs with unavailabilty of some iptables targets and matches in
32bit VEs over 64bit kernels.
<br/>Bug #68017.
<br/>Bug #68042.
<br/>Bug #68043.
</div>

==== diff-cpt-clone-zombie-3 ====
<div class="change">
Patch from Alexey:<br/>
[CPT] restoring threads with tsk-&gt;fs==NULL, bug#65219

If a nptl thread is ptraced, it does not die immediately
and we can arrive to the state:
<pre>
parent
|
main_thread -----&gt; thread1 [ptraced]
in TASK_ZOMBIE in TASK_ZOMBIE
</pre>
To restore such configuration we do kernel_thread(CLONE_SIGNAL)
in context of main_thread. But if it is exited, it has tsk-&gt;fs == NULL
and kernel oopes.

Suggested fix is very simple: we just attach temporary fs_struct
from init task of VE. Also, we have to delay initialization of tsk-&gt;group_exit,
otherwise kernel will not allow us to clone.

This fix is pragmatic.

Better fix would be restructuring of restore to delay zombification
until the last stage of restore. I.e. we could restore all the tree
of alive processes with all the attributes of alive task (fs, mm etc).
And after it is complete, we could make one more pass and collect garbage
killing zombie tasks and clearing fs, mm etc. It would be cleaner
and safer, but requires too much of changes.

Bug #65219.
</div>

==== diff-ve-sysfs-ptmx-20060907 ====
<div class="change">
Patch from Umka:<br/>
This patch adds /sys/class/tty/ptmx device. It's necessary
'cause otherwise udev doesn't create /dev/ptmx.

{{Bug|243}}.
</div>

==== diff-ve-nf-ipt-owner-20060907 ====
<div class="change">
Patch from Dmitry:<br/>
ipt_owner match is virtualized.

Bug #68090.
</div>

==== diff-ms-tifmemdie-20060907 ====
<div class="change">
Patch from Denis:

* replaces PF_MEMDIE with TIF_MEMDIE
* fixes OOM kill counter, which is required for correct OOM generation calculations

Bug #68248.
</div>

==== diff-ubc-oomdebug-20060907 ====
<div class="change">
Patch from Denis:<br/>
OOM generation/kill counter printing on OOM reports
</div>

==== diff-cpt-ns-to-jiffies ====
<div class="change">
Patch from Alexey:<br/>
[CPT] arithmetic bug in _ns_to_jiffies().

Trivial.

But it took lots of time to find this. The only visible effect
of this bug was so funny, that it is worth to describe.
Sometimes, sshd (main daemon, which must never die) died after
checkpointing.

sshd resets SIGALRM handler to SIG_DFL in signal handler.
The bug resulted in incorrect calculation of it_real_incr and alarm was
occasionally restarted. And that killed sshd. :-)
</div>

==== diff-cpt-mlockall ====
<div class="change">
Patch from Alexey:<br/>
[CPT] mlockall() prevents restore

If a program in new redhats ever did mlockall(), we have
configurations with unreadable mlocked VMAs, which are not
really in core. This is sort of a linux feature.

The reason is that mlock*() set VM_LOCKED even if they
cannot bring in pages.

The fix is to ignore -EFAULT, returned by mlock().
</div>

==== diff-vzwdog-irq-b-20060905 ====
<div class="change">
Patch from Vasiliy:<br/>
/proc/interrupt file should be closed if kernel_thread() fails
</div>

==== diff-ms-modpost-unresolved-20060904 ====
<div class="change">
Patch from Andrey:<br/>
Unresolved symbols should abort build.

Bug #67875.
</div>

==== diff-vsched-boot-rollback-20060904 ====
<div class="change">
Patch from Andrey:<br/>
We need to rollback idle vcpu initialization if cpu initialization failed. In
other case idle vcpu will be initialized in second time and we will get panic
in init_idle().

Bug #67506.
</div>

==== diff-cpt-mod-refcnt-leak ====
<div class="change">
Patch from Alexey:<br/>
[CPT] massive module refcnt leakage while restore
Bug: detach of passed FDs is made
directly in af_unix.c, bypassing skb destructor sometimes, so we
leak module refcnt grabbed, when we attached our private destructor.
</div>

==== diff-cpt-resume-oops ====
<div class="change">
Patch from Alexey:<br/>
[CPT] crash in cpt_resume().

Actually, it is known bug, which has been fixed in hurry
and the fix did not cover all the places. Task can have tsk-&gt;sighand==NULL,
if it is already released.
</div>

==== diff-emt64-better-calltraces ====
<div class="change">
Patch from mainstream:<br/>
Make emt64 print more friendly call traces.

[http://linux.bkbits.net:8080/linux-2.6/gnupatch@44a999d2XW45NdW2kIwsOgamozNOXA gnupatch@44a999d2XW45NdW2kIwsOgamozNOXA]
</div>

==== diff-cpt-af-unix-deleted ====
<div class="change">
Patch from Alexey:<br/>
[CPT] another bug in restoring deleted af_unix sockets

One case was missed. We assumed that if path_lookup() fails
with -ENOENT, it means that we can bind to this name.
But directory can be deleted!

So, instead, switch to attempt to bind() to name. And if it fails,
bind() to temporary name instead.
</div>

==== diff-cpt-vsyscall-dump-20060911 ====
<div class="change">
Patch from Andrey:

Vsyscall object were not written correctly to image file thus output of
imagedump utility were screwed up. Fixed.
</div>

==== diff-cpt-kill-freeze-clear ====
<div class="change">
Patch from Alexey:<br/>
[CPT] clearing TIF_FREEZE was not removed

Code was a little messed up while splitting to two independent
patches (diff-cpt-suspend-cleanup and diff-cpt-ve-suspend).
As result TIF_FREEZE is still cleared in wake_ve(), which was main goal
of diff-cpt-ve-suspend.
</div>

==== diff-ve-sysfs-ptmx-b-20060907 ====
<div class="change">
Patch from Dmitry:<br/>
virtualized simple_dev_list. Required for recently added tty_class
virtualization.

Bug #68652.<br/>
Bug #68654.
</div>

==== diff-ve-nf-ipt-slab-20060927 ====
<div class="change">
Patch from Dmitry:<br/>
Fixed slab corruption on debug kernels

Bug #68880.
</div>

==== diff-ms-stopmachine-yield ====
<div class="change">
Patch from mainstream:<br/>
[PATCH] Fix occasional stop_machine() lockup with &gt; 2 CPUs

Stephen Rothwell noted a case where one CPU was sitting in userspace, one
in stop_machine() waiting for everyone to enter stopmachine(). This can
happen if migration occurs at exactly the wrong time with more than 2 CPUS.

Say we have 4 CPUS:

<ol>
<li>stop_machine() on CPU 0creates stopmachine() threads for CPUS 1, 2
and 3, and yields waiting for them to migrate to their CPUs and
ack.</li>

<li>stopmachine(2) gets rebalanced (probably on exec) to CPU 1.</li>

<li>stopmachine(2) calls set_cpus_allowed on CPU 1, sleeps awaiting
migration thread.</li>

<li>stopmachine(1) calls set_cpus_allowed on CPU 0, moves onto CPU1 and
starts spinning.</li>
</ol>

Now the migration thread never runs, and we deadlock. The simplest
solution is for stopmachine() to yield until they are all in place.

Signed-off-by: Rusty Russell &lt;rusty@rustcorp.com.au&gt;<br/>
Signed-off-by: Andrew Morton &lt;akpm@osdl.org&gt;<br/>
Signed-off-by: Linus Torvalds &lt;torvalds@osdl.org&gt;<br/>
</div>

==== diff-vzdq-isize-b-20060927 ====
<div class="change">
Patch from Denis ported from lunc@ code:<br/>
always fail dentry revalidate check for /proc quotafile
</div>

==== diff-vzdq-isize-20060913 ====
<div class="change">
Patch from Vasily (vtaras@) fixed by Denis:<br/>
This patch sets correct size on /proc/vz/aquota/*/aquota.*

Bug #59920.
</div>

==== diff-ms-remount-flags-20060912 ====
<div class="change">
Patch from Andrey:

In our kernel remounting of bind-mounts were prohibited.
This patch changes that logic and now remounting of bind-mounts is prohibited
if superblock flags are changed only.
</div>

==== diff-vzdq-symlink-20060913 ====
<div class="change">
Patch from Denis:<br/>
This patch restores quota symlink to /proc on vz_quota_on corrupted by
quotacheck.

Bug #66949.
</div>

==== diff-dbg-stop-machine-20060922 ====
<div class="change">
Patch from Pavel:

This patch tracks info about all tasks involved
in "stop machine" procedure.

Should help to fixb ug #68813 and probably bug #67369.
</div>

==== diff-ve-veth-perf-20060926 ====
<div class="change">
Patch from Denis:<br/>
TX for veth device do not require device queue locking
</div>

==== diff-debug-busy-inodes-b-20060914 ====
<div class="change">
Patch from Dmitry:<br/>
fixed printed debug info in case of "busy inodes"

Bug #68575.
</div>

==== diff-venet-perf-20060925 ====
<div class="change">
Patch from Denis:<br/>
TX for venet device does not require device queue locking
</div>

==== diff-ve-vpid-init-20060322 ====
<div class="change">
Patch from Vasily:<br/>
This patch removes removes VE init pid + 1024 (virtual init pid).
Its presence is detected by the chkrootkit and Maik blames us for this.

Bug #68754.
</div>

==== diff-ve-neightable-warn-20060920 ====
<div class="change">
Patch from Denis:

The problem:

* VE should have exactly one ARP entry
* it is allocated from UBC slab and the allocation is failed due to UBC
* EBUFS is returned to the caller

The cure: -ENOMEM in such a case.

Bug #65836.
</div>

==== diff-ve-veth-proc-20060919 ====
<div class="change">
Patch from Andrey:

It is a bug that veth proc entry (/proc/vz/veth) exists in VE0 and VPS.
This patch fixes this.

{{bug|271}}.
</div>

==== diff-ve-nf-ct-proc-20060919 ====
<div class="change">
Patch from Dmitry:<br/>
Fix /proc entries for conntracks.

{{Bug|267}}.
</div>

==== diff-ubc-oomkill-fixes-20060918 ====
<div class="change">
Patch from Pavel:<br/>
This patch fixes locking and ub refcounting in oom killer.

* oom_kill() drops oom_generation_lock, so after returning from it no need to do it again;
* if no bad processes were found in selected ub then ub must be put-ed;
* comment about locking before oom_select_and_kill_sc.

Bug #68721.
</div>

==== diff-ms-nf-compat-err-fix-20060907 ====
<div class="change">
Patch from Dmitry:<br/>
This patch fixed translate_compat_table() error way.

Bug #68286.
</div>

==== diff-ubc-magic-checks-20061011 ====
<div class="change">
Patch from Pavel:

When ub's BUG on bad page's ub/pb happens it's hard
to find out what has hapened w/o some memory dumps.

This patch makes such a dumps and doesn't BUG the
machine. Instead page_ub(page) is set to NULL in case
of error in kmem accounting. For page beancounters
all pbcs that refer to bad page are removed.

This patch can help to solve Bug #70105
and some others...

* don't free page beancounters
* print some additional page info (taken from bad_page())
* do grace recovery
</div>

==== diff-ms-xmit-bh-20061009 ====
<div class="change">
Patch from Konstantin Khorenko/mainstream:<br/>
[NET]: Fix unbalanced local_bh_enable() in dev_queue_xmit()

Many thanks to Maik Broemme who helped debugging this.

[http://linux.bkbits.net:8080/linux-2.6/gnupatch@4186e5bfgUOMBbA6xFaY0_z84kaURw gnupatch@4186e5bfgUOMBbA6xFaY0_z84kaURw]
<br/>
[http://linux.bkbits.net:8080/linux-2.6/cset@1.1938.295.30?nav=index.html|src/|src/net|src/net/core|related/net/core/dev.c cset@1.1938.295.30]

Bug #70107.
</div>

==== diff-ms-fs-preparewrite-eh-20061005 ====
<div class="change">
Patch from Vasiliy/mainstream:

{{CVE|2006-4813}}: Information leak in __block_prepare_write()
Dmitriy Monakhov from SWsoft Virtuozzo/OpenVZ Linux Team has noticed an
information leak in __block_prepare_write() which affects RHEL4 kernels:
__block_prepare_write() does not clear properly the data buffers during error
recovery and therefore content of previously unliked files is accessible.

It is known issue and it is fixed in mainstream by following patch:
<br/>[http://www.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=152becd26e0563aefdbc4fd1fe491928efe92d1f http://www.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=152becd26e0563aefdbc4fd1fe491928efe92d1f]
<br/>[https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=207463 RedHat Bug #207463]
<br/>Bug #69778.
</div>

==== diff-cpt-check-vsyscall-20061004 ====
<div class="change">
Patch from Andrey:<br/>
This patch adds checking for vsyscall presence on i386 arch.

Also we demand syscall capability only for 64-bit processes.
</div>

==== diff-ve-veth-in-ve0-20061009 ====
<div class="change">
Patch from Andrey Mirkin:<br/>
This patch enables creation of veth pair in VE0.

* ve0.is_running is initialized to 1
* ve0.op_sem is initialized in vecalls_init()
</div>

==== diff-vzdq-noquota-20061004 ====
<div class="change">
Patch from Alexey:<br/>
[VZQUOTA] add virtinfo notifier call to disable vzquota on an inode

Comparing to previous version:

<ol>
<li>vzquota_dentry_off() is replaced with vzquota_inode_off()</li>
<li>all the operations with S_NOQUOTA are moved under inode_qmblk_lock().
A few of exceptions (in standard dquot.h header) rely on the fact,
that S_NOQUOTA is never cleared.</li>
<li>Two patches are merged together, because S_NOQUOTA handling
essentially trivialized.</li>
</ol>
</div>

==== diff-cpt-execve-20061006 ====
<div class="change">
Patch from Andrey:

* replaced execve call with function like in 2.6.16 kernel.
* we should check return code of execve to be -ENOENT, not ENOENT.
</div>

==== diff-ms-dcahe-aliasing-20061009 ====
<div class="change">
Patch from Dmitry Monakhov:<br/>
A couple of flush_dcache_page()s are missing on the I/O-error paths.

Acked-By: David Miller<br/>
committed in -mm: d-cache-aliasing-issue-in-__block_prepare_write.patch
</div>

==== diff-cpt-vsyscall-checks-b-20061004 ====
<div class="change">
Patch from Andrey:<br/>
We should check if task-&gt;mm is not NULL before checking
task-&gt;mm-&gt;context.vdso value.

Bug #69680.
</div>

==== diff-ms-qdisc-lookup-sync-20061004 ====
<div class="change">
Patch from Denis(den@) and Vasily:<br/>
[PATCH] add synchronization while lookup qdiscs.

this patch is a part of
[http://linux.bkbits.net:8080/linux-2.6/patch@1.1938.331.16 patch@1.1938.331.16]

{{Bug|278}}.

</div>

==== diff-ve-netlink-perm-20061004 ====
<div class="change">
Patch from Dmitry (dim@) and Vasily:<br/>
cap_netlink_recv should check for both CAP_NET_ADMIN and CAP_VE_NET_ADMIN.
Now zebra in VE0 under std user should work.

[http://forum.openvz.org/index.php?t=tree&amp;goto=6283&amp;#msg_6283 http://forum.openvz.org/index.php?t=tree&amp;goto=6283&amp;#msg_6283]
</div>

==== diff-ms-ext2-errorbehaviour-20061006 ====
<div class="change">
Patch from Vasiliy:<br/>
EXT2_ERRORS_CONTINUE should be read from the superblock as default value for
error behaviour.

parse_option() should clean the alternative options and should not change
default value taken from the superblock.

Signed-off-by: Vasily Averin &lt;vvs@sw.ru&gt;<br/>
Acked-by: Kirill Korotaev &lt;dev@openvz.org&gt;
</div>

==== diff-ms-ext3-errorbehaviour-b-20061006 ====
<div class="change">
Patch from Dmitry Mishin:<br/>

EXT3_ERRORS_CONTINUE should be taken from the superblock as default value for
error behaviour.

Signed-off-by: Dmitry Mishin &lt;dim@openvz.org&gt;<br/>
Acked-by: Vasily Averin &lt;vvs@sw.ru&gt;<br/>
Acked-by: Kirill Korotaev &lt;dev@openvz.org&gt;
</div>

==== diff-ms-ext3-errorbehaviour-20060902 ====
<div class="change">
Patch from Vasiliy:<br/>
SWsoft Virtuozzo/OpenVZ Linux kernel team has discovered that ext3 error
behavior was broken in linux kernels since 2.5.x versions by the following
patch:

2002/10/31 02:15:26-05:00 tytso@snap.thunk.org<br/>
Default mount options from superblock for ext2/3 filesystems

[http://linux.bkbits.net:8080/linux-2.6/gnupatch@3dc0d88eKbV9ivV4ptRNM8fBuA3JBQ gnupatch@3dc0d88eKbV9ivV4ptRNM8fBuA3JBQ]

In case ext3 file system is mounted with errors=continue (EXT3_ERRORS_CONTINUE)
errors should be ignored when possible. However at present in case of any error
kernel aborts journal and remounts filesystem to read-only. Such behavior was
hit number of times and noted to differ from that of 2.4.x kernels.

This patch fixes this:

* do nothing in case of EXT3_ERRORS_CONTINUE,
* set EXT3_MOUNT_ABORT and call journal_abort() in all other cases
* panic() should be called after ext3_commit_super() to save sb marked as EXT3_ERROR_FS

Signed-off-by: Vasily Averin &lt;vvs@sw.ru&gt;<br/>
Acked-by: Kirill Korotaev &lt;dev@sw.ru&gt;

Bug #57259.<br/>
Bug #67988.
</div>

==== diff-ms-net-bridge-20061004 ====
<div class="change">
Patch from Andrey, backported from mainstream:<br/>
[BRIDGE]: Fix deadlock in br_stp_disable_bridge

Looks like somebody forgot to use the _bh spin_lock variant. We ran into a
deadlock where br-&gt;hello_timer expired while br_stp_disable_br() walked
br-&gt;port_list.

Signed-off-by: Adrian Drzewiecki &lt;z@drze.net&gt;<br/>
Signed-off-by: Stephen Hemminger &lt;shemminger@osdl.org&gt;<br/>
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;

[http://kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=78872ccb68335b14f0d1ac7338ecfcbf1cba1df4 Git: 78872ccb68335b14f0d1ac7338ecfcbf1cba1df4]

Bug #69666.
</div>

</noinclude>