Contents
Changes
- Rebased on 2.6.9-42.0.3EL
- Driver updates, configs synchronization with RHEL
- CPT fixes
- Mainstream updates
- A lot of virtualization enhancements (sysctls, sysfs, stats)
- OOM-killer fixes
- VE cleanup speedup.
Config changes
A lot of changes due to:
- attempt to approximate configs to RedHat ones
- driver updates to match HCL of 2.6.8 branch
Driver updates
diff-drv-adp94xx-freeze-20060906
Patch from Kostja:
This patch fixes kernel compilation with CONFIG_SCSI_ADP94XX=y
by removing uses of PF_FREEZE flag in adp94xx driver.
diff-wrn-implicit-funcs-20060906
Patch from Kostja:
This patch removes wanings "implicit declaration of function"
during compilation on x86 and x86_64 arches.
linux-2.6.9-e1000-7.2.7.patch
Patch ported by Kostja (khorenko@):
e1000 driver updated up to 7.2.7 version
sources were taken from
sourceforge.net/projects/e1000
Bug #19952.
linux-2.6.9-r8169-2.2.patch-1
Patch ported by Kostja (khorenko@):
r8169 driver updated up to 2.2 version
sources were taken from 2.6.8-022stab078.20 vz kernel.
Bug #19950.
linux-2.6.9-sk98lin-8.31.2.3.patch
Patch ported by Kostja (khorenko@):
sk98lin driver updated up to version 8.31.2.3 sources were taken from skd.de
Bug #28918.
linux-2.6.9-sky2-1.4.patch
Patch ported by Kostja (khorenko@):
sky2 driver updated up to version 1.4
sources were taken from 2.6.8-022stab078.20 vz kernel
Bug #19950.
linux-2.6.9-qla4xxx-5.00.02.patch
Patch ported by Kostja (khorenko@):
qla4xxx driver updated up to version 5.00.02
sources from Qlogic's site.
Bug #27641.
linux-2.6.9-arcmsr-1.20.0X.12.patch
Patch ported by Kostja (khorenko@):
Areca driver v1.20.0X.12 added.
Sources are from Areca site
Bug #59933.
linux-2.6.9-dell_rbu-0.9.patch
Patch ported by Kostja (khorenko@):
dell_rbu driver updated to 0.9 version.
sources from Dell site.
Bug #55618.
linux-2.6.9-aoe-14.patch
Patch ported by Kostja (khorenko@):
AoE driver version 14 added; sources from site
Bug #51009.
linux-2.6.9-dpt_i2o-2.5.0-2426.patch
Patch ported by Kostja (khorenko@):
added alternative driver for I2O hardware, version 2.5.8, build 2426 sources taken from Mark Salyzyn,
!!! Obsoletes diff-drv-dpt-entropy-20040525 !!!
Bug #68066.
linux-2.6.9-i2o-1.325.patch
Patch ported by Kostja (khorenko@):
updates i2o layer,
backported from to 2.6.17 linux mainstream kernel by Vasily (vvs@)
Bug #68066.
diff-scsi-megaraid-dma64-20060621
Patch from Vasily:
this patch prevent enable of 64-bit DMA on the Megaraid SATA 150-4 controller because of it does not support 64-bit DMA.
Bug #52530.
linux-2.6.9-qla2xxx-8.01.05.patch
Patch prepared by Kostja:
Qlogic qla2xxx driver updated up to 8.01.05 version.
Sources from site
Bug #27641.
Patches
diff-cpt-annoying-printk
Patch from Alexey:
[CPT] remove annoying printk
In 2.6.9 printk("=") in refrigerator() is commented out. We should remove printk(">\n") in cpt. The code with comment is not removed, but commented out to remember that we have to return this, if the printk in refrigerator() is uncommented.
diff-cpt-asmlinkage
Patch from Alexey:
[CPT] asmlinkage attribute was forgotten
This fixes CPT with CONFIG_REGPARAM compiled
diff-cpt-check-syscall-cap-20060814
Patch from Andrey:
This patch adds check for syscall cpu capability, which is needed by vsyscall page on x86_64 arch like sysenter capability which is already checked.
diff-cpt-check-vsyscall-20060817
Patch from Andrey:
This patch checks if 64-bit task in vsyscall now on x86_64 arch while suspend. If we noticed that task in vsyscall while suspend then we can try to suspend again. Check for vsyscall page on x86_64 in dump_one_mm() is removed.
diff-cpt-ifindex-renumber-20060815
Patch from Andrey:
This patch adds renumbering of netdev->ifindex'es on restore. We can do this because network is suspended. All manipulations are protected with rtnl_lock(). if the required index is already busy, then it swaps ifindex on the device in question and device which holds ifndx.
diff-cpt-lic-forkret-20060810-2
Patch from Dmitry:
fixed oops caused by diff-cpt-vzent-ovz-20060804, added modules license.
Remove use of syscall_exit.
Bug #66511.
diff-cpt-mm-eagain-20060817
Patch from Andrey:
In tests we can see message: "mm_struct is referenced outside"
After that message checkpoint fails.
It seems that this situation is legal, so checkpoint could be restarted. So we return -EAGAIN to be able to restart checkpoint.
diff-cpt-skb-pcount
Patch from Alexey:
[CPT] save/restore tcp_skb_pcount()
Backport from 2.6.16. 2.6.9 has this thing too.
diff-cpt-suid-dumpable
Patch from Alexey:
[CPT] restore mm->dumpable correctly
mm->dumpable is not boolean in >=2.6.9, but tri-state. Just save and restore raw value.
diff-cpt-test-caps-fix-20060815
Patch from Andrey:
This patch fixes old test capabilities code. We can't use context in this code, because it is not yet initialized. Was broken due to diff-cpt-checks-20060808
diff-cpt-ve-features-20060815
Patch from Andrey:
Feature set were not saved in CPT, so VPSes based on Suse template could fail after restore (VE_FEATURE_SYSFS was lost). Save feature set in place which were not used before (cpt_os_version and cpt_os_features fields in image header).
diff-cpt-vsyscall-page-20060814
Patch from Andrey:
Changes:
- checks for errors are added
- externs are moved to .h file
- current_thread_info()->sysenter_return are set to right value on both arches
diff-cpt-x86_64-debuginfo
Patch from Alexey:
[CPT] fix compilation with CONFIG_DEBUG_INFO
Just #undef it.
diff-ms-dcache-shrink-sb
Patch from Kirill:
Introduce per-sb list of dcache entries to improve shrink_dcache_sb() and shrink_dcache_parent(). This should eliminate customers problem when on VE stop umount takes an hour to complete while holding s_umount semaphore.
diff-ms-nf-ipt-compat-20060814
Patch from Dmitry:
remove extra checks from compat_copy_* functions. Previously lead to extra module put on error way.
Bug #66569.
diff-ms-retranscollapse
Patch from mainstream, prepared by Denis:
[TCP]: Do not try to collapse multi-packet SKB
Signed-off-by: David S. Miller <davem@davemloft.net>
diff-ms-smp-send-stop-irqs-fix-20060726
Patch from Pavel:
Do not rely on smp_call_function() to notify other cpus they must stop. Just call IPI after setting call_data accordingly.
smp_call_function() operates on global static call_data_stuct under lock to be sure it is valid during the call.
smp_send_stop() sends IPI w/o syncronisation with ones from smp_call_function(), but this is OK if handler will ACK booth of them.
Bug #65573.
diff-ms-vsyscall-page-20060814
Patch from Andrey:
Changes:
- new entry sysenter_return is added to thread_info structure on x86_64 arch, ia32entry.S code changed accordingly
- constants are changed to defined values
- now we have a hole between IA32_STACK_TOP and vsyscall page
- VSYSCALL32_SYSEXIT value must match SYSENTER_RETURN_OFFSET value to be able to migrate vsyscall-sysenter page from x86_64 to i386
Now we are able to migrate int80 and sysenter vsyscall pages from i386 to x86_64 and back.
diff-ms-vsyscall-sysenter-align-20060814
Patch from A. Mirkin:
>There is one unexpected place: >> >> #define VSYSCALL32_SYSEXIT (VSYSCALL32_BASE + 0x41A) >> >> If we cannot avoid this, I am afraid it would be better just >> to add alignment to 0x420 in vsyscall-sysenter both in i386 >> and in ia32/x86_64 and to undo that code mimicing i386 mmap. >> >> If we need to know 0x41A explicitly, that trick loses sense completely. >> >> But this can be done later. >> >> Alexey
I have added necessary alignment in both archs and removed redundant code from x86_64 sysenter page. Now we have return offset at 0x420.
diff-ms-netlinkcb-1
Patch from mainstream for the netlink memory corruption:
>Bug #66596.
[NETLINK]: Fix sk_rmem_alloc assertion failure.
In netlink_dump we're operating on sk after dropping the cb lock. This is racy because the owner of the socket could close it after we drop the cb lock.
This is possible because netlink_dump isn't always called from the context of the process that owns the socket. For instance, if there is contention on rtnl then rtnetlink requests will be processed by the process that owns the rtnl.
The solution is to hold a ref count on the socket before we drop the cb lock.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
diff-cpt-x8664-setpriority
Patch from Alexey:
[CPT] process priority was restored incorrectly on x86_64
Ugly type casting bug. u32 was implicitly casted to long and on 64bit archs negative nice values were rejected as huge positive ones.
diff-ms-dcache-shrink-sb-fix
Patch from Pavel:
Newly added s_dentry_unused list must be initialized...
Bug #66944. Bug #66923.
diff-ve-net-dev-sysctl-20060821
Patch from Dmitry:
This patch allows VE owner to use net.ipv4.conf.<net_device>.*
sysctls.
Bug #66842.
diff-ve-sysfs-root-20060818
Patch from Vasily Tarasov:
Fix of sysfs tree visibility in VPS.
sysfs_root variable must be virtualized, so that VPS see only class subsystem and class net.
Bug #66581.
diff-arch-4gb-mce-20060824
Patch by Vasily (vvs@):
This patch fixes 4Gb-split-related issue: access to kernel-space memory (machine_check_vector) before context switching
Bug #67271.
diff-cpt-restore-mnt-flags-20060831
Patch from Andrey:
Mount point's mnt_flags (noexec,nosuid,nodev) were omitted and not restored correctly. This patch should be applied with previous patch (diff-ms-bind-mount-flags-20060816), in other case we should do the following:
- Remove check for bind-mounts in do_remount() function
- Change procedure for restoring bind-mounts in next way:
do_mount(bind); do_remount(mnt_flags).
diff-cpt-rst-dir
Patch from Alexey:
[CPT] do not keep open cwd while restore
>>From the viewpoint of CPT, cwd/root are very similar to an open file, it is just pair dentry/mnt. Normally, when opening some file we store it and its inode in special object cache to resolve opening of the same inode, when some of its aliases (dentries) are deleted.
But it is useless for directories, which cannot be hardlinked ever. And this consumes numfile UBC, so that restore can fail easily. So, do not store cwd/root file, unless it is deleted. This does not solve problem with restoring VE hitting numfiles, but relieves it a lot.
Now we can temporarily increase numfile limit while cpt/rst by 2 and everything should be OK.
Bug #62876.
diff-cpt-rst-sigdfl-20060830
Patch from Alexey:
[CPT] save/restore even SIG_DFL handlers
Linux has a funny feature: when SA_ONESHOT signal resets handler, flags are not set to default. And LTP tests verify this pathology.
diff-cpt-tcp-bind-bug-20060831
Patch from Alexey:
[CPT] tcp sockets were bind()ed incorrectly during restore
This case was totally missed. Fortunately, this happens rarely.
If checkpoint happens after some listening socket was closed, but it left behind some children (including timewait buckets), restore fails to bind them, unless the service used SO_REUSEADDR.
Stress checkpointing of LTP tests did not catch this earlier only because... I repaired the tests not to fail upon exhaustion of port space some time ago. Before that they failed with obvious and harmless diagnosis long before the first binding conflict happened.
diff-cpt-vsyscall-checks-20060817
Patch from Andrey:
This patch adds check for vsyscall cpu capabilities in compat mode on x86_64. We need to check it to be sure that migration of processes with vsyscall will be successful.
diff-ms-bind-mount-flags-20060816
Patch from Andrey:
This patch adds support of 3 mount flags (nodev, noexec, nosuid) to --bind mount. Now we can do bind mounts with noexec, nosuid and nodev options w/o need to do remount. This patch is also required for diff-cpt-restore-mnt-flags-20060831
diff-ms-emt64-entry-bad-iret
Patch from Andrey backported from 2 mainstream patches:
[PATCH] x86_64: Don't call do_exit with interrupts disabled after IRET exception
This caused a sigreturn with bad argument on a preemptible kernel to complain with
Debug: sleeping function called from invalid context at /home/lsrc/quilt/linux/include/linux/rwsem.h:43 in_atomic():0, irqs_disabled():1 Call Trace: {__might_sleep+190} {profile_task_exit+21} {__do_exit+34} {do_wait+0}
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Git: 2391c4b594eb28abd58102de8f4e5d7a4fa39f4c
[PATCH] x86_64: Report SIGSEGV for IRET faults
tcsh is not happy with the -9999 error code.
Suggested by Ernie Petrides
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Git: 3076a492a5e8dd624f237886646b35d12193502d
Bug #67257.
diff-ms-ext3-commit
Patch from mainstream:
[PATCH] jbd: fix BUG in journal_commit_transaction()
Fix possible assertion failure in journal_commit_transaction() on jh->b_next_transaction == NULL (when we are processing BJ_Forget list and buffer is not jbddirty).
!jbddirty buffers can be placed on BJ_Forget list for example by journal_forget() or by __dispose_buffer() - generally such buffer means that it has been freed by this transaction.
Freed buffers should not be reallocated until the transaction has committed (that's why we have the assertion there) but they *can* be reallocated when the transaction has already been committed to disk and we are just processing the BJ_Forget list (as soon as we remove b_committed_data from the bitmap bh, ext3 will be able to reallocate buffers freed by the committing transaction). So we have to also count with the case that the buffer has been reallocated and b_next_transaction has been already set.
And one more subtle point: it can happen that we manage to reallocate the buffer and also mark it jbddirty. Then we also add the freed buffer to the checkpoint list of the committing trasaction. But that should do no harm.
Non-jbddirty buffers should be filed to BJ_Reserved and not BJ_Metadata list. It can actually happen that we refile such buffers during the commit phase when we reallocate in the running transaction blocks deleted in committing transaction (and that can happen if the committing transaction already wrote all the data and is just cleaning up BJ_Forget list).
Signed-off-by: Jan Kara <jack@suse.cz>
Acked-by: "Stephen C. Tweedie" <sct@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
GIT: 9ada7340987aa24395809570840c7c6847044f52
Bug #67362.
diff-ms-fib-info-leak-20060829
[PATCH] one more memory leak in fib_semantics This is the last patch of sequence of three patches curing memory leakages. This closes bug #67568.
This is mainstream bug specific for 2.6.9, the bug has been fixed:
commit b7656e7f2944984befa3ab99a5b99f99a23b302b Author: David S. Miller <davem@davemloft.net> Date: Fri Aug 5 04:12:48 2005 -0700 [IPV4]: Fix memory leak during fib_info hash expansion. When we grow the tables, we forget to free the olds ones up. Noticed by Yan Zheng. Signed-off-by: David S. Miller <davem@davemloft.net>
diff-ms-neigh-table-memleak
Patch from Alexey:
[PATCH] memory leak in neigh table destructor
It leaks one/two size-64 (size-128 on x86_64) per VE destruction. I do not see any more leaks in 2.6.16 on i386.
2.6.9 (or x86_64) still leaks a in size-64, size-128. Probably, in size-32.
diff-ms-pit-cpukhz
Patch from mainstream:
pit timer doesn't initialize cpu_khz
Bug #66955.
diff-ubc-net-wait-mem-fix-20060823
Patch from Pavel Emelianov <xemul@openvz.org>:
Return sk_stream_wait_memory() prototype to original state to make inifiniband driver (and any other caller) compile. Places that use new version call __sk_stream_wait_memory().
diff-ubc-oomwake-20060823
Patch from Denis:
This patch wakes up OOM killed process if it stucks in 5 second uninterruptible sleep in oom_kill
diff-ve-memleak-fib-hash-20060828
Patch from Alexey:
[PATCH] memory leakage in fib_hash
FIB hash tables and zone structs were never freed. Each time, when VE is stopped, they leak. All the kernels are affected.
It is surprising it was not detected earlier, it says something about quality of testing. Obviously, vzctl chkpnt/restore tests were never made, they bring a system with 4G of ram quite soon. Of course, vzctl start/stop is not so fast to bring down a system with decent amount of RAM, but hundreds of thousands of slab entries are still well visible.
The patch solves leakage in size-128 and most of leakage in size-64.
We still leak two objects in size-64 and 6 entries in size-32.
diff-ve-multi-cleanup-20060824
Patch from Pavel Emelianov:
Try to cleanup each VE in a separate thread. This alows simultaneous stop of many VEs at once
Bug #60673.
diff-ve-net-fib-leak-fix-20060830
Patch from Pavel Emelianov:
Fix memory leak in case of CONFIG_VE_NETDEV=n
Do not create fib rules if we're not going to use them.
diff-ve-net-loop-stat-20060821
Patch from Dmitry:
Virtualized loopback_stats
Bug #66571.
diff-ve-net-mtu-20060828
Patch from Dmitry:
- removed mtu restore logic for moved devices
- added posibility to set mtu > 1500 for veth devices
Bug #66836.
diff-ve-portrange-b-20060829
Patch from Denis:
This patch fixes virtualization of ip_local_port_range sysctl
diff-ms-nf-compat-err-fix2-20060908-2
Patch from Dmitry (dim@), found by Vasiliy:
add flush of offsets on error way, may lead to table corruption on
the next compat_do_replace.
Bug #65826.
diff-ms-nf-security-checks2-20060904
Patch from Dmitry:
A lot of changes in order to unify compat checks with regular ones.
Fixed bugs with unavailabilty of some iptables targets and matches in
32bit VEs over 64bit kernels.
Bug #68017.
Bug #68042.
Bug #68043.
diff-cpt-clone-zombie-3
Patch from Alexey:
[CPT] restoring threads with tsk->fs==NULL, bug#65219
If a nptl thread is ptraced, it does not die immediately and we can arrive to the state:
parent | main_thread -----> thread1 [ptraced] in TASK_ZOMBIE in TASK_ZOMBIE
To restore such configuration we do kernel_thread(CLONE_SIGNAL) in context of main_thread. But if it is exited, it has tsk->fs == NULL and kernel oopes.
Suggested fix is very simple: we just attach temporary fs_struct from init task of VE. Also, we have to delay initialization of tsk->group_exit, otherwise kernel will not allow us to clone.
This fix is pragmatic.
Better fix would be restructuring of restore to delay zombification until the last stage of restore. I.e. we could restore all the tree of alive processes with all the attributes of alive task (fs, mm etc). And after it is complete, we could make one more pass and collect garbage killing zombie tasks and clearing fs, mm etc. It would be cleaner and safer, but requires too much of changes.
Bug #65219.
diff-ve-sysfs-ptmx-20060907
Patch from Umka:
This patch adds /sys/class/tty/ptmx device. It's necessary
'cause otherwise udev doesn't create /dev/ptmx.
diff-ve-nf-ipt-owner-20060907
Patch from Dmitry:
ipt_owner match is virtualized.
Bug #68090.
diff-ms-tifmemdie-20060907
Patch from Denis:
- replaces PF_MEMDIE with TIF_MEMDIE
- fixes OOM kill counter, which is required for correct OOM generation calculations
Bug #68248.
diff-ubc-oomdebug-20060907
Patch from Denis:
OOM generation/kill counter printing on OOM reports
diff-cpt-ns-to-jiffies
Patch from Alexey:
[CPT] arithmetic bug in _ns_to_jiffies().
Trivial.
But it took lots of time to find this. The only visible effect of this bug was so funny, that it is worth to describe. Sometimes, sshd (main daemon, which must never die) died after checkpointing.
sshd resets SIGALRM handler to SIG_DFL in signal handler. The bug resulted in incorrect calculation of it_real_incr and alarm was occasionally restarted. And that killed sshd. :-)
diff-cpt-mlockall
Patch from Alexey:
[CPT] mlockall() prevents restore
If a program in new redhats ever did mlockall(), we have configurations with unreadable mlocked VMAs, which are not really in core. This is sort of a linux feature.
The reason is that mlock*() set VM_LOCKED even if they cannot bring in pages.
The fix is to ignore -EFAULT, returned by mlock().
diff-vzwdog-irq-b-20060905
Patch from Vasiliy:
/proc/interrupt file should be closed if kernel_thread() fails
diff-ms-modpost-unresolved-20060904
Patch from Andrey:
Unresolved symbols should abort build.
Bug #67875.
diff-vsched-boot-rollback-20060904
Patch from Andrey:
We need to rollback idle vcpu initialization if cpu initialization failed. In
other case idle vcpu will be initialized in second time and we will get panic
in init_idle().
Bug #67506.
diff-cpt-mod-refcnt-leak
Patch from Alexey:
[CPT] massive module refcnt leakage while restore
Bug: detach of passed FDs is made
directly in af_unix.c, bypassing skb destructor sometimes, so we
leak module refcnt grabbed, when we attached our private destructor.
diff-cpt-resume-oops
Patch from Alexey:
[CPT] crash in cpt_resume().
Actually, it is known bug, which has been fixed in hurry and the fix did not cover all the places. Task can have tsk->sighand==NULL, if it is already released.
diff-emt64-better-calltraces
Patch from mainstream:
Make emt64 print more friendly call traces.
diff-cpt-af-unix-deleted
Patch from Alexey:
[CPT] another bug in restoring deleted af_unix sockets
One case was missed. We assumed that if path_lookup() fails with -ENOENT, it means that we can bind to this name. But directory can be deleted!
So, instead, switch to attempt to bind() to name. And if it fails, bind() to temporary name instead.
diff-cpt-vsyscall-dump-20060911
Patch from Andrey:
Vsyscall object were not written correctly to image file thus output of imagedump utility were screwed up. Fixed.
diff-cpt-kill-freeze-clear
Patch from Alexey:
[CPT] clearing TIF_FREEZE was not removed
Code was a little messed up while splitting to two independent patches (diff-cpt-suspend-cleanup and diff-cpt-ve-suspend). As result TIF_FREEZE is still cleared in wake_ve(), which was main goal of diff-cpt-ve-suspend.
diff-ve-sysfs-ptmx-b-20060907
Patch from Dmitry:
virtualized simple_dev_list. Required for recently added tty_class
virtualization.
Bug #68652.
Bug #68654.
diff-ve-nf-ipt-slab-20060927
Patch from Dmitry:
Fixed slab corruption on debug kernels
Bug #68880.
diff-ms-stopmachine-yield
Patch from mainstream:
[PATCH] Fix occasional stop_machine() lockup with > 2 CPUs
Stephen Rothwell noted a case where one CPU was sitting in userspace, one in stop_machine() waiting for everyone to enter stopmachine(). This can happen if migration occurs at exactly the wrong time with more than 2 CPUS.
Say we have 4 CPUS:
- stop_machine() on CPU 0creates stopmachine() threads for CPUS 1, 2 and 3, and yields waiting for them to migrate to their CPUs and ack.
- stopmachine(2) gets rebalanced (probably on exec) to CPU 1.
- stopmachine(2) calls set_cpus_allowed on CPU 1, sleeps awaiting migration thread.
- stopmachine(1) calls set_cpus_allowed on CPU 0, moves onto CPU1 and starts spinning.
Now the migration thread never runs, and we deadlock. The simplest solution is for stopmachine() to yield until they are all in place.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
diff-vzdq-isize-b-20060927
Patch from Denis ported from lunc@ code:
always fail dentry revalidate check for /proc quotafile
diff-vzdq-isize-20060913
Patch from Vasily (vtaras@) fixed by Denis:
This patch sets correct size on /proc/vz/aquota/*/aquota.*
Bug #59920.
diff-ms-remount-flags-20060912
Patch from Andrey:
In our kernel remounting of bind-mounts were prohibited. This patch changes that logic and now remounting of bind-mounts is prohibited if superblock flags are changed only.
diff-vzdq-symlink-20060913
Patch from Denis:
This patch restores quota symlink to /proc on vz_quota_on corrupted by
quotacheck.
Bug #66949.
diff-dbg-stop-machine-20060922
Patch from Pavel:
This patch tracks info about all tasks involved in "stop machine" procedure.
Should help to fixb ug #68813 and probably bug #67369.
diff-ve-veth-perf-20060926
Patch from Denis:
TX for veth device do not require device queue locking
diff-debug-busy-inodes-b-20060914
Patch from Dmitry:
fixed printed debug info in case of "busy inodes"
Bug #68575.
diff-venet-perf-20060925
Patch from Denis:
TX for venet device does not require device queue locking
diff-ve-vpid-init-20060322
Patch from Vasily:
This patch removes removes VE init pid + 1024 (virtual init pid).
Its presence is detected by the chkrootkit and Maik blames us for this.
Bug #68754.
diff-ve-neightable-warn-20060920
Patch from Denis:
The problem:
- VE should have exactly one ARP entry
- it is allocated from UBC slab and the allocation is failed due to UBC
- EBUFS is returned to the caller
The cure: -ENOMEM in such a case.
Bug #65836.
diff-ve-veth-proc-20060919
Patch from Andrey:
It is a bug that veth proc entry (/proc/vz/veth) exists in VE0 and VPS. This patch fixes this.
diff-ve-nf-ct-proc-20060919
Patch from Dmitry:
Fix /proc entries for conntracks.
diff-ubc-oomkill-fixes-20060918
Patch from Pavel:
This patch fixes locking and ub refcounting in oom killer.
- oom_kill() drops oom_generation_lock, so after returning from it no need to do it again;
- if no bad processes were found in selected ub then ub must be put-ed;
- comment about locking before oom_select_and_kill_sc.
Bug #68721.
diff-ms-nf-compat-err-fix-20060907
Patch from Dmitry:
This patch fixed translate_compat_table() error way.
Bug #68286.
diff-ubc-magic-checks-20061011
Patch from Pavel:
When ub's BUG on bad page's ub/pb happens it's hard to find out what has hapened w/o some memory dumps.
This patch makes such a dumps and doesn't BUG the machine. Instead page_ub(page) is set to NULL in case of error in kmem accounting. For page beancounters all pbcs that refer to bad page are removed.
This patch can help to solve Bug #70105 and some others...
- don't free page beancounters
- print some additional page info (taken from bad_page())
- do grace recovery
diff-ms-xmit-bh-20061009
Patch from Konstantin Khorenko/mainstream:
[NET]: Fix unbalanced local_bh_enable() in dev_queue_xmit()
Many thanks to Maik Broemme who helped debugging this.
gnupatch@4186e5bfgUOMBbA6xFaY0_z84kaURw
cset@1.1938.295.30
Bug #70107.
diff-ms-fs-preparewrite-eh-20061005
Patch from Vasiliy/mainstream:
CVE-2006-4813: Information leak in __block_prepare_write() Dmitriy Monakhov from SWsoft Virtuozzo/OpenVZ Linux Team has noticed an information leak in __block_prepare_write() which affects RHEL4 kernels: __block_prepare_write() does not clear properly the data buffers during error recovery and therefore content of previously unliked files is accessible.
It is known issue and it is fixed in mainstream by following patch:
http://www.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=152becd26e0563aefdbc4fd1fe491928efe92d1f
RedHat Bug #207463
Bug #69778.
diff-cpt-check-vsyscall-20061004
Patch from Andrey:
This patch adds checking for vsyscall presence on i386 arch.
Also we demand syscall capability only for 64-bit processes.
diff-ve-veth-in-ve0-20061009
Patch from Andrey Mirkin:
This patch enables creation of veth pair in VE0.
- ve0.is_running is initialized to 1
- ve0.op_sem is initialized in vecalls_init()
diff-vzdq-noquota-20061004
Patch from Alexey:
[VZQUOTA] add virtinfo notifier call to disable vzquota on an inode
Comparing to previous version:
- vzquota_dentry_off() is replaced with vzquota_inode_off()
- all the operations with S_NOQUOTA are moved under inode_qmblk_lock(). A few of exceptions (in standard dquot.h header) rely on the fact, that S_NOQUOTA is never cleared.
- Two patches are merged together, because S_NOQUOTA handling essentially trivialized.
diff-cpt-execve-20061006
Patch from Andrey:
- replaced execve call with function like in 2.6.16 kernel.
- we should check return code of execve to be -ENOENT, not ENOENT.
diff-ms-dcahe-aliasing-20061009
Patch from Dmitry Monakhov:
A couple of flush_dcache_page()s are missing on the I/O-error paths.
Acked-By: David Miller
committed in -mm: d-cache-aliasing-issue-in-__block_prepare_write.patch
diff-cpt-vsyscall-checks-b-20061004
Patch from Andrey:
We should check if task->mm is not NULL before checking
task->mm->context.vdso value.
Bug #69680.
diff-ms-qdisc-lookup-sync-20061004
Patch from Denis(den@) and Vasily:
[PATCH] add synchronization while lookup qdiscs.
this patch is a part of patch@1.1938.331.16
diff-ve-netlink-perm-20061004
Patch from Dmitry (dim@) and Vasily:
cap_netlink_recv should check for both CAP_NET_ADMIN and CAP_VE_NET_ADMIN.
Now zebra in VE0 under std user should work.
http://forum.openvz.org/index.php?t=tree&goto=6283&#msg_6283
diff-ms-ext2-errorbehaviour-20061006
Patch from Vasiliy:
EXT2_ERRORS_CONTINUE should be read from the superblock as default value for
error behaviour.
parse_option() should clean the alternative options and should not change default value taken from the superblock.
Signed-off-by: Vasily Averin <vvs@sw.ru>
Acked-by: Kirill Korotaev <dev@openvz.org>
diff-ms-ext3-errorbehaviour-b-20061006
Patch from Dmitry Mishin:
EXT3_ERRORS_CONTINUE should be taken from the superblock as default value for error behaviour.
Signed-off-by: Dmitry Mishin <dim@openvz.org>
Acked-by: Vasily Averin <vvs@sw.ru>
Acked-by: Kirill Korotaev <dev@openvz.org>
diff-ms-ext3-errorbehaviour-20060902
Patch from Vasiliy:
SWsoft Virtuozzo/OpenVZ Linux kernel team has discovered that ext3 error
behavior was broken in linux kernels since 2.5.x versions by the following
patch:
2002/10/31 02:15:26-05:00 tytso@snap.thunk.org
Default mount options from superblock for ext2/3 filesystems
gnupatch@3dc0d88eKbV9ivV4ptRNM8fBuA3JBQ
In case ext3 file system is mounted with errors=continue (EXT3_ERRORS_CONTINUE) errors should be ignored when possible. However at present in case of any error kernel aborts journal and remounts filesystem to read-only. Such behavior was hit number of times and noted to differ from that of 2.4.x kernels.
This patch fixes this:
- do nothing in case of EXT3_ERRORS_CONTINUE,
- set EXT3_MOUNT_ABORT and call journal_abort() in all other cases
- panic() should be called after ext3_commit_super() to save sb marked as EXT3_ERROR_FS
Signed-off-by: Vasily Averin <vvs@sw.ru>
Acked-by: Kirill Korotaev <dev@sw.ru>
Bug #57259.
Bug #67988.
diff-ms-net-bridge-20061004
Patch from Andrey, backported from mainstream:
[BRIDGE]: Fix deadlock in br_stp_disable_bridge
Looks like somebody forgot to use the _bh spin_lock variant. We ran into a deadlock where br->hello_timer expired while br_stp_disable_br() walked br->port_list.
Signed-off-by: Adrian Drzewiecki <z@drze.net>
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Git: 78872ccb68335b14f0d1ac7338ecfcbf1cba1df4
Bug #69666.