Open main menu

OpenVZ Virtuozzo Containers Wiki β

Changes

Download/kernel/2.6.8/022stab078.10/changes

14,681 bytes added, 12:30, 21 March 2008
created (not yet fixed)
== Changes ==
* UBC, VZDQ and other fixes
* Performance optimizations
* A fix for iowait stats
* Mainstream security fixes
* Added nosrc.rpm

=== Configs ===
Same as {{Kernel link|2.6.8|022stab077.1}}, plus<br/>
Added:
* +<code>CONFIG_SERIAL_8250_EXTENDED=y</code>
* +<code>CONFIG_SERIAL_8250_SHARE_IRQ=y</code>
Changed:
* &nbsp;CONFIG_SERIAL_8250_NR_UARTS=16 (was 4)
Removed:
* -<code>CONFIG_4KSTACKS</code>
* -<code>CONFIG_UBC_DEBUG</code>
* -<code>CONFIG_UBC_DEBUG_KMEM</code>

<includeonly>[[{{PAGENAME}}/changes#Patches|{{Long changelog message}}]]</includeonly><noinclude>
=== Patches ===

==== diff-vzdq-allocnofs-20060511 ====
<div class="change">
Patch from Kirill:<br/>
This patch is addon for diff-vzdq-getstat-20060510
and fixes all other places where allocation with GFP_FS
under qmblk-&gt;dq_sem is possible.
</div>

==== diff-ms-readv-errh ====
<div class="change">
Patch from mainstream:<br/>
This patch fixes error handling in readv().
Trivial error check corrected a bit.<br/>
Bug #61525.
</div>

==== diff-vzdq-getstat-20060510 ====
<div class="change">
Patch from Kirill:<br/>
This patch fixes selfdeadlock in vzquota in
quota_ugid_getstat(). copy_to_user() can trigger page fault
and stuck on qmbl-&gt;dq_sem.<br/>
Bug #62179.
</div>

==== diff-ms-smbfs-chroot ====
<div class="change">
Patch from mainstream:<br/>
[PATCH] smbfs chroot issue (CVE-2006-1864)

Mark Moseley reported that a chroot environment on a SMB share can be
left via "cd ..\\". Similar to CVE-2006-1863 issue with cifs, this fix
is for smbfs.

Steven French &lt;sfrench@us.ibm.com&gt; wrote:

Looks fine to me. This should catch the slash on lookup or equivalent,
which will be all obvious paths of interest.

Signed-off-by: Chris Wright &lt;chrisw@sous-sol.org&gt;

[http://kernel.org/git/?p=linux/kernel/git/stable/linux-2.6.16.y.git;a=commitdiff;h=4acbb3fbaccda1f1d38e7154228e052ce80a2dfa X-Git-Url]
</div>

==== diff-ms-locks-after-close ====
<div class="change">
Patch from mainstream:<br/>
[PATCH] stale POSIX lock handling

I believe that there is a problem with the handling of POSIX locks, which
the attached patch should address.

The problem appears to be a race between fcntl(2) and close(2). A
multithreaded application could close a file descriptor at the same time as
it is trying to acquire a lock using the same file descriptor. I would
suggest that that multithreaded application is not providing the proper
synchronization for itself, but the OS should still behave correctly.

SUS3 (Single UNIX Specification Version 3, read: POSIX) indicates that when
a file descriptor is closed, that all POSIX locks on the file, owned by the
process which closed the file descriptor, should be released.

The trick here is when those locks are released. The current code releases
all locks which exist when close is processing, but any locks in progress
are handled when the last reference to the open file is released.

There are three cases to consider.

One is the simple case, a multithreaded (mt) process has a file open and
races to close it and acquire a lock on it. In this case, the close will
release one reference to the open file and when the fcntl is done, it will
release the other reference. For this situation, no locks should exist on
the file when both the close and fcntl operations are done. The current
system will handle this case because the last reference to the open file is
being released.

The second case is when the mt process has dup(2)'d the file descriptor.
The close will release one reference to the file and the fcntl, when done,
will release another, but there will still be at least one more reference
to the open file. One could argue that the existence of a lock on the file
after the close has completed is okay, because it was acquired after the
close operation and there is still a way for the application to release the
lock on the file, using an existing file descriptor.

The third case is when the mt process has forked, after opening the file
and either before or after becoming an mt process. In this case, each
process would hold a reference to the open file. For each process, this
degenerates to first case above. However, the lock continues to exist
until both processes have released their references to the open file. This
lock could block other lock requests.

The changes to release the lock when the last reference to the open file
aren't quite right because they would allow the lock to exist as long as
there was a reference to the open file. This is too long.

The new proposed solution is to add support in the fcntl code path to
detect a race with close and then to release the lock which was just
acquired when such as race is detected. This causes locks to be released
in a timely fashion and for the system to conform to the POSIX semantic
specification.

This was tested by instrumenting a kernel to detect the handling locks and
then running a program which generates case #3 above. A dangling lock
could be reliably generated. When the changes to detect the close/fcntl
race were added, a dangling lock could no longer be generated.

Cc: Matthew Wilcox &lt;willy@debian.org&gt;<br/>
Cc: Trond Myklebust &lt;trond.myklebust@fys.uio.no&gt;<br/>
Signed-off-by: Andrew Morton &lt;akpm@osdl.org&gt;<br/>
Signed-off-by: Linus Torvalds &lt;torvalds@osdl.org&gt;<br/>

GIT: c293621bbf678a3d85e3ed721c3921c8a670610d<br/>
RHEL4u3: linux-2.6.9-locks-after-close.patch

[http://linux.bkbits.net:8080/linux-2.6/cset@1.3332.104.96 http://linux.bkbits.net:8080/linux-2.6/cset@1.3332.104.96]
</div>

==== diff-ms-lock-race ====
<div class="change">
Patch from mainstream:<br/>
[PATCH] fs/locks.c: Fix sys_flock() race

sys_flock() currently has a race which can result in a double free in the
multi-thread case.

<source lang="c">
Thread 1 Thread 2

sys_flock(file, LOCK_EX)
sys_flock(file, LOCK_UN)
</source>

If Thread 2 removes the lock from inode-&gt;i_lock before Thread 1 tests for
list_empty(&amp;lock-&gt;fl_link) at the end of sys_flock, then both threads will end up calling locks_free_lock for the same lock.

Fix is to make flock_lock_file() do the same as posix_lock_file(), namely
to make a copy of the request, so that the caller can always free the lock.

This also has the side-effect of fixing up a reference problem in the
lockd handling of flock.

Signed-off-by: Trond Myklebust &lt;Trond.Myklebust@netapp.com&gt;<br/>
Signed-off-by: Andrew Morton &lt;akpm@osdl.org&gt;<br/>
Signed-off-by: Linus Torvalds &lt;torvalds@osdl.org&gt;<br/>

X-Git-Tag: v2.6.17-rc1<br/>
X-Git-Url: http://kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=993dfa8776308dcfd311cf77a3bbed4aa11e9868
</div>

==== diff-ve-vprintk-20061015 ====
<div class="change">
Patch from Vasiliy Tarasov:<br/>
This patch fixes vprintk(). It should print the messages in VE0,
not in current context.
</div>

==== diff-ve-netfilter-ipt-redir-20060517 ====
<div class="change">
Patch from Dmitry:<br/>
fixed ipt_REDIRECT work inside VEs.<br/>
<br/>
[http://bugzilla.openvz.org/show_bug.cgi?id=171 OpenVZ Bug #171].
</div>

==== diff-ms-sched-wakeup-forked ====
<div class="change">
Patch from Kirill:<br/>
This patch fixes wake_up_forked_process():
<ul>
* it should not change task-&gt;cpu
* it should lock p's runqueue, not current
* and current-&gt;runqueue can be != p-&gt;runqueue
</ul>
</div>

==== diff-ve-proc-tgid-20060518 ====
<div class="change">
Patch from Dmitry Mishin &lt;dim@openvz.org&gt;:<br/>
Fixed oops in get_tgid_list.<br/>
If external (ve0) process lookups proc tree of VE, which is in
ve_cleanup_list, oops in get_tgid_list is possible. Fixed.
</div>

==== diff-ms-megaraid-64bit-dma-check ====
<div class="change">
Patch from mainstream and Vasiliy Averin:<br/>
This patch contains a fix for 64-bit DMA capability check in
megaraid_{mm,mbox} driver. With patch, the driver access PCI
configuration space with dedicated offset to read a signature. If the
signature read, it means that the controller has capability to handle
64-bit DMA. Before this patch, the driver blindly claimed the capability without checking with controller.
The issue has been reported by Vasily Averin &lt;vvs@sw.ru&gt;. Thank you Vasily for the reporting.<br/>
<br/>
Fixed a bug in megaraid_init_mbox().<br/>
Customer reported "garbage in file on x86_64 platform".<br/>
Root Cause: the driver registered controllers as 64-bit DMA capable
for those which are not support it.<br/>
Fix: Made change in the function inserting identification machanism
identifying 64-bit DMA capable controllers.<br/>
<br/>
Signed-Off By: Seokmann Ju &lt;seokmann.ju@lsil.com&gt;
</div>

==== diff-ubc-pb-hash-func ====
<div class="change">
Patch from Pavel:<br/>
This patch fixes poor pb_hash function, which reduced hash list
length very much and made fork()/exit() quicker.
</div>

==== diff-ms-dst-lock-20060522 ====
<div class="change">
Patch from Dmitry:<br/>
replace add_timer() by mod_timer() in dst_run_gc()
in order to avoid BUG message.
<source lang="c">
CPU1 CPU2
dst_run_gc() entered dst_run_gc() entered
spin_lock(&amp;dst_lock) .....
del_timer(&amp;dst_gc_timer) fail to get lock
.... mod_timer() &lt;--- puts timer back
.... in list
add_timer(&amp;dst_gc_timer) &lt;--- BUG because timer is in list already.
</source>

Bug #62581.
</div>

==== diff-ms-showmem-atomicity ====
<div class="change">
Patch from Kirill:

This patch fixes printk() under zone-&gt;lock.

It can be unsafe to icall printk() under this lock, since
caller can try to allocate/free some memory and selfdeadlock
on this lock. I found allocations/freeing mem both in netconsole and serial console.
</div>

==== diff-dbg-nmi-printk2-20060524 ====
<div class="change">
Patch from Andrey:

This patch implements safer printk from places like NMI or from
under critical locks (runqueue lock and so on).
It is a replacement of two dbg-nmi-printk-200508* patches.

Printk from NMI watchdog was left unmodified in this version, since
other patches are also fiddling with it. NMI watchdog requires great
care and consideration, and is left for future inspection.
</div>

==== diff-ubc-putwarn-20060525 ====
<div class="change">
Patch from Andrey:

This patch prints more sensible warning on bad refcounter
in __put_beancounter.
</div>

==== diff-ubc-tcppage-20060525 ====
<div class="change">
Patch from Andrey:

This patch fixes an apparent bug in accounting in
ub_sock_tcp_chargepage.
Should help problems at DefenderHosting.
</div>

==== diff-dbg-nmi-printk2b-20060526 ====
<div class="change">
Patch from Andrey:

Console code passes additional information via global variables.
This patch fixes release_console_sem() call with this respect.
</div>

==== diff-ubc-net-tcp-openreq-20060529 ====
<div class="change">
Patch from Dmitry:

Fixed oops in inet_sock_destruct due to wrong sk_clone error path.
Found by phycho.
</div>

==== diff-ve-iowait-20060525 ====
<div class="change">
Patch from Dmitry:

This patch fixes iowait_time statistics for both VE0 and VEs.
Removes redundant nr_iowait field in VE_CPU_STATS.

Bug noticed by Matt Loschert.
</div>

==== diff-ve-iowait2-20060525 ====
<div class="change">
Patch from Dmitry:

Fixed iowait stats for VE0 - after schedule task
may be activated on the another processor.
</div>

==== diff-security-ct-infleak ====
<div class="change">
Patch from mainstream:<br/>
[PATCH] NETFILTER: Fix small information leak in SO_ORIGINAL_DST (CVE-2006-1343)<br/>

It appears that sockaddr_in.sin_zero is not zeroed during
getsockopt(...SO_ORIGINAL_DST...) operation. This can lead
to an information leak (CVE-2006-1343).

Signed-off-by: Marcel Holtmann &lt;marcel@holtmann.org&gt;<br/>
Signed-off-by: Chris Wright &lt;chrisw@sous-sol.org&gt;

X-Git-Url: http://www.kernel.org/git/?p=linux/kernel/git/stable/linux-2.6.16.y.git;a=commitdiff;h=11091f6a4a11feb5794aef9307c428838129ea02
</div>

==== diff-gcc4-qla2xxx-20060530 ====
<div class="change">
Patch from Kir &lt;kir@openvz.org&gt;:<br/>
fixing a compilation issue with gcc4

The following error occurs when trying to compile 022stab077:

<pre>drivers/scsi/qla2xxx/qla_gs.c: In function qla2x00_ga_nxt:
drivers/scsi/qla2xxx/qla_gs.c:97: sorry, unimplemented: inlining failed in call
to qla24xx_prep_ms_iocb: function not considered for inlining
drivers/scsi/qla2xxx/qla_gs.c:61: sorry, unimplemented: called from here
</pre>

{{Bug|182}}.
</div>

==== diff-ms-as-params-20060605 ====
<div class="change">
Patch from Vasily:

Change AS I/O scheduler defaults due to the problem with syncs.
* read_batch_expire = 10 by default.
* read_expire = 10 by default.

Kernel.org [http://bugzilla.kernel.org/show_bug.cgi?id=5900 bug #5900]

Found by Matt Loschert, ticket #154336.
</div>

==== diff-ve-root-user-20060605 ====
<div class="change">
Patch from Vasily:

in some places we should compare not with &amp;root_user ptr (HN root),
but with VPS root. Resulted in inability of su to change user when ulimit
was too tight for root.

Found by Barmaley, ticket #158322.
</div>

==== diff-fairsched-assert-20060602 ====
<div class="change">
Patch by Andrey (saw@):

This patch fixes assertions in fairsched to avoid printk deadlocks,
and to print more information.
</div>

==== diff-ve-vpid-leak ====
<div class="change">
Patch from Alexey:<br/>
[PATCH] leakage of vpid_mapping

Probably this fixes bug #62834.

The problem was that when switching to sparse VPID mappings, we could
have processes with non-virtual pids entered to VE. F.e. it could be
some stuck process from VE setup scripts. In this case we created
useless mapping struct, which was nevere freed, because it referred
to non-virtual pid.

I left a printk() in the code, because we definitely need confirmation
that this event really happens. It does not in my tests: to the moment
I run 400000 checkpoint/restores and 20000 of migrations on VE and I found
no problems, unfortunately.
</div>

==== linux-2.6.8.1-3w9xxx-2.26.4.009.patch ====
<div class="change">
Patch prepared by Vasily:<br/>
Sources were taken from www.3ware.com.

3w-9xxx driver was updated up to 2.26.4.009 version
Fixed a kmap_atomic() problem that might result in data loss under Linux.
New driver version disables local IRQs while the driver is holding KM_IRQ0. Thisis to prevent an IRQ handler from using those kmap slots while the driver is
using them, which can result in memory corruption.

[http://3ware.com/download/Escalade9550SX-Series/9.3.0.4/9.3.0.4_Release_Notes_Web.pdf 9.3.0.4_Release_Notes_Web.pdf]

Bug #38702.
</div>

</noinclude>