  • CVE-2008-5029 (Unix sockets kernel panic) fixed
  • Fixed an oops on x86_64 in case Container constantly tries to exceed privvmpages
  • Return EOPNOTSUPP when RTNL message is not supported by the kernel




Patch from mainstream, ported by Kostja (khorenko@).

net: Fix recursive descent in __scm_destroy().

__scm_destroy() walks the list of file descriptors in the scm_fp_list pointed to by the scm_cookie argument.

Those, in turn, can close sockets and invoke __scm_destroy() again.

There is nothing which limits how deeply this can occur.

The idea for how to fix this is from Linus. Basically, we do all of the fput()s at the top level by collecting all of the scm_fp_list objects hit by an fput(). Inside of the initial __scm_destroy() we keep running the list until it is empty.

Bug #128590.


Patch from mainstream, ported by Kostja (khorenko@).

This is a prereq patch for next one.

[AF_UNIX]: Rewrite garbage collector, fixes race.

Throw out the old mark & sweep garbage collector and put in a refcounting cycle detecting one.

The old one had a race with recvmsg, that resulted in false positives and hence data loss. The old algorithm operated on all unix sockets in the system, so any additional locking would have meant performance problems for all users of these.

The new algorithm instead only operates on «in flight» sockets, which are very rare, and the additional locking for these doesn't negatively impact the vast majority of users.

In fact it's probable, that there weren't *any* heavy senders of sockets over sockets, otherwise the above race would have been discovered long ago.

The patch works OK with the app that exposed the race with the old code. The garbage collection has also been verified to work in a few simple cases.

Bug #128590.


Patch from mainstream, ported by Kostja (khorenko).

net: unix: fix inflight counting bug in garbage collector

Previously I assumed that the receive queues of candidates don't change during the GC. This is only half true, nothing can be received from the queues (see comment in unix_gc()), but buffers could be added through the other half of the socket pair, which may still have file descriptors referring to it.

This can result in inc_inflight_move_tail() erronously increasing the «inflight» counter for a unix socket for which dec_inflight() wasn't previously called. This in turn can trigger the «BUG_ON(total_refs < inflight_refs)» in a later garbage collection run.

Fix this by only manipulating the «inflight» counter for sockets which are candidates themselves. Duplicating the file references in unix_attach_fds() is also needed to prevent a socket becoming a candidate for GC while the skb that contains it is not yet queued.

Bug #128590.


Patch from Konstantin (khlebnikov@):

fix endless loop in x86_64 arch_get_unmapped_area_topdown

if we hit in hole between NULL address and first vma in mm, and requested len equal to this hole size — addr become NULL and we got endless loop.

this patch change loop exit condition, and terminate loop in this case.

the same condition used in newer kernel and mainstream.

Bug #119137.


Patch from Denis (den@):

rtnl: return EOPNOTSUPP when RTNL message is not supported by the kernel

More precicusely, return EOPNOTSUPP for RTM_NEWLINK only and EINVAL for the rest. This is a SUSE11 compatibility.

Bug #115250.

Signed-off-by: Denis V. Lunev <>


Patch from Denis (den@), ported by Kostja (khorenko):

ub: get parent UB instead of sub-group one to calculate usage

When MEMINFO="privvmpages:1" on SLM enabled system, one should get actual VE usage from a whole group rather than sub-group. Follow UBC hierarchy to the root for this.

Bug #118541.


Patch from Kostja (khorenko@):

fix typo: parameters order in printk

Typo was in patch for OpenVZ Bug #760.