Difference between revisions of "Containers/Network virtualization"

From OpenVZ Virtuozzo Containers Wiki
Jump to: navigation, search
(Approaches)
(Requirements)
 
(4 intermediate revisions by 3 users not shown)
Line 7: Line 7:
  
 
== Approaches ==
 
== Approaches ==
 +
 
=== Virtualization on the 2nd level (OpenVZ) ===
 
=== Virtualization on the 2nd level (OpenVZ) ===
'''Requirements''':
+
 
 +
==== Requirements ====
  
 
The main requirement is that containers should have close to standalone servers networking capabilities. In details:
 
The main requirement is that containers should have close to standalone servers networking capabilities. In details:
Line 20: Line 22:
  
  
'''Current implementation''':
+
==== Current implementation ====
  
 
For input packets context switching is performed in netif_receive_skb(), inherited from the device  context. For output, context is inherited from the socket one.
 
For input packets context switching is performed in netif_receive_skb(), inherited from the device  context. For output, context is inherited from the socket one.
  
 
=== Virtualization on the 3d level (IBM) ===
 
=== Virtualization on the 3d level (IBM) ===
'''Requirements''':
+
 
# One can ran servers in several containers listening on *:port without conflict and __without__ forcing the bind to use the IP address assigned to the container;
+
==== Requirements ====
 +
 
 +
# One can run servers in several containers listening on *:port without conflict and __without__ forcing the bind to use the IP address assigned to the container;
 
# The source address will be filled with the container IP address;
 
# The source address will be filled with the container IP address;
 
# Keep sockets isolated by namespace;
 
# Keep sockets isolated by namespace;
Line 33: Line 37:
 
# have broadcast and multicast working.
 
# have broadcast and multicast working.
  
'''Current implementation''':
+
==== Current implementation ====
  
 
For input packets context switching is inherited from the routing entry, for output - inherited from the socket one.  
 
For input packets context switching is inherited from the routing entry, for output - inherited from the socket one.  
  
=== Socket virtualization (Linux-VServer) ===
+
=== Sockets isolation (Linux-VServer) ===
'''Requirements''':
+
 
# implementation overhead for established tcp connections should be zero;
+
==== Requirements ====
# FIXME
+
 
 +
# all interfaces and IPs are visible on the host
 +
# routing and iptables is configured on the host
 +
# guest has a subset of IPs assigned for 'binding'
 +
# source ip (of guest packets) is within the assigned set
 +
# 'local' guest traffic is isolated from other guests
 +
# no measurable overhead on packet routing
 +
# normal routing not impaired (same behaviour as without)
 +
# Guest-Guest and Guest-Host traffic via Loopback
 +
 
 +
==== Current implementation ====
  
'''Current implementation''':
+
Network Context with 'assigned' set of IPs, which are used for 'collision' checks at bind
 +
time, 'source' checks at send time and 'destination' checks at receive time. The first
 +
assigned IPs is handled special as it is used for routing decisions outside the IP set.
 +
Loopback traffic isolation is done via IP 'remapping'.
  
There is no context switching for packets at all, checks are performed between process and socket contexts.
 
  
 
== Virtualization table ==
 
== Virtualization table ==
This is a summary table in order to show which core networking objects are virtualized/isolated in above approaches or not.
+
This is a summary table in order to show which core networking objects are virtualized/isolated in the above approaches and which are not.
  
 
{| class="wikitable"
 
{| class="wikitable"
 
! width="20%" | Virtualization approach
 
! width="20%" | Virtualization approach
! width="13%" | network devices
+
! width="10%" | network devices
! Width="13%" | routing tables
+
! Width="10%" | routing tables
! Width="13%" | network sockets
+
! Width="10%" | network sockets
! Width="13%" | netfilters
+
! Width="10%" | loopback
 +
! Width="10%" | netfilters
 
|-
 
|-
| 2d level virtualization || v || v/i || v || v  
+
| 2d level virtualization || v || v/i || v || v || v  
 
|-
 
|-
| 3d level virtualization || - || i || i || -
+
| 3d level virtualization || - || i || i || i || -
 
|-
 
|-
| bind filtering || - || - || i || -
+
| sockets isolation || - || - || i || - || -
 
|}
 
|}
  

Latest revision as of 16:47, 14 January 2010

There are a number of approaches to the network virtualization, caused by different requirements for different usages. This page is made in order to summarize them and create solution suitable for all.

Usages[edit]

Current known usages are:

  • Virtual Environments - complete OS environment, with it's own users, groups, filesystems and devices;
  • Application Containers - partly isolated environment with application inside.

Approaches[edit]

Virtualization on the 2nd level (OpenVZ)[edit]

Requirements[edit]

The main requirement is that containers should have close to standalone servers networking capabilities. In details:

  1. containers should have own loopback;
  2. containers should have ability to setup their own level 3 addresses;
  3. containers should have ability to sniff their traffic;
  4. containers should have ability to setup their own routes;
  5. containers should have ability to receive multicast/broadcast packets;
  6. containers should have their own netfilters;
  7. containers should have at least one level 2 device;


Current implementation[edit]

For input packets context switching is performed in netif_receive_skb(), inherited from the device context. For output, context is inherited from the socket one.

Virtualization on the 3d level (IBM)[edit]

Requirements[edit]

  1. One can run servers in several containers listening on *:port without conflict and __without__ forcing the bind to use the IP address assigned to the container;
  2. The source address will be filled with the container IP address;
  3. Keep sockets isolated by namespace;
  4. have the loopback isolated;
  5. have the performance near to native as possible;
  6. have broadcast and multicast working.

Current implementation[edit]

For input packets context switching is inherited from the routing entry, for output - inherited from the socket one.

Sockets isolation (Linux-VServer)[edit]

Requirements[edit]

  1. all interfaces and IPs are visible on the host
  2. routing and iptables is configured on the host
  3. guest has a subset of IPs assigned for 'binding'
  4. source ip (of guest packets) is within the assigned set
  5. 'local' guest traffic is isolated from other guests
  6. no measurable overhead on packet routing
  7. normal routing not impaired (same behaviour as without)
  8. Guest-Guest and Guest-Host traffic via Loopback

Current implementation[edit]

Network Context with 'assigned' set of IPs, which are used for 'collision' checks at bind time, 'source' checks at send time and 'destination' checks at receive time. The first assigned IPs is handled special as it is used for routing decisions outside the IP set. Loopback traffic isolation is done via IP 'remapping'.


Virtualization table[edit]

This is a summary table in order to show which core networking objects are virtualized/isolated in the above approaches and which are not.

Virtualization approach network devices routing tables network sockets loopback netfilters
2d level virtualization v v/i v v v
3d level virtualization - i i i -
sockets isolation - - i - -

Legend:

  • 'v' - virtualized
  • 'i' - isolated
  • '-' - neither virtualized nor isolated