Difference between revisions of "Containers/Network virtualization"

From OpenVZ Virtuozzo Containers Wiki
Jump to: navigation, search
m (fixed cat.)
(Requirements)
 
(7 intermediate revisions by 3 users not shown)
Line 1: Line 1:
[[Category:Containers]]
+
There are a number of approaches to the network virtualization, caused by different requirements for different usages. This page is made in order to summarize them and create solution suitable for all.
 
 
There are a number of approaches to the network virtualization, caused by different requirements for different usages. This page is made in order to summarize them and create solution possibly suitable for all.
 
  
 
== Usages ==
 
== Usages ==
Line 9: Line 7:
  
 
== Approaches ==
 
== Approaches ==
* '''virtualization on the 2nd level (OpenVZ)''';
+
 
: For input packets context switching is performed in device xmit code, requires virtual device for performing. For output, context is inherited from socket one.
+
=== Virtualization on the 2nd level (OpenVZ) ===
* '''virtualization on the 3d level (IBM)''';
+
 
: For input packets context switching is performed in routing code, for output - inherited from socket one.
+
==== Requirements ====
* '''socket virtualization (Linux-VServer)'''.
+
 
: There is no context switching for packets at all, checks are performed between process and socket contexts.
+
The main requirement is that containers should have close to standalone servers networking capabilities. In details:
 +
# containers should have own loopback;
 +
# containers should have ability to setup their own level 3 addresses;
 +
# containers should have ability to sniff their traffic;
 +
# containers should have ability to setup their own routes;
 +
# containers should have ability to receive multicast/broadcast packets;
 +
# containers should have their own netfilters;
 +
# containers should have at least one level 2 device;  
 +
 
 +
 
 +
==== Current implementation ====
 +
 
 +
For input packets context switching is performed in netif_receive_skb(), inherited from the device context. For output, context is inherited from the socket one.
 +
 
 +
=== Virtualization on the 3d level (IBM) ===
 +
 
 +
==== Requirements ====
 +
 
 +
# One can run servers in several containers listening on *:port without conflict and __without__ forcing the bind to use the IP address assigned to the container;
 +
# The source address will be filled with the container IP address;
 +
# Keep sockets isolated by namespace;
 +
# have the loopback isolated;
 +
# have the performance near to native as possible;
 +
# have broadcast and multicast working.
 +
 
 +
==== Current implementation ====
 +
 
 +
For input packets context switching is inherited from the routing entry, for output - inherited from the socket one.  
 +
 
 +
=== Sockets isolation (Linux-VServer) ===
 +
 
 +
==== Requirements ====
 +
 
 +
# all interfaces and IPs are visible on the host
 +
# routing and iptables is configured on the host
 +
# guest has a subset of IPs assigned for 'binding'
 +
# source ip (of guest packets) is within the assigned set
 +
# 'local' guest traffic is isolated from other guests
 +
# no measurable overhead on packet routing
 +
# normal routing not impaired (same behaviour as without)
 +
# Guest-Guest and Guest-Host traffic via Loopback
 +
 
 +
==== Current implementation ====
 +
 
 +
Network Context with 'assigned' set of IPs, which are used for 'collision' checks at bind
 +
time, 'source' checks at send time and 'destination' checks at receive time. The first
 +
assigned IPs is handled special as it is used for routing decisions outside the IP set.
 +
Loopback traffic isolation is done via IP 'remapping'.
 +
 
  
 
== Virtualization table ==
 
== Virtualization table ==
This is a summary table in order to show which core networking objects are virtualized/isolated in above approaches or not.
+
This is a summary table in order to show which core networking objects are virtualized/isolated in the above approaches and which are not.
  
 
{| class="wikitable"
 
{| class="wikitable"
 
! width="20%" | Virtualization approach
 
! width="20%" | Virtualization approach
! width="13%" | network devices
+
! width="10%" | network devices
! Width="13%" | routing tables
+
! Width="10%" | routing tables
! Width="13%" | network sockets
+
! Width="10%" | network sockets
! Width="13%" | netfilters
+
! Width="10%" | loopback
 +
! Width="10%" | netfilters
 
|-
 
|-
| 2d level virtualization || v || v/i || v || v  
+
| 2d level virtualization || v || v/i || v || v || v  
 
|-
 
|-
| 3d level virtualization || - || i || i || -
+
| 3d level virtualization || - || i || i || i || -
 
|-
 
|-
| bind filtering || - || - || i || -
+
| sockets isolation || - || - || i || - || -
 
|}
 
|}
  
Line 37: Line 84:
 
* 'i' - isolated
 
* 'i' - isolated
 
* '-' - neither virtualized nor isolated
 
* '-' - neither virtualized nor isolated
 +
 +
[[Category:Containers]]

Latest revision as of 16:47, 14 January 2010

There are a number of approaches to the network virtualization, caused by different requirements for different usages. This page is made in order to summarize them and create solution suitable for all.

Usages[edit]

Current known usages are:

  • Virtual Environments - complete OS environment, with it's own users, groups, filesystems and devices;
  • Application Containers - partly isolated environment with application inside.

Approaches[edit]

Virtualization on the 2nd level (OpenVZ)[edit]

Requirements[edit]

The main requirement is that containers should have close to standalone servers networking capabilities. In details:

  1. containers should have own loopback;
  2. containers should have ability to setup their own level 3 addresses;
  3. containers should have ability to sniff their traffic;
  4. containers should have ability to setup their own routes;
  5. containers should have ability to receive multicast/broadcast packets;
  6. containers should have their own netfilters;
  7. containers should have at least one level 2 device;


Current implementation[edit]

For input packets context switching is performed in netif_receive_skb(), inherited from the device context. For output, context is inherited from the socket one.

Virtualization on the 3d level (IBM)[edit]

Requirements[edit]

  1. One can run servers in several containers listening on *:port without conflict and __without__ forcing the bind to use the IP address assigned to the container;
  2. The source address will be filled with the container IP address;
  3. Keep sockets isolated by namespace;
  4. have the loopback isolated;
  5. have the performance near to native as possible;
  6. have broadcast and multicast working.

Current implementation[edit]

For input packets context switching is inherited from the routing entry, for output - inherited from the socket one.

Sockets isolation (Linux-VServer)[edit]

Requirements[edit]

  1. all interfaces and IPs are visible on the host
  2. routing and iptables is configured on the host
  3. guest has a subset of IPs assigned for 'binding'
  4. source ip (of guest packets) is within the assigned set
  5. 'local' guest traffic is isolated from other guests
  6. no measurable overhead on packet routing
  7. normal routing not impaired (same behaviour as without)
  8. Guest-Guest and Guest-Host traffic via Loopback

Current implementation[edit]

Network Context with 'assigned' set of IPs, which are used for 'collision' checks at bind time, 'source' checks at send time and 'destination' checks at receive time. The first assigned IPs is handled special as it is used for routing decisions outside the IP set. Loopback traffic isolation is done via IP 'remapping'.


Virtualization table[edit]

This is a summary table in order to show which core networking objects are virtualized/isolated in the above approaches and which are not.

Virtualization approach network devices routing tables network sockets loopback netfilters
2d level virtualization v v/i v v v
3d level virtualization - i i i -
sockets isolation - - i - -

Legend:

  • 'v' - virtualized
  • 'i' - isolated
  • '-' - neither virtualized nor isolated