NetSlices: Scalable Packet Processing in User-Space at 10GbE Line Rate


Modern commodity operating systems do not provide developers with user-space abstractions for building high-speed packet processing applications. The conventional raw socket is inefficient and unable to take advantage of the emerging hardware, like multi-core processors and multi-queue network adapters. Unlike the conventional raw socket, the new NetSlice operating system abstraction tightly couples the hardware and software packet processing resources, and provides the application with control over these resources. To reduce shared resource contention, NetSlice performs domain specific, coarse-grained, spatial partitioning of CPU cores, memory, and NICs. Moreover, it provides a streamlined communication channel between NICs and user-space. Although backward compatible with the conventional socket API, the NetSlice API also provides batched (multi-) send / receive operations to amortize the cost of protection domain crossings. We show that complex user-space packet processors---like a protocol accelerator and an IPsec gateway---built from commodity components can scale linearly with the number of cores and operate at nominal 10Gbps network line speeds.




NetSlices kernel extension (module) and user-space apps


  1. Download and build the snapshot ('make'). Make sure you have the linux headers installed, e.g. in Ubuntu the package is called linux-headers-2.6.xx-xx where xx-xx should match your current kernel (type 'uname -r' to find it out). The kernel module was tested on kernel versions 2.6.20, 2.6.24, 2.6.27, and 2.6.28, found for example in stock Ubuntu 7.04, 8.04 and 8.10 respectively. The module was also ported to work on kernels versions up to 2.6.36, however it was not yet thoroughly tested in these configurations.

  2. You will need at least one 10GbE NIC with multiqueue capabilities. NetSlice was tested with the Myricom Myri-10G (10G-PCIE-8B) and the Intel 82598EB 10-Gigabit AT NICs. NetSlice requires the NICs be loaded/configured with multiqueue support (e.g. the myri10ge driver is loaded with myri10ge_max_slices="${SLICES}", where "${SLICES}" is the number of NetSlice contexts---for a 16 core machine, "${SLICES}" defaults to 8).

  3. The MSI-X interrupts from each NIC queue should be statically assigned exclusively to NetSlice contexts. You can simply use the script that is distributed in the NetSlice bundle (e.g. assuming your box forwards packets between two interfaces named eth4 and eth5: 'for eth in eth4 eth5; do bash "${eth}" 1 0; done').

  4. Load the kernel module ('sudo insmod ./netslice_mod.ko').

  5. Create NetSlice character devices if not already present (typically once created, they survive reboots since the kernel will assign the NetSlice kernel module with the same character device major number). To find out the major number assigned by the kernel type 'grep netslice_dev /proc/devices'. Assuming the major number is, say 250, create device special files: 'for ((minor=0;minor<"${SLICES}";minor++)); do mknod "/dev/netslice/node-${minor}" c 250 "${minor}"; done'.
  6. Start netslice user-space app(s) (e.g. 'sudo ./netslice_test -v 128 /dev/netslice/node-0'). Typically you will have to start one user-space app for each NetSlice context, e.g.: let LAST_SLICE="${SLICES}-1"; pexec -c -r `seq -s " " 0 ${LAST_SLICE}` -e slice -o - -u - 'sudo ./netslice_test -v 128 "/dev/netslice/node-${slice}"'.