Processor Controlled Off-Processor I/O
The performance of modern RISC processors on operating system code is well below application code performance. The kernel code implementing communication services across the network is not an exception. Modern networking technologies are characterized by a small packet size, which further increases the communication overhead. We took the approach of removing the kernel layer from the cross-machine communication path while still providing protection. The presence of a programmable communication processor on the network adapter made this experiment possible. The firmware running on the communication processor implements a Virtual Communication Machine (VCM); applications communicate with the VCM through shared memory without having to switch to kernel mode. Data is transferred directly between application buffers and the network without any intermediate buffering in the user or kernel spaces. The VCM architecture makes this possible; in particular, the VCM can be programmed to access any location in the address space of an application. The main processor controls the communication but it is not directly involved with it; as a consequence, the overhead on the main processor is very low. The design not only provides very low latencies, but also minimizes the effect of communication on the main processor data caches. We implemented the datagram subset of the Berkeley sockets interface on top of the VCM interface and integrated it with a user-level thread package. Multicast capabilities were added to the interface. Performance measured at both the VCM and socket layers is presented.
computer science; technical report
Previously Published As