Hypervisors
One of the main features of Cloud Computing is dynamic resource allocation. At a single machine level, one first creates virtual (abbreviated “v.”) machines on a physical machine. The union of the v. machines on the physical machines are then networked together and controlled by a master controller that does the resource allocations within the cloud.
In order to understand this, it is first necessary to understand what happens on a single machine. VMware’s implementation of this is a standard, and the following notes are taken primarily from wandering through their excellent website www.vmware.com to understand their hypervisors, ESX and the newer and far better incarnation ESXi. These notes are a little more like general requirements for a general hypervisor than the VMware hypervisors’ functionality; hence, these notes don’t correspond precisely to their specific capabilities.
A hypervisor installs directly on top of the physical server and creates multiple virtual machines that can run simultaneously, sharing the physical resources of the underlying server. For efficiency, security, and reliability, a hypervisor needs to be quite small and simple. VMware’s ESXi is less than 100MB. On the other hand, it needs to run scripts for automated installations and maintenance, and it needs a remote management interface, all of which adds a little to the complexity. These notes are broken down into:
- Physical machine support
- Virtual machines and other virtual support
- Resource management
- Operating system support
- Security
- Hardware realities
[Note: VMware ESXi is available as a free download for deployment as a single-server virtualization solution. One can use the freely available VMware vSphere™ Client to command VMware ESXi to create and manage its virtual machines.]
1. Physical machine support (ESXi limits are in parentheses):
- 64 bit support on rack, tower and blade servers from Dell, Fujitsu Siemens, HP, IBM, NEC, Sun Microsystems and Unisys.
- Physical cores (64)
- Physical memory (1TB)
- Transactions/sec (8,900)
- I/O ops/sec (200,000)
- iSCSI, 10Gb Ethernet, InfiniBand, Fibre Channel, and converged network adapters
- SAN multipathing
- Support storage systems from all major vendors.
- Internal SATA drives, Direct Attached Storage (DAS), Network Attached Storage (NAS) and both fibre channel SAN and iSCSI SAN.
- Remote (security) management with granular visibility into v. machine hw resources, e.g. memory, v. cores, keyboards, disk, and IO. Allows security against viruses, Trojans, and key-loggers.
- Energy efficiency with dynamic voltage and frequency Scaling and support for Intel SpeedStep and AMD PowerNow!
- Support for next-generation virtualization hardware assist technologies such as AMD’s Rapid Virtualization Indexing® or Intel’s Extended Page Tables.
- Support large memory pages to improve efficiency of memory access for guest operating systems.
- Support performance offload technologies, e.g., TCP Segmentation Offloading (TSO), VLAN, checksum offloading, and jumbo frames to reduce the CPU overhead associated with processing network I/O.
- Support virtualization optimized I/O performance features such as NetQueue, which significantly improves performance in 10 Gigabit Ethernet virtualized environments.
2. Virtual Support
- v. machines (256) per physical machine
- v. RAM (255 MB) per v. machine
- v. SMP (each v. machine can use up to 8 cores simultaneously)
- Select v. machine direct access to physical IO and SAN LUNs.
- v. disks (limit ???)
- v. file systems allow multiple v. machines to access a single v. disk file simultaneously.
- Remote boot a v. machine from a v. disk on a SAN.
- Multiple v. NICs per v. machine each with its own IP and MAC address.
- v. InfiniBand channels between applications running on v. machines. Of course, these v. channels do not have to be restricted to the same physical machine.
- v. switches to support v. networks and v. LANs among v. machines.
- Support Linux and Microsoft v. clusters of v. machines.
3. Resource Management
Dynamically manage resource allocations while v. machines are running, subject to minimum, maximum, and proportional resource shares for physical CPU, memory, disk, and network bandwidth.
Intelligent process scheduling and load balancing across all available physical CPUs
Allow physical page sharing by v. machines so that a physical page is not duplicated across the system.
Shift RAM dynamically from idle v. machines to active ones, forcing the idle ones to use their own paging areas and to release memory for the active ones.
Allocate physical network bandwidth to network traffic between v. machines to meet peak and average bandwidth and burst size constraints.
Provide “priority” network access to “critical” v. machines.
Support failover for v. NIC’s , for v. machines, and for v. network connections to enhance availability.
4. Operating System Support
A hypervisor should support a wide variety of guest operating systems on its v. machines. It is reasonable to require an OS modification (to get a v. OS) to run on a v. machine; however, applications that don’t directly access the hardware should run on a v. OS without any modification. The v. operating systems should be allowed to call the hypervisor with a special call to improve performance and efficiency or to avoid difficult-to-virtualize aspects of a physical machine. Such calls are usually hypervisor specific and are currently called “paravirtualization.”
VMware has a “standard” paravirtualization interface called its “Virtual Machine Interface” that can be supported by an OS to allow a single binary version of the OS to run either on native hardware or on a VMware hypervisor.
Most networking applications are at the application layer and hence will run without change. For Cloud Computing, Infiniband read and write commands to send and receive from two applications’ buffer pairs are such. More on this in a subsequent post.
5. Security
There should be v. hardware support to check digitally signed v. kernel modules upon load.
There should also be v. memory integrity techniques at load-time coupled with v. hardware capabilities to protect the v. OS from common buffer-overflow attacks used to exploit running code.
The hypervisor should secure iSCSI devices from unwanted intrusion by requiring that either the host or the iSCSI initiator be authenticated by the iSCSI device or the target whenever the host attempts to access data on the target LUN.
The hypervisor should disallow promiscuous mode sniffing of network traffic, MAC address changes, and forged source MAC transmits.
Once the small hypervisor is well secured, running all the user applications on v. machines should provide GREATER security than if they were running on real servers!
6. Hardware Realities
Of course virtual machines go back at least as far as the VM operating system on the IBM 360, and Wikipedia has a nice history of them and also a nice discussion of paravirtualization and its predecessors. The early x86 architecture was at best challenging and at worst impossible to virtualize, and there were multiple attempts and academic papers on this topic. Intel made things worse with the ugly 80286 processor, but tried somewhat to fix things with the 80386. When at Digital Equipment Corporation, we had a meeting at Intel with the chief 80386 designer who was cognizant of VM efforts, but only mildly sympathetic. In any case, around 2005-2006 both Intel and AMD made some serious efforts to support virtual machines. Neither Intel nor AMD implement this support in all of their processor products, presumably preferring to charge more for those processors that have the support. Caveat Emptor.
-gayn