The topics of Big Data and/or Cloud Computing have become way too big to cover in a one hour talk. Choosing a slice through these topics was a challenge, but putting a Low Latency spin on it seems to open up a formally taboo slice.
After an introduction to Big Data, Clouds, and Hadoop, the talk discusses low latency and basically concludes that if you need low latency, you probably need a private cloud.
That said, it appears that there are now people in the Hadoop infrastructure that are interested in low latency, and it will certainly be interesting to see how that shakes out. The optimization of keeping MapReduce processes in memory for subsequent execution on new data was added by Google in the early days of MapReduce. This helps, but doesn’t do everything, and it certainly doesn’t address the need for special hardware (switches, firewalls, fiber optic boosters and repeaters, motherboard paths to IO, caching, and of course nanosecond clocks.)
The bleeding edge folks for low latency in the cloud are the electronic high frequency securities traders. Where a millisecond advantage is worth 100’s of millions of dollars, no expense is spared on equipment or software. These traders have achieved mind-blowing results.
The talk also examines the new Open Compute server designs sponsored initially by Facebook. Unfortunately, Facebook has few needs for low latency. As with Google, Yahoo, et al, the only low latency need is at the user interface where the users expect “instant” results from their queries. Even the version 2 Open Compute designs seem a little lacking; although the AMD design is probably better than that of Intel in this arena.
The slides for the talk are here.