Like the projects at Google that have come out of 20% time, what people are supposed to be working on at Amazon sometimes could be less important than what they played with on the side.
I was working on a few projects, but I wanted to step out and learn more parts of the code. Just reading code gets to be dull. I needed a specific task in mind, a purpose that forced me to shine my light into the back corners of the source.
So, in idle moments, I wandered off looking for performance optimizations. Focusing on the high traffic pages -- home page, book detail pages, search results -- I asked, where was big bad obidos spending its time?
I turned up some interesting tidbits. The first thing I found had to do with shopping carts.
When you walk into a the grocery store, the first thing you probably do is grab a shopping cart. Similarly, the first thing Amazon did when someone appeared at our store was hand them a shopping cart, reserving a little bit of space in our database for them to store all their virtual loot.
However, grocery stores don't have to contend with robot hordes or other window shoppers. If they did, they would have to have a lot more shopping carts around, and almost all those shopping carts would be empty.
Given all the looky-loos, it makes more sense for Amazon to wait a bit and quickly slip a shopping cart into your hands when you grab the first thing you want to buy.
This little change helped more than you might think. All those shopping carts add up.
But a bigger issue was the real time availability lookups. When you looked at a book at Amazon, the site went off to the warehouse, rummaged among the shelves, and checked if we had any copies. If it turned up bupkis, it checked how quickly we could order the book. All in real time.
This turned out to be the single most expensive operation on a book detail page. Ugly business, checking availability.
But, do you really need to know the availability right now? Maybe knowing what it was N minutes ago is okay. Huh, right, cache the data. It's okay if it is a little stale.
Because I was doing this on my own time, I started playing with some less obvious methods of doing availability caching. I thought that, given how much this would be hammered by the site, I might try to find a way to minimize locking. I also thought that I might be able to load the cache preemptively, so there would be no delays to shoppers on the site when refreshing the parts of the cache.
I hacked up something that seemed to worked well. In tests, latency to a shopper on the website dropped from entirely too long to very near zero. I was starting to talk to a few other people about the prototype, asking what they thought, seeing how it could be improved.
Right about then, several other people were working on a major redesign of the Amazon site, some combination of an extreme makeover and new features. I was approached by someone who wanted to show book availability on search result pages, something that was completely impossible without caching, but would be possible if my quick prototype could be dressed up and pushed out the door. And out it went.
Of course, all of this is obsolete at this point. Back when I built the inventory cache, it was designed for one small Seattle warehouse and a single big honkin' iron webserver. The massive inventories across several huge distribution centers -- some of which can swallow thirteen football fields and come back for seconds -- combined with a switch to a cluster of commodity webservers eventually made the old cache inappropriate. It lasted well beyond its time, so long that the heroics of its youth lay forgotten under the problems of its senility.
Today, I look back at the inventory cache as just one of many examples of the benefits of time to wander. 20% time has value well beyond its proportions.