Brad Fitzpatrick from LiveJournal gave an April 2007 talk, "LiveJournal: Behind the Scenes", on scaling LiveJournal to their considerable traffic.
Slides (PDF) are available and have plenty of juicy details.
One thing I wondered after reviewing the slides -- and this is nothing but a crazy thought exercise on my part -- is whether they might be able to simplify the LiveJournal architecture (the arrows pointing everywhere shown on slide 4).
In particular, there seems to be a heavier emphasis on caching layers and read-only databases than I would expect. I tend to prefer aggressive partitioning, enough partitioning to get each database working set small enough to easily fit in memory.
I know little about LiveJournal's particular data characteristics, but I wonder if aggressive partitioning in the database layer might yield performance as high as a caching layer without the complexity of managing the cache consistency. Databases with data sets that fit in memory can be as fast as in-memory caching layers.
Likewise, I wonder if there would be benefit from dropping the read-only databases in favor of partitioned databases. With partitioned databases, the databases may be able to fit the data they do have entirely in memory; read-only replicas may still be hitting disk if the data is large.
Hitting disk is the ultimate performance killer. Developers often try to avoid database accesses because their database accesses hit disk and are dog slow. But, if you can make your database accesses not hit disk, then they can be blazingly fast, so fast that separate layers even might become unnecessary.
Again, wild speculation on my part. Everything I have seen indicates that Brad knows exactly what he is doing. Still, I cannot help myself from wondering, is there any way to get rid of all those layers?
[Slides found via Sergey Chernyshev]
Update: Another version of Brad's talk with some updates and changes.