The load on Amazon's website was remarkable.
As an ex-Amazonian said on one of my earlier posts, "There wasn't anything [Amazon] could buy ... that would handle the situations Amazon was encountering ... it was all ... written in-house out of necessity."
Not everything was written in-house, of course. Amazon needed a database. It used Oracle.
Unfortunately, our use of Oracle was also unusual. The strange usage patterns and high loads we inflicted on our Oracle databases exposed bugs and race conditions they had never seen before.
At one point in early 1998, the database had a prolonged outage, taking the Amazon.com website with it. The database tables were corrupted. Attempts to restore the database failed, apparently tickling the same bug immediately. We were dead in the water.
My DBA skills are even weaker than my sysadmin skills. Though some like Bob impressively leapt at the problem with pure debugging fury, there was little others like myself could do that would be helpful. We sat on the sidelines, watched, and waited.
It was like having a loved one in the hospital. Amazon was down and stumbled every time she tried to rise.
Through the herculean efforts of Amazon's DBAs and the assistance of Oracle's engineers, the problem was debugged. Oracle sent us a patch. The helpless fear lifted. Amazon came back to us at last.