“The Wayback Machine is humongous, and getting humongouser”
Amazing piece in the New Yorker (they’re on fire at the moment) about the Internet Archive/Wayback Machine and the nature of ‘history’ on the web. The Archive is now based in a former Greek church in the Presidio near San Francisco’s Golden Gate Bridge full of computers that crawl and snapshot as much of the web as they can, keeping a permanent record that some would rather be forgetten.
The problem of archiving the web dates back to 1991 when Tim Berners-Lee decided on the the protocols for the web — he considered a time axis but decided against it:
“One reason it was never developed was the preference for the most up-to-date information: a bias against obsolescence. But the chief reason was the premium placed on ease of use. “We were so young then, and the Web was so young,” Berners-Lee told me. “I was trying to get it to go. Preservation was not a priority. But we’re getting older now.””
The Internet Archive is now trying to compensate for that weakness but it’s certainly not straightforward. There are plenty of examples of events that have been manipulated after the effect — some deliberately some by mistake. It’s also just interesting to search the internet with another variable. Here’s my personal blog just over ten years ago for example — as soon as you start clicking on links it takes you to a whole different web that doesn’t exist any more.
As Alexis Madrigal pointed out in a discussion on twitter about the New Yorker piece “what about apps?”. At the moment there’s nobody keeping a history of a lot of things that we use on the internet — and certainly not independently. If Facebook or any other huge company disappears — which isn’t impossible let’s face it — there may well be no archive of moments that are very precious to people. We’re just at the start of working out how to deal with that.
Image by Todd some rights reserved.