Due to the relentless obsolescence of digital formats
and platforms, along with the ten-year life spans of digital storage media
such as magnetic tape and CD-ROMs, there has never been a time of such
drastic and irretrievable information loss as right now. If that claim
seems extravagant, consider the number of literate people in the world
and how much work is "knowledge" work, which increasingly means
computer work. The world economy itself has become digital. This is a civilizational
Information lives in two major dimensions-space and time. With digitization
and the Internet, all information is now potentially global. The space
dimension for data will keep exploding, but the time dimension is shrinking.
The half-life of data is currently about five years. There is no improvement
in sight because the attention span of the high-tech industry can only
reach as far as next year's upgrade, and its products reflect that. But
civilizational time is measured in centuries. A major disconnect is in
progress. Loss of cultural memory has become the price of staying perfectly
Nothing Like Acid-Free Paper
The loss is already considerable. You may have noticed that any files you
carefully recorded on 5l/4" floppy disks a few years ago are now unreadable.
Not only have those disk drives disappeared, but so have the programs,
operating systems, and machines that wrote the files (WordStar in CP/M
on a Kaypro?). Your files may be intact, but they are as unrecoverable
as if they never existed. The same is true of Landsat satellite data from
the 1960s and early 1970s on countless reels of now-unreadable magnetic
tape. All of the early pioneer computer work at labs such as MIT Artificial
Intelligence is similarly lost, no matter how carefully it was recorded
at the time. The pioneer work of today is just as doomed, because the rate
of digital obsolescence keeps accelerating, and the serious search for
a long-term strategy for storage has yet to begin.
There is still nothing in the digital world like acid-free paper. Former
University of California, Berkeley librarian Peter Layman points out, "When
we know a book is important, we...tell a publisher: print it on acid-free
paper. And with decent library air-conditioning it will last 500. years
If you want to preserve something else, like a newspaper, microfilm it.
We know there is a 500-year life to microfilm properly cared for. But what
do we do with digital documents? What we do today is we refresh them every
time there's a change in technology-or every 18 months, whichever comes
first. This is an expensive approach! We need a digital equivalent to microfilm,
a 500-year solution."
Losing Our Collective Memory
Supercomputer designer Danny Hillis also put the problem in perspective
at a conference on "Digital Continuity" held at the Getty Center
in Los Angeles in February 1998. "Back when information was hard to
copy" said Hillis, "people valued the copies and took care of
them. Now, copies are so common as to be considered worthless, and very
little attention is given to preserving them over the long term."
He noted that thousands of years ago we recorded important matters on clay
and stone that lasted thousands of years. Hundreds of years ago we used
parchment that lasted hundreds of years.
As a result, Hillis suggests, we are now in a period that may be a maddening
blank to future historians--a Dark Age--because nearly all of our art,
science, news, and other records are being created and stored on media
that we know can't outlast even our own lifetimes. We arrived at this situation
partly because digitization otherwise offers so many profound benefits.
We can now store, search, and cross-correlate literally everything. In
fact, according to estimates by Bellcore's Michael Lesk, who calculated
the total amount of data there is in the whole world, storage has now surpassed
data, probably permanently. There is more room to store stuff than there
is stuff to store. We need never again throw anything away. That particular
role of archivists and curators has become obsolete.
A New History
If raw data can be kept accessible as well as stored, history will become
a different discipline, closer to a science, because it can use marketers'
data-mining techniques to detect patterns hidden in the data. You could
fast-forward history, tease out correlated trends, zoom in on particular
moments. Watershed events might be studied in the original--the actual
force-feedback virtual-reality experiment that showed a new way to fold
a protein that transformed medicine, plus the lab surveillance camera images
of the event, as well as the phone calls, E-mail, and web searches that
surrounded the discovery. Note, there are both passive and active digital
records in that example.
The E-mail, phone calls, and photographs are passive; all you have to do
is keep them readable. But the virtual-reality experiment is active--it
was probably run on some experimental one-off piece of cobbled together
lab equipment. Without that complex of then-current hardware, you can't
replay the experiment. Preservation of such hardware-dependent digital
experiences is nearly impossible. For instance, the elaborate virtual-reality
model of Berlin that has been used for planning that city for years will
almost certainly be lost, as will the U.S. Army's famous computer model
of the pivotal tank battle in the Gulf War.
Storage Vs. Preservation
Digital storage is easy; digital preservation is not.
Preservation means keeping the stored information cataloged, accessible,
and usable on current media, which requires constant effort and expense.
Furthermore, while contemporary information has economic value and pays
its way, there is no business case for archives, so the creators or original
collectors of digital information rarely have the incentive-- or skills,
or continuity-to preserve their material. It's a task for long-lived nonprofit
organizations such as libraries, universities, and government agencies,
which may or may not have the mandate and funding to do the job. University
of California, Berkeley, archivist Howard Besser points out that digital
artifacts are increasingly complex to revive. For starters you've got the
viewing problem--a book displays itself, but the contents of a CDROM are
invisible until opened on something. Then there's the scrambling problem--the
innumerable ways that files are compressed and, increasingly, encrypted.
There are interrelationship problems--hypertext or web site links active
in the original but now dead ends. And translation problems occur in the
way different media behave--just as a photograph of a painting is not the
same experience as the painting, looking through a screen is not the same
as experiencing an immersion medium; watching a game is not the same as
playing the game. For all these reasons, archivists now encourage tagging
all digital artifacts with a rich supply of "metadata" --digital
information about the artifact telling what it is and how it works. A number
of professional organizations are working on setting consistent (and expandable)
standards for metadata. Gradually a set of "best practices" is
emerging for ensuring digital continuity: use the most common file formats,
avoid compression where possible, keep a log of changes to a file, employ
standard metadata, make multiple copies, and so forth.
And don't forget atomic backup--while the durability of bits is still moot,
the atoms in ink on paper have great stability.
Net: Haven Or Horror?
What about the net? Everything can be dumped there, everything can be retrieved
there, and fairly universal standards such as TCP/IP emerge there. New
talents emerge there as well. The net is responsible for the legions of
"emulators" who keep finding new ways to revive old games such
as "Pac Man" and "Frogger" for play on new computers.
Vernacular archivists such as the emulators are one hopeful wave of the
future. Massively distributed research like that can convene enormous power.
Another example: thanks to the current interest in family genealogy, the
thousands of users of a program called "Family Tree Maker" are
linking their research into a "World Family Tree" on the web.
So far it has tied together 75,000 family trees, a total of 50 million
names. The goal, once unthinkable, is to eventually document and link every
named human who ever lived. With the net, preservation goes fractal--infinitely
branched instead of centralized. But that leaves the question: Is the net
itself profoundly robust and immortal, or is it the most ephemeral digital
artifact of all? At present the web has a "memory" of about two
months, says web archivist Brewster Kahle.
What is the solution? We cannot reverse the digitization of everything.
What we have to do is convert the design of software from brittle to resilient,
from heedlessly headlong to responsible, and from time-corrupted to time-embracing.
These are intractable problems. For certain, none of them can be solved
in a year, but all of them can yield to decades of focused work, if the
health of civilization is understood to be at stake.
Thinking In The Long Term
"The real problem" says computer designer Hillis, "is not
technological. We have the technical understanding to solve problems such
as digital degradation. What we don't have yet in our digital culture is
the habit of long-term thinking that supports preservation .... In the
early 2000s people will realize that we're not at the end of something-we're
at the beginning. There really will be a year 3000 and 4000 and so on.
Once that idea is more widely accepted, the engineers who are thinking
about the next digital medium will naturally think about how it lasts ....
Hillis is more of an optimist than I am. I think it will take insistent,
knowledgeable, unremitting demand from librarians and archivists for long-lived
digital media, or the engineers will never take the problem seriously enough.
If that happens, then librarian Lyman's hope might be realized: "I'd
say that what's motivating us is not just a fear of losing what we have,
but of being able to build something new out of this digital rubble that
we've created-to build something that's really quite amazing, that may
be as much of a landmark on our civilization as the Library of Alexandria
was in the ancient world."
From Nick Reid
Stewart Brand's article underlines a signifcant but often
covered aspect of the largely vendor-marketing driven systems obsolescence
that renders much of current official history essentially unsaveable in
the long term.
What is far less often mentioned, although someone (not
me) should be mentioning it in fora such as yours, is the massive theft
of personal history that is incurred by the same corrupt process.
While the records our grandparents kept of their thoughts
and major life events were fewer and far more sensorially limited than
those common today - handwritten diaries, black and white photos, etc -
what they put down when they were courting was readily available when their
granchildren were exploring their own adult relationships. The older people
become, the more important to their contentment such personal mementos
seem to be. But today's digital generation can look forward to a far sparser
trip down memory lane than their near ancestors had.