The Heroic Age
The Doomsday Machine, or, "If you build it, will they still come ten years from now?"
What Medievalists working in digital media can do to ensure the longevity of their research
by Daniel Paul O'Donnell
University of Lethbridge, Canada
Yes, but the... whole point of the doomsday machine... is lost... if you keep it a secret!
It is, perhaps, the first urban myth of humanities computing: the Case of the Unreadable Doomsday Machine. In 1986, in celebration of the 900th anniversary of William the Conqueror's original survey of his British territories, the British Broadcasting Corporation (BBC) commissioned a mammoth 2.5 million electronic successor to the Domesday Book. Stored on two 12 inch video laser discs and containing thousands of photographs, maps, texts, and moving images, the Domesday Project was intended to provide a high-tech picture of life in late 20th century Great Britain. The project's content was reproduced in an innovative early virtual reality environment and engineered using some of the most advanced technology of its day, including specially designed computers, software, and laser disc readers (Finney 1986).
Despite its technical sophistication, however, the Domesday Project was a flop by almost any practical measure. The discs and specialized readers required for accessing the project's content turned out to be too expensive for the state-funded schools and public libraries that comprised its intended market. The technology used in its production and presentation also never caught on outside the British government and school system: few other groups attempted to emulate the Domesday Project's approach to collecting and preserving digital material, and no significant market emerged for the specialized computers and hardware necessary for its display (Finney 1986, McKie and Thorpe 2003). In the end, few of the more than one million people who contributed to the project were ever able to see the results of their effort.
The final indignity, however, came in March 2003 when, in a widely circulated story, the British newspaper The Observer reported that the discs had finally become "unreadable":
16 years after it was created, the £2.5 million BBC Domesday Project has achieved an unexpected and unwelcome status: it is now unreadable.
The special computers developed to play the 12in video discs of text, photographs, maps and archive footage of British life are -- quite simply -- obsolete.
As a result, no one can access the reams of project information -- equivalent to several sets of encyclopedias -- that were assembled about the state of the nation in 1986. By contrast, the original Domesday Book -- an inventory of eleventh-century England compiled in 1086 by Norman monks -- is in fine condition in the Public Record Office, Kew, and can be accessed by anyone who can read and has the right credentials. 'It is ironic, but the 15-year-old version is unreadable, while the ancient one is still perfectly usable,' said computer expert Paul Wheatley. 'We're lucky Shakespeare didn't write on an old PC.' (McKie and Thorpe 2003)
In fact, the situation was not as dire as McKie and Thorpe suggest. For one thing, the project was never actually "unreadable," only difficult to access: relatively clean copies of the original laser discs still survive, as do a few working examples of the original computer system and disc reader (Garfinkel 2003). For another, the project appears not to depend, ultimately, on the survival of its obsolete hardware. Less than ten months after the publication of the original story in The Observer, indeed, engineers at Camileon, a joint project of the Universities of Leeds and Michigan, were able to reproduce most if not quite all the material preserved on the original 12 inch discs using contemporary computer hardware and software (Camileon 2003a; Garfinkel 2003).
The Domesday Project's recent history has some valuable, if still contested, lessons for librarians, archivists, and computer scientists (see for example the discussion thread to Garfinkel 2003; also Camileon 2003b). On the one hand, the fact that engineers seem to be on the verge of designing software that will allow for the complete recovery of the project's original content and environment is encouraging. While it may not yet have proven itself to be as robust as King William's original survey, the electronic Domesday Project now at least does appear have been saved for the foreseeable future-even if "foreseeable" in this case may mean simply until the hardware and software supporting the current emulator itself becomes obsolete.
On the other hand, however, it cannot be comforting to realise that the Domesday Project required the adoption of such extensive and expensive restoration measures in the first place less than two decades after its original composition: the discs that the engineers at Camileon have devoted the last ten months to recovering have turned out to have less than 2% the readable lifespan enjoyed by their eleventh-century predecessor. Even pulp novels and newspapers published on acidic paper at the beginning of the last century have proved more durable under similarly controlled conditions. While viewed in the short term, digital formats do appear to offer a cheap method of preserving, cataloguing, and especially distributing copies of texts and other cultural material, their effectiveness and economic value as a means of long-term preservation has yet to be demonstrated completely.
These are, for the most part, issues for librarians, archivists, curators, computer scientists, and their associations: their solution will almost certainly demand resources, a level of technical knowledge, and perhaps most importantly, a degree of international cooperation far beyond that available to most individual humanities scholars (Keene 2003). In as much as they are responsible for the production of an increasing number of electronic texts and resources, however, humanities scholars do have an interest in ensuring that the physical record of their intellectual labour will outlast their careers. Fortunately there are also some specific lessons to be learned from the Domesday Project that are of immediate use to individual scholars in their day-to-day research and publication.
1. Do not write for specific hardware or software.
Many of the preservation problems facing the Domesday Project stem from its heavy reliance on specific proprietary (and often customized) hardware and software. This reliance came about for largely historical reasons. The Domesday Project team was working on a multimedia project of unprecedented scope, before the Internet developed as a significant medium for the dissemination of data. In the absence of suitable commercial software and any real industry emphasis on inter-platform compatibility or international standards, they were forced to custom-build or commission most of their own hardware and software. The project was designed to be played from a specially-designed Phillips video-disc player and displayed using custom-built software that functioned best on a single operating platform: the BBC Master, a now obsolete computer system which, with the related BBC Model B, was at the time far more popular in schools and libraries in the United Kingdom than the competing Macintosh, IBM PC, or long forgotten Acorn systems.
With the rise of the internet and the development of well-defined international standard languages such as Standard General Markup Language (SGML), HyperText Markup Language (HTML), eXtensible Markup Language (XML), and Hypermedia/Time-based Structuring Language (HyTime), few contemporary or future digital projects are likely to be as completely committed to a single specific hardware or software system as the Domesday Project. This does not mean, however, that the temptation to write for specific hardware or software has vanished entirely. Different operating systems allow designers to use different, often incompatible, shortcuts for processes such as referring to colour, assigning fonts, or referencing foreign characters (even something as simple as the Old English character thorn can be referred to in incompatible ways on Windows and Macintosh computers). The major internet browsers also all have proprietary extensions and idiosyncratic ways of understanding supposedly standard features of the major internet languages. It is very easy to fall into the trap of adapting one's encoding to fit the possibilities offered by non-standard extensions, languages, and features of a specific piece of hardware or software.
The very real dangers of obsolescence this carries with it can be demonstrated by the history of the Netscape <layer> and <ilayer> tags. Introduced with the Netscape 4.0 browser in early 1997, the <layer> and <ilayer> tags were proprietary extensions of HTML that allowed internet designers to position different parts of their documents independently of one another on the screen: to superimpose one piece of a text over another, to place text over (or under) images, or to remove one section of a line from the main textual flow and place it elsewhere (Netscape Communications Corporation 1997). The possibilities this extension opened up were exciting. In addition to enlivening otherwise boring pages with fancy typographic effects, the <layer> and <ilayer> elements also allowed web designers to create implicit intellectual associations among otherwise disparate elements in a single document. For example, one could use these tags to create type facsimiles of manuscript abbreviations by superimposing their component parts or create annotated facsimile editions by placing textual notes or transcriptions over relevant manuscript images.
As with the Domesday Project, however, projects that relied on these proprietary extensions for anything other than the most incidental effects were doomed to early obsolescence: the <layer> and <ilayer> tags were never adopted by the other major browsers and, indeed, were dropped by Netscape itself in subsequent editions of its Navigator browser. Thus an annotated manuscript facsimile coded in mid 1997 to take advantage of the new Netscape 4.0 <layer> and <ilayer> tags would, with the release of Netscape 5.0 at the end of 1999, already be obsolete. Users who wished to maintain the presumably intellectually significant implicit association between the designer's notes and images in this hypothetical case would need either to maintain (or recreate) a working older version of the Netscape browser on their system (an increasingly difficult task as operating systems themselves go through subsequent alterations and improvements) or to convert the underlying files to a standard encoding.
2. Maintain a distinction between content and presentation
A second factor promoting the early obsolescence of the Domesday Project was its emphasis on the close integration of content and presentation. The project was conceived of as a multimedia experience and its various components-text, video, maps, statistical information-often acquired meaning from their interaction, juxtaposition, sequencing, and superimposition (Finney 1986, "Using Domesday"; see also Camileon 2003b). In order to preserve the project as a coherent whole, indeed, engineers at Camileon have had to reproduce not only the project's content but also the look and feel of the specific software environment in which it was intended to be searched and navigated (Camileon 2003b).
Here too the Domesday Project designers were largely victims of history. Their project was a pioneering experiment in multimedia organisation and presentation and put together in the virtual absence of now standard international languages for the design and dissemination of electronic documents and multimedia projects -- many of which, indeed, were in their initial stages of development at the time the BBC project went to press.
More importantly, however, these nascent international standards involved a break with the model of electronic document design and dissemination employed by the Domesday Project designers. Where the Domesday Project might be described as an information machine -- a work in which content and presentation are so closely intertwined as to become a single entity -- the new standards concentrated on establishing a theoretical separation between content and presentation (see Connolly 1994 for a useful discussion of the distinction between "programmable" and "static" document formats and their implications for document conversion and exchange). This allows both aspects of an electronic to be described separately and, for the most part, in quite abstract terms which are then left open to interpretation by users in response to their specific needs and resources. It is this flexibility which helped in the initial popularization of the World Wide Web: document designers could present their material in a single standard format and, in contrast to the designers of the Domesday Project, be relatively certain that their work would remain accessible to users accessing it with various software and hardware systems -- whether this was the latest version of the new Mosaic browser or some other, slightly older and non-graphical interface like Lynx (see Berners-Lee 1989-1990 for an early summary of the advantages of multi-platform support and a comparison with early multi-media models such as that adopted by the Domesday Project). In recent years, this same flexibility has allowed project designers to accommodate the increasingly large demand for access to internet documents from users of (often very advanced) non-traditional devices: web activated mobile phones, palm-sized digital assistants, and of course aural screen readers and Braille printers.
In theory, this flexibility also means that where engineers responsible for restoring the Domesday Project have been forced to emulate the original software in order to recreate the BBC designer's work, future archivists will be able to restore current, standards-based, electronic projects by interpreting the accompanying description of their presentation in a way appropriate to their own contemporary technology. In some cases, indeed, this restoration may not even require the development of any actual computer software: a simple HTML document, properly encoded according to the strictest international standards, should in most cases be understandable to the naked eye even when read from a paper printout or text-only display.
In practice, however, it is still easy to fall into the trap of integrating content and presentation. One common example involves the use of table elements for positioning unrelated or sequential text in parallel "columns" on browser screens (see Chisholm, Vanderheiden, et al. 2000, § 5). From a structural point of view, tables are a device for indicating relations among disparate pieces of information (mileage between various cities, postage prices for different sizes and classes of mail, etc.). Using tables to position columns, document designers imply in formal terms the existence of a logical association between bits of text found in the same row or column -- even if the actual rationale for this association is primarily aesthetic. While the layout technique, which depends on the fact that all current graphic-enabled browsers display tables by default in approximately the same fashion, works well on desktop computers, the same trick can produce nonsensical text when rendered on the small screen of a mobile phone, printed by a Braille output device, or read aloud by an aural browser or screen-reader. Just as importantly, this technique too can lead to early obsolescence or other significant problems for future users. Designers of a linguistic corpus based on specific types of pre-existing electronic documents, for example, might be required to devote consider manual effort to recognising and repairing content arbitrarily and improperly arranged in tabular format for aesthetic reasons.
3. Avoid unnecessary technical innovation
A final lesson to be learned from the early obsolescence of the Domesday Project involves the hidden costs of technical innovation. As a pioneering electronic document, the Domesday Project was in many ways an experiment in multimedia production, publication, and preservation. In the absence of obvious predecessors, its designers were forced to develop their own technology, organisational outlines, navigation techniques, and distribution plans (see Finney 1986 and Camileon 2003a for detailed descriptions). The fact that relatively few other projects adopted their proposed solutions to these problems -- and that subsequent developments in the field led to a different focus in electronic document design and dissemination -- only increased the speed of the project's obsolescence and the cost and difficulty of its restoration and recovery.
Given the experimental status of this specific project, these were acceptable costs. The Domesday Project was never really intended as a true reference work in any usual sense of the word. Although it is full of information about mid-1980s Great Britain, for example, the project has never proved to be an indispensable resource for study of the period. While it was inspired by William the Conqueror's great inventory of post-conquest Britain, the Domesday Project was, in the end, more an experiment in new media design than an attempt at collecting useful information for the operation of Mrs. Thatcher's government.
We are now long past the day in which electronic projects can be considered interesting simply because they are electronic. Whether they are accessing a Z39.50 compliant library catalogue, consulting an electronic journal on JSTOR, or accessing an electronic text edition or manuscript facsimile published by an academic press, users of contemporary electronic projects by-and-large are now more likely to be interested in the quality and range of an electronic text's intellectual content than the novelty of its display, organisation or technological features (Nielsen 2000). The tools, techniques, and languages available to producers of electronic projects, likewise, are now far more standardised and helpful than those available to those responsible for electronic incunabula such as the Domesday Project.
Unfortunately this does not mean that contemporary designers are entirely free of the dangers posed by technological experimentation. The exponential growth of the internet, the increasing emphasis on compliance with international standards, and the simple pace of technological change over the last decade all pose significant challenges to the small budgets and staff of many humanities computing projects. While large projects and well-funded universities can sometimes afford to hire specialized personnel to follow developments in computing design and implementation and freeing other specialists to work on content development, scholars working on digital projects in smaller groups, at less well-funded universities, or on their own often find themselves responsible for both the technological and intellectual components of their work. Anecdotal evidence suggests that such researchers find keeping up with the pace of technological change relatively difficult -- particularly when it comes to discovering and implementing standard solutions to common technological problems (Baker, Foys, et al. 2003). If the designers of the Domesday Project courted early obsolescence because their pioneering status forced them to design unique technological solutions to previously unresolved problems, many contemporary humanities projects appear to run same risk of obsolescence and incompatibility because their inability to easily discover and implement best practice encourages them to continuously invent new solutions to already solved problems (HATII and NINCH 2002, NINCH 2002-2003, Healey 2003, Baker, Foys, et al. 2003 and O'Donnell 2003).
This area of humanities computing has been perhaps the least well served by the developments of the last two decades. While technological changes and the development of well-designed international standards have increased opportunities for contemporary designers to avoid the problems which led to the Domesday Project's early obsolescence, the absence of a robust system for sharing technological know-how among members of the relevant community has remained a significant impediment to the production of durable, standards-based projects. Fortunately, however, improvements are being made in this area as well. While mailing lists such humanist-l and tei-l long have facilitated exchange of information on aspects of electronic project design and implementation, several new initiatives have appeared over the last few years which are more directly aimed at encouraging humanities computing specialists to share their expertise and marshal their common interests. The Text Encoding Initiative (TEI) has recently established a number of Special Interest Groups (SIGs) aimed at establishing community practice in response to specific types of textual encoding problems. Since 1993, the National Initiative for a Networked Cultural Heritage (NINCH) has provided a forum for collaboration and development of best practice among directors and officers of major humanities computing projects. The recently established TAPoR project in Canada and the Arts and Humanities Data Service (AHDS) in the United Kingdom likewise seek to serve as national clearing houses for humanities computing education and tools. Finally, and aimed more specifically at medievalists, the Digital Medievalist Project (of which I am currently director) is seeking funding to establish a "Community of Practice" for medievalists engaged in the production of digital resources, through which individual scholars and projects will be able to pool skills and practice acquired in the course of their research (see Baker, Foys, et al. 2003). Although we are still in the beginning stages, there is increasing evidence that humanities computing specialists are beginning to recognise the extent to which the discovery of standardised implementations and solutions to common technological problems is likely to provide as significant a boost to the durability of electronic resources as the development of standardised languages and client-side user agents in the late 1980s and early 1990s. We can only benefit from increased cooperation.
The Case of the Unreadable Doomsday Machine makes for good newspaper copy: it pits new technology against old in an information-age version of nineteenth-century races between the horse and the locomotive. Moreover, there is an undeniable irony to be found in the fact that King William's eleventh-century parchment survey has thus far proven itself to be more durable than the BBC's 1980s computer program.
But the difficulties faced by the Domesday Project and its conservators are neither necessarily intrinsic to the electronic medium nor necessarily evidence that scholars at work on digital humanities projects have backed wrong horse in the information race. Many of the problems which led to the Domesday Project's early obsolescence and expensive restoration can be traced to its experimental nature and the innovative position it occupies in the history of humanities computing. By paying close attention to its example, by learning from its mistakes, and by recognising the often fundamentally different ways in which contemporary humanities computing projects differ from such digital incunabula, scholars can contribute greatly to the likelihood that their current projects will remain accessible long after their authors reach retirement age.
1. See the controversy between Baker 2002 and [Association of Research Libraries] 2001, both of whom agree that even very acidic newsprint can survive "several decades" in carefully controlled environments.
2. The first internet browser, "WorldWideWeb," was finished by Tim Berners-Lee at CERN (Conseil Européen pour la Recherche Nucléaire) on Christmas Day 1990. The first popular consumer browser able to operate on personal computer systems was the National Center for Supercomputing Applications (NCSA) Mosaic (a precursor to Netscape), which appeared in 1993. See [livinginternet.com] 2003 and Cailliau 1995 for brief histories of the early browser systems. The first internet application, e-mail, was developed in the early 1970s ([www.almark.net] 2003); until the 1990s, its use was restricted largely to university researchers and the U.S. military.
3. Camileon 2003; See McMordie 2003 for a history of the Acorn platform.
4. SGML, the language from which HTML is derived, was developed in the late 1970s and early 1980s but not widely used until the mid-to-late 1980s ([SGML Users' Group] 1990). HyTime, a multimedia standard, was approved in 1991 ([SGML SIGhyper] 1994).
5. This is the implication of Finney 1986, who stresses the project's technically innovative nature, rather than its practical usefulness, throughout.
Association of Research Libraries, Washington DC. "Q and A in response to Nicholson Baker's Double Fold." Web Page, September 4, 2001 [accessed December 29, 2003]. Available at: http://www.arl.org/preserv/bakerQA.html.
Baker, Nicholson. Double fold: libraries and the assault on paper. New York: Vintage Books; 2002.
Baker, Peter, Martin Foys, Murray McGillivray, Daniel Paul O'Donnell, Roberto Rosselli Del Turco and Elizabeth Solopova. "The Digital Medievalist Project: A Community of Practice for Medievalists working with digital media." Web Page, September 15, 2003 [accessed December 29, 2003]. Available at: http://www.digitalmedievalist.org.
Berners-Lee, Tim. "The original proposal of the WWW, HTMLized." Web Page, 1989 [accessed December 29, 2003]. Available at: http://www.w3.org/History/1989/proposal.html.
Cailliau, Robert. "A little history of the World Wide Web." Web Page, 1995 [accessed December 29, 2003]. Available at: http://www.w3.org/History.html.
Camileon (a). "BBC Domesday." Web Page, 2003a [accessed December 29, 2003]. Available at: http://www.si.umich.edu/CAMILEON/domesday/domesday.html.
Camileon (b). "Preserving BBC Domesday: Frequently Asked Questions." Web Page, 2003b [accessed December 29, 2003]. Available at: http://www.si.umich.edu/CAMILEON/domesday/faq.html.
Chisholm, Wendy, Gregg Vanderheiden, Ian Jacobs, [W3C], and [WAI]. "HTML Techniques for Web Content Accessibility: Guidelines 1.0." Web Page, November 6, 2000 [accessed December 29, 2003]. Available at: http://www.w3.org/TR/2000/NOTE-WCAG10-HTML-TECHS-20001106/.
Connolly, Daniel W. "Toward a formalism for communication on the Web." Web Page, 1994 [accessed December 29, 2003]. Available at: http://www.w3.org/MarkUp/html-spec/html-essay.html.
Finney, Andy. "The Domesday Project." Web Page, 1986. [accessed December 29, 2003]. Available at: http://www.atsf.co.uk/dottext/domesday.html.
Garfinkel, Simson. "The Myth of Doomed Data: The handwringing about obsolete formats is misguided. The digital files we create today will be around for a very, very long time. Technology Review [On-Line Edition]. December 3, 2003 [accessed December 29, 2003]. Available at: http://www.technologyreview.com/articles/wo_garfinkel120303.asp?p=1
HATII and NINCH. "NINCH guide to good practice, version 1.0." Web Page, 2002 [accessed December 29, 2003]. Available at: http://www.nyu.edu/its/humanities//ninchguide/.
Healey, Antonette di Paolo. "The Dictionary of Old English: the next generation(s)." Unpublished lecture. ISAS Conference; Phoenix, AZ. 2003.
Keene, Suzanne. "Now you see it, now you won't." Web Page, 2003 [accessed December 29, 2003]. Available at: http://www.suzannekeene.info/conserve/digipres/index.htm.
livinginternet.com. "Web browser History." Web Page [accessed December 29, 2003]. Available at: http://livinginternet.com/w/wi_browse.htm.
McKie, Robin and Vanessa Thorpe. "Digital Domesday Book lasts 15 years not 1000." The Observer [On-Line Edition]. March 3, 2003. [Accessed December 29, 2003]. Available at: http://observer.guardian.co.uk/uk_news/story/0,6903,661093,00.html
McMordie, Robert. "Technical history of Acorn (version 0.6 beta)." Web Page [accessed December 29, 2003]. Available at: http://www.mcmordie.co.uk/acornhistory/index.shtml.
Netscape Communications Corporation. "Dynamic HTML in Communicator." Web Page, 1997 [accessed December 29, 2003]. Available at: http://developer.netscape.com/docs/manuals/communicator/dynhtml/layers3.htm.
Nielsen, Jakob. Designing Web usability. Indianapolis: New Riders; 2000.
NINCH. "Why does the cultural community need Best Practices?" Web Page, 2002 [accessed December 29, 2003]. Available at: http://www.ninch.org/programs/practice/why.html.
O'Donnell, Daniel Paul. "Texts and the Single Scholar: Is the morning after worth the night before?" Unpublished lecture. Thirty-eighth International Congress on Medieval Studies, Kalamazoo, MI. May 8, 2003.
Sahrmann, Josh. "An introduction to Netscape layers." Web Page, May 16, 1998 [accessed December 29, 2003]. Available at: http://tech.irt.org/articles/js087/.
SGML SIGhyper. "HyTime and SMDL - History." Web Page, June 28, 1994 [accessed December 29, 2003]. Available at: http://www.sgmlsource.com/history/hthist.htm.
SGML Users' Group. "A Brief History of the Development of SGML." Web Page, 1990 [accessed December 29, 2003]. Available at: http://www.sgmlsource.com/history/sgmlhist.htm.
www.almark.net. "History of e-mail." Web Page [accessed December 29, 2003]. Available at: http://www.almark.net/Internet/html/slide5.html.
Copyright © Daniel Paul O'Donnell, 2004. All rights reserved. This edition copyright © The Heroic Age, 2004. All rights reserved