Figure 4 - uploaded by Geoffrey M. Brown
Content may be subject to copyright.
Data view constructed from dBase files in a legacy Windows application. For researchers interested in collating or searching data intended to be viewed using these executables, assisted data extraction provides high-quality, low-risk views into the original format while requiring minimal interaction with the emulated environment. 

Data view constructed from dBase files in a legacy Windows application. For researchers interested in collating or searching data intended to be viewed using these executables, assisted data extraction provides high-quality, low-risk views into the original format while requiring minimal interaction with the emulated environment. 

Source publication
Article
Full-text available
Emulation is frequently discussed as a failsafe preservation strategy for born-digital documents that depend on contemporaneous software for access (Rothenberg, 2000). Yet little has been written about the contextual knowledge required to successfully use such software. The approach we advocate is to preserve necessary contextual information throug...

Contexts in source publication

Context 1
... the data encoded in these files alone are not sufficient, as (for example) some field names are encoded numerically using the Standard Industrial Classification Code List, and only mapped to their natural language counterparts in the final GUI display. A sample of such a display is provided in Figure 4. For researchers interested in collating or searching data intended to be viewed using these executables, assisted data extraction provides high-quality, low-risk views into the original format while requiring minimal interaction with the emulated environment. ...
Context 2
... select a resource by clicking an appropriate link to a signed Java applet (running server-side), which in turn executes a helper application installed on the local workstation. This application first prompts them to select the desired emulation package ( Figure 3). For this implementation, we tested the software with VMware Server 1.0.2 and VMware Workstation 6.0.4, to ensure compatibility of the API calls with a variety of common products. Users are presented with a final confirmation of their selections: the location of the virtual machine - the application automatically searches for paths corresponding to valid virtual machines and presents a default choice in an intermediate dialog - and the path to an ISO image corresponding to their original selection on the website. The application (shown in this stage in Figure 4) may also be run in a “standalone” mode, with the user selecting a disk image ...
Context 3
... it is possible to migrate the dBase data directly to a more modern format, this may result in loss of context; without the data manipulation performed in the GUI application, the intended visual structure of the data (frequently non-trivial) can be lost. Macro scripts such as those described in the previous section can provide a simple method for exporting structured information. In the example illustrated below, economic data published by the U.S. Census Bureau as a CD-ROM “County Business Patterns 1995-1996” is encoded in a series of legacy .dbf files that could readily be migrated to a more modern format using numerous open source and commercial software solutions. However, the data encoded in these files alone are not sufficient, as (for example) some field names are encoded numerically using the Standard Industrial Classification Code List, and only mapped to their natural language counterparts in the final GUI display. A sample of such a display is provided in Figure 4. Because modern browsers operate in a “sandbox” designed to shield the user from malicious sites, we used a signed Java applet to link browsing activity with the local executable necessary to configure, boot, and perform required installations in a local virtual machine. Administrators can deploy this solution (or a modified one using the available source) simply by adding an applet-based link which points to the desired resource within existing pages. In our implementation, these links were generated automatically in the following ...

Citations

... For instance, emulation has been used to provide access to a large collection of legacy CD-ROMs [8], [9]. Furthermore, requirements and workflows have been developed for preparing ready-made environments to render certain digital artifacts [10]. ...
Conference Paper
Until now, emulation of legacy architectures has mostly been seen as a tool for hobbyists and as technical nostalgia. However, in a world in which research and development is producing almost entirely digital artifacts, new and efficient concepts for preservation and re-use are required. Furthermore, a significant amount of today's cultural work is purely digital. Hence, emulation technology appeals to a wider, non-technical, user-group since many of our digital objects cannot be re-used properly without a suitable runtime environment. This article presents a scalable and cost-effective Cloud-based Emulation-as-a-Service (EaaS) architecture, enabling a wide range of non-technical users to access emulation technology in order to re-enact their digital belongings. Together with a distributed storage and data management model we present an implementation from the domain of digital art to demonstrate the practicability of the proposed EaaS architecture.
... In previous work, (Woods & Brown, 2010) we proposed an access model based upon networked "virtual collections" of CD-ROMs which can enable consortia of libraries to pool the technical expertise necessary to provide continued access to such materials for a geographically sparse base of patrons, who may have limited technical knowledge. ...
... We have previously proposed a general model for preserving "virtual CD-ROM" collections and explored the use of emulation of Windows based platforms (Woods & Brown, 2010). In this paper, we extend this work to emulation of classic Macintoshes through a significant case study -the CD-ROMs published by the Voyager Company. ...
... Libraries and educational institutions could collaborate in creating images of CD-ROMs in their collections, as well as customizing supporting software images for these CD-ROMs. In our previous work, we assumed a client-side emulator preconfigured to execute a generic Windows XP environment, and utilized a helper application to customize this environment for a particular CD-ROM (Woods & Brown, 2010). Where emulator environment size is substantial compared to CD-ROM size (three to four times for Windows XP), this represents a substantial space saving. ...
Article
Full-text available
Over the past 20 years, many thousands of CD-ROM titles were published; many of these have lasting cultural significance, yet present a difficult challenge for libraries due to obsolescence of the supporting software and hardware, and the consequent decline in the technical knowledge required to support them. The current trend appears to be one of abandonment – for example, the Indiana University Libraries no longer maintain machines capable of accessing early CD-ROM titles.In previous work, we proposed an access model based upon networked ‘virtual collections’ of CD-ROMs which can enable consortia of libraries to pool the technical expertise necessary to provide continued access to such materials for a geographically sparse base of patrons, who may have limited technical knowledge.In this paper, we extend this idea to CD-ROMs designed to operate on ‘classic’ Macintosh systems with an extensive case study – the catalog of the Voyager Company publications, which was the first major innovator in interactive CD-ROMs. The work described includes emulator extensions to support obsolete CD formats and to enable networked access to the virtual collection.
... One potential solution is to automate the different installation steps for each relevant package. Another possible approach is to minimize dependency on this knowledge by providing automated configuration and execution within virtualized environments [14]. This group demonstrated how to deploy automation scripts, i.e. ...
Article
Emulation evolves into a mature digital preservation strategy, providing authentic functional access to a wide range of digital objects, using their original creation environments. In contrast to format migration strategies a functional, emulation-based approach requires a number of additional components, i.e. the full softwarestack required to render a digital object but also its configuration. The goal of the bwFLA project is the implementation and development of a distributed framework for emulation-based services and technologies to address Baden-Württemberg state and higher education institutes’ libraries’ and archives’ new challenges in digital long-term preservation.
... If the application is started on file load (by <application> <filename>), further reduction of interaction would be possible. Woods and Brown (2010) suggest scripted on-demand installations of common software environments in their "Assisted Emulation" paper. The authors additionally advocate to preserve necessary contextual information through scripts designed to control the legacy environment, and created during the preservation workflow. ...
Article
Full-text available
Many digital preservation scenarios are based on the migration strategy, which itself is heavily tool-dependent. For popular, well-defined and often open file formats – e.g., digital images, such as PNG, GIF, JPEG – a wide range of tools exist. Migration workflows become more difficult with proprietary formats, as used by the several text processing applications becoming available in the last two decades. If a certain file format can not be rendered with actual software, emulation of the original environment remains a valid option. For instance, with the original Lotus AmiPro or Word Perfect, it is not a problem to save an object of this type in ASCII text or Rich Text Format. In specific environments, it is even possible to send the file to a virtual printer, thereby producing a PDF as a migration output. Such manual migration tasks typically involve human interaction, which may be feasible for a small number of objects, but not for larger batches of files.We propose a novel approach using a software-operated VNC abstraction layer in order to replace humans with machine interaction. Emulators or virtualization tools equipped with a VNC interface are very well suited for this approach. But screen, keyboard and mouse interaction is just part of the setup. Furthermore, digital objects need to be transferred into the original environment in order to be extracted after processing. Nevertheless, the complexity of the new generation of migration services is quickly rising; a preservation workflow is now comprised not only of the migration tool itself, but of a complete software and virtual hardware stack with recorded workflows linked to every supported migration scenario. Thus the requirements of OAIS management must include proper software archiving, emulator selection, system image and recording handling. The concept of view-paths could help either to automatically determine the proper pre-configured virtual environment or to set up system images for certain migration workflows. View-paths may rise in demand, as the generation of PDF output files from Word Perfect input could be cached as pre-fabricated emulator system images. The current groundwork provides several possible optimizations, such as using the automation features of the original environments.
Book
Full-text available
This selective bibliography presents over 500 English-language articles, books, and technical reports. It covers digital curation and preservation copyright issues, digital formats (e.g., data, media, and e-journals), metadata, models and policies, national and international efforts, projects and institutional implementations, research studies, services, strategies, and digital repository concerns. Most sources have been published from 2000 through February 2011. It is under a under a Creative Commons Attribution License. It is also available as a website with a Google Translate link (https://tinyurl.com/24avtyuu). "This tremendous resource is. . . an excellent place to survey much of the available research on a topic related to data curation". - Julia Flanders and Trevor Muñoz. "An Introduction to Humanities Data Curation." In DH Curation Guide: A Community Resource Guide to Data Curation in the Digital Humanities, 2012.
Book
Full-text available
This bibliography presents over 650 English-language articles, books, and technical reports. It covers digital curation and preservation copyright issues, digital formats (e.g., data, media, and e-journals), metadata, models and policies, national and international efforts, projects and institutional implementations, research studies, services, strategies, and digital repository concerns. Most sources were published from 2000 through 2011. It is available as a EPUB file, a low-cost paperback, a paperback PDF file, a website with a Google Translate link, and a website PDF with live links (http://digital-scholarship.org/dcbw/dcb.htm). It is under a under a Creative Commons Attribution License. "Librarians and scholars who are concerned with managing digital resources and preserving them for future use will find a crash course on the subject in this bibliography. . . . This book is recommended for librarians working with original digital resources, scholars interested in digital repositories, and students in the field." - Paul M. Blobaum, Journal of the Medical Library Association 101, no. 2 (2013): 158.
Article
Emulation of Multimedia objects in Libraries (EMiL) is a web-based emulation service environment for library reading rooms that addresses the challenges of accessing large born-digital collections. In particular, we present an automated process to gather technical metadata necessary to initiate an emulation session that renders a user-chosen object as well as a technical design that allows a seamless integration into existing web-based catalogues. The primary focus is on providing access to multimedia CD-ROMs of the 1990s and 2000s, but the EMiL system is designed to be used with other digital collections as well.
Conference Paper
Preservation of complex, non-linear digital objects such as digital art or ancient computer environments has been a domain reserved for experts until now. Digital culture, however, is a broader phenomenon. With the introduction of the so-called Web 2.0 digital culture became a mass culture. New methods of content creation, publishing and cooperation lead to new cultural achievements. Therefore, novel tools and strategies are required, both for preservation but in particular for curation and presentation. We propose a scaleable architecture suitable to create a community driven platform for preservation and curation of complex digital objects. Further, we provide novel means for presenting preserved results including technical meta-data, and thus, allowing for public review and potentially further community induced improvements.
Article
Most of today's business processes are based solely on digital data. Input, output, and intermediate results are pure digital objects. Keeping such data accessible and meaningful but even more importantly keeping the business processes functional is challenging. With regards to complex electronic business processes traditional archiving does not provide satisfactory results. Due to the fast technical life-cycle the time gap between archiving and re-enactment or reuse of a process poses risks and increases the uncertainty on achievable results. We propose a novel strategy bridging this specific time gap. For this we present a scalable and cost-effective infrastructure with associated workflows focusing on a process' execution context. Most importantly the process' developers are able to assess the preserved results in a timely manner, and thus, reduce the uncertainty on future re-enactment results.
Article
Full-text available
Over the past several decades, millions of digital objects of significant scientific, economic, cultural, and historic value have been published and distributed to libraries and archives on removable media. Providing long-term access to these documents, media files, and software executables is an increasingly complex task because of dependencies on aging or legacy hardware and software. This is a persistent problem for both digital libraries and long-term digital archives, where mandates to maintain and improve access can be overshadowed by ongoing technical and administrative costs associated with digital collections. There are several widely accepted techniques used by the archival community to preserve materials originally held on legacy media: bitstream preservation, migration of documents from aging formats to modern ones, and emulation for legacy executables. I demonstrate how these techniques can be combined to provide high-quality access to digital collections without compromising long-term archival processes or increasing risk. I show that most technical risk to preserving and accessing legacy born-digital documents can be effectively managed through the careful application of existing open source tools paired with some custom software. I focus on the collection of Government Printing Office documents held on legacy optical and magnetic removable media at the Indiana University Libraries. This collection contains millions of born-digital objects (documents and software) in hundreds of formats. I present a systematic approach to transferring bit-identical file-systems from legacy media to modern storage, ensuring future operation within legacy environments and supporting integrity checks and deduplication tasks. I describe reliable, high-performance techniques for automated identification, feature extraction, migration, rendering, and distribution of the documents and software contained in this collection. I examine methods that exemplify best practices for providing Web access to digital collections, including high-performance indexing, generation of and access to machine- and human-readable metadata, on-demand migration and rendition of legacy documents, and the construction of a "virtual file-system" to simplify navigation of the digital archive. Finally, I examine the relationship between these techniques and the development of quantifiable measures of risk for legacy digital objects. [The dissertation citations contained here are published with the permission of ProQuest LLC. Further reproduction is prohibited without permission. Copies of dissertations may be obtained by Telephone (800) 1-800-521-0600. Web page: http://www.proquest.com/en-US/products/dissertations/individuals.shtml.]