The California Digital Library supports the assembly and creative use of the world's scholarship and knowledge for the University of California libraries and the communities they serve.
In addition, the CDL provides tools that support the construction of online information services for research, teaching, and learning, including services that enable the UC libraries to effectively share their materials and provide greater access to digital content.
Integrating Metadata Standards to Support Long-Term Preservation of Digital Assets: Developing Best Practices for Expressing Preservation Metadata in a Container Format
This paper explores the purpose and development of best practice guidelines for the use of preservation metadata as detailed in the PREMIS Data Dictionary for Preservation Metadata within documents conforming to the Metadata Encoding and Transmission Standard (METS). METS is an XML schema that provides a container format integrating various forms of metadata with digital objects or links to digital objects. Because of the flexibility of METS to serve many different functions within digital systems and to support many different metadata structures, integration guidelines will facilitate common practices among institutions. There is constant tension between tighter control over the METS package to support object exchange versus each implementation's unique preservation metadata requirements given the different contexts and implementation models among PREMIS implementers. The PREMIS in METS Guidelines serve primarily as a standard for submission and dissemination information packages. This paper details the issues encountered in using the standards together, and how the METS document changes as events pertaining to the lifecycle of digital assets are recorded for future preservation purposes. The guidelines have enabled the implementation of an exchange format and creation/validation tools based on the PREMIS in METS guidelines.
- 1 supplemental PDF
The Chronopolis Digital Preservation Initiative, one of the Library of Congress' latest efforts to collect and preserve atrisk digital information, has completed its first year of service as a multi-member partnership to meet the archival needs of a wide range of cultural and social domains. In this paper we will explore the major themes within Chronopolis.
- 1 supplemental PDF
The term "Significant Properties" has been given a variety of definitions and used in various ways over the past several years. The relationship between Significant Properties and the OAIS term Representation Information has been a puzzle. This paper proposes a definition of Significant Properties which provides a way to clarify this relationship and indicates how the concept can be used in a coherent way. We believe that this approach is consistent with the actual use of the concept and does not invalidate the previous pieces of work but rather provides a clear and consistent view of the concept. It also links together Authenticity and Provenance which are also key concepts in digital preservation.
- 1 supplemental PDF
The recent transition of US presidential administrations has raised awareness and concern regarding the continuity of access to federal research data. These data are part of the vital public record of federally-funded research, and their continued availability is critically important to scientific integrity and advancement, governmental accountability, and informed public policy. The data.gov portal was created in 2009 as a central repository of government research data, and currently hosts over 135,000 datasets. This information is, according to the 2013 federal open data policy, “a valuable national resource and a strategic asset to the Federal Government, its partners, and the public.” As such, it is imperative that these data are subject to effective long-term stewardship. Best practice within the preservation community calls for redundancy, at both a technical and organizational level, as a primary strategy for higher preservation assurance. Consequently, California Digital Library (CDL) and Code for Science & Society (CSS) collaborated with the data.gov development team on datamirror.org, a full dynamic mirror of data.gov. datamirror.org holds descriptive metadata and links to the dataset copies of record on federal agency websites, as well as alternative links to local datamirror-managed replicas (41 TB), and soon, to other known copies that may emerge through the efforts of the national data rescue movement, in which CDL and CSS are active participants. While instigated by recent political events, the stewardship provided by datamirror.org is merely an expression of prudent research data management that is clearly called for to ensure permanent access to the nation’s rich digital patrimony.
- 1 supplemental file
A presentation to staff of the California Digital Library, providing an update on Cobweb development progress as of January, 2018.
Digital critical editions hold the promise of supporting new scholarly research activities not previously possible or practical with print critical editions. This promise resides in the specific ability to integrate corpora, their associated editorial material and other related content into system architectures and data structures that exploit the strengths of the digital publishing environment. The challenge is to do more than simply create an online copy of the print publication, but rather to provide the kind of resource that both eases and extends the research activities of scholars. Authoritative collections published online in this manner, and with the same rigor brought to the print publishing process, offer scholars: the ability to discover more elusive, granular pieces of information with greater facility; tighter, more obvious and more accessible connections between authoritative versions of texts, editorial matter and primary source material; and continually corrected and expanded "editions," no longer dependent upon the print lifecycle. This paper will explore these benefits and others as they are instantiated in the recently released Mark Twain Papers Online (MTPO) (http://www.marktwainproject.org), created and published as a joint project of the Mark Twain Papers & Project at The Bancroft Library of UC Berkeley (the Papers), the University of California Press (UC Press), and the California Digital Library of the University of California (CDL). This current release of MTPO is comprised of more than twenty three hundred letters written between 1853 and 1880; over twenty eight thousand records of other letters with text not held by the Papers; nearly one hundred facsimiles; and makes available the many decades of archival research on the part of the editors at the Papers. Of particular focus in this discussion will be several key features of the system which, despite the many challenges they presented in development, were felt to be essential pieces of a digital publication that could support scholarship in new and significant ways. Those features include facets, which create intellectual structure and support serendipity; advanced search, which provides a means for researchers to apply their own analytical frameworks; citation support functionality, which serves to secure and record the outcomes of research exploration; and complex displays of individual letters, which allow detailed inspection by collocating the pieces of the authoritative object. These features together maintain the integrity and stability of the collection, while concurrently allowing for fluidity in the continued expansion of the material. In this way, MTPO hopes to succeed as a digital critical edition that will support and extend the research activities of scholars.