Contents 1 Core Principles and Activities 1.1 Principles 1.2 Methodology 2 Related terms 3 Challenges 3.1 Rate of creation of new data and data sets 3.2 Storage format evolution and obsolescence 3.3 Underestimation of human labor costs 3.4 Standardization and coordination between institutions 3.5 Digitization of analog materials 3.6 New representational formats 3.7 Accessibility 3.8 Responses to challenges 4 Approaches 5 See also 6 References 7 External links

Core Principles and Activities[edit] The term “digital curation” was first used in the e-science and biological science fields as a means of differentiating the additional suite of activities ordinarily employed by library and museum curators to add value to their collections and enable its reuse[6][7][8] from the smaller subtask of simply preserving the data, a significantly more concise archival task.[6] Additionally, the historical understanding of the term “curator” demands more than simple care of the collection. A curator is expected to command academic mastery of the subject matter as a requisite part of appraisal and selection of assets and any subsequent adding of value to the collection through application of metadata.[6] Principles[edit] There are five commonly accepted principles that govern the occupation of digital curation: Manage the complete birth-to-retirement life cycle of the digital asset.[4] Evaluate and cull assets for inclusion in the collection.[4] Apply preservation methods to strengthen the asset’s integrity and reusability for future users.[4] Act proactively throughout the asset life cycle to add value to both the digital asset and the collection.[4] Facilitate the appropriate degree of access to users.[4] Methodology[edit] The Digital Curation Center offers the following step-by-step life cycle procedures for putting the above principles into practice[9]: Conceptualize: Consider what digital material you will be creating and develop storage options. Take into account websites, publications, email, among other types of digital output.[9] Create: Produce digital material and attach all relevant metadata, typically the more metadata the more accessible the information.[9] Access and use: Determine the level of accessibility for the range of digital material created. Some material may be accessible only by password and other material may be freely accessible to the public.[9] Appraise and select: Consult the mission statement of the institution or private collection and determine what digital data is relevant. There may also be legal guidelines in place that will guide the decision process for a particular collection.[9] Dispose: Discard any digital material that is not deemed necessary to the institution.[9] Ingest: Send digital material to the predetermined storage solution. This may be an archive, repository or other facility.[9] Preservation action: Employ measures to maintain the integrity of the digital material.[9] Reappraise: Reevaluate material to ensure that is it still relevant and is true to its original form.[9] Store: Secure data within the predetermined storage facility.[9] Access and reuse: Routinely check that material is still accessible for the intended audience and that the material has not been compromised through multiple uses.[9] Transform: If desirable or necessary the material may be transferred into a different digital format.[9]

Related terms[edit] The term "digital curation" is sometimes used interchangeably with terms such as "digital preservation" and "digital archiving". While digital preservation does focus a significant degree of energy on optimizing reusability, preservation remains a subtask to the concept of digital archiving, which is in turn a subtask of digital curation.[6][10] For example, archiving is a part of curation, but so are subsequent tasks such as themed collection-building, which is not considered an archival task. Similarly, preservation is a part of archiving, but so are the important tasks of selection and appraisal that are not necessarily part of preservation.[10] Data curation is another term that is often used interchangeably with digital curation, however common usage of the two terms differs. While “data” is a more all-encompassing term that can be used generally to indicate anything recorded in binary form, the term “data curation” is most common in scientific parlance and usually refers to accumulating and managing information relative to the process of research.[11] So, while documents and other discrete digital assets are technically a subset of the broader concept of data[6], in the context of scientific vernacular digital curation represents a broader purview of responsibilities than data curation due to its interest in preserving and adding value to digital assets of any kind.[7]

Challenges[edit] Rate of creation of new data and data sets[edit] The ever lowering cost, and increasing prevalence of entirely new categories of technology has lead to a quickly growing flow of new data sets.[12] These come from well established sources such as business and government, but the trend is also driven by new styles of sensors becoming embedded in more areas of modern life.[7] This is particularly true of consumers, whose production of digital assets is no longer relegated strictly to work. Consumers now create wider ranges of digital assets, including videos, photos, location data, purchases, and fitness tracking data, just to name a few, and share them in wider ranges of social platforms.[7] Additionally, the advance of technology has introduced new ways of working with data. Some examples of this are international partnerships that leverage astronomical data to create “virtual observatories”, and similar partnerships have also leveraged data resulting from research at the Large Hadron Collider at CERN and the database of protein structures at the Protein Data Bank.[8] Storage format evolution and obsolescence[edit] By comparison, archiving of analog assets is notably passive in nature, often limited to simply ensuring a suitable storage environment. Digital preservation requires a more proactive approach.[13] Today’s artifacts of cultural significance are notably transient in nature and prone to obsolescence when social trends or dependent technologies change.[7] This rapid progression of technology occasionally makes it necessary to migrate digital asset holdings from one file format to another in order to mitigate the dangers of hardware and software obsolescence which would render the asset unusable.[9] Underestimation of human labor costs[edit] Modern tools for program planning often underestimate the amount of human labor costs required for adequate digital curation of large collections. As a result cost-benefit assessments often paint an inaccurate picture of both the amount of work involved, and the true cost to the institution for both successful outcomes and failures.[7] Standardization and coordination between institutions[edit] An absence of coordination across different sectors of society and industry in areas such as the standardization of semantic and ontological definitions[14], and in forming partnerships for proper stewardship of assets has resulted in a lack of interoperability between institutions, and a partial breakdown in digital curation practice from the standpoint of the ordinary user.[7] Digitization of analog materials[edit] The curation of digital objects is not limited to strictly born-digital assets. Many institutions have engaged in monumental efforts to digitize analog holdings in an effort to increase access to their collections. Examples of these materials are books, photographs, maps, audio recordings, and more.[7] The process of converting printed resources into digital collections has been epitomized to some degree by librarians and related specialists. For example, The Digital Curation Centre is claimed to be a "world leading centre of expertise in digital information curation"[15] that assists higher education research institutions in such conversions. New representational formats[edit] For some topics, knowledge is embodied in forms that have not been conducive to print, such as how choreography of dance or of the motion of skilled workers or artisans is difficult to encode. New digital approaches such as 3D holograms and other computer-programmed expressions are developing.[citation needed] For mathematics, it seems possible for a new common language to be developed that would express mathematical ideas in ways that can be digitally stored, linked, and made accessible. The Global Digital Mathematics Library is a project to define and develop such a language.[citation needed] Accessibility[edit] The ability of the intended user community to access the repository’s holdings is of equal importance to all the preceding curatorial tasks. This must take into account not only the user community’s format and communication preferences, but also a consideration of communities that should not have access for various legal or privacy reasons.[16] Responses to challenges[edit] Specialized research institutions[17][18] Academic courses Dedicated symposia[19][20] Peer reviewed technical and industry journals[21]

Approaches[edit] Many approaches to digital curation exist, and have evolved over time in response to the changing technological landscape. Two examples of this are sheer curation[6] and channelization[citation needed]. Sheer curation is an approach to digital curation where curation activities are quietly integrated into the normal work flow of those creating and managing data and other digital assets. The word sheer is used to emphasize the lightweight and virtually transparent nature of these curation activities. The term sheer curation was coined by Alistair Miles in the ImageStore project,[22] and the UK Digital Curation Centre's SCARP project.[23] The approach depends on curators having close contact or 'immersion' in data creators' working practices. An example is the case study of a neuroimaging research group by Whyte et al., which explored ways of building its digital curation capacity around the apprenticeship style of learning of neuroimaging researchers, through which they share access to datasets and re-use experimental procedures.[24] Sheer curation depends on the hypothesis that good data and digital asset management at the point of creation and primary use is also good practice in preparation for sharing, publication and/or long-term preservation of these assets. Therefore, sheer curation attempts to identify and promote tools and good practices in local data and digital asset management in specific domains, where those tools and practices add immediate value to the creators and primary users of those assets. Curation can best be supported by identifying existing practices of sharing, stewardship and re-use that add value, and augmenting them in ways that both have short-term benefits, and in the longer term reduce risks to digital assets or provide new opportunities to sustain their long-term accessibility and re-use value.[citation needed] The aim of sheer curation is to establish a solid foundation for other curation activities which may not directly benefit the creators and primary users of digital assets, especially those required to ensure long-term preservation. By providing this foundation, further curation activities may be carried out by specialists at appropriate institutional and organisation levels, whilst causing the minimum of interference to others.[citation needed] A similar idea is curation at source used in the context of Laboratory Information Management Systems LIMS. This refers more specifically to automatic recording of metadata or information about data at the point of capture, and has been developed to apply semantic web techniques to integrate laboratory instrumentation and documentation systems.[25] Sheer curation and curation-at-source can be contrasted with post hoc digital preservation, where a project is initiated to preserve a collection of digital assets that have already been created and are beyond the period of their primary use.[citation needed] Channelization is curation of digital assets on the web, often by brands and media companies, into continuous flows of content, turning the user experience from a lean-forward interactive medium, to a lean-back passive medium. The curation of content can be done by an independent third party, that selects media from any number of on-demand outlets from across the globe and adds them to a playlist to offer a digital "channel" dedicated to certain subjects, themes, or interests so that the end user would see and/or hear a continuous stream of content.[citation needed]

See also[edit] Digital preservation Data curation Digital asset management Data format management Digital artifactual value Digital obsolescence Curator Biocurator

Animations introducing digital preservation and curation Digital Curation Centre Digital Curation and Trusted Repositories: Steps Toward Success*DigCurV A project funded by the European Commission to establish a curriculum framework for vocational training in digital curation.

