May 2006
Designing and Supporting Science-Driven Infrastructure
Fran Berman and Reagan Moore, San Diego Supercomputer Center

1. Introduction

The 20th century brought about an “information revolution” that has forever altered the way we work, communicate, and live. In the 21st century, data is ubiquitous. Available in digital format via the web, desktop, personal device, and other venues, data collections both directly and indirectly enable a tremendous number of advances in modern science and engineering.

Today’s data collections span the spectrum in discipline, usage characteristics, size, and purpose. The life science community utilizes the continually expanding Protein Data Bank1 as a worldwide resource for studying the structures of biological macromolecules and their relationships to sequence, function, and disease. The Panel Study of Income Dynamics (PSID),2 a longitudinal study initiated in 1968, provides social scientists detailed information about more than 65,000 individuals spanning as many as 36 years of their lives. The National Virtual Observatory3 is providing an unprecedented resource for aggregating and integrating data from a wide variety of astronomical catalogs, observation logs, image archives, and other resources for astronomers and the general public. Such collections have broad impact, are used by tens of thousands of individuals on a regular basis, and constitute critical and valuable community resources.

However, the collection, management, distribution, and preservation of such digital resources does not come without cost. Curation of digital data requires real support in the form of hardware infrastructure, software infrastructure, expertise, human infrastructure, and funding. In this article, we look beyond digital data to its supporting infrastructure, and provide a holistic view of the software, hardware, human infrastructure, and costs required to support modern data-oriented applications in research, education, and practice.

Pages: 1 2 3 4 5 6

Reference this article
Berman, F., Moore, R. "Designing and Supporting Data Management and Preservation Infrastructure," CTWatch Quarterly, Volume 2, Number 2, May 2006. http://www.ctwatch.org/quarterly/articles/2006/05/designing-and-supporting-data-management-and-preservation-infrastructure/

Any opinions expressed on this site belong to their respective authors and are not necessarily shared by the sponsoring institutions or the National Science Foundation (NSF).

Any trademarks or trade names, registered or otherwise, that appear on this site are the property of their respective owners and, unless noted, do not represent endorsement by the editors, publishers, sponsoring institutions, the National Science Foundation, or any other member of the CTWatch team.

No guarantee is granted by CTWatch that information appearing in articles published by the Quarterly or appearing in the Blog is complete or accurate. Information on this site is not intended for commercial purposes.