Compound information objects are becoming the norm rather than the exception in the new scholarly communication environment. As a result, it is essential to augment the existing technical communication infrastructure with an interoperable approach that allows using, re-using, referencing, and discovering them across the borders of scholarly disciplines and applications. The international OAI-ORE effort works towards a solution that fully leverages the web architecture and that consists of publishing Resource Maps that describe compound objects, referencing resources in their compound object context, and mechanisms to facilitate discovery of Resource Maps.
Although OAI-ORE has made significant conceptual progress since it started in September 2006, important questions remain unanswered. How will the solution deal with versioning? How can the trustworthiness of Resource Maps be assessed? Which kinds of relationship types should OAI-ORE define to support bootstrapping adoption, and which should be left to individual communities? Which technologies should be used to represent Resource Maps, and how does a choice affect potential adoption? Some of these questions will receive at least a preliminary answer by the end of September 2007, which is the deadline that OAI-ORE has set itself for the release of a public alpha specification. Following that release, OAI-ORE will encourage experimentation by various scholarly communities and solicit feedback from potential stakeholders worldwide. The insights gained from those activities will be taken into account for a version 1 specification that is planned for September 2008.
In the course of May 2007, the Digital Library Research & Prototyping Team of the Los Alamos Laboratory launched an experiment to explore the notion of Resource Map publishing as a means to expose compound object boundary-type information to the web. More particularly, the experiment explored whether an existing web application would be able to take advantage of published Resource Maps, without requiring any modifications to the application itself. The experiment pertained to archiving compound information objects as they evolve over time and the applications that were used were the Internet Archive’s Heritrix toolkit that contains a web crawler and its Wayback Machine user interface.
The experiment’s optimistic scenario assumes that Resource Map publishing has become so commonplace that the Internet Archive starts to actively collect them. The experiment zooms in on two publishers that make Resource Maps discoverable via dedicated Sitemaps. When a Resource Map listed in a SiteMap changes, its associated Sitemap date-time is changed. When a new Resource Map is published, it is added to the SiteMap. The Internet Archive uses these Sitemaps and their contained date-times as a trigger to collect and archive Resource Maps as well as the resources they reference. As a result, the Wayback Machine now allows searching for a specific Resource Map of a specific date and for immediately seeing the version of the resources referenced by that Resource Map as they existed on that same date. Understanding that Resource Maps expose the boundaries of compound objects, the net result is in effect an archive of evolving compound objects, versioned by the date-time of the Resource Map that describes them.
OAI-ORE is supported by the Andrew W. Mellon Foundation, the Coalition for Networked Information, Microsoft, and the National Science Foundation (IIS-0430906).
2 Roosendaal, H. E., Guerts, P. A. T. M. "Forces and functions in scientific communities: an analysis of their interplay," in CRISP 97: Cooperative Research Information Systems in Physics, Oldenburg, Germany, 1997.
3 National Science Foundation Cyberinfrastructure Panel, "Cyberinfrastructure Vision for 21st Century Discovery," National Science Foundation, Washington, D.C. 2007, www.nsf.gov/od/oci/CI_Vision_March07.pdf.
4 "ImageWeb server," imageweb.zoo.ox.ac.uk/. Accessed June 29, 2007.
5 Crane, G. "What Do you Do with a Million Books?," D-Lib Magazine, Vol. 12, March 2006.
6 Razum, M. "eSciDoc - A Scholarly Information and Communication Platform in the Age," in Digital Library Goes e-Science (DLSci06), Alicante, Spain, 2006.
7 Dmitriev, P., Lagoze, C., Suchkov, B. "As We May Perceive: Inferring Logical Documents from Hypertext," in HT 2005 - Sixteenth ACM Conference on Hypertext and Hypermedia, Salzburg, Austria, 2005.
8 Lagoze, C., Krafft, D., Cornwell, T., Eckstrom, D., Jesuroga, S., Wilper, C. "Representing Contextualized Information in the NSDL," in ECDL2006, Alicante, Spain, 2006.
9 Berners-Lee, T. "Semantic Web Road Map," W3C, www.w3.org/DesignIssues/Semantic.html.
10 Jacobs, I., Walsh, N. "Architecture of the World Wide Web," W3C, Proposed Recommendation April 2004, www.w3.org/TR/2004/PR-webarch-20041105/.
11 Berners-Lee, T. "Linked Data," W3C 2006, www.w3.org/DesignIssues/LinkedData.html.
12 Carroll, J. J., Bizer, C., Hayes, P., Stickler, P. "Named Graphs, Provenance and Trust," in WWW 2005 Chiba, Japan: ACM, 2005.
13 Carroll, J. J., Bizer, C., Hayes, P., Stickler, P. "Named Graphs," 2005, sites.wiwiss.fu-berlin.de/suhl/bizer/pub/NamedGraphs-WebSemanticsJourn....
14 Davis, I. "GRDDL," W3C October 2006, www.w3.org/TR/grddl-primer/.
15 R. Lewis, "Dereferencing HTTP URIs " W3C, www.w3.org/2001/tag/doc/httpRange-14/2007-05-31/HttpRange-14.
16 "The Digital Object Identifier System Home Page," International DOI Foundation (IDF), www.doi.org/.
17 Lagoze, C., Payette, S., Shin, E., Wilper, C. "Fedora: An Architecture for Complex Objects and their Relationships," International Journal of Digital Libraries, Vol. 6, pp. 124-138, April 2005.
18 Van de Sompel, H., Bekaert, J., Liu, X., Balakireva, L., Schwander, T. "aDORe: a modular, standard-based Digital Object Repository," www.arxiv.org/abs/cs.DL/0502028.
19 Van de Sompel, H., Hammond, T., Neylon, E., Weibel, S. "The "info" URI Scheme for Information Assets with Identifiers in Public Namespaces," IETF RFC 4452, 2006, www.rfc-editor.org/rfc/rfc4452.txt.