Digital Library Federation Forum: 2006-11-08

I'm in Boston at the Fall DLF Forum and it has been an interesting day. I had some time before the sessions started to head out and take a long (wet) walk around town. The event is being help up near Copley Square and I decided to walk down through the Boston Public Garden, into the Commons and then through downtown to Long Warf. On the way back up to the Forum site, I walked up through the Fennuel Hall area past the city hall and then back up through the Commons, and the Green. I stopped off at a cafe on Newbury St. before heading in to the meeting. Boston is a beautiful town and its true... it is quite walkable.

So, the forum...

Here are the agenda items from day 1:

"Collex: NINES in the Semantic Web." Duane Gran and Erik Hatcher, University of Virginia.

This presentation describes the technologies behind Collex and and the design for producing a "semantic collections and exhibits builder for the remixable web." We will demonstrate the Collex software and describe opportunities for collaboration and for integration of the system into digital libraries. Collex is designed to be a generalizable, web-based tool for Collecting and Exhibiting digital resources described with RDF, a standard format for the semantic web. It allows users to search and annotate objects and repurpose them in annotated bibliographies, course syllabi, and illustrated essays. Collex converts the semantics of RDF into a faceted browser with auto-suggest fields to reveal full text search matches in real time. Users produce a folksonomy by tagging objects and creating collections and exhibits. Patterns of use emerge as scholars annotate and repurpose objects. The software leverages these patterns to promote knowledge discovery in our prototype database of nearly 50,000 digital objects from ten scholarly digital editions of nineteenth century British and American literature (the NINES federation). Collex uses Lucene, via Solr, for full-text searching, faceting browsing, and more-like-this queries. Ruby on Rails supports the web tier of Collex. We have expanded on basic Dublin Core with specialized metadata suited to our initial dataset and user research interests. Emerging "Web 2.0" or "Ajax" technologies make the application immediately responsive to user actions such as search, object collection and annotation. We also provide syndicated Atom feeds for user-created tags and metadata facets. Features coming soon include drag-and-drop exhibit building, saved searches, and exposing more of Collex data with other information environments.

"Swimming in the Resource Pool: The USC Libraries' Gandhara Project." Todd Grappone and Zahid Rafique, University of Southern California.

The original goal of USC's Gandhara Project was to provide a search interface to all metadata regarding resources and locally produced research. This goal has largely been realized. By combining a simple locally developed XML wrapper technology with open-source software and programming labor resources, we have developed an elegant open standards based library information system that will be the basis for our future knowledge systems at USC. We created a system to organically store, index and represent items regardless of format or origin. In addition, by using XML and open standards we are developing a user-centered tool for designing online libraries. Using ingest harvesting and crawling technology, Gandhara creates a single search interface to the USC ILS, the USC Digital Archive, Institutional Repository, online chat reference sessions, the usc.edu web site as well as the medical library catalog. Not only does Gandhara offer new bibliographic services but it also allows for backend flexibility. In the system currently being developed, indexing is not wedded to a single metadata standard; any resource that produces XML can be indexed and searched. By focusing on developing collection-to-collection and collection-to-user data feeds, the "locked box" of an ILS is opened. This paper presentation will discuss the development of the system, a demonstration of the current system and next steps.

"SIMILE-Semantic Web browsing in DSpace." MacKenzie Smith, MIT.

Project SIMILE has created a new faceted browsing interface for the DSpace institutional repository platform that demonstrates the value of RDF and Semantic Web technology for that category of digital library systems. The new UI will be demonstrated, and its implications for DSpace and other IR platforms discussed. Since 2003, the MIT Libraries have been conducting a research project called SIMILE to demonstrate practical applications of Semantic Web technology for digital libraries, particularly in the area of metadata interoperability. The project has recently developed DWell, a faceted browsing interface for the DSpace digital repository platform based on the Longwell (general purpose) faceted browser. DWell uses OAI-PMH to extract RDF-encoded metadata from DSpace repositories, and provides new and powerful ways to explore the contents of the repository while still integrating the new features into the standard DSpace user interface via easy-to-use Javascript widgets.DWell will be released to the DSpace community in the near future, and clearly demonstrates how RDF and Semantic Web technology can add value to digital library systems with minimal effort. The DWell UI will be demonstrated, as well as other applications of the Longwell browser. http://simile.mit.edu .

 

Web Archiving Update.

Kristine Hanna, Internet Archive; Tracy Seneca, CDL; John Tuck, British Library; Taylor Surface, OCLC; and Jennifer Marill, Library of Congress.

Web archiving services under development and deployment at a number of different institutions will enable librarians and other document selectors to extend their historic collection-building roles into the domain of web-based materials. Such services allow curators to initiate and monitor web crawls relevant to specific topic areas, analyze and annotate harvested data, and search and browse local archives built from sites that may have been harvested multiple times. A great deal has been learned in the past year about the Do's and Don't's of web archiving, and the panelists will be presenting on their latest experiences.

Architectures and Collaboration. (Grand Ballroom, Main Lobby Level)

"PennTags: Social Bookmarking in an Academic Environment." Michael Winkler, University of Pennsylvania.

PennTags is a social bookmarking system that has been developed at the University of Pennsylvania for use in an academic environment. PennTags allows owners to capture links to content on the open web, much like del.icio.us and other social bookmarking tools. But, PennTags can also capture content links from the Library catalog, DOI & OpenURL sources and other proprietary research-support systems that can be elusive to non-academic bookmarking systems. After capture, PennTags allows owners to enhance their bookmarks with tags and annotations that support user-driven classification, contextualization and critical analysis of these resources. Owners can organize posts into projects - logical, synthetic groupings that enhance meaning and utility of the publication. Recent changes to PennTags enable multiple project participants that can add, edit and shape its intellectual content. Over the past year as PennTags has matured, we've observed several ways in which a tool like PennTags can be a powerful tool in an academic setting to organize resources, create content and support communities of learners and researchers. Librarians are using it to quickly produce on-the-fly research guides that can continue to grow and self-organize. Increasingly, the Library is using PennTags as a content management system for its web presence. But more exciting is that communities of users - librarians, researchers, students - have used PennTags to create shared knowledge-bases that range from class projects in film studies, to bibliographies on copyright, to resources collected by students in Veterinary Medicine, to reference tools discovered by and for medical interns. In each case, the community used PennTags to produce this corpus of resource links, metadata and synthetic organization in a harvestable, public venue. As the use of PennTags continues to grow, PennTags faces opportunities and challenges to support diverse and specialized needs to organize and present content, to support classroom use of digital objects and resources, and to integrate with the tools that researchers, faculty, students and librarians routinely use in their work. In this presentation, we discuss how PennTags has developed and what plans we have for further development.

"Cooperative Architecture and Cooperative Development of a Course Reserves Tool." Randy Stern and David McElroy, Harvard.

Harvard University Library and the Harvard University iCommons project have jointly developed a web based system to automate faculty creation, library processing, and student display of course reserves reading lists. In its first year of operation, the system processed over 14,000 reserves requests for more than 900 courses in the Faculty of Arts and Sciences and this year has been extended to support 5 Harvard graduate schools. Independent library, registrar, and courseware teams worked together over a 2 year period to cooperatively design, develop, pilot, and productize a unique set of SOAP services for communication of course and library data, and service requests from web based tools running in disparate environments. Citation lists can incorporate reading lists from previous years, on-line submission, as well as legacy submissions in paper or email format. The presentation will describe the technical architecture of the solution, the management challenges of development and support in a distributed organization, and present some screen shots of the system.

"OpenURL Unleashed: Six Questions (Q6) and the OpenURL Object Model (OOM)." Jeffery A. Young, OCLC.

Forget what you think you know about OpenURL. Think of it this way instead: OpenURL is the little bit of glue that allows programmers to drop their raw business logic on a server's classpath and have it appear as a web service. This glue amounts to six questions that a three year old could understand: who, what, where, why, when, and how. Since any imaginable web service request can be expressed in these terms, this simpler understanding breaks down the barriers that have prevented OpenURL from realizing its full potential. Furthermore, the OpenURL Object Model (OOM) is offered as a language- and platform-independent view of an OpenURL application that allows developers to focus on two simple interfaces for transforming their business logic into web services.

Session 4: Archives and Rights. (Ballroom Foyer, Main Lobby Level)

"Archiving Katrina Web Content for Enduring Access & Research: Lessons Learned from the Deployment of Open Source Tools & Resources for the Historic Preservation of Current Events." Gordon Mohr and Kris Carpenter, Internet Archive.

On September 4, 2005, the Internet Archive began to collect, archive and make publicly accessible Web content covering Hurricane Katrina's historic landfall in the Gulf of Mexico and its immediate aftermath. The Internet Archive, the Library of Congress, a select group of universities, and many individual contributors worked together to compile a comprehensive list of websites to create this historical record of the devastation and the massive relief effort that followed. The goal was to collect content as it was generated or updated and to preserve each page for immediate viewing and for future research. The project was executed using a full suite of open source web archiving tools and end user applications including the: Heritrix web crawler; Hadoop scheduler; Nutch search; and Nutch wax & Wayback machine archive file viewers.

The Katrina collection spans content generated between September 4 and November 8, 2005 and has over 61 million unique pages, all text searchable, from over 1700 Sites. The collection is hosted at http://websearch.archive.org/katrina/. This talk describes the Katrina archives, the open source tools and applications used to create and support the collection, the lessons learned, and how these open source tools and applications have evolved as the result of this community-based collaboration.

"Faculty Rights and Other Scholarly Communication Practices." Denise Troll Covey, Carnegie Mellon.

Spring 2006 Carnegie Mellon University Libraries conducted interviews with a stratified random sample of campus faculty to better understand their scholarly practices and concerns, to identify factors that influence their behavior, and to enable the Libraries to target education, tools and services. The interview data, analyzed by college, faculty track, rank on the track, gender, and age, reveal how faculty disseminate their work, how they keep current in their field, and why - without pay - they serve on editorial boards and referee articles. More importantly, they reveal faculty levels of understanding and appreciation of copyright and the open access movement. The presentation focuses on the more provocative outcomes of the study, including:

  • The influence of copyright transfer terms on faculty selection of publishers
  • Faculty understanding of their copyright transfer agreements
  • What faculty are likely to do if their rights are not clear
  • Current self-archiving practices and barriers and incentives to faculty negotiating the right to self-archive
  • Faculty concerns about open access
  • Factors likely to influence faculty choices or to provoke their resistance.

The presentation concludes with a brief description of the University Libraries' collaboration with the provost and university legal counsel to address the more compelling findings.

 

Reply

The content of this field is kept private and will not be shown publicly.