Friday, September 24, 2010

Digitising the archives: the Wellcome Library approach

Like most research libraries and archives repositories, the Wellcome Library is currently planning to digitise quantities of its unique holdings and provide remote access to the digitised content over the Web. Among the many challenges that such plans present, perhaps the most fundamental is the decision what to digitise, or where to start - with almost limitless potential in the holdings but limited resources what do we prioritise?

Some institutions have chosen to select their most popular collections, others those for which they can obtain commercial funding (which are often the same of course). The Wellcome Library has opted for a thematic approach: we aim to digitise a substantial proportion of our holdings by looking at various broad subject areas and creating integrated online resources to support research and discovery in those fields. Since digitisation and the internet enable the creation of virtual online archives by providing a single point of access to widely dispersed content, we intend to explore the integration of relevant content from the holdings of other institutions into the online resources that we eventually create.

The first theme, ‘Modern Genetics and its Foundations’, will focus on the development of the science of biological inheritance from the later 19th century onwards, and the growing understanding of its role in human health and disease during the 20th century. Arguably, this will represent the fundamental meta-narrative of modern medicine; the gradual integration of genetics into the clinic. Content relevant to this theme ranges from relatively early documentation on the basic science of heredity and on the study of inherited diseases, to material on the elucidation of the molecular basis of inheritance in the mid-20th century and the subsequent development of genomics.

Preparations for developing the theme are underway: over 600 boxes of personal and institutional papers held by the Wellcome Library’s archives department will be imaged to provide the substrate or bedrock of the theme. These include:

  • the papers of Francis Crick (1916-2004), molecular biologist and Nobel Prize winner
  • the notebooks of Fred Sanger (b.1918), biochemist and double Nobel Prize winner
  • the papers of Arthur Mourant (1904-1994), haematologist and geneticist
  • the papers of Hans Greuneberg (1907-1982), geneticist
  • the records of the MRC Blood Group Unit , 1935-95.

This material will form a core of documentation on some of the most important research on the theoretical underpinnings of the biology of inheritance, on genetics and gene sequencing in post-war Britain. To this we will add:

  • the papers of Sir Ernst Chain (1906-1979), biochemist and Nobel Prize winner
  • the papers of Norman Heatley (1911-2004), biochemist
  • the papers of Sir Peter Medawar (1915-1987), biologist and Nobel Prize winner
  • the papers of Dame Honor Fell (1900-1980), medical scientist.

Although more loosely connected with the theme, this material will help to document the contemporary scientific, intellectual and institutional context in which genetics and allied research took place.

More archival collections will be added as they become available for digitisation in future years. The selected collections will be digitised ‘cover to cover’ so their historical research potential will not be limited exclusively to questions around the given theme. We do, however, feel that the thematic approach both helps us address the issue of prioritisation in a more creative way than merely responding to perceived current user demand, and provides more potential for eventual integration of third-party content and thus the development of online virtual archives. It is in the elimination not only of geographical distance for the current researcher but also of the vagaries of historical dispersal of papers that the technologies of digitisation really come into their own.

Author: Richard Aspin, Head of Research and Scholarship, Wellcome Library

Monday, September 20, 2010

What will the Wellcome Digital Library offer?

The overall strategy of the Wellcome Digital Library is to support three activity layers aimed at different user behaviours:
  • Engage – by highlighting the range of material available in the Library, and using actively curated content to encourage visitors to investigate further;
  • Discover – allow users to investigate our holdings by searching or browsing on subject themes and names, and retrieving a mixture of actively curated and automatically generated content;
  • Research – enable users to conduct a single search which will identify all relevant material in the Library, including digitised and non-digitised holdings, and allow users to facet, select and manipulate this content as needed.
To support these activities the following IT systems will be procured and developed over the next 2 years:
  • Search and discovery – to encourage users to engage with our content;
  • Delivery - to provide access to the content;
  • Workflow – to manage all aspects of the digitisation processes;
  • Storage – to ensure that digitised content can be preserved securely;
  • Digital asset management – to manage the digital objects that are created.
Through these systems we will seek to provide our users with the ability to:
  • Find relevant materials through fast, accurate, and comprehensive search functions, including full-text search;
  • View, download and reuse content under a range of licenses, including Creative Commons licenses where appropriate;
  • Engage with the content through a variety of Web 2.0 and other tools that will include the ability to comment on and tag content and provide transcriptions.
Not only will the digital library be technically capable of supporting these activities, but there will be a wide range of resources on offer, with a critical mass of content from the Library's holdings. As much as possible, discreet collections will be digitised and made available in their entirety, with cover-to-cover imaging employed as standard (more soon!).

Monday, September 13, 2010

Wellcome Digital Library blog

In August 2010, the Wellcome Library announced an ambitious plan to develop a world-class digital resource for the History of Medicine. The core of this resource will be digitised content from the Library's own holdings, although funding will also be made available to others to digitise complementary collections for inclusion in the digital library.

As we move into the world of large-scale digitisation - with a short-term plan of 1m images online in the next 2 years - a number of questions, issues and opportunities await us. We have already started tackling some of the big questions, such as:

  • What should we digitise?
  • What content is of most value to researchers?
  • What online toolset should we offer researchers?
  • How can we use the digital library to encourage learning and discovery?

And of course there are the nitty-gritty technical issues, including:

  • Logistics of digitisation and workflows.
  • In-house vs. outsource options.
  • Metadata.
  • Long-term data management.
  • Delivery formats, speeds, and functions.

As we work through these issues, and progress with our digitisation programme, we will use our new Wellcome Digital Library blog as a real-time progress report, discussion outlet, and notification area.