Preservation, preservation, preservation: scholarly journal archiving initiatives
If you've attended a conference, followed a discussion list or read an information industry publication of late, you can't fail to have picked up on the increasing interest in all things archival. A number of scholarly journal archiving initiatives have been launched over the last couple of years, with many publisher and library participants. Here we look at some of the options, and ask why preservation is such a big thing all of a sudden.
Why archive?
The shift in scholarly publishing (and thus in journal collections) from print to electronic raises two particular concerns that can be addressed by archiving strategies: the fragility of online content, and the licensing terms under which libraries may access the content in perpetuity (given that they no longer possess their own, paper, copy). Even when a publisher licence permits permanent access to content, this cannot be guaranteed: suppose the publisher sells the journal, or ceases to operate. The need to provide a long-term safety net, to ensure that the scholarly record is not lost and remains accessible long into the future, is driving the current enthusiasm for archiving initiatives. This is reflected by the Digital Library Federation's September 2005 statement, Urgent Action Needed to Preserve Scholarly Electronic Journals1 , which has also been publicly endorsed by the International Coalition of Library Consortia (ICOLC) and the Association of Research Libraries (amongst many others).
What archiving organisations are out there?
Several. The most well known are:
Portico :: www.portico.org
A not-for-profit organisation, Portico started life as part of JSTOR - which also, confusingly, describes itself as a scholarly journal archive. JSTOR's archive, however, is about providing immediate (short-term) access, rather than long-term preservation. Portico aims to ensure the continuing availability of electronic scholarly literature. It is funded in part by charitable foundations and government agencies, and in part by the main beneficiaries - publishers and libraries. Portico normalises content upon loading into a standard archival format which preserves content rather than format. It takes ongoing responsibility for ensuring the content remains usable; its primary method of preservation is migration, which it describes as "transitioning content from one file format to another as technology changes and as file formats become obsolete."2 Portico's library partners can only access content in the archive when one of a number of specific trigger events occurs, such as a publisher ceasing to operate. Annual publisher fees ("archive contributions") range from $250 to $75,000 (depending on annual journals turnover) for a completely outsourced archiving solution.
LOCKSS :: www.lockss.org
Instead of providing a centralised archive, LOCKSS provides open-source software to enable libraries to build and maintain localised digital archives, in order to protect their own community's access to the content within. Libraries collect content that they subscribe to from participating publishers' sites, using a specially-developed web crawler, and store it in the format provided by the publisher. Where multiple libraries are licensed to preserve the same content, their LOCKSS servers cooperate to ensure any corrupt or lost content is repaired and the overall record is protected intact - hence the expansion of the LOCKSS acronym, "Lots Of Copies Keeps Stuff Safe". Once again, access to LOCKSS content is only enabled following specific trigger events. There are no upfront charges to publishers (although libraries are required to pay a membership fee); costs may include providing a LOCKSS-friendly interface and creating a plug-in (which libraries can use to tailor their LOCKSS system so that it interacts effectively with the publisher's system).
CLOCKSS (Controlled LOCKSS) :: www.lockss.org/clockss
CLOCKSS is an instance of the LOCKSS model, whereby a subset of LOCKSS library and publisher participants are cooperating to create and maintain a comprehensive "dark archive" - including content not subscribed to by the libraries collecting it - that may be used by the global research community following a trigger event.
National Libraries
The KB (Koninklijke Bibliotheek, or National Library of the Netherlands; www.kb.nl) was one of the first off the mark; it began setting up preservation agreements with key Dutch publishers such as Elsevier and Kluwer Academic Publishers back in 1996, and has since extended this to include most of the major scholarly publishers. The National Library of Australia's (www.nla.gov.au) PANDORA project aims to preserve Australian-published e-resources, whilst the Swedish National Library's (www.kb.se) Kulturarw3 project expects to preserve electronic journals amongst (perhaps optimistically) all web content originating in Sweden. The British Library (www.bl.uk) is piloting Legal Deposit for e-journals; legislation is also in place, or in planning, for legal deposit of e-journals in Canada, Denmark, New Zealand, Norway and South Africa4 . The Library of Congress (www.loc.gov) has supported the development of Portico (above) to meet its preservation needs for e-journals.
Who's doing it?
A wide number of publishers and libraries are beginning to get involved in archiving activities; for example, organisations participating in CLOCKSS include the Universities of Edinburgh, Indiana and Stanford, and major publishers including Blackwell, Oxford University Press, Springer, Taylor and Francis, John Wiley & Sons and Elsevier; most of these are also involved in Portico.
How does Ingenta fit in?
Ingenta's model is designed to provide access to scholarly content, rather than preserve it. Whilst our distributed networks ensure that your content is safe in the short-term, our service (like that of other hosting and technology partners) does not include a commitment to the format migration that is likely to be necessary for long-term preservation. Rather than set up yet another, competing, archiving service, we aim to facilitate participation in existing initiatives; for example, we are actively participating in the British Library's legal deposit pilot on behalf of a number of publishers, we supply content to the National Library of the Netherlands for others, and our interface is LOCKSS-friendly.
Conclusions?
Fortunately for publishers, much of the preservation burden lies with libraries. Nonetheless, it is wise for publishers to participate in, or at least make content and licensing compliant with, one of the many archival initiatives to ensure that you are able to meet your library customers' long-term access needs. It should be remembered that archiving is primarily about preservation, and whilst that may equate to access in the long term, archival licensing terms can be modelled in such a way that archival compliancy is not a threat to current licensing revenues.
Where can I find out more?
- Fenton, Eileen. 'Preserving Electronic Scholarly Journals: Portico'. April 2006. Ariadne, issue 47. http://www.ariadne.ac.uk/issue47/fenton/
- Hodge, Gail and Frangakis, Evelyn. 'Digital Preservation and Permanent Access to Scientific Information: the State of the Practice'. February 2004 (revised April 2004). A Report Sponsored by The International Council for Scientific and Technical Information (ICSTI) And CENDI (US Federal Information Managers Group). http://www.icsti.org/digitalarchiving.php
- Machovec, George S.. 'E-Journal Archives and Preservation: An Executive Overview'. April 2006. The Charleston Advisor, volume 7, issue 4. http://www.charlestonco.com/features.cfm?id=205&type=ed
- Waters, Donald J., et al. 'Urgent Action Needed to Preserve Scholarly Electronic Journals'. September 2005. http://www.diglib.org/pubs/waters051015.htm
Footnotes
1Waters 2005.
2http://www.portico.org/about/approach.html
3Actually, that's not a footnote; the project's correct title is Kulturarw3.
4Hodge and Frangakis 2004.