Under the Hood: exploring Ingenta technologies
When is a scholarly journal not a scholarly journal? When it's being delivered.
End users increasingly don't distinguish between data and publications; they just want to retrieve the content which is of value to them. So, the technology supporting content delivery needs to be up to the job. Which is where Ingenta comes in. We're helping our publishers deploy a range of cutting-edge technologies to provide the intelligent tools your researchers require to get the most from the content you pay for. The online research experience relies on the technology behind electronic content.
What do you want to find under the hood?
What?
The IngentaConnect platform, first launched in 2004, is our flagship website and currently hosts over 10,000 e-publications and provides access to over 20,000 fax and Ariel-deliveredjournals. The system has been designed around industry standard technologies and makes use of open source components. This allows us to assemble best-of-breed components created by experts in the software engineering community into a scalable and extensible system.
How?
The architecture is a multi-tier Java J2EE application running on the open source JBoss application server, and the Jetty servlet engine. The system is backed by a mixture of database technologies including Oracle, Postgres and MySQL. Web services, conforming to the REST architectural pattern, are used to bring together data in real-time for creating a dynamic user experience for end users.
Why?
Our focus is on implementing the functionality our users need, not reinventing wheels or creating bespoke infrastructure. Agile software engineering practices and rigorous release processes allow us to rapidly install and deploy a continuous stream of application upgrades. This positions us well to deliver modular developments and ensure the site remains state-of-the-art.
___________________
What?
Ingenta's award-winning Metastore is a data repository designed to support flexible content types. It goes beyond the restrictive infrastructure of paper journals to support innovative online strategies such as uploading the raw research materials which are critical to evaluation of the associated literature, or repackaging of content into "virtual journals". Metastore’s extensibility means it can be relied upon for long-term efficiency and stability, and it uses standard vocabularies which integrate well with other technologies.
How?
Metastore is an RDF (Resource Description Framework) triplestore, built using Jena, an open source Java framework developed by Hewlett Packard’s HP Labs. RDF provides a framework for describing resources, their metadata, and their relationships. “Resources” could simply be academic papers, but the technology is flexible enough to represent any kind of research data we might need to deliver via IngentaConnect. With over 200 million triples, Metastore is the largest commercial deployment of this Semantic Web technology and W3C standard.
Why?
As the boundaries between journals and databases continue to blur, scientific communities increasingly need the capability and flexibility to retrieve, interact with and navigate between many different kinds of content. We believe Semantic Web technologies are best placed to meet the future needs of scientific research.
___________________
What?
Our forthcoming new search engine, Rummage, will expand and improve on IngentaConnect’s current search functionality with enhanced indexing of author names, spell checking ("did you mean..?"), additional sorting options, and relevance rankings. Federated searching will be supported through the Z39.50 Next Generation (ZiNG) protocol.
How?
The technology behind Rummage is Solr, an enterprise search server based on the Lucene search engine library; both Solr and Lucene are open source projects from the Apache Software Foundation. Lucene is highly respected, and used by several applications and websites, including Wikipedia. Its flexible architecture can index any file from which text can be extracted, enabling it to catalogue the many data types delivered by IngentaConnect. Solr packages up Lucene as a web service allowing online continuous indexing, and easy integration of the search into existing websites. As with Metastore, we’ve chosen scalable technologies which will ensure strong performance ongoing for our – your – users.
Why?
In this “golden age of Google”, the value of good discoverability is not in doubt. Web users are increasingly sophisticated, and accustomed to high-levels of functionality; they demand more from native search engines. Rummage will allow us to deliver timely, flexible results to a new generation of savvy searchers – and thereby encourage further use of the content to which you're subscribed.
___________________
What?
The latest release of IngentaConnect incorporates a number of new user-driven features including new support for social bookmarking tools (such as del.icio.us and Connotea) and improved interfacing with reference managers (including new RDF and BibTeX export options). As a result, some changes to the site’s page layout have been necessary to make room for new features.
How?
Tweaking the display is relatively simple, as IngentaConnect is rendered using Cascading Style Sheets (CSS), a W3C-specified language which provides the style information to be applied to a web page. This avoids hard-coding look-and-feel elements into the page’s HTML. Our use of Java Server Pages in our flexible templating environment further enhances our ability to easily customise the site. Careful deployment of Javascript and Dynamic HTML to create interactive navigation options puts more functionality at the fingertips of our users. Naturally we continue to provide fallback options for users whose browsers are not able to support any of our more sophisticated features, and we retain our W3C AA compliancy.
Why?
Whilst we are cautious of riding bandwagons, many Web 2.0 features and tools have proven their genuine value to the user. Increasing site usability is an obvious benefit, whilst integrating IngentaConnect with social software supports increased citations and use of content.
___________________
What?
Also in the pipeline is a new subscription management tool for our publisher customers which will provide state-of-the-art controls for loading, reviewing and editing subscriber access rights. Smarter processing options will allow for increased flexibility of business models, in line with the market’s slow but steady move away from traditional calendar year subscriptions. The new tool will also add to the functionality currently offered to institutional administrators and subscription agents.
How?
Building on our existing investment and experience with dynamic user interface technologies including CSS, JavaServer Pages (JSP) and Faces (JSF) the new system will use Ajax (Asynchronous JavaScript and XML) techniques to deliver a dynamic, responsive interface that will allow users to quickly access the functions they need. Flexible reporting options, email alerting and RSS feeds for notifications will allow publishers to immediately respond to your requests, tweak business options, and monitor system activity.
Why?
Licensing of scholarly content is evolving; the need for increased flexibility requires complex supporting technology, and ever more sophisticated tools. This software delivers on our essentials: extensibility (ready for as-yet-unknown requirements), currency (so our users are getting the cream of what’s presently available) and platform-independence (superior functionality with no plugins or downloads required).
___________________