Digitisation update

I gave a presentation about our various digitisation activities to the ITLT on June 12th which scored a first for me: having spoken for more than my allotted time I attempted to wrap up the session quickly and was urged to carry on  – “it’s interesting” they said.  I have a follow up engagement, speaking to IS Forum on July 21st.

Jan Whalen has been appointed as Digitisation Assistant  and as an introduction came with me to Sheffield Hallam University to see their Luna installation where they have implemented the new web-based browser.  Not suitable for all our collections but I’m working to get it running with some of them.

IMP Conference proceedings are being scanned by Hollingworth and Moss.  Chris Grave from EDL and Scott have been working to find the best way of preparing metadata for ingest into eScholar.

First of our pamphlets (Foreign and Commonwealth Collection) are now available via JSTOR.   To see some, go to www.jstor.org and choose advanced search.  Check the pamphlets box and do a search for “Basutoland”. 

I’m working with Martin Snelling to develop our core texts (CLA) scanning service.  First objective is to use the PackTracker software package to ease the administrative burden to help keep the existing pilot service running and legal while we plan a more permanent replacement.

Digitisation Update 14/05/2009

Luna Collections
Update on numbers:
Rylands Collection 6716 of which 5118 are available in the public collection
Genizah Collection 12409
Medieval Collection 2725
Rylands Papyri 299

Additional material funded and due for adding soon: Shahnama (Persian epic poetry with illustrations), Dante (early printed books), Darby bible

JISC e-Content bid
Carol Burrows is preparing a bid (with additional input from me and Caroline Checkley-Scott) for funding to explore possible models for a North West Centre of Competence for Heritage Digitisation, located within JRUL (mostly at Deansgate).

Pamphlets have been returned from digitisation by BOPCRIS. Content due to be on JSTOR by the end of June.  Individual pamphlets will eventually be linked from Copac and Talis.  Free access to UK educational institutions.

Other Digitisation
I’ve been in discussion with Phil Butler to explore use of the repository for JRUL digital collections. Potentially at present JRUL Bulletin, IMP Conference Proceedings held by EDL and Deansgate Hymn Indexes. We are exploring methodology and tools required for this to work in practice..

Digitisation/E-resources R&D Update

Luna server and its collections

Support is ongoing for Deansgate projects and also work towards implementation of the new LUNA system. Manuscripts are being added steadily to the medieval collection. Having investigated the work involved, Genizah staff have decided to put their record enhancement sub-project on hold until they see how much time is available at the end of the main project.

Pamphlets

Christine Chappelle and I attended the project launch in Liverpool. This was the usual mini-conference with reception afterwards. Some interesting views on the value of the digitised pamphlet collection and ways in which it will contribute to research and teaching on 19th century social topics. Michael Spinella from JSTOR was very enthusiastic about their expansion into new types of content. The project has digitised over 1million pages. Our pamphlets are due for return on May 13th. Although many of them have been away since last September BOPCRIS have been providing pdf copies in response to store requests from prospective users.

IMP Conferences Scanning

Planning proceeding. MBS/IMP have agreed in principle to provide funding for the scanning.   Quotes received and being considered.

Scanning Equipment

Copibook scanner has been moved to the old bindery office where it is less cramped and we have improved ancillary equipment. Training programme in progress. Public scanners about to be available on 3 Blue 1 cluster machines – coincidentally at the same time as scanning via photocopiers is made available. Scanning will be free of charge to users.

Storage Infrastructure

The ITS Storage Infrastructure Group which I attend has been considering storage requirements around the University, investigating the best balance between devolved and central storage provision and backup. It has now moved forwards to look at the available storage technologies and solutions with vendors invited to give presentations. It is interesting to see the parallels between our storage needs, for example for archiving of large image files, with data archiving requirements elsewhere in the University. General clamour for more p: drive space.

Digital Curation

I attended a meeting in the School of Social Sciences in the National Centre for eSocial Science where a representative from the Digital Curation Centre was talking about their Lifecycle model for digital data curation. Particular relevance here was not just the subject matter but that the staff attending from NCeSS considered that the whole area of data curation, ie how research data is documented, archived, preserved and possibly re-purposed, was one in which the Library should be taking a lead in the University. Probably a topic for the Digital Preservation group which Phil is chairing to take forward.

Sustainability

One of the principal problems of digitisation projects is sustainability: particularly where institutions want to provide open access to the output from a digitisation project, a way needs to be found of coping with service maintenance costs once the project funding stops. I attended a workshop run jointly by Ithaka and the JISC Strategic Content Alliance which considered and discussed case studies of projects using different approaches to sustainability. The overriding theme was the importance of partnerships of various types, for example the JSTOR partnership in the Pamphlets project or the Oxford “Electronic Enlightenment” being marketed commercially via OUP,  in producing a model for sustainability.  However, it became obvious that many unsolved issues still remain and JISC’s desire to produce a “roadmap for sustainability” is still some way off.

E-Resources

As I now no longer have E-resources R&D in my job title this is probably my last report on these topics. Full handover awaits the return of Olivia and the arrival of the new e-resources team members.

SFX

Peter Jervis is doing sterling work with SFX knowledgebase updates. Statistics files have been sent to Ex Libris for use in their BX project. We await trial access to the product.

Webfeat

Work is in progress to expand the portal search to tailor the set of resources searched to the user’s faculty.  Mostly the SearchIt service runs without major problems but there have recently been some issues related to database vendors being changed.  For example where SilverPlatter resources were transferred to the OvidSP interface the corresponding Webfeat translators were not available or did not work as they should. Problems now being addressed by Webfeat.

Thin Client Server

Server has been patched and hotfixes loaded but there are still problems with many of the applications. The underlying cause is being investigated by David Hughes from ITS but is proving difficult to track down.  Our version of Citrix is becoming unsupported at the end of 2009 so we are working towards de-commissioning the service by then.  Applications will be assessed individually by Research & Learning Support staff and then moved to alternative access methods (stand-alone or web) or retired.  Scifinder client access via clusters, currently available only to EPS users, is being extended to MHS and FLS members.  Similar access for library staff can be enabled on DT08 machines.

Digitisation/E-resources R&D Update

Luna server and its collections

Slow saving in the images database on the new server has been fixed.  Took a long time to identify the cause and 30 seconds to fix.  Save times reduced from 45 seconds to 5.

Rylands Medieval collection is now live, already contains 5 of the projected 40+ manuscripts

Genizah Project group have identified a need to include rotated and enhanced images of their fragments in the Genizah Collection.  A test area has been set up for them to experiment to find their preferred method.  Several possible methods have been identified and discussed:   now awaiting a final decision on which to implement.

 

Pamphlets

Final batch dispatched to Southampton for scanning.  Project launch in Liverpool on March 20th.  Christine Chappelle and JMC will attend.  Some content from Newcastle, Liverpool and UCL now live at http://www.jstor.org/page/info/about/news/announcements/2009.jsp#FebA

 

IMP Conferences Scanning

Project being set up to digitise 44 printed volumes of IMP conference papers held in EDL.   Later volumes are already available in online form on the IMP website.   Aim is to unite the two sets of content in the IR with a supporting website providing kudos for JRUL/MBS. 

 

Scanning Equipment

Move of CopiBook scanner to old bindery office and installation of public scanner(s) on Blue 1 both imminent.

 

SFX

Peter Jervis is being trained to give additional help with SFX checking and knowledgebase updates.

 

We are providing our SFX usage statistics to ExLibris for inclusion in their bX project.  BX is a new service, due for release later this year, which mines collected usage data to provide information about usage patterns and give feedback along the lines of “others who consulted this article also looked at ….”   Idea is that this will give a more current view of usage than the necessarily retrospective citation analysis.  We will have 3 months free trial access when the service launches.

 

Thin Client Server

Server in mourning for Alan?  Many of the applications are not working but before they are repaired the server needs patching and service packs/hotfixes applying.  Initial attempt to do this has failed due to licencing problems.  IT Services are investigating. 

Digitisation/IT Research and Development

1. Digitisation
Luna Insight has been migrated to a new server and upgraded in the process to the new Version 6.0.  Service name remains the same (enriqueta.man.ac.uk) so that existing client viewer installations do not need upgrading immediately.  Old server is still running under a new name and will be used for testing purposes.  The new viewer, LUNA, is currently being implemented.  If all goes according to plan this will replace both the current JVA client and simpler browser viewers as the preferred access method.  JVA client may still be required for specialist purposes as it includes additional facilities not included in LUNA.  We may also find that LUNA is not suitable for all collections.

Second batch of pamphlets is being prepared for sending to BOPCRIS  at the end of February for digitisation.  Work needs to be done quickly so thanks to Simon and Victoria from cataloguing for their help in meeting the deadline.  Pamphlets project is holding a launch event in Liverpool on March 20th.  Some searchable content should be available in JSTOR around this time.

Old bindery office has now been cleared so it should be possible to make progress with moving the scanner.

2. SFX
Knowledgebase updates have all been run up to date at last.  Processing of the updates is proceeding.  Thanks again to help from cataloguing.

3. RAE
Outputs have been returned to us and are currently being distributed back to their owners.

JMC.   23/01/2009

Digitisation/Systems Development

1. Digitisation

Plans are progressing for Luna Imaging to migrate our image databases to the new server in the week beginning January 12th. There will be some down-time but it will be kept to a minimum.

A test Luna collection has been set up to hold the English Manuscripts Project data and images. The test metadata profile is currently being tested by project staff.

Special Collections have been awarded a grant to digitise a Persian manuscript. Images (about 600) will be added to the main Rylands Collection but will also be accessible via a separate website.

We have been asked if we can provide additional catalogued pamphlets for digitisation at BOPCRIS as part of the JISC/RLUK 19th Century Pamphlets Online project. Negotiations with Special Collections are continuing but we hope it will be possible.

Data has now all been copied from the old DSpace server which can now be decommissioned. It is intended that the documents it held will be made available via the new repository as soon as possible after launch.

CopiBook scanner has still not been moved. Maps stored in its new home in the old bindery office must be moved first.

2. Webfeat (SearchIt)

Search box now live on the portal, searching Web of Science, Expanded Academic index, Business Source Premier, KnowUK, Oxford Reference Online, Custom Newspapers and Talis. Preliminary statistics for numbers of searches available at s:\eres_statistics\searchit\nov08\daily_searches_sept_nov08.xls
Overall trend is rising, about 600 searches per day at present. We have made a small change to the interface to make it more obvious how to change the group of resources included in a search.

3. SFX (FindIt)

Backlog gradually reducing, thanks to help from cataloguing staff.

4. Data Curation

I attended a meeting of the Data Curation Centre “Research Data Management Forum: Roles and Responsibilities for Effective Data Management”. This looked at the issues involved in management of data, identifying the roles of data creators (the researchers), data scientists (working with researchers to optimise the way in which data generation is planned and the data are manipulated), data managers (computing specialists dealing with storage and preservation) and data librarians (advising on preservation, description, use and re-use of data). Several issues for us to consider as our re-structure progresses and the repository is launched.

5. RAE

No news yet of the return date for our outputs.

JMC. 3/12/2008

Digitisation/Systems Development

1.  Digitisation.  Plans are in hand to replace the four year old Luna server with a new Sun machine capable of supporting Luna version 6 and much larger collections than at present.  Image files will be stored on the SAN rather than internally as now.   Machine is being funded by the JISC English MSS project, “In the Begynning”.

Genizah collection continues to grow and now has over 10.6k images of fragments in the public collection.

Copibook scanner is to be moved to the old bindery office.

2.  RAE 2008.  The final phase is about to start.  We have been notified that the printed outputs taken to Bristol last January will be returned to us between mid November and the end of January.  They will be unpacked and checked in JRUL before being returned to Faculties/Schools.

3. SFX (FindIt).  Knowledgebase still not up to date but help in clearing the backlog is being organised.

4.  Webfeat (SearchIT).  We have seen a test version of a simple search box on the portal, searching Web of Science, Expanded Academic index,  Business Source Premier, Talis Prism, Custom Newspapers, Oxford Reference Online and KnowUK.  Members of the advisory group were enthusiastic so when further testing is complete it will appear on the live system.

5. E-resource traffic monitoring.  We are starting to work with ITS to monitor unusual traffic to our e-journal sites.  This should provide early warning of abuse such as has previously led to complaints and threats of disconnection from suppliers.