Большое и малое: значение стандартов

The large and the small: the importance of standards

Значне і невеличке: важливість стандартів

Хопкинсон А.

Миддлсекский Университет, Лондон, Великобритания

Hopkinson A.

Middlesex University, London, UK

Хопкісон А.

Мідлсекський Університет, Лондон, Великобританія

В течение длительного времени библиотекари предсказывают появление электронной библиотеки. Основным стимулом для внедрения электронной библиотеки должен быть финансовый, но первой, естественно, должна развиваться техника. Этот процесс может быть ускорен научными исследованиями, проводимыми в библиотеках и информационных органах.

In the UK there are a large number of projects being funded for the benefit of higher education within the framework of eLib, the Electronic Libraries Programme. These range from training in the new technologies which are influencing libraries to document delivery and conservation. Management issues such as the problems of copyright are also included.

Many of these projects are intended to serve as models for future developments in higher education libraries.

As these projects get under way, it is clear that there are some severe gaps in standards which prevent projects from being initiated and concluded in the best ways.

As a result of this, as well as the large projects under way there are some smaller projects within the framework of MODELS, hosted by the University of Bath's United Kingdom Office for Library and Information Networking.

Within the UK context and in certain other countries, MARC and other standards like Z39.50 and their related profiles need enhancement and recommendations for this are being considered in the UK.

Serials holdings are used as an illustration of what is being done to enhance the standards in this area and to see the kinds of project that will benefit by this.

Thus small but key activities like standards making are necessary to the success of large scale integrated systems.

На протязі тривалого часу бібліотекарі провіщають появу електронної бібліотеки. Основним стимулом щодо запровадження електронної бібліотеки повинен бути фінансовий, но, по-перше, звичайно, повинна розвиватися техніка. Цей процес може бути прискорено науковими дослідженнями, які проводяться в бібліотеках та інформаційних закладах.

1. Introduction

For a long time now, librarians have been predicting how the electronic library will come about. The main stimulus to the introduction of the electronic library is likely to be financial but naturally the technology has to be developed first. This may of course be hastened on by research and development activities conducted in the library and information field.

In 1993, the UK Higher Education Funding Council commissioned the Libraries Review which was chaired by Sir Brian Follett. As a direct response to the Review, the Joint Information Systems Committee (JISC) established the Electronic Libraries Programme (eLib) The main aim of this is to help the Higher Education Community to develop the electronic library.

This in itself is being done within the context of enabling resources to be used more efficiently. Libraries are there of course to assist in the teaching/learning and research processes. Many projects benefit both the academic researcher and the student.

We know there are a large number of resources available on the Network and we know how difficult it can be to find them. Some eLib projects are concerned with this; eLib is funding several gateways in subject areas like Sociology, Medicine and Urban Design.

Libraries have had to cope with shrinking budgets and a number of projects have been set up to make easier the discovery of the locations of books and other bibliographic materials. CURL, the Consortium of Research libraries is working closely with JISC to provide national access to the CURL database of bibliographic records which is hosted at Manchester University nit he north of England and this is to be extended to provide related document delivery services. Middlesex University is participating in LAMDA (London and Manchester Document Delivery) which for over a year has been providing services in these two regions. Also, the EDDIS project is developing a system to be driven by the end-user for discovery, ordering and transmission of material by both traditional and electronic, means.

Training and awareness are important and Netskills based at Newcastle University is providing courses on networked information to library staff and users from introductions to the World Wide Web to HTML authoring courses

One of the big problems in the implementation of the electronic library is copyright which has an impact on most electronic library applications. Many of the projects are working directly with publishers to find a way through the maze of copyright regulation and to ensure that their project is in the clear and has full agreement of publishers involved. Research projects are also being undertaken in this area by De Montfort University and the Library and Information Technology Centre at South Bank University.

In this context, the funding councils have funded a pilot project to have a national site license for journals from four publishers, which includes low cost access to electronic versions of journals for subscribers.

Many other information resources in universities could be made more widely available; eLib is working with the archives community to develop national finding aids and the standards to support them.

eLib is also working with the British Library Research and Innovation Centre (formerly the BL Research and Development Department), in joint projects such as the UK Office for Library and Information Networking (UKOLN). eLib also has close ties with the European Commission's Telematics for Libraries Programme and the Coalition for Networked Information in the United States.

There are a number of World Wide Web pages giving information about eLib which can be accessed through http://ukoln.bath.ac.uk/elib. There is also an electronic journal at http://www.ariadne.ac.uk.

Many of these projects are intended to serve as models for future developments in higher education libraries.

2. The Electronic Library

One of the main areas where economies can be made and where a higher rate of success is needed in searching for resources is in the delivery of journal articles to the end user. In the UK we have the British Library's Document Supply centre of world-wide renown, so you might wonder what else do we need. What we really need is a facility whereby researchers, students or any member of the public can access databases of secondary services literature, the kind of service like British Humanities Index, Chemical Abstracts, etc. to find the articles on their chosen topic. From there they need to be able to see if the item is in their library or whether it is available at another library and, if so, in which library. After all, a student may not have the authority to be able to receive journal articles from the British Library Document Supply Centre without having to pay for the photocopies obtained there and may prefer to go to a library where the journal is held, especially in an area like London wherte the holding library may not be very far away. Basically what is needed is for the details of the journal issue containing a desired article to be matched against the holdings of serials in available libraries. This may even be a better solution than the other alternative which would be to be able to go straight to a copy of the article in an electronic journal and copy it. The student may not have the facility to do this, may be charged more than if he or she went to the library to read it or may want to browse the rest of the journal and its other issues.

Thus a number of projects are being considered to set up working models to undertake and implement these activities.

There are a large number of union catalogues around in the UK. These have been designated as 'physical clumps'. A clump is an amorphous group of items like a group of trees. In this sense it is used to indicate that the contents of the union catalogue may not make up a logical set of data in terms of subject or type of library. In effect they usually relate to a region, though not always and usually they relate only to one type of library within a region. The next extension to this idea is the 'virtual clump', a group of catalogues which are not in one place but which can be searched by one search from a single software package.

One of the most important standards which will enable this kind of scenario to be implemented is Z39.50, an American standard which has been adopted internationally (in the US version rather than in the ISO version which is lagging slightly behind). This standard is described in more detail below. But this standard, which is not specifically a library standard, relies on profiles being developed by each sector which wishes to use it and this is an area that has to be advanced to enable systems which wish to use the standards to work.

3. Standards

As these projects get under way, it is clear that there are some severe gaps in standards which prevent projects from being initiated and concluded in the best ways.

As a result of these gaps, as well as the large technological projects under way there are some smaller projects within the framework of MODELS (Moving to Distributed Environment for library Services), hosted by the University of Bath's United Kingdom Office for Library and Information Networking (UKOLN).

These are looking at the development of metadata standards, standards for the data that describes or indexes data.

It is interesting as a case study to take serials holdings and see what is being done to enhance the standards in this area .

3.1 MARC

In the early days of MARC it was seen as necessary for the standard only to provide what was needed by national libraries. Indeed, in the very beginning it was there to provide national bibliographies, not even catalogues for national libraries. If any other library had a requirement, they were free to add their own fields. From early on in the use of MARC in the UK this practice had a number of disadvantages as the British Library was not concerned with expanding the MARC format beyond its own necessary requirements. One of the problems was that the recording of journal articles in a consistent way was not facilitated. The US MARC formats, have for a long time now included facilities for recording holdings (US MARC Format for Holdings Data).

In order to ensure that serials holdings are treated with the level of consistency required to enable the eLib projects to function, more standards are required. There needs to be a consensus that the standards should be used in a particular way. And in today's environment, what the UK does has to be consistent with what the US is doing. Moreover, whatever is agreed will have implications for those systems that have already recorded serial holdings. The study on serial holdings has shown that, in general, systems which have recorded serial holdings have been fairly consistent and have not been inconsistent with what the Anglo-American Cataloguing Rules have suggested.

The kind of data usually found is in a form:

Vol.2, no.5(1990)-vol.8, no.12(1997)

Many systems put in the captions (vol., no.) whether or not the document is in English and includes such captions on the document.

Some systems have a more abbreviated form:

2:5(1990)-8:12(1997)

As well as the need to indicate the span of holdings there is also the need to indicate gaps and there needs to be a mechanism to do that. If volume 4, no. 1 is missing, the above could be represented by:

Vol.2, no.5(1990)-vol.8, no.12(1997). Lacks Vol.4, no.2

or

Vol.2, no.5(1990)-vol.4, no.1(1992) Vol.4, no.3(1990)-vol.8, no.12(1997)

3.2 ISO 10324 Holdings Statements: Summary Level

ISO 10324: Information and documentation: Holdings statements: Summary level is an international standard about to be published (voting finished on 19 February). It has been prepared by ISO TC 46 SC9. It has been based on the ANSI (US) standard which has been developed hand in hand with the MARC format for Holdings. Therefore it can serve as a useful set of rules to define further the form of the data in the MARC record.

The document follows the style of a cataloguing code, such as the International Standard Bibliographic Description (ISBD).

The aim of the standard is to specify 'display requirements for holdings statements at the summary level to promote consistency in the communication and exchange of holdings information.' Additionally the standard claims it is appropriate for union catalogue lists.

There are three levels of holdings statement; the first indicates only the existence of a serial title in an institution. The second adds to the first the extent of an institutions's holdings. Level three includes a summary statement.

The standard deals with 6 areas which are further subdivided.

1. Item identification: title, ISSN, CODEN, etc.

2. Location data area includes the institution identifier, sublocation identifier, copy identifier (for multiple copies) and call number.

3. The third element is entitled 'Date of Report Area' and is an optional element to indicate the date of the latest modification of the record.

4. The General Holdings Area is a coded field which is essential for serial holdings. The whole is included in parentheses and the five subelements are separated by commas. if codes are not available or the data to establish the codes, a textual field may be used; then there may not be 5 data elements.

5. Extent of Holdings area. The standard is very specific as to punctuation here since the punctuation allows the holdings to be expressed in a concise form.

Here are some complete examples of the extent of holdings.

v.1-19+"suppl."v.1-12 v.1(1950)-10(1959) v.1-5(1901-1905) v.2-6,8-14,17-20, 1945-1949,1951-1957,1960-1963 v.1(1950)-2(1951),4(1953)-8(1957) v.1(1929)-[3](1930)-8(1936) v.5-6(1950-1951),10(1955),12(1957) 1912-1950,1954t.2(1940)-9(1947) 1969:Jan v.1-6 <bound> v.7-10 <unbound>

Holdings note area

This may be used to give information on a gap or a non-gap break, or any other information, in a non-formatted form.

3.3 SICI (Serial Item and Contribution Identifier)

There are other standards around which have recommended standardisation of data in this area and which have to be taken into account.

SICI is the Serial Item and Contribution identifier (and is available on the Internet <http://sunsite.Berkeley.EDU/SICI>). It has been developed by ANSI/NISO (the American standards bodies in the bibliographic field).

It is to be used with serial publications in all formats. It is intended to be used with EDI (electronic data interchange between publishers and the booktrade and libraries), SISAC bar codes, Z39.50 queries, Universal Resource Numbers (URNs), electronic mail and 'human transcription in print'.

The SICI is divided into three section, Item segment, Contribution Segment and Control Segment

The item is the physical part which contains the journal and it includes ISSN, chronology and enumeration

The contribution segment includes pagination and a title code. This does not concern the study as its aim is to investigate the matching between the serial holdings and the SICI, for which the article identifier is not relevant.

The first 9 characters are the ISSN of the journal followed by a chronological designation followed by the enumeration. The date in standard ISO date format (YYYYMMDD) is as specific as is required going down to the day of the month if necessary. Although it is dealing with a single issue it may be necessary to use a spanning date for issues which have a start and end date and this is denoted by a slash. The repeated number always starts from the year. Thus we can have (199502/199503); this is 1995 February / March. Parentheses must be present even if no date is available (seldom the case) to separate the ISSN from the enumeration. Codes beginning with 2 replace the month for the seasons and 3 for the quarters (e.g. Summer 1980 is 198022).

The hierarchies of the enumeration are separated by colons.

The contribution segment is enclosed in < >. <> must be present if null. The control segment follows. Neither contribution nor control segment need concern us; they are described fully in the standard.

The SICI is highly dependent on an ISSN and on such having been assigned and used consistently.

Here is an example which the standard gives:

The Library (Sixth Series) XVI:4 (Dec. 1994) is represented by SICI: 0024-2160(199412)6:16:4

3.4 Z39.50 (Search and Retrieve)

Z39.50, the Search and Retrieve standard of ANSI/NISO is a means of standardising queries on databases. Using this will enable a user to do a search on his local server which in turn can go and extend the search to other servers which have implemented the Z39.50 interface. Profiles are set up to allocate codes (called tags) to identify different data elements. The system which initiates the search on a Z39.50 server (the client), needs to know what a particular index on the host is equivalent to in its system. It needs to know that its 'title keyword' index is called 'title' index in the other system.

At present, the profiles available in the UK do not have serial holdings in them, so it is not possible to initiate a search through serial holdings. If it were possible even through the Z39.50 profiles, one would have to agree on detailed standards for the recording of these serial holdings as can be found in US MARC for Holdings, ISO 10324 or in the SICI standard. At present many catalogues with serial holdings would not be able to provide such detail. There is another way round. The UK profiles require the return of a MARC record to satisfy a query. It would then be possible for the client to look at the MARC records and analyse them. This too depends on a consistent formulation of serial holdings in the source data. This was found to be lacking when records from different sources were studied. One reason for this is that the Anglo-American Cataloguing Rules are not very precise about how to describe serial holdings. This means that many records in machine-readable form will need to be upgraded in order to provide the data on serials holdings which will be needed for the electronic library. Many libraries are automating their serials acquisitions procedures; even these only operate on current holdings and in order to include those from the past it will be necessary to recatalogue manually the earlier holdings.

3.5 Requirements for additional standards

Thus there is a need for a number of different standards to be in place before projects of this kind can advance the electronic library. As well as serial holdings, there are other data elements such as location, sub-location required in the development of union catalogues.

4. Lessons for other countries

One of the lessons that can be deduced from this study of the requirements of serials holdings is that establishing standards is neither an easy nor a static activity. MARC in its early days was thought to have no requirement for holdings or locations. Now these are seen to be essential for the development of systems we have currently.

Countries which are developing standards now have the advantage of being able to establish them in advance of accruing large databases. It is enlightening to see also that establishing a national format requires the inclusion of fields for many different purposes which have only recently been established. Also, it becomes clear that data have to be developed at the level of the form (the cataloguing rules) as well as at the level of the content designators and the machine-readable record structure for the success of these large-scale projects.

Additionally, because of the choice of different standards that are available, it has to be stressed that decisions have to be made in each project and in each country. Recently, in Tunisia, I undertook a consultancy to advise the government on the implementation of automation in the university libraries. It was noteworthy that those in charge of implementing the project thought that library automation standards were waiting to be implemented, without any effort. As can be seen, there is a choice of MARC format. AngloAmerican Cataloguing Rules is not complete and needs standards like ISO 10324 to complete it. Other standards like Z39.50 are still under development at the international level. So decisions have to made all the while, the implementation of standards is not a once and for all activity.

Thus, for the success of a large project the small detail of the way the data is structured is as important as the management of the large-scale project.