Problems of Implementing ISO 2709 Formats on CDS/ISIS Использование формата ISO 2709 с CDS/ISIS Hopkinson Alan Middlesex University, London, Great Britain Хопкинсон Алан Университет Миддлсекс, Лондон, Великобритани Описываются форматы ISO 2709, MARC и CCF, их структура, особенности применения в среде CDS/ISIS. В приложении приводятся образцы записей в этих форматах. 1 Introduction CDS/ISIS is an information retrieval package in the ISIS family, having originated as a mainframe package, based on the ISIS software that was first developed to run on IBM computers at the International Labour Office in Geneva. From early on in its development as a mainframe package, one of its important features was that it had an ISO 2709 [1] interface. ISO 2709 is of course the standard which specifies the record structure on which the MARC formats and other associated formats like Unesco's CCF [2] are based. MARC stands for Machine-Readable Cataloguing, but is in fact a format for recording bibliographic information divided into records, usually each record corresponding to a book. It originated in the US in 1968 and has had many national manifestations, including UK MARC, Canadian MARC, MEKOF and many others. There is also an international version developed in 1977 called UNIMARC which has the aim of enabling conversion between the different national formats but has ben adopted by many countries as their national format, countries such as Japan, South Africa, Portugal and Croatia. The Common Communication Format or CCF is a format which was developed as a means to providing a detailed and structured method for recording a number of mandatory and optional elements in machine-readable form for exchange between different bibliographic systems. Whereas MARC formats are used in general though not exclusively by the library community the CCF is also used by organisations in the secondary services community; organisations setting up data bases which are intended to be focussed as much on journal articles as on books and monographic materials. The CCF was intended originally to serve as an exchange mechanism for records going between secondary services and libraries. It ended up replacing these other formats and in the same way that many national libraries adopted UNIMARC after it had become available, so many secondary services looking for a format tended to adopt the CCF. It differs from MARC in that it makes use of the fourth element of the ISO 2709 directory which became an option in the second (1981) edition of ISO 2709. Other differences include the use of upper case for subfields and an emphasis on record linking. This paper assumes a certain level of knowledge of ISO 2709, MARC formats and the CCF and relevant data processing terminology. 2 Implementing MARC on CDS/ISIS 2.1 ISO 2709 tags CDS/ISIS has been developed to be compatible with ISO 2709. The mainframe version originally had two-digit tags, and traces of this can be seen in the micro version which started out having two digit tags but was then modified to allow for three; in the default print format in version 1 of CDS/ISIS, only two digits were displayed making it of little use as a default print format though this is corrected in later versions. As far as MARC is concerned it uses only these 3digit tags and so there is no problem either in the storage or export of data. The export and import of data is undertaken by means of the module Master File Services. (In version 1 of CDS/ISIS, ISISXCH is the name of the program and in later versions ISISXCH is its name on menus so it will be referred to in this paper in this way.) Consequently, MARC tags can be used in the system. 2.2 ISO 2709 Subfields CDS/ISIS supports alphabetic subfields; the subfields of MARC can therefore be used. The first character of the subfield identifier, which is ASCII 31 in MARC, has to be represented in Micro-ISIS by '^' which is ASCII 94 on the most used code pages, e.g. 450 and 851. ASCII 31 has to be changed to ascii 94. This would have to be changed by a conversion table at output or it can be changed by a global edit, or a special Pascal program. However, repeating subfields within one field which are used extensively in most MARC formats are not treated correctly. The first only is displayed or processed by a reformatting FST so the result may not always be that desired either in display or in export. They will be processed when importing and exporting without any problem, at least so long as no conversion table (FST) is used. The solution here is to write format exits which can display repeating subfields; these could probably be used to export or import repeating subfields. There are other ways round the problem. Repeating fields can be used in a field which is not normally repeating, such as the imprint field. Thus UK MARC imprint field 260 which does not repeat at the field level but has repeating subfields can be entered as follows: ^aLondon%^aMoscow^bLanguage Press^c1990 This works within the system and when the data are exported it can also export into one field. But this is merely an ad hoc solution for a particular field and has to be used with care. Incidentally, MARC formats go hand in hand with the use of ISBD, the International Standard Bibliographic Description and it is interesting to note that in general this is supported by CDS/ISIS. However, MARC subfields correspond to ISBD punctuation (which is one reason why subfields are repeatable). Again, the repeated subfields do not display so it may be possible to change subfield identifiers to another subfield on a second or subsequent occurrence. These can then be changed back in the reformatting FST used on export. 2.3 Indicators in CDS/ISIS Indicators may be entered at the beginning of each field as they would appear in an ISO 2709 record. They will not be displayed in printout when subfields are processed (and when repeated subfields are not visible). When indicators are unused (e.g. '00') or always given the same value in a particular field, they may be added at the ISISXCH process (by means of a reformatting FST), to avoid the necessity to add them each time or to store them in the database. 2.4 Linking entry fields UNIMARC uses a mechanism for linking between different records. This may be used for linking between parents and children and vice versa or between earlier and later titles, etc. It is possible to make record to record links to other records on the database using a record number which could be the mfn or another identifer. Embedded fields may be produced on output and are best not stored but created by linking to the record and having a sophisticated entry in the FST which will reproduce the REF function used for linking via MFN or the REF(L) function used when the link is made via another identifier. The UNIMARC methodology is a complex one and requires a complex solution but CDS/ISIS already provides the mechanism. 2.5 ISO 2709 record label The label cannot be manipulated so it is not possible to enter the record status or bibliographic level codes as prescribed in the MARC standard. Again, this could be added to the record as output from CDS/ISIS, by program. The only solution seems to be to rewrite the ISISXCH software to enable the label to be created on a record by record basis from bibliographic fields in the record reserved for this purpose. 2.6 MARC fixed fields Fixed fields (e.g. 100) are difficult to enter, but this can be overcome. Extra non- UNIMARC fields may be added to the database each containing one fixed-length element. They can then be output as one UNIMARC field by using a reformatting FST when the ISO 2709 record is created. 3 Implementing the CCF on CDS/ISIS 3.1 Segments in CDS/ISIS The module ISISXCH includes a program for converting data into and out of the database in the ISO 2709 format. CDS/ISIS uses a non-standard version of ISO 2709, tailored for the MS-DOS file system. CDS/ISIS does not allow the record label to be manipulated, and cannot read or write the extended directory implemented by ISO 2709-1981. These features could be included by writing a program to massage an ISO 2709 record; indeed the International Development Research Centre in Ottawa have made available a program written in BASIC which converts ISO 2709 output from MINISIS, another software package in the ISIS family into CDS/ISIS acceptable ISO 2709 format. This program could be adapted to make other additions to the records, for example converting ^ to ascii 31. It will of course be necessary to include the segments of the CCF in the records. There are two ways in which this can be done. 1. Flat record structure It may be felt appropriate for simplicity's sake for segments to be represented by 'flattening' the record structure. One way of doing this is to use the basic CCF fields for the primary segment and add other fields for the secondary segment; then to devise another set which correspond to the fields in segment 1, and another in segment 2 and so on. Obviously it is tempting to limit the number of segments that one will need to use to avoid having to set up a large number of fields. CCF CDS/ISIS CCF CDS/ISIS CCF CDS/ISIS Seg 0 Seg 1 Seg 2 001 001 010 700 010 800 100 100 100 701 100 801 200 200 200 720 200 820 300 300 300 730 300 830 400 400 400 740 400 840 440 440 440 744 440 844 620 620 620 762 620 862 Fig. 1: Flat record structure equivalents in CDS/ISIS These fields could easily be converted logically to the CCF structure. All that would be required — and these are not there at the moment — would be a facility to produce the segment identifier in the fourth element of the directory and generate the linking fields. CDS/ISIS allows tags of five digits, so a four digit tag could be used to include a segment identifier. The repeat indicator would then have to be added by means of a program which could be written to interface with this. However it has to be remembered that ISO 2709 does not allow tags of more than three digits and CDS/ISIS has no mechanism to export any tag other than a 3-digit tag though it may internally hold 5 digit tags. 2. Generating separate segments from separate records on the CDS/ISIS database This possibility can be realized in version 2 of CDS/ISIS. Each segment can be made a separate record and linked. They can be output as one printed record in version 2 of CDS/ISIS. The linking fields (080) may be entered by the cataloguer. This may be felt to be a little cumbersome and open to error. Alternatively, all that is needed is a methodology to be followed in data entry to enable the 080 linking fields to be created. A simple program could effect a suitable conversion from the field in the CDS/ISIS record which represented the link to the 080 field. It does not appear that ISISXCH (the processor converting data to and from ISO 2709) would be able to generate the segment and repeat identifiers. 3.2 CCF subfields in CDS/ISIS As far as the CCF is concerned, there is no difference between the subfields of MARC and those of the CCF. 3.3 CCF indicators in CDS/ISIS In the CCF indicators are usually optional and need not be used. The character in the label which indicates the number of indicators should be set to 0. 3.4 CCF implementation It may be worth noting that sample databases based on the CCF have been produced by UNESCO and these include diskettes containing the databases. This uses 4 digit tags to cope with the problem of segments. One of these has been published by UNESCO [3] and the other, the Integrated Database using the CCF has been made available via distributors as a database named CCF with a manual included as a file. The field definition table is included in annex 1. 4. Windows version So far the Windows version of CDS/ISIS, currently in beta-test has not been mentioned. It is interesting to observe that one of the main problem areas mentioned above, repeating subfields, appears to work, though I have not fully tested it there or elsewhere. Unfortunately, format exits have not been implemented in the beta-test version that I have and I have had system crashes when I have attempted to get a display format to work which uses the REF and the L function on databases wehich have been set up to include record linking. 5. Conclusion Although CDS/ISIS cannot yet produce output exactly according to either MARC or the CCF, both formats are almost supported. Moreover, it is possible to implement both formats in a way that avoids some of the problems outlined above. For example in many systems links to associated records are made by means of notes rather than actually linking between records: it may be more attractive for the system to be kept simple for the users of the system than to have a complex system which requires complex input. Moreover, it is always necessary to remember that the purpose of an exchange format is as a medium for the exchange of data. It is not necessary to store the data internally in any form. However, the nearer the data are entered and stored to the data in the exchange format form, the easier it is to manage the system, so it would be a positive benefit to smaller institutions in particular who cannot undertake their own pascal programming if the necessary alterations were made to the CDS/ISIS program to enable the features outlined in this paper or if they were incorporated into the Windows version. 6. Bibliography 1. INTERNATIONAL ORGANIZATION FOR STANDARDIZATION. Documentation: format for bibliographic information interchange on magnetic tape. [2nd ed.] Geneva, ISO, 1981 (ISO 2709-1981). The first edition was published in 1973. 2. CCF: the Common Communication Format. 2nd ed. Paris, Unesco, 1988 (PGI- 88/WS/2). 3. International Information System on Cultural Development: CDS/ISIS model database: manual and accompanying diskette. Paris: UNESCO, 1994 (PGI-93/WS/16) ANNEX 1 CDS/ISIS EXAMPLES — CCF RECORDS IN DEFAULT PRINT FORMAT Record number: 0053 Bibliographic level M Date record entered ^a19860408 Language of item ^aeng Type of Material 100 Title ^aThe African food problem : from famine to food self-reli ance^bby Maurice J.Williams Name of person ^aWilliams^bMaurice J. Name of Meeting ^aConference on Development Education^gLeinster House, Dublin^i23 January 1986 Imprint ^as.l.^bWorld Food Council Date of Publication ^a19860000 Physical Description ^a11p Note ^aPaper for the conference on Development Education, Leinster House, Dublin, 23 January 1986 DescriptorBibliography no. c1068 Record number: 002001 Language of item ^aeng Type of Material 100 Title ^aThe strategy of food aid^bSherman E. Johnson Name of person ^aJohnson^bSherman E Date of Publication ^a19620000 Part Statement ^aVol. 26(1)^bpp. 3-5, 22 Descriptor Title (2nd level) ^aForeign Agriculture Bibliographic level (lev 2d) s RECORDS IN ISBD FORMAT 00053 Bibliography No: c1068 WILLIAMS, Maurice J. CONFERENCE ON DEVELOPMENT EDUCATION : LEINSTER HOUSE, DUBLIN) The African food problem : from famine to food self-reliance / by Maurice J. Williams. -— s.l. : World Food Council, 1986. -11p Paper for the conference on Development Education, Leinster House, Dublin, 23 January 1986. Terms: Afr; balance; agdev; trade; region; foodaid; emergency; admin. Language code: ENG. Bibliographic level: m. 0002001 JOHNSON, Sherman E. The strategy of food aid / Sherman E. Johnson In: Foreign Agriculture Vol. 26(1) (1962-00-00), pp. 3-5, 22 Terms: foodaid; policydon; USA. Language code: ENG. Bibliographic level: a. Secondary bibliographic level: s. TABLE OF CCF FIELDS IN CDS/ISIS FIELD NAME PERMITTED SUBFIELDS TAG OR PATTERN Record identifier 9999999 1 Bibliographic level A 15 Date record entered A 22 Language of record A 31 Language of item A 40 Language of summaries A 41 Physical Medium A 50 Type of Material A 60 ISBN ABC 100 ISSN A 101 CODEN XXXXXX 102 Document Number A 120 Title ABL 200 Key title AB 201 Parallel Title ABL 210 Spine Title AL 220 Cover Title AL 221 Added Title-Page Title AL 222 Running Title AL 223 Other Variant Title AL 230 Uniform title ABCDEFGL 240 Edition Statement ABL 260 Personal author ABCDEFZ 300 Name of Corporate Body ABCDEFGLZ 310 Name of Meeting ABCDEFGHIJLZ 320 Affiliation ABCDEL 330 Place of Publication/Publisher ABCD 400 Manufacturer ABCD 410 Distributor ABCD 420 Date of Publication AB 440 Date of Legal Deposit A 441 Serial Numbering & Date A 450 Physical Description ABCD 460 Price and binding ABC 465 Series Statement ABCDL 480 Part Statement ABC 490 Note A 500 Note on Bibliographic Relation A 510 Contents Note A 530 Abstract A 600 Classification AB 610 FIELD NAME PERMITTED SUBFIELDS TAG OR PATTERN Descriptor A 620 Geographical area A 630 Title (secondary segment) ABL 700 Bibliographic level (2nd seg) A 701 Par'll title (2nd segment) ABL 705 Edition statement (2nd seg) ABL 709 Personal author (2nd segment) ABCDEFZ 710 0Corporate body (2nd segment) ABCDEFGLZ 711 Meting Name (2nd segment) ABCDEFGHIJKLZ 712 Place of publication (2nd seg) ABCD 715 Date of publication (2nd seg) AC 716 Location of document A 998 Bibliography number A 999 ANNEX 2 CDS/ISIS Example — UNIMARC Record of monograph in internal format containing a link to a monographic series. Label bibliographical level code: m 001 20055 010 ^a92-2-106396-8 101 0 ^aeng 200 1 ^aFrom a developing to a newly industrialised country ^ethe republic of Korea 1961-82^fTony Michell 210 ^aGeneva^cILO^d1988 215 ^axii, 180 p 225 2 ^aEmployment, adjustment and industrialisation^x02573415^v6 461 20054^v6 700 1^aMichell^bTony 960 19890208 961 d1988____ 962 f 963 0 964 y Record of series Label bibliographical level code: s 001 20054 011 ^a0257-3415 101 0 ^aeng 200 1 ^aEmployment, adjustment and industrialisation 210 ^aGeneva^cILO^d1986 712 02^aILO^31092 960 19861218 961 s19869999 962 f 963 0 964 y Records of above from UNIMARC record (displayed in diagnostic format) Label bibliographical level code: m 001 20055 010 $a92-2-106396-8 100 $a19890208d1988 f0ENGy0103a 101 0 $aeng 200 1 $aFrom a developing to a newly industrialised country $ethe republic of Korea 1961-82$fTony Michell 210 $aGeneva$cILO$d1988 215 $axii, 180 p 225 2 $aEmployment, adjustment and industrialisation$x0257 3415$v6 461 1$100120054$12001 $v6 700 1$aMichell$bTony Record of series Label bibliographical level code: s 001 20054 011 $a0257-3415 100 $a19861218s19869999 f0engy0103a 101 0 $aeng 200 1 $aEmployment, adjustment and industrialisation 210 $aGeneva$cILO$d1986 712 02$aILO$31092 Output in ISBD form Michell, Tony From a developing to a newly industrialised country : the republic of Korea 1961 — 82 / Tony Michell. -— Geneva : ILO, 1988. -— xii, 180 p. — (Employment, adjustment and industrialisation, ISSN 0257-3415 ; 6). -— ISBN 92-2- 106396-8 ADDED ENTRIES CORPORATE AUTHOR(S): ILO SERIAL TITLE: Employment, adjustment and industrialisation Record no: 20055.