Bibframe, RDA, MARC, JSON... we help you untangle these concepts!

The exchange of bibliographic data is a long-standing achievement. We could begin our analysis with printed bibliographies and union catalogues on cards (keyword: Mundaneum / Paul Otlet). But for pragmatic reasons, we only mention the MARC format, which has symbolised this exchange of data since the computerisation of libraries. Dating from the late 1960s – long before the web, that is – MARC is still hegemonic today in library systems.

So for a long time now, data sharing has been part of the DNA of libraries. It is also what makes them so different from archives: resources are published, they are owned by other institutions, and therefore the description work can be pooled and shared. This context is changing nowadays with authority repositories, to which we will return later

How is standard bibliographic exchange performed in libraries today

We can break down our different standards into levels:

1. The record format: MARC/ISO or MARC/XML

As an example, here is a MARC record in native exchange format (according to the original ISO 2709 standard, designed to save storage space)...

00562nz  a2200169n  4500001001400000003000500014005001700019008004100036035001500077039010600092040002600198072000900224110006100233410004500294667003000339999002300369vtls009951124RERO20140816170306.0970318   a z  babn             ana     d  aA009951124 9a201408161703bVLOADc200902131049d9511c200902131048d9511c200406201733dVLOADy200406201021zVLOAD  aRERO vsmatbfrefrero  as1se2 aAssociation des métiers d'art et d'artisanat du Valais2 aMétiers d'art et d'artisanat du Valais  aDescripteur validé RERO  aVIRTUA50         y

... and here is its equivalent in MARC/XML (which consumes many more characters):

<?xml version="1.0" encoding="UTF-8" ?>
<marc:collection xmlns:marc="http://www.loc.gov/MARC21/slim" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.loc.gov/MARC21/slim http://www.loc.gov/standards/marcxml/schema/MARC21slim.xsd">
    <marc:record>
        <marc:leader>00562nz  a2200169n  4500</marc:leader>
        <marc:controlfield tag="001">vtls009951124</marc:controlfield>
        <marc:controlfield tag="003">RERO</marc:controlfield>
        <marc:controlfield tag="005">20140816170306.0</marc:controlfield>
        <marc:controlfield tag="008">970318   a z  babn             ana     d</marc:controlfield>
        <marc:datafield tag="035" ind1=" " ind2=" ">
            <marc:subfield code="a">A009951124</marc:subfield>
        </marc:datafield>
        <marc:datafield tag="039" ind1=" " ind2="9">
            <marc:subfield code="a">201408161703</marc:subfield>
            <marc:subfield code="b">VLOAD</marc:subfield>
            <marc:subfield code="c">200902131049</marc:subfield>
            <marc:subfield code="d">9511</marc:subfield>
            <marc:subfield code="c">200902131048</marc:subfield>
            <marc:subfield code="d">9511</marc:subfield>
            <marc:subfield code="c">200406201733</marc:subfield>
            <marc:subfield code="d">VLOAD</marc:subfield>
            <marc:subfield code="y">200406201021</marc:subfield>
            <marc:subfield code="z">VLOAD</marc:subfield>
        </marc:datafield>
        <marc:datafield tag="040" ind1=" " ind2=" ">
            <marc:subfield code="a">RERO vsmat</marc:subfield>
            <marc:subfield code="b">fre</marc:subfield>
            <marc:subfield code="f">rero</marc:subfield>
        </marc:datafield>
        <marc:datafield tag="072" ind1=" " ind2=" ">
            <marc:subfield code="a">s1se</marc:subfield>
        </marc:datafield>
        <marc:datafield tag="110" ind1="2" ind2=" ">
            <marc:subfield code="a">Association des métiers d'art et d'artisanat du Valais</marc:subfield>
        </marc:datafield>
        <marc:datafield tag="410" ind1="2" ind2=" ">
            <marc:subfield code="a">Métiers d'art et d'artisanat du Valais</marc:subfield>
        </marc:datafield>
        <marc:datafield tag="667" ind1=" " ind2=" ">
            <marc:subfield code="a">Descripteur validé RERO</marc:subfield>
        </marc:datafield>
        <marc:datafield tag="999" ind1=" " ind2=" ">
            <marc:subfield code="a">VIRTUA50         y</marc:subfield>
        </marc:datafield>
    </marc:record>
</marc:collection>

2. The internal structure of MARC

These structures are standardised and describe which contents are in which zones. The best known standards today are UNIMARC (title is in field 200) and MARC21 (title is in field 245). Both can be saved in MARC/ISO or MARC/XML.

3. Cataloguing rules

These define the way in which the field contents are entered. For example, we can mention the source of information, the required and optional elements, the use of upper and lower case, the transcription of the text appearing on the resource, the use of specific abbreviations, the lists of authorised terms (roles, types of content, media), etc.

The most widely used bibliographic description rules today are called RDA (Resource Description and Access).

4. Data modelling may represent an extension, a fourth level.

Very often, this level is closely linked to the cataloguing rules. It indicates existing entities and their relationships. For a long time, libraries described only bibliographic, holdings and item records. In the early 2000s, FRBR modelling, now LRM, placed cataloguing in a broader context of entities: persons, works, places, concepts... These entities are generally common with descriptions made in archives, museums and on the web in general, and thus open the way to a tremendous expansion of data sharing for which libraries remain the main guarantors. The GND for Cultural Data (GND4C) project is an eloquent example.

How RERO ILS is different

When designing RERO ILS, a fundamental choice was not to base the software directly on the MARC format. As RERO ILS is built on the technologies of the Invenio 3 framework, we naturally adopted JSON as the basis for the data. JSON is a format for structuring data, based on the same principle as XML, but which is integrated into many recent programming languages.

To take up the principle of levels mentioned earlier, we propose the following comparison:

Level RERO ILS The rest of the world (with exceptions) Our comments
Record format JSON MARC/ISO We could very well imagine recording MARC21 or UNIMARC in JSON.
Internal structure Bibframe MARC21, UNIMARC, ... Bibframe defines the overall name of the JSON elements and the type of data they contain: do we have a hierarchical Russian doll structure or do we have a simple element with text as a value?
Cataloguing rules RDA RDA The still recent evolution of RDA leads to different levels of implementation depending on the libraries.
Data modelling LRM (objective) LRM (objective) Modelling is implemented differently, depending on library systems, cataloguing standards and discovery interfaces. LRM is an ambitious goal that is difficult to achieve quickly...

Concrete example of Bibframe/JSON

The Bibframe/JSON structure is used in both projects, RERO ILS and SONAR, in a very similar way.

Here is a JSON extract from the document "Bulle au pluriel" (see the record in the RERO+ Catalogue or the JSON data in the RERO ILS API. This extract corresponds to the author, namely

  • in RDA: an agent (person) associated with a manifestation/work (a document)
  • in MARC21: the field 100 or 700
  • in Bibframe: the field contribution
    "contribution": [
      {
        "agent": {
          "$ref": "https://mef.rero.ch/api/agents/idref/254732569",
          "type": "bf:Person"
        },
        "role": [
          "cre"
        ]
      }]

Unlike MARC, Bibframe and JSON allow elements to be nested and have a multi-level hierarchy, which gives the possibility of more granular and structured data. Thus, in our example, the key $ref (referring to the author in the IdRef repository) is within the key agent, which is itself within contribution. This granularity improves some of the limitations of MARC (for those who remember, for example, the 245$6 or 336$8 linking sub-fields).

Why doesn't the author's name appear? The identifier is preferred, this is how Linked Data works. If the name is changed, all linked documents do not have to be changed. However, these documents are reindexed to allow an efficient search based on the new name.

You said there was RDA? Yes, RDA is recognisable in the vocabulary used in the values, like the code cre (creator) for the role. The agent key comes from Bibframe and happily coincides with the RDA terminology. The full RDA layer is only added in the editor, and is translated according to the language of the interface, in French in the image below.

editeur_contribution

Is RERO ILS interoperable?

We are not going to say otherwise! RERO ILS is interoperable with the rest of the world because it handles the MARC format in import and export. What happens inside the system is not really relevant from an interoperability point of view.

For exchange processes to take place, library systems rely on protocols, like SRU (successor to Z39.50), which describes exactly how data can be searched and retrieved, notably in MARC/XML.

Thus, RERO ILS is able to import in MARC (via the "Import from the web" function) and to export in MARC via SRU. See for example all the documents of Jean Piaget (whose PPN IdRef is 027071170) via the SRU service.

Finally, one could almost say that RERO ILS is naturally interoperable since the project is Open Source.

Why not do what everyone else is doing?

MARC is an extremely well designed standard for its time, and is to some extent scalable: its lifespan proves it! However, web technologies can more easily accommodate new formats such as XML and JSON. This is also and especially the case for the people developing today's applications: they interact with library data to develop institutional websites, mobile applications, web services, etc. Often, these people do not only work with library data, and therefore prefer generalist formats that are widespread in all kinds of domains.

This is one of the strengths of RERO ILS, which works natively with some modern standards such as REST APIs.

Partager