Bibliographic index card records
Although the majority of library catalogues are now digitized, under the hood most continue to use an index card paradigm similar to the one shown below for my CiTO paper , which uses PubMed tag-value pairs to encode key bibliographic information.
Note that in this bibliographic record, there are no hierarchical structures and no explicit relationships between the statements, although there is an implicit presumption that all the tag-value pairs recorded on this card relate to the same article.
Bibliographic records as RDF graphs
In contrast, a generic RDF graph encoding such bibliographic information looks like this:
Here, the primary information graph about the paper itself links to other graphs describing the author and the publisher, creating a small web of linked data.
[Note: An introduction to RDF and linked data is given in the first paper in this series, entitled What are linked data?]
Clearly, such an RDF graph is not easily machine-readable. However, it can be written out (‘serialized’) into a series of simple machine-readable RDF statements, which in Turtle notation (Terse RDF Triple Language), are as follows:
<http://dx.doi.org/10.1186/2041-1480-1-S1-S6> # URI of the CiTO paper in Journal of Biomedical Semantics rdf:type fabio:JournalArticle ; dc:title "CiTO, the Citation Typing Ontology" ; fabio:hasPublicationYear "2010"^^xsd:gYear ; prism:publicationDate "2010-06-22"^^xsd:date ; dcterms:bibliographicCitation "Shotton D (2010). CiTO, the Citation Typing Ontology. J. Biomed. Semant. 1,S1: S6." ; prism:doi "10.1186/2041-1480-1-S1-S6" ; fabio:hasPubMedId "20626926" ; dcterms:publisher [ rdf:type foaf:Organization ; foaf:name "BioMed Central" ; foaf:homepage <http://www.biomedcentral.com/> ] ; dcterms:creator [ rdf:type foaf:Person ; foaf:name "David Shotton" ; foaf:mbox <mailto:email@example.com> ; foaf:workplaceHomePage <http://www.zoo.ox.ac.uk/staff/academics/shotton_dm.htm> ] .
[Note: A guide to understanding Turtle for the uninitiated is give in the previous post in this series, entitled Libraries and Linked Data #2: Rough Guide to Turtle.]
Notice the compact and easily comprehensible nature of this encoding. Note also how terms (class and property names) from different ontologies and structured vocabularies have been combined to create these RDF statements. Such use of terms from pre-existing well-used ontologies, such as the Dublin Core Metadata Initiative metadata terms and the Friend of a Friend Vocabulary, is good practice when creating RDF descriptions, because it builds on previous effort where possible, and reduces the number of new ontological descriptions that are required. A list of open linked data vocabularies, useful for finding required terms in existing vocabularies, is given by the Open Knowledge Foundation’s Linked Open Vocabularies site. Those specific for libraries – one of the biggest clusters – are given at http://lov.okfn.org/dataset/lov/details/vocabularySpace_Library.html.
Other RDF statements could be added to the RDF graph given above, for example detailing the author’s institutional affiliation, thereby enriching the information content of this graph of linked data.
If other RDF graphs are published by third parties in which BioMed Central is similarly defined as a publisher, then the CiTO graph given above can be combined automatically with the others to form an interconnected information network – a larger RDF graph of ‘linked data’ about bibliographic entities and their publishers – in which the truth content of each original statement is maintained, thereby enlarging the web of knowledge, the Semantic Web.
Of course, RDF is only one of several ways of storing bibliographic data. In the Open Citations Corpus, for example, we store the data internally in BibJSON format, a compact JSON format adapted for bibliographic information, and then convert it to RDF using an XSLT transformation for external exposure, as detailed in a previous post.
 Shotton, David (2010). CiTO, the Citation Typing Ontology. J. Biomedical Semantics 1 (Suppl. 1): S6. http://dx.doi.org/10.1186/2041-1480-1-S1-S6.