SCoRO, the Scholarly Contributions and Roles Ontology, is a new CERIF-compliant ontology for use by authors, publishers and research administrators, for describe the contributions and roles of scholars, and the organizations of which they are members, with respect to projects, research investigations and other academic activities, and to the scholarly journal articles and other outputs that result from them.
The need for such a disciplined way of describing scholarly contributions and roles was articulated at the International Workshop on Contributorship and Scholarly Attribution held at Harvard University on 16 May 2012. At that workshop, an early version of the SCORO ontology was presented. The current version has been refined and improved as a result of the feedback received and the recommendations contained within the workshop’s Final Report.
Clearly, the present informal system of ascribing scholarly roles and contributions to the authors of academic papers people is ill-suited to the needs of the digital age, where machine-readable metadata to describe and accompany on-line articles is desirable.
In particular, the order in which authors’ names appear in a paper’s author list is a highly imprecise and ambiguous method of signaling credit, since its meaning varies significantly between academic disciplines. At least three different conventions are in common usage:
- Alphabetical, widely used in disciplines such as mathematics to indicate a commitment not to distinguish between the relative importance of the different authors;
- Sequence determines credit, where authors are listed in order of declining importance; and
- First author – last author emphasis, the norm in the biomedical sciences, usually interpreted as meaning that the first author did most of the work while the last author is the senior author who has contributed the intellectual driving force, while intermediate authors made less important contributions.
Some journals now insist on author contribution statements at the end of articles, but these generally relate to a different topic, namely investigation contributions, rather than attribution of authorship credit.
For example, the acknowledgements in one Nature paper published in November 2007 are exemplary in their simplicity and clarity:
- A.C. and J.H.H. conducted the observations.
- A.C. processed the data.
- P.W.L. performed the modelling.
- A.C. wrote the main paper.
- P.W.L. wrote the Supplementary Information.
- All authors discussed the results and implications, and commented on the manuscript at all stages.
However, such contribution statements can be more specific, complex and idiosyncratic, as exemplified in another paper from the same issue of Nature:
- J.L., J.R.S. and J.W.L. conceived the Brainbow strategies.
- J.R.S. and J.W.L. supervised the project.
- J.L. built initial constructs and validated them in vitro and in vivo.
- T.A.W. performed all cerebellar axonal tracing and colour profile analysis with programs developed with J.L.
- H.K. performed all live imaging experiments.
- R.W.D. generated Brainbow-1.0 lines expressing cytoplasmic XFPs.
- R.A.B. generated Brainbow-1.1 constructs and lines.
- J.L., T.A.W. and R.W.D. screened mouse lines.
Note in this second case how detailed domain knowledge is required to understand the meaning of some of the contributions, e.g. “conceived the Brainbow strategies”, “built initial constructs”, “validate constructs” or “screened mouse lines”.
Another instance of detailed author attribution occurs in the March 2011 issue of Nature Genetics, and took the following form:
- B.G., J.B., R.C.H. and G.P.P. conceived and designed the study.
- B.G., J.B. and D.M. implemented the process, built and populated the databases.
- K.R.P., F.C.C. and H.F. performed experiments.
- B.K.S., D.J.A., A.N.B., B.C., P.F., A.E.F., A.F., R.G., M.V.E.G., M.G., R.J.G., P.C.G., C.L.H., J.D.H., M.J., P.J., E.K., P.K., S.M., K.M., J.O., A.P., M.N.P., P.P., S.Pa., L.P., M.R., S.S., I.S., M.S., S.L.T., J.T.-S., R.T., T.W., J.S.W., C.W., B.Z. and G.P.P. contributed data.
- B.G., J.B., D.R.H. and S.Ph. analyzed results.
- R.C.H. and G.P.P. supervised data analysis.
- D.M., W.M., C.R., D.H.K.C. and H.W. provided expertise and infrastructure.
- B.G., J.B., D.R.H., K.R.P., S.Ph., R.C.H. and G.P.P. wrote the paper.
The authors of this paper concluded
“Our project provides the first example of implementing microattribution to incentivise submission of all known genetic variation in a defined system. It has demonstrably increased the reporting of human variants, leading to a comprehensive online resource for systematically describing human genetic variation in the globin genes and other genes contributing to hemoglobinopathies and thalassemias.”
However, one drawback of this system of author contribution statements as practiced is that the contributions of non-authors such as research technicians are poorly acknowledged, despite the fact that they may have contributed significantly to the research investigation whose results are reported in the article.
The Harvard – Wellcome Trust workshop concluded that we need an easy and more straightforward way to define the contributions of both authors and non-authors, in a manner that distinguishes:
- Intellectual contributions, including the conception and design of experiments;
- Experimental contributions, including provision of experimental material, undertaking experiments and analyzing data;
- Organizational contributions, including fund-raising and project management; and
- Authorship contributions, such as drafting the paper and preparing the illustrations.
The workshop participants were not keen on the idea that the relative effort of different people to such contributions should be accounted for using numerical quantitation, e.g. “Dr X contributed 80% and Dr Y contributed 20% of the effort to the contribution of revising the manuscript”. This decision was largely reached both because assigning contribution efforts with mathematical precision could not be achieve, and also because such numbers would be of little practical usefulness.
Rather, it was thought sufficient to say that someone was solely responsible for a particular contribution, played a major role in it, played a minor role in it, or had no part in it.
SCoRO meets this requirement by creating a limited controlled vocabulary of generic terms to describe contributions within these four general categories. It also provides the option of specifying the effort for each contribution, relative to that of others:
- Solely responsible for this contribution.
- Contributed major effort.
Contributed minor effort.
The default position is that a person played no part in a particular contribution, which is assumed unless one of the other effort categories is specified.
In addition to specifying such contributions, SCoRO distinguishes and permits the separate definition of individuals’ roles, under the following categories:
- Investigation roles, such as Principle Investigator, Research Assistant or Technician;
- Project roles, including Project Leader and Project Manager;
- Data roles, such as Data Creator, Data Manager or Curator; and
- Authorship roles, including Corresponding Author, Senior Author, Article Guarantor, and Consortium Author (to cover entries in author lists such as “. . . and members of the MalariaGen Consortium”).
This distinction between roles and contributions is useful, since, for example, people other than the PI can be active in contributing to the leadership of an investigation, and people other than the formal Data Manager can be involved in managing data.
The basic conceptual structure of SCoRO is as shown in the following diagram:
Legend: The basic concepts within the SCoRO model.
Additionally, SCoRO provides terms for organizational, financial and teaching roles relevant to academic activities which are unrelated to scholarly research and journal article authorship, but which are useful in other circumstances. Publishing roles, for example those of editor and reviewer, have previously been separately covered in PRO, the Publishing Roles Ontology, which is imported into SCoRO.
This import of PRO into SCoRO permits us to employ a standard ontology design pattern called the Time-indexed Value in Context Pattern (TVC) , that is already part of PRO. This pattern permits one to specify both the context and the time-period for a particular contribution or role, if required, by introducing one level of indirection between the object and the location, via the class tvc:ValueInTime, as explained in the PRO section of a recent blog post on use of the SPAR ontologies: Libraries and linked data #5: Using the SPAR ontologies to publish bibliographic records.
This ontology design pattern is exemplified in the following RDF statement:
foaf:Agent pro:holdsRoleInTime [ a pro:RoleInTime ; pro:withRole pro:editor ] .
Here, the domain of pro:withRole is not foaf:Agent, but rather an anonymous member of the class pro:RoleInTime, which itself is the range of the property pro:holdsRoleInTime, for which the domain is foaf:Agent. The range of pro:withRole is the class pro:Role, and its sub-classes pro:PublishingRole, scoro:InvestigationalRole, scoro:OrganizationalRole, etc., whose members permit specific roles to be specified. New roles can easily be added, simply by inserting new individual members into the appropriate subclass of pro:Role.
This single step of indirection permits contextual and temporal attributes to be specified, putting that role into context. Exactly the same use of TVC and of individuals within the class scoro:Contribution and its sub-classes may be used for scoro:withContribution, enabling contributions also to be time-limited and given contexts.
Examples of this design pattern being used both for two roles and for a contribution, with specification of the context of both roles, the time interval during which the second role of Principal Investigator was valid, and the context of the contribution, are given in the following example:
:ourInvestigation a frapo:Investigation ; dcterms:title "Experiments in Semantic Publishing" ; frapo:hasOutput :Adventures . :Adventures a fabio:JournalArticle ; dcterms:bibliographicCitation "Shotton D, Portwin K, Klyne G, Miles A (2009). Adventures in semantic publishing: exemplar semantic enhancement of a research article. PLoS Computational Biology 5: e1000361." ; dcterms:creator :Shotton , :Portwin , :Klyne , :Miles ; prism:doi "10.1371/journal.pcbi.1000361" . :Shotton a foaf:Person ; foaf:name "David Shotton" ; scoro:hasORCID "0000-0001-5506-523X" ; pro:holdsRoleInTime [ a pro:RoleInTime ; pro:withRole scoro:author , scoro:senior-author ; scoro:relatesToPublication :Adventures ] ; pro:holdsRoleInTime [ a pro:RoleInTime ; pro:withRole scoro:principal-investigator ; tvc:atTime [ a ti:timeInterval ; ti:hasIntervalStartDate "2008-05-01"^^xsd:date ; ti:hasIntervalEndDate "2009-04-17"^^xsd:date ] ; scoro:relatesToEntity :ourInvestigation ] ; scoro:makesContribution [a scoro:ContributionSituation ; scoro:withContribution scoro:conceives-project ; scoro:withEffort scoro:major-effort ; scoro:hasContributionContext :ourInvestigation] .
[This RDF is expressed in Turtle. For an introduction to Turtle, see here.]
In the above example, SCoRO is used in combination with FaBiO, the FRBR-aligned Bibliographic Ontology, that provides structured vocabulary terms to characterize scholarly publications, and with FRAPO, the Funding, Research Administration and Projects Ontology, to be described in a subsequent post, that provides structured vocabulary terms to describe research administration, research funding, and the projects and investigations that such funding supports.
SCoRO also makes it easy to assert in a machine-readable manner that two authors on a paper, for example the first and second authors in an author list, have equal principal authorship roles, as shown in the following Turtle statements:
:author-1 pro:holdsRoleInTime :role1 . :role1 a pro:RoleInTime ; pro:withRole scoro:principal-author ; tvc:withinContext :ourJointPaper . :author-2 pro:holdsRoleInTime :role2 . :role2 a pro:RoleInTime ; pro:withRole scoro:principal-author ; tvc:withinContext :ourJointPaper ; scoro:isEqualToRoleInTime :role1 . :ourJointPaper a fabio:JournalArticle ; dcterms:creator :author-1 , author-2 .
In a similar manner, two contributions can be declared to be the same, using the property scoro:isEqualToContributionSituation to equate one scoro:ContributionSituation with another.
A fuller and more accurate graphical description of SCoRO, that shows the use of pro:RoleInTime and scoro:ContributionSituation to permit the inclusion of contextual information, is given in the following Graffoo diagram.
Legend: Graffoo representation of the SCoRO ontology.
SCoRO is written in OWL 2 DL, and imports FOAF, the Friend of a Friend Ontology and PRO, the Publishing Roles Ontology. As with all SPAR Ontologies, we use content negotiation to handle the SCoRO URL http://purl.org/spar/scoro/, employing LODE, the Live OWL Documentation Environment, to deliver human-readable documentation of SCoRO if this URL is presented by a Web browser, or delivering the scoro.owl file if it is presented by an ontology editor such as Protégé.
As with any ontology. the existence of SCoRO is of little use on its own. People require an easy-to-use tool for the creation of SCoRO-specific metadata detailing the contributions and roles of people in relation to a particular research investigation and the publications arising from it. For this, we have created the SCoRF, the Scholarly Contributions Report Form, which makes entry of contributions and attribution metadata an easy task, as described in the next blog post.
 Peroni S, Shotton D and Vitali F (2012). Describing roles and statuses and their temporal extents: a general pattern with applications in scholarly publishing. In Proceedings of the 8th International Conference on Semantic Systems (i-Semantics 2012): pages 9-16. http://dx.doi.org/10.1145/2362499.2362502.