In a recent blog post, Heather Piwowar, in discussing the advantages of citing datasets in the reference list of the article, said “No journals have standardized on this approach so far”. However, Pensoft Journals, a publisher that specializes in publishing biodiversity and biological systematics papers, and that has taken the lead in promoting the publication of datasets with DOIs, has exactly such a policy.
Recently, in response to my Data Citation Best Practice Discussion Document  discussed in the preceding blog post, I was invited to work with Pensoft Journals to contribute to and help revise their now-published Data Publishing Policies and Guidelines for Biodiversity Data . This 34-page paper has a three-page section on how to cite data in Pensoft Journals.
While recognising that citations of Genbank and similar bioinformatics datasets are by custom made by placing the database accession number somewhere in the text, with no entry in the reference list of the article, we make the following generic recommendation:
“Data citations may relate either to the author’s own data, or to data created and published by others (“third-party data”). In the former case, the dataset may have been previously published, or may be published for the first time in association with the article that is now citing it. All these types of data should, for consistency, be cited in the same manner.
“As is the norm when citing another research article, any citation of a data publication, including a citation of one’s own data, should always have two components:
- An in-text citation statement containing an in-text reference pointer that directs the reader to a formal data reference in the paper’s reference list.
- A formal data reference within the article’s reference list.
“The data reference in the article’s reference list should contain the minimal components recommended in the DataCite Metadata Kernel v2.0 specification. In DataCite terms: Creator PublicationYear Title Publisher Identifier; alternatively (but meaning the same thing): Author PublicationYear Title DataRepositoryName DOI. These components should be presented in whatever format and punctuation style the journal specifies for its references. The following example demonstrates in general terms what is required.
This paper uses data from the [name] data repository at http://dx.doi.org/***** (Jones et al. 2008a), first described in Jones et al. 2008b.
“Data reference in reference list:
Jones A, Bloggs B, Smith C (2008a). Title of data package. Repository name. doi:*****.
“Article reference in reference list:
Jones A, Saul D, Smith C (2008b). Title of journal article. Journal Volume: Pages. doi:###. ”
Pensoft also recommends that the in-text data citation statement in Pensoft journals should be included in the body of the paper, in a separate section named Data Resources situated after the Material and Methods section. More details are given in the paper .
Furthermore, Pensoft has reached an agreement for cooperation in data hosting and developing of data publishing workflows with GBIF, the Global Biodiversity Information Facility, with the Dryad Data Repository and with the Consortium for Barcode of Life.
Clearly, these Pensoft data citation recommendations, which work fine for on-line journals without a numerical limit on the number of citations, would not be feasible in journal articles with a strict limit to the number of citations, which is why Heather’s emphasis of exploring alternative ways for data citation in such cases is important.
 David Shotton (2011) Data Citation Best Practice Discussion Document. Google Docs. https://docs.google.com/document/d/1kF8-faB72l4dKTLEyx6Z5cIabk68GrJ9GraCtWnK0qQ/edit?hl=en_GB&authkey=CPPW46wL#.
 Penev L, Mietchen D, Chavan V, Hagedorn G, Remsen D, Smith V, Shotton D (2011). Pensoft Data Publishing Policies and Guidelines for Biodiversity Data. Pensoft Publishers, http://www.pensoft.net/J_FILES/Pensoft_Data_Publishing_Policies_and_Guidelines.pdf.