Recent posts

PREPARDE final meeting slides

The PREPARDE project officially finished with a final project meeting on Friday 23rd August, 2013. However, work on the topic of data publication is still in full swing, mainly through the  Publishing Data Interest Group of the Research Data Alliance.

In the meantime, our project deliverables are being put up on the Deliverables page and the slides from the final meeting have been attached to this post.

I'd like to thank all the project team and all the other interested people who have given so generously of their time and energy to interact with us on this project and in our workshops. Here's hoping we can work together again on this topic soon!

Cross-linking and workflows workshop report

The third and final workshop for the PREPARDE project was held at Rutherford Appleton Laboratory on the 30th April 2013. It was well attended with over 30 participants and a lot of excellent presentations and discussion on the workshop themes of how to manage workflows for data publication, and how to effectively link between datasets and the papers that cite them.

All the presentations given are now available on the workshop webpage at

One theme that I noticed coming around several times was that of needing a central registry or broker to manage the flow of information between the data repositories and the data journals. As the moment, the beginnings of this exist in the form of the DataCite metadata store, where repositories deposit the DOI specific metadata, which can then be retrieved by the journal publisher. What is needed is a mechanism for the reverse to happen - i.e. a place for the journal to push the information that a dataset has been cited to, and where a repository can query/pull that citation information from.

Data citation as a mechanism for cross-linking also came up several times, with the general feeling of "if it ain't broke, don't fix it!" The consensus (at least in my breakout group) was that citation is a well-established, well-used and well-understood mechanism for linking between articles, so we can piggy-back on that to expand its use to linking datasets to articles as well.

We all took copious notes from our breakout and discussion session, which are in the process of being written up now. Once they're tidied up, they too will be added to the workshop webpage.

I'd like to thank again our wonderful presenters (some of whom even came from abroad!) and our participants for their active engagement and their insights at the workshop. We really appreciate it!

Report from EGU 2013

As mentioned in our previous post, members of the PREPARDE team attended the European Geophysical Union General Assembly in Vienna, and presented the project at a splinter meeting and in an oral presentation on Friday 12th April.

The splinter meeting went well (aside from me accidentally dumping a glass of water onto my poor colleague!) We had a number of attendees who gave us lots of food for thought and feedback on our proposed guidelines for peer review of data and repository accreditation, so many thanks to them!

Both documents are still open for comment, and all comments are welcome!

The data publication  oral session itself was very interesting, with Hans Pfeiffenberger highlighting in particular some of the issues that [ESSD] have experienced in their many years of operating a data journal. We also had excellent talks from Kerstin Lehnert about disciplinary repositories and their role in data publication and David Marques about Elsevier's work in cross-linking publications with research data.

The  poster session, though thin on the ground, did provide a lot to think about too.

The PREPARDE presentations that we made from both the oral session and the splinter meeting can be found linked to this post.

PREPARDE and EGU - where we'll be

Members of the PREPARDE project team will be attending the  European Geophysical Union conference in Vienna, 7-12 April.

In particular, we'll be hosting a  splinter meeting SPM1.38 in room R3 on Fri 12th April, 12:15–13:15, where we'll be canvassing people for their opinions on our proposed guidelines for data publication.

We also have an oral presentation booked in session GI1.2: Data Assimilation and Data Publishing in Geosciences on Friday, 12 Apr 2013 in room G1 at 13:30–13:45.

 EGU2013-697 Data publication - policies and procedures from the PREPARDE project Sarah Callaghan, Fiona Murphy, Jonathan Tedds, John Kunze, Rebecca Lawrence, Matthew S. Mayernik, Angus Whyte, and Timothy Roberts

Hope to see people there!

Workshop on Cross-Linking and Workflows

The PREPARDE project is pleased to announce that registration for its workshop on cross-linking and workflows for data publication is now open at

The workshop will be held at  Rutherford Appleton Laboratory, Didcot, UK on Tuesday the 30th April.

More information about the workshop is below:

Linking datasets and articles for publication – cross-linking and workflows Data is being increasingly recognised as the foundation of the scientific record. Data journals and other data publishing models (such as enhanced publications) are opening up data and other research products for further use, re-use and validation. A key part of data publication, and linking data to other scientific outputs, is a formalised and standardised method of permanently cross-linking between the dataset (as held in a repository) and the related article (as hosted in a journal).

The PREPARDE project is funded by JISC to address key issues arising from data publication, including cross-linking between repository and journal. It aims to produce guidelines applicable to a wide range of scientific disciplines and data publication models. The project initially focuses on earth science disciplines, and the Geoscience Data Journal, a partnership between the Royal Meteorological Society and Wiley-Blackwell.

Intended audience: the workshop will interest anyone involved in publishing data, either directly, via a data journal or as part of an enhanced publication

Critical issues the workshop aims to cover include:

  • How can publishers and repositories collaborate to exploit and share metadata about related scientific outputs (i.e. datasets and articles)
  • Where are the best points in the various journal and repository workflows to establish persistent links and share metadata for data publication?

Participating in the workshop will help you to:

  1. Understand benefits and risks to stakeholders in scholarly publishing likely to arise from different data publishing models.
  2. Shape PREPARDE project guidelines on:- a) workflows for data publication b) data and metadata standards for enabling cross-linking between data repositories and academic publishers.

Please do pass these details on to other interested parties, and please do contact me (sarah.callaghan@…) if you have any questions.


The JISC Managing Research Data Programme funded PREPARDE project invites anyone with an interest to sign up to the new Research DATA-PUBLICATION announce and discuss  mailing list.

This list will be used to make announcements and promote discussion to Higher Education and the international research community about research data publication. Topics include technical and scientific peer review, trusted repository accreditation and the challenges for researchers, institutions, funders and publishers.

Creation of this list is a recommended outcome from the recent PREPARDE project workshop which included representatives from many key stakeholders from research, funders, institutions and publishers, and was held in January 2013 at the International Digital Curation Conference in Amsterdam. There will be a write up of this event soon.

And it’s a good idea we hope…!

Thanks, Jonathan Tedds

  • Posted: 2013-02-23 22:14
  • Author: jtedds
  • Categories: (none)
  • Comments (1)

Link Roundup - Data repository accreditation

(Pulling together some useful links on the subject of data repository accreditation so that they're all in the one place)

  • - information about the European Framework for Audit and Certification of Digital Repositories. "The framework will consist of a sequence of three levels, in increasing trustworthiness:
    • Basic Certification is granted to repositories which obtain DSA (Data Seal of Approval) certification;
    • Extended Certification is granted to Basic Certification repositories which in addition perform a structured, externally reviewed and publicly available self-audit based on ISO 16363 or DIN 31644;
    • Formal Certification is granted to repositories which in addition to Basic Certification obtain full external audit and certification based on ISO 16363 or equivalent DIN 31644."
  •  Trustworthy Repositories - overview of Trusted Repositories Audit & Certification (TRAC). "In general terms, TRAC:
    • Provides tools for the audit, assessment, and potential certification of digital repositories
    • Establishes documentation requirements required for audit
    • Delineates a process for certification
    • Establishes appropriate methodologies for determining the soundness and sustainability of digital repositories"
  •  Data Seal of Approval - "The Data Seal of Approval ensures that in the future, research data can still be processed in a high-quality and reliable manner, without this entailing new thresholds, regulations or high costs." - first level in the Trusted Digital Repositories Framework

OpenAIRE Interoperability workshop, University of Minho, 7-8 February

The  OpenAIRE Interoperability Workshop was held at the University of Minho, Portugal, on the 7/8 February 2013. I was there, wearing a number of different hats. Firstly, as a member of the OpenAIREplus team, secondly, presenting about PREPARDE and our work on workflows and cross-linking, and thirdly as a data scientist and repository manager.

My presentation was videoed and can be seen on the  OpenAIRE Vimeo channel. I've attached my slides to this blog post too.

I made it to the workshop on the first day (after a few hiccups with broken planes) just in time for the session on Open Science, Open Data and Repositories, which was part of the University of Minho's Open Access Seminar. This was a really good session for introducing people to key concepts about open data and open access, and even gave us a bit of a history lesson too about how this whole scientific publishing thing got started. (I never knew quite how pivotal  Henry Oldenburg was - not only was he the founding editor of the Philosophical Transactions of the Royal Society, but he also pretty much invented the process of pre-publication peer-review.)

The session on OpenAIREplus was also really useful for getting a feel about what OpenAIRE are trying to do to extend their services to include research data, especially in the case where it's linked directly to publications*. Discussions about this were continued on the following day in a splinter group talking about the proposed OpenAIRE guidelines for Data Archive Managers.

I'm very glad to see that these guidelines are based around the  DataCite metadata schema, tying in nicely with the work we're doing at BADC and the general direction data citation seems to be going at the moment. The plan is to dig out some case studies of data sets linked to published papers, and I have a couple in mind already, one of which relates to my (very soon to be published) GDJ paper.

OpenAIRE and PREPARDE have a lot of shared interests, so it was very nice to be able to connect with them at this event, and even start talking about what should come next by way of data publication and article-data linking in the future. The calls for Horizon 2020 will be out soon...

*I'm a bit wary of this, as it's almost putting data as second class citizens in comparison to articles, but I do appreciate they have to limit the scope of the project somehow in order to get stuff done.

Supporting Infrastructure Models for Research Data Management (SIM4RDM) Project workshop

Not strictly a PREPARDE event (although our funder, JISC is one of the partners) but extremely relevant to the bigger picture, I attended the Supporting Infrastructure Models for Research Data Management ( SIM4RDM) Project workshop on the 31st Jan in Dublin. SIM4RDM is putting a framework of evaluation and policy recommendations together to inform key EU research and innovation funding programme  Horizon 2020’s decision-making processes.

Crucially, SIM4RDM is aiming for national and global coverage as well as EU-level initiatives, and research data funding, management, curation, re-use, accreditation (the usual suspects) are all key parts of the picture. There was an interesting EC presentation by Carlos Morais Pires on how it relates to the overall plans for a research data management framework, updates on the project’s progress and methodology to date from key members, as well as a sneak preview on the  Research Data Alliance by RDA Council Member (and leader of the group that produced  Riding the Wave) John Wood. All in all, a tiring but worthwhile day.

A couple of take-homes:

Collaborations. The SIM4RDM Landscape Report, available on the website, indicates that ‘mixed’ groups such as we have on the JISC funded PREPARDE project - with publishers, researchers, data centre managers, etc all working towards a common goal - are very much the exception rather than the norm. That’s something that would ideally change.

Related point) Engagement. OK, so we can fill a room with engaged enthusiasts, but there is still a sense of it being largely the same crowd in different rooms. We need to find some potential new converts to preach to.

The workflows and missions espoused by PREPARDE need to become the norm, and I’d very much like to think we can put something in place to support that.

  • Posted: 2013-02-04 09:09 (Updated: 2013-02-04 09:10)
  • Author: Fiona Murphy
  • Categories: (none)
  • Comments (0)

PREPARDE at the International Digital Curation Conference 2013 (IDCC13)

IDCC13 was an important conference for the PREPARDE team, not just because we had an oral presentation in the Data Publication session, but also because it was the venue for the first of our three project workshops.

As always, the conference was very well attended, and digital technology and social media were used to enable those who weren't there physically to at least follow along. Videos of the keynote speeches, blog posts and storifys of tweets can all be found on the conference  homepage.

The PREPARDE presentation was titled "Processes and Procedures for Data Publication: a case study in the Geosciences" and raised a lot of interest ( slides can be downloaded here) - in fact at that stage of the data publication session I noticed that it was standing room only at the back! Angus Whyte (a PREPARDE team member) has written an excellent overview and summary of the whole session that can be found  here.

Thursday 17th January was the PREPARDE workshop, titled: "Data publishing, peer review and repository accreditation: everyone a winner?" (which I've been reliably informed was the first IDCC13 workshop to be sold out). The main focus for this workshop was on the tricky issue of repository accreditation, which was addressed by presenters from a wide variety of viewpoints, including, but not limited to publishers, professional societies and data repository managers. The slides are available  here (scroll down the page to find the workshop title and agenda) and I created a storify of the tweets sent during the workshop  here.

Angus and his DCC colleague Alex Ball took loads of notes during the workshop, and a full workshop report will be coming out soon. I'd like to take the opportunity again to thank all of our excellent speakers, and the workshop attendees for their lively input into our discussions during the day!

Report from the American Meteorological Society Annual Meeting

The American Meteorological Society held its  Annual Meeting in Austin Texas earlier this month, which gave Matt Mayernik and me the opportunity to hold forth on various aspects of data citation, peer review, etc.

Matt’s poster was based on current work with NCAR colleagues on ‘What Does Peer Review of Data Sets Mean and What Roles do Data Archiving and Quality Control Have in the Process?’ (attached). The poster compares/contrasts conventional peer review processes with data quality control processes in place at current NCAR data archives. It attracted interest from a number of delegates as well as the Editor of BAMS.

I presented on Geoscience Data Journal and PREPARDE, and Matt on the NCAR Data Citation Initiative to the  Atmospheric Science Librarians International (ASLI) strand of the meeting, which had several sessions around ‘Expanding Data’ and ‘Expanding Publications’. Several other speakers, including John Sandy of Alabama and Gloria Hicks of NSIDC, are working in compatible areas (see full program details  here) and there were some lively discussions, particularly in the breaks.

Other than that, there were some opportunities to further catch up on  EarthCube, the NSF-driven cyberinfrastructure project for the geosciences, which is of immense interest and potential importance for underpinning the future of data-driven research, particularly from North America. The other key development, from my point of view was the Town Hall meeting organised by the  AMS Board on Data Stewardship which is working on a wide-ranging position paper. Not only does this encompass data citation, encouraging good data management by producers, it is also looking at the status the ‘data scientist’ has within research institutions, and the influence of funders, policy makers and industry in all aspects of research data origination, use and management. It is simply the most ambitious mission I have yet to see from a learned society. I very much welcome this development and hope we have the opportunity to support its continuing progress.

  • Posted: 2013-01-22 10:35 (Updated: 2013-01-22 16:54)
  • Author: Fiona Murphy
  • Categories: (none)
  • Comments (0)

PREPARDE at the AGU Fall meeting

The PREPARDE project hosted a town hall meeting at the AGU Fall meeting last week, titled: "TH32F. Publishing Research Data: Peer Review, Data Center Accreditation, and Linking", which was sponsored by ESSI.

The description of the meeting was:

Data publication is an increasingly important mechanism for incentivising the sharing of research data. The PREPARDE (Peer Review for Publication & Accreditation of Research Data in the Earth sciences) is working with academic institutions, learned societies, data centers and commercial publishers to investigate barriers and drivers to data publishing and sharing, peer review, and re-use of geoscientific datasets. This town hall meeting will provide an overview of the project?s progress to date and invite input from anyone interested in developing long-term sustainable policies, processes, incentives and business models for managing and publishing research data. Key project partners, the publishers and Board Members of the Geoscience Data Journal will also be available for informal discussions.

We had an excellent attendance, with an audience of over 60 people, varying from librarians and data managers to scientific researchers and data producers. We had some very interesting discussions after the project team had done their brief presentation (see slides attached to this post) - a lot of which spilled out into the corridor after the allotted hour for the Town Hall meeting had finished!

I got the distinct impression that data publication is of interest to a lot of people. There was some balking at the notion of using data papers as a method for giving credit to the data producers and managers for their hard work (and to a certain extent I agree, but until we can convince the powers that be that datasets are valuable entities in their own right, we're a bit stuck and have to piggyback on existing article publication methods).

Some areas (like the  National Snow and Ice Data Center ) already have their own processes in place to review and publish information about the data held in their archives (an example is  here), which is excellent - and we in no way want to replace these! Instead we want to offer publication options for those subject areas which aren't as well served, or for those data stored in institutional repositories, where data review might not be possible for the repository managers due to the lack of domain knowledge.

We're still at the early stages of writing our guidelines, but it is gratifying to know that what we're doing is of interest to so many people, and to have so many willing volunteers already to act as reviewers!

(p.s. I've also attached the slides I used for my presentation in the session IN22A. Data Stewardship, Citation With Confidence, and Preparing Next Generation of Data Managers, as they too are related to the project)

Report from the CODATA General Assembly, Taipei, 27-31 October 2012

This is my first ever blogpost - so it's a momentous moment, for me at least.

I recently had the the opportunity to attend the  CODATA meeting in Taipei, where I presented on PREPARDE, the Geoscience Data Journal and more generally on some of the projects and initiatives which the scholarly publishing industry has been taking part in (see slides attached).

I was kindly invited to work with the  CODATA-ICSTI Task Group on Data Citation Standards and Practices for two days as a pre-meeting activity. (PREPARDE Project manager, Sarah Callaghan, is one of the Co-Chairs.) The Task Group is currently compiling a report being which focuses on the international good practices in data citation to be published next year. Incidentally, they're on the look-out for reviewers to help facilitate this process, so do get in touch with one of the Co-Chairs if you want to volunteer.

I also chaired a very interesting session on 'Best Practices and Future Directions in Data Sharing'. We had a preview of the new Creative Commons data licence, had an account of the 2011'Canadian Research Data Summit and heard a fascinating account of a discipline wholly new to me - Computer Supported Cooperative Work - which amongst other things studies cyber infrastructure development as a way of understanding dynamic, emergent human collaborations.

As usual at these sorts of events, I emerged both far better informed than before, as well as humbled by the expertise and achievements I witnessed around me. Personal highlights were an impromptu meeting with Mark Hahnel of Figshare, and the WDS members' Forum, where a number of members, including Kerstin Lehnert of IEDA, gave flash presentations of their work.

The general feeling seems to be that progress - towards the better management, curation, re-integration, linking, etc of research data - is being made, but there is still a long way to go. So any feedback you have on what's going well, or where your pain points are would be tremendously welcome. It's all about communities, collaboration and communication, after all.....

  • Posted: 2012-11-13 14:00 (Updated: 2012-11-13 14:13)
  • Author: Fiona Murphy
  • Categories: (none)
  • Comments (0)

A list of Data Journals (in no particular order)

We don't want to reinvent the wheel when it comes to writing guidelines for data journals, hence it makes sense to see what's out there in terms of pre-exisiting data journals, and what their guidelines are.

Below is a (non-exhaustive, in no particular order) list of the data journals we know of, either through personal experience or through internet searches.

(Thanks to Tom Pollard for the details about Ubiquity Press and Iryna Kuchma and Simon Hodson for pointing out other journals I missed.)

Name of Data Journal

Geoscience Data Journal

Aims and Scope

Geoscience Data Journal provides an Open Access platform where scientific data can be formally published, in a way that includes scientific peer-review. Thus the dataset creator attains full credit for their efforts, while also improving the scientific record, providing version control for the community and allowing major datasets to be fully described, cited and discovered.

An online-only journal, GDJ publishes short data papers cross-linked to – and citing – datasets that have been deposited in approved data centres and awarded DOIs. The journal will also accept articles on data services, and articles which support and inform data publishing best practices.

Repository Criteria

Other notes Open access for papers, doesn’t mandate open access for datasets.

Name of Data Journal

Earth System Science Data

Aims and Scope

Earth System Science Data (ESSD) is an international, interdisciplinary journal for the publication of articles on original research data(sets), furthering the reuse of high (reference) quality data of benefit to Earth System Sciences. The editors encourage submissions on original data or data collections which are of sufficient quality and potential impact to contribute to these aims.

Repository Criteria

Other notes

Datasets need to be open access and have a persistent ID (e.g. DOI)

Name of Data Journal

Ecological Archives - Data Papers

Aims and Scope

Ecological Archives publishes materials that are supplemental to articles that appear in the ESA journals (Ecology, Ecological Applications, Ecological Monographs, Ecosphere, and Bulletin of the Ecological Society of America), as well as peer-reviewed data papers with abstracts published in the printed journals. Ecological Archives is published in digital, Internet-accessible form.

Repository Criteria

In addition, all authors are encouraged to register their data at ESA's official Data Registry at

Other notes

Page about what ESA considers a data paper and guidelines for reviewers at

Doesn't seem to give any information about what constitutes an appropriate data repository.

Name of Data Journal

Hindawi publishing:

  • Dataset Papers in Agriculture
  • Dataset Papers in Biology
  • Dataset Papers in Chemistry
  • Dataset Papers in Ecology
  • Dataset Papers in Geosciences
  • Dataset Papers in Materials Science
  • Dataset Papers in Medicine
  • Dataset Papers in Nanotechnology
  • Dataset Papers in Neuroscience
  • Dataset Papers in Pharmacology
  • Dataset Papers in Physics

Aims and Scope

The following seems to cover all the dataset journals:

Dataset Papers in Geosciences is a peer-reviewed, open access journal devoted to the publication of dataset papers in all areas of geosciences.

Dataset Papers in Geosciences is part of a series of journals devoted to the dissemination of dataset papers covering a wide range of academic disciplines. In addition to publishing dataset papers, the journal hosts the underlying data that is associated with these papers and makes it accessible to all researchers worldwide.

Repository Criteria

None – the journal hosts the underlying dataset

Name of Data Journal

Journal of Chemical and Engineering Data

Aims and Scope

The Journal of Chemical & Engineering Data is a monthly journal devoted to the publication of experimental data and the evaluation and prediction of property values. It is the only American Chemical Society journal primarily concerned with articles containing experimental data on the physical, thermodynamic, and transport properties of welldefined materials including complex mixtures of known compositions and systems of environmental and biochemical interest.

Repository Criteria

The Journal operates in

cooperation with the Thermodynamics Research Center (TRC) of the National Institute of Standards and Technology (NIST) to process manuscripts.

Name of Data Journal


Aims and Scope

GigaScience? aims to revolutionize data dissemination, organization, understanding, and use. An online open-access open-data journal, we publish 'big-data' studies from the entire spectrum of life and biomedical sciences. To achieve our goals, the journal has a novel publication format: one that links standard manuscript publication with an extensive database that hosts all associated data and provides data analysis tools and cloud-computing resources.

Repository Criteria

Data and materials release Submission of a manuscript to GigaScience? implies that readily reproducible materials described in the manuscript, including all relevant raw data, will be freely available to any scientist wishing to use them for non-commercial purposes. Nucleic acid sequences, protein sequences, and atomic coordinates should be deposited in an appropriate database in time for the accession number to be included in the published article. In computational studies where the sequence information is unacceptable for inclusion in databases because of lack of experimental validation, the sequences must be published as an additional file with the article.

Provides suggested repositories and citation methods for:

  • Nucleotide sequences
  • Protein sequences
  • Mass spectrometry
  • Structures
  • Chemical structures and assays
  • Functional genomics data (such as microarray, RNA-seq or ChIP-seq data)
  • Computational modeling
  • Plasmids

Other notes

Open access, open data journal.

No APCs at this time

Name of Data Journal

Journal of Physical and Chemical Research Data

Aims and Scope

Journal of Physical and Chemical Reference Data is published by the American Institute of Physics (AIP) for the National Institute of Standards and Technology (NIST); content is published online daily, collected into quarterly online and printed issues (4 issues per year). The objective of the Journal is to provide critically evaluated physical and chemical property data, fully documented as to the original sources and the criteria used for evaluation, preferably with uncertainty analysis. Critical reviews of measurement techniques may also be included if they shed light on the accuracy of available data in a technical area. Papers reporting correlations of data or estimation methods are acceptable only if they are based on critical data evaluation and if they produce “reference data”—the best available values for the relevant properties. The journal is not intended as a publication outlet for original experimental measurements such as those normally reported in the primary research literature, nor for review articles of a descriptive or primarily theoretical nature.

Repository Criteria

Can’t find any

Name of Data Journal

Biodiversity Data Journal

Aims and Scope and Scope

Biodiversity Data Journal (BDJ) is a community peer-reviewed, open-access, comprehensive online platform, designed to accelerate publishing, dissemination and sharing of biodiversity-related data of any kind. All structural elements of the articles – text, morphological descriptions, occurrences, data tables, etc. – will be treated and stored as DATA, in accordance with the Data Publishing Policies and Guidelines of Pensoft Publishers.

Repository Criteria

Best practice recommendations:

  • Deposition of data in an established international repository is always to be preferred to supplementary files published on a journal‘s website.8
  • Occurrence-by-species records should be deposited through GBIF IPT.
  • Genomic data should be deposited at GenBank?, either directly or via an affiliated repository, e.g. Barcode of Life Data Systems (BOLD).
  • Phylogenetic data should be deposited at TreeBASE, either directly or through the Dryad Data Repository.
  • All other biological data, including heterogeneous datasets, should be deposited in the Dryad Data Repository.
  • Repositories not mentioned above, including institutional repositories, may be used at the discretion of the author.
  • Digital Object Identifiers (DOIs) or other persistent links (URLs) to the data deposited in repositories, as well as the name of the repository, should always be published in the paper describing that data resource.

Other Repositories mentioned

  • The Knowledge Network for Biocomplexity (KNB)
  • The National Biological Information Infrastructure
  • DataBasin?
  • DataONE
  • The PaleoBiology? Database
  • The Research Collaboratory for Structural Bioinformatics (RCSB)‘s Protein

Data Bank (PDB)

  • The Universal Protein Resource (UniProt?)

Other notes

Data publishing policies and guidelines give a good overview of what’s needed in general for data publishing.

Name of Data Journal

F1000 Research

Aims and Scope


Data Publication: F1000 Research promotes publication, refereeing and sharing of full datasets to encourage collaboration and accelerate scientific discovery. Data articles are citable and authors are credited when data are reused.


Data Articles: A dataset (or set of datasets) together with the associated methods/protocol used to generate the data. A Data Article may be published as a stand-alone article, or in conjunction with a Research Article

Repository Criteria

Can’t find anything, but are partnering with Dryad, biosharing and figshare.

APCs include up to 1 GB of data. For 1-5 GB of data with an article, an additional US $200 to cover the storage costs is charged. Beyond 5 GB of data, authors are asked to contact F1000R to discuss the costs.

Other notes

The information pack at

gives information about the publication model being used.

Name of Data Journal (website)

International Journal of Robotics Research

Aims and Scope


Repository Criteria


Peer Review and hosting Datasets from Data Papers

It is the intent of IJRR to solicit archival quality data, which means that all aspects of the Data Paper and associated dataset will be peer reviewed. Our aim is to provide an archival service to authors, however this is an ongoing process, and while we aim ultimately to host all datasets and their supporting websites, it is anticipated that initially many data papers will refer to websites hosted and author managed.

Other notes

Guidelines for submissions of Data papers

Name of Data Journal

CODATA's Data Science Journal

Aims and Scope

The Data Science Journal is a Journal of the Committee on Data for Science and Technology (CODATA) of the International Council for Science (ICSU)

The Data Science Journal is a peer-reviewed, open access, electronic journal publishing papers on the management of data and databases in Science and Technology. The scope of the Journal includes descriptions of data systems, their publication on the internet, applications and legal issues. All of the Sciences are covered, including the Physical Sciences, Engineering, the Geosciences and the Biosciences, along with Agricultural and the Medical Sciences.

The Journal publishes data or data compilations, if the quality of data is excellent or if significant efforts are required in compilation.

Scope of the journal is a long list at

Repository Criteria

It looks like data is stored as part of the supplemental information


Ubiquity Press:

Name of Data Journals

Aims and Scope

Ubiquity Press metajournals provide a fully open access way to discover research resources that are spread across multiple locations and usually hard to find. Metapapers reward authors for openly archiving their research datasets, software and reports, through citation and impact tracking. The publications encourage new and more efficient research, new collaborations, and reuse by the public (e.g. for teaching and journalism). UP metapapers come in three flavours:

  • Data papers highlight openly archived data with high reuse potential, and provide recognition for the producers of the data.
  • Software papers help you to locate openly archived, reusable code relevant to your research, and provide a mechanism for citing its use.
  • Research reports provide concise summaries of key developments in a field that the community do not otherwise have access to.

All papers are peer reviewed to ensure that the associated resources have been archived in suitable repositories, according to appropriate open standards.

Repository Criteria

Ubiquity Press Metajournals provide a list of repositories recommended for the archiving of datasets. For example:

Other notes

All papers and data are fully open access

Name of Data Journal

BMC Research Notes

Aims and Scope

BMC Research Notes is an open access journal publishing scientifically sound research across all fields of biology and medicine. The journal provides a home for short publications, case series, and incremental updates to previous work with the intention of reducing the loss suffered by the research community when such results remain unpublished.

BMC Research Notes also encourages the publication of software tools, databases and data sets and a key objective of the journal is to ensure that associated data files will, wherever possible, be published in standard, reusable formats.

Repository Criteria

Under "Publishing data" in

Other Notes

"Authors linking datasets to their publications should include an Availability of supporting data section in their manuscript and cite the dataset in their reference list."

And finally, a model journal:

Name of Journal

Geoscientific Model Development (GMD)

Aims and Scope

Geoscientific Model Development (GMD) is an international scientific journal dedicated to the publication and public discussion of the description, development and evaluation of numerical models of the Earth System and its components. Manuscript types considered for peer-reviewed publication are:

Geoscientific model descriptions, from box models to GCMs; Development and Technical papers, describing development such as new parameterisations or technical aspects of running models such as the reproducibility of results; Papers describing new standard experiments for assessing model performance, or novel ways of comparing model results with observational data;

Model intercomparison descriptions, including experimental details and project protocols.

Repository Criteria

None specified

Other notes

Not a Data Journal as such, but serves a similar function for the modelling community.

Running a project, or tips, tricks and tools for the project manager

As the project manager, I've spent a lot of my time recently sorting out things which will (hopefully) help the project to run smoothly. So, besides the usual project and workpackage plan, I've got this site and blog set up and have introduced the team to a few other things I've found useful.

  • Every project needs good communications, and having a project-specific mailing list is definitely the way to go, as it's one convenient place to send emails to all the partners, without having to make sure everyone's name is in the "To:" field. I use JISCmail ( which is free for members of the academic and research communities, and has never given me any problems.
  • Trying to find a good date and time for meetings/teleconferences is always tricky. I use Doodle ( to set up sets of dates that people can then express a preference on. Yes, it's a commercial service, but it's free for basic use. I also like their yes/no/maybe options for responses to the poll.
  • At the moment, we're mainly passing documents around via email, which isn't ideal because of lack of version control etc. (though with only a few people working on the documents at any time, it's manageable). Later on, I'm expecting to make a bit more use of Google Docs (now Google Drive to share and edit documents on-line.
  • We've got partners spread out across the UK and also in the USA as well, so a key part of our interactions are through the monthly teleconference. We've used skype (and use it for the weekly project management chat) though the sound quality can be variable, so we've also used a commercial teleconference system. Jury's still out for this particular tool!
  • For keeping track of my "things to do", I use a combination of pen and paper (no need to worry about batteries) and flags and appointments in my Outlook. I have had Remember the Milk ( recommended as a good tool for managing the to-do list.
  • Finally, this site is itself is a wiki, and all the project partners are encouraged to create and edit pages as needed. We still need to sort out a proper, secure, project document repository for all our working documents, but in the meantime, we'll keep building this site as a first stop for information about the project.

So, that's my tools and tips for running a project. If anyone knows of any others, I'd love to hear about them!

Hello and welcome to the PREPARDE blog!

This is just a quick test post to say hello and welcome to the PREPARDE project blog.

The project kicked off at the beginning of July, and we're currently getting started on a lot of interesting work and discussions around our main project themes of scientific peer-review of data, cross-linking between datasets and papers, accreditation of data repositories and workflows for data publication.

In other words, there's not a lot here yet, but there will be!