Changes between Initial Version and Version 1 of KickoffMinutes


Ignore:
Timestamp:
15/08/12 15:42:15 (6 years ago)
Author:
scallagh
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • KickoffMinutes

    v1 v1  
     1 == PREPARDE Kick-off Meeting == 
     2Mon 2 July 2012, University of Leicester (Physics Meeting Room F10)[[BR]] 
     311.00am – 4.30pm 
     4 
     5Present: 
     6 * Dr Sarah Callaghan (British Atmospheric Data Centre)                 (SC) 
     7 * Dr Rebecca Lawrence (F1000)                                          (RL) 
     8 * Dr Fiona Murphy (Wiley-Blackwell) (via Skype)                                (FM) 
     9 * Tim Roberts (Wiley-Blackwell)                                                (TR) 
     10 * Dr Jonathan Tedds (University of Leicester)                          (JT) 
     11 * Dr Angus Whyte (DCC)                                                 (AW) 
     12 * Dr Andrew Burnham (University of Leicester) (Note taking)            (AB) 
     13 
     14Present for part of the meeting: 
     15 * Dr Roland Leigh (University of Leicester) (1.00pm onwards)           (RoL) 
     16 * John Kunze (California Digital Library) (via Skype) (3.15pm onwards) (JK) 
     17 * Dr Matt Mayernik (NCAR) (via Skype) (3.15pm onwards)                 (MM) 
     18 
     19 
     20'''Project Overview''' 
     21 
     22 
     23(JT) This is a widely scoped project so there is need to focus, identify where it can make a difference, the direction, and who can lead where. 
     24The original start date was postponed from 1 June 2012, to 1 July 2012 with end date 30th June 2013. 
     25The Project budget had to be revised down to £135,000 – and agreed by JISC - from the originally proposed £150,000. 
     26There is a revised document and risk report accordingly (distributed). 
     27Project documents and slides – agreed that project partners’ logos should be included. Need therefore to add DCC, F1000, and new Wiley logo which is currently being developed.  There was agreement that logos could be resized accordingly. 
     28 
     29'''Geoscience Data Journal''' 
     30 
     31(FM) To produce a press release next week. 
     32The Editorial board is currently being compiled. 
     33The content of the journal will be a) data papers and b) papers about data. 
     34In relation to papers about data publication, there may be additional funding for this element from “CODATA”, which is less geoscience specific. 
     35The publisher to pay to publish - £1000. 
     36There is the issue of peer-review process to reassure funder that dataset is good quality. 
     37 
     38(RL) Also looking at this – allowing data only papers, encouraging deposit of data into repositories. 
     39There is a general move towards linking data to articles. 
     40 
     41See BMJ editorial on open access – “BMJ Editorial: Open Science and Reproducible Research” (27 June 2012, http://blog.datadryad.org/2012/06/27/bmj-editorial-open-science-and-reproducible-research/) 
     42 
     43(JT) Springer publishers are considering an open access, on-line only publication to include articles from a range of disciplines regarding data publication. JT has been invited to take a leading role. 
     44 
     45(FM) Wants to be at the AGU (American Geophysical Union) conference to engage with community.  To send through details of the meeting. 
     46This will be chance to meet up with US partners in this project (w/c 3.12.2012). 
     47 
     48'''OJIMS''' 
     49 
     50(SC) SC was OJIMS Project Manager. 
     51OJIMS - Overlay Journal Infrastructure for Meteorological Sciences JISC project – this developed software for overlay journals i.e. papers about data sets. 
     52The project proved that this would technically work. 
     53The issue was quiet for a while until Geoscience Data Journal picked it up. 
     54 
     55All OJIMS project documents are available and online. 
     56The project put a technical framework in for how to publish datasets – a test-bed. 
     57Have covered “How to review datasets” - IJDC Papers  
     58Referred to the NERC Science Information Strategy Data Citation and Publication project (http://www.ijdc.net/index.php/ijdc/article/view/208/277) -  
     59Can now mint DOIs (Digital Object Identifiers). 
     60Going out to known authors of good datasets in relation to getting a DOI and publishing. 
     61Will help people to do this and dataset creators will get credit. 
     62 
     63'''Objectives (PowerPoint slides to be forwarded by JT)''' 
     64 
     651.      Capture & manage workflows to operate GDJ (Repository-controlled & Journal-Controlled Diagram) 
     66The project will involve examining the workflows shown. 
     67Those shown on orange are to be investigated by this project. 
     68 
     69(RL) Queried whether there are other approaches?  It assumes that data is in a repository first, but what happens where they aren’t available? 
     70Is currently taking another approach with F1000 i.e. people come with a paper and then look at repository issues. 
     71 
     72(SC) GDJ is core test-case so need to learn from that and look broader. 
     73The project should come up with alternative use case diagrams. 
     74 
     75Other issues discussed: 
     76 *      Orphan datasets. 
     77 *      (SC) Issues with Figshare e.g. not enough metadata. 
     78 
     792.      Develop procedures and policies for authors, reviewers and editors 
     80(FM) Need to clarify what a data paper is. 
     81How do retrospective linking? 
     82 
     83(RL) Guidelines for the F100 new publication include issues regarding: 
     84•       Whether data is raw or processed. 
     85•       How it has been processed etc. 
     86The author guidelines are on line – RL to will forward them. 
     87 
     88(JT) Different disciplines will have different levels of data i.e. beyond simply raw vs. processed. 
     89This was thought to be a possible area for a DCC briefing document based on a recent ALPSP presentation he had given entitled “What is (research) data?” 
     90 
     91'''Work Package 1 – Project Management''' 
     92 
     93(SC) Delivery requirements within the first month: 
     94 *      Project plan required within a month (with sub plans). 
     95SC to do the first draft and send round for comments. 
     96 
     97 *      Consortium agreement – this needs to come from Leicester (JT). 
     98SC to look for similar documents. 
     99JT to check with Simon Hodson (JISC) if there is a favoured format for consortium agreements. 
     100 
     101 *      A web page needs to be on JISC site 
     102 
     103BADC can set up a wiki, and will split it into the separate work packages. 
     104A project blog also will be important – it requires monthly posts and a “we’ve started” kick-off post. 
     105There is an issue of where to site all this as UoL is the lead institute, but the Project Manager is based at BADC. 
     106(JT) There should be a team blog but it isn’t necessary to tie this to UoL (e.g. use Drupal in the JISC-funded BRISSkit project). A project website will be established at Leicester which points to the wiki and blog assuming hosted elsewhere. 
     107JT is using “#PREPARDE” hash tag for tweets. 
     108 
     109A Mid-term report is required after 6 months. 
     110Internal project communications – a project mailing list has been set up and this should be used for all communications until other facilities have been established. 
     111A monthly teleconference/Skype will take place (all agreed to use Skype). This will be important to keep things ticking over. 
     112There should also be face to face meetings when possible e.g. when workshops are running. 
     113 
     114'''Work Package 2 – Journal and Data Repository workflows''' 
     115 
     116(FM) Information about what people need? 
     117How accessible are data repos to potential publishers. 
     118Have got some peer-review guidance documents. 
     119Best way to engage?  
     120See GDJ home page – author guidelines (and on submissions page) 
     121Reviewer guidelines not yet published. 
     122What meetings/survey etc. to gather information? 
     123 
     124(SC) Sees the work package slightly differently, concerned with points where interactions with repositories can be made, looking for example at what are the BADC workflows when publishing. 
     125Thinks guidelines fit into the next work package. 
     126First pass capture is required before move further. 
     127 
     128Required activity is to a) sit down with those in the organisation, with a sheet of paper and literally talk about what happens in the publication process, and b) get the results into e-format and send round for comment. 
     129(SC) Asked AW to do submission flows for IJDC – as baseline for traditional publishing. 
     130 
     131Need to start to get academics involved relatively early to view what considering. 
     132(JT) Whilst framing the project in the first month, wondered if there are others to contact/liaise with. 
     133(SC) When we have our own workflows others may be more likely to comment/share. 
     134 
     135Others to contact re their workflows: 
     136a)      DRYAD – Ryan Scherle (http://wiki.datadryad.org/Publications ).  AW to talk to him at an Open Repository event. 
     137b)      Brian Hole (Ubiquity Press and PI for other new JISCMRD data publication project based around new Journal of Open Archaeological Data) 
     138c)      Pensoft? 
     139(SC) Also need to ask US colleagues to do workflows from their perspective. 
     140SC agreed to lead this work package alongside FM. 
     141 
     142'''Work Package 4 – Cross-linking between repositories and data publishers''' 
     143 
     144(FM) 
     145Can technically cross-link but it is a very manual process. 
     146This involves what can be automated, places to collect information, and best practice to emerge. 
     147The more specific about what is needed the better. 
     148 
     149(SC) There is an issue about where exactly a data citation should be.  This needs to be thrashed out with DataCite etc. 
     150(RL) Her own discussion about this concluded that it should be in the reference list.  There is need to send a paper round to publishers to get agreement. 
     151(SC) Agreeing with this, it was thought that the citation should be in the paper’s abstract also. 
     152 
     153(RL) ISB & BioShare are working on unique identifiers and links through a BBSRC funded project. 
     154The article and the dataset will both have their own DOI. 
     155 
     156'''Work Package 3 – Scientific review of datasets''' 
     157 
     158(RL) 
     159Through experience of ESA datasets there are validation points and programmes. 
     160Need major data provider e.g. Met Office, ESA to get a peer review process in for publishing algorithms. 
     161Wants to know how datasets are reviewed, what methods are used, have they been peer- reviewed.  This influences whether there is trust in the data. 
     162The aim is for a structured way to have confidence in datasets. 
     163See: CEMS - Climate and Environmental Monitoring from Space (http://isic-space.com/driving-innovation/climate-and-environmental-monitoring-from-space/) 
     164 
     165There is a Knowledge exchange interest, and issues move towards “can a validation portal be created?”. 
     166Also looking at Data review guidelines. 
     167 
     168(SC) Referred to International Journal of Data Curation (IJDC) article with data quality/metadata quality guidelines etc. (Issue 2, Volume 6, 2011 “Citation and Peer Review of Data: Moving Towards Formal Data Publication” - http://www.ijdc.net/index.php/ijdc/article/view/181/265) 
     169 
     170(RoL) Raised the issue of what researchers are looking for to make data trustworthy - How can you trust data? 
     171Suggested a link to the National Centre for Earth Observation (algorithm developers) as they create new datasets and are interested in validation etc. 
     172 
     173(?) 
     174Scientific and technical (what the repos. Does – data scientist e.g. won’t give them a DOI if not required metadata, not using accepted terminology) validation/review 
     175Generalities vs. specifics of subsets of disciplines/types of data 
     176(RL)  Repos. Needs to know about peer review e.g. “it’s rubbish” 
     177(SC) – they don’t currently know/express an opinion  
     178There was seen to be need for domain knowledge in review. 
     179 
     180(AW) The NERC Data Value Checklist addresses what makes a dataset long-term scientifically valuable. 
     181 
     182Allocation of DOIs – It was noted that UKDA allocate a new DOI for major revisions (and not minor revisions), but practice varies. 
     183(SC) Minting DOI’s is relatively new so there is a not lot of experience. 
     184 
     185(JT) Concluded that with so many issues and questions, the key question is which aspects to concentrate on. Will work with RoL, Heiko Balzter and other academics to produce recommendations. 
     186 
     187'''Work Package 6 – Stakeholder engagement and dissemination, including external communications''' 
     188 
     189(JT) Will be running stakeholder workshops later on, shaped by earlier work. 
     190There is potential for a briefing paper, questionnaire approach, to target audiences for workshops. 
     191(SC) Should also include the DCC organised Research Data Management Forum. 
     192It was agreed that the appropriate format would be to tag onto other events e.g. as per IDCC conference. 
     193 
     194Possibilities to follow up: 
     195 *      STM Association - http://www.stm-assoc.org/ (International Association of Scientific, Technical & Medical Publishers) (FM) 
     196 *      Learned Societies 
     197 *      Royal Astronomical Society (JT) 
     198 *      AGU (SC) 
     199 *      AMS – American Meteorological Society (FM) 
     200 *      Use GDJ Review Board as now putting it together (RL) 
     201 *      IPCC (RL) 
     202 
     203(SC) It was suggested that a portable “standard workshop” should be created and taken from place to place. 
     204(JT) Developments in this area should be recorded on the Wiki. 
     205 
     206'''Work Package 5 - Data Repository Accreditation''' 
     207 
     208(SC) Leading on this with a clear idea of initial leads and contacts. 
     209There is a first draft report for IDCC conference in January focussing on guidelines for publishers to help them identify trustworthy repositories. 
     210A request was made for others to point to any information if they are aware of any. 
     211Note: 
     212 *      The “Data Seal of Approval” (http://www.datasealofapproval.org/) 
     213 *      16363 – ISO standard. (http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=56510) 
     214 
     215'''3.15pm onwards – American colleagues joined at this point and work package discussions were reviewed''' 
     216 
     217a)      Matt Mayernik – NCAR  
     218b)      John Kunze – Associate Director, University of California Curation Center 
     219 
     220'''Work Package 1 – Project Management''' 
     221 
     222SC confirmed that a project Mailing List had set up, and she was writing a Project Plan. 
     223SC/JT to work on a project website/wiki/collaborative environment. JK/MM confirmed that they were not too concerned about which platform was used for these. 
     224A monthly telecom will be set up and Leicester will draw up a consortium agreement. 
     225 
     226Agreed Communications: 
     227a)      Telecons to default to 4.00pm UK time monthly, via Skype, days to be decided by Doodle Poll. 
     228b)      Quarterly or 6 monthly face to face meetings. 
     229 
     230'''Work Package 2 – Journal and Data Repository workflows''' 
     231 
     232(SC) Each project team member should go into each of their own organisations and look at the publishing workflows – steps that happen before a DOI is created, the work is frozen and it is made available. 
     233Workflow comparison to be conducted, to bring together to see common procedures and differences and where crosslinks can be made easily. 
     234 
     235(MM) Within NCAR there are many workflows rather than one. 
     236Data management teams are very specific, different labs with very specific work e.g. the climate modelling team. 
     237Data systems and workflows have grown independently. 
     238We need to see what should cut across all e.g. citation, peer review. 
     239Noted that this is well timed as this was work which was needed. 
     240 
     241To focus on 3 groups, better organised, and have already contacted and confirmed that they are interested in this proposal: 
     242a)      NCAR earth observing lab (observation data). 
     243b)      Climate modelling team (simulation data). 
     244c)      Research Data Archive (reference collection of diverse data types). 
     245 
     246Suggestions have been made about other US groups which may be appropriate. 
     247 
     248(JK) Confirmed that have their own repository and issue DOI’s. 
     249 
     250'''Work Package 3 – Scientific review of datasets''' 
     251 
     252Are working towards reviewer guidance, and will use what exists (RL evidence). 
     253(RL) How many different flavours will be required? – this will be interesting. 
     254 
     255(MM) Aware that this will be tricky, with technical vs. scientific review.  There are wide variations in time spent, and results. 
     256Issues include: 
     257 *      Who is qualified to review? 
     258 *      Who would you send a dataset to?  Peer review implies review by somebody outside of the team responsible for creating and archiving data, but people outside will have less context, and may require more than project documentation. 
     259 *      Review before or after a data set is made available is interesting question. There, users may find issues with the data e.g. calibration errors, and results in change.   
     260 *      Common practice at NCAR is to require login before data can be downloaded, which allows the data archiving teams to inform those who have used data if there is a data set change. 
     261 
     262(JK) Experiences much less formal review and so feels an outlier in this. 
     263Within versioning the important thing is keeping the version history, whatever the DOI policy. 
     264 
     265(RL) Is asked to peer reviews and they make take days.  Issues are therefore, 
     266 *      How much time is reasonable? 
     267 *      Tools used? 
     268 *      Prioritisation of which is important and needs/doesn’t need review. 
     269 *      How do you structure this extra work? 
     270He concluded that best review is when people actually use the data. 
     271 
     272'''Work Package 6 – Stakeholder engagement and dissemination, including external communications''' 
     273 
     274Suggested events: 
     275 *      Agree that IDCC workshop will be good idea. 
     276 *      (MM) American meteorological Society January 2013 
     277 *      AGU 
     278 
     279(SC) Decided against a dedicated Project Twitter feed for now but to use #PREPARDE in personal tweets, show activity on web pages, and get information on Wiki and point to it via #PREPARDE as soon as possible. 
     280 
     281'''Work Package 4 – Cross-linking between repositories and data publishers''' 
     282 
     283(RL) 
     284Worth linking into BBSRC funded iBioDBCore (Internal BioCreators / Bio-Sharers) 
     285List of all biology repositories - Catalogue data they take, back-up plan, how funded etc. 
     286BioDBCore Stamp 
     287Last year and a half 
     288 
     289(SC) A “Publication Roadmap” for CDL is required as a deliverable. 
     290Next step is linking papers to data with a DOI. 
     291What metrics should be used? e.g. usage counters. 
     292 
     293(JK) Reuters are looking at launching data citation index, but SC noted that this is some way off yet. 
     294 
     295'''Work Package 5 - Data Repository Accreditation''' 
     296 
     297(SC) The initial step is the need to do Google searches to find out what is out there.  Get what information we can find. 
     298(MM) What exactly will Accreditation mean? – How active? How much self-certification? 
     299(SC) What we need to do is requirements for this. 
     300 
     301Suggested sources: 
     302 *      (FM) NISO – National Inst. Standards Org 
     303 *      (AW) Digital Preservation Consortium, Center for research Libraries, DCC risk management approach (DRAMBORA) 
     304 *      (JT) Compare to Biomed ISO standards work (UCL pilot group with BRISSkit http://www.brisskit.le.ac.uk project led by JT) 
     305 
     306SC to draw up timelines in the next month.