Moving from bottom-up to top-down: open-science practice in light of research funder’s perspective

In November 2022, Toby and I attended the Data Science Support Workshop organised by the Wellcome Mental Health Data Prize, which focused on open science and FAIR principles. Open science was never new to us, and in fact, all our team members have practised open science in some way, and have received some recognition regarding our contributions to open science[1,2].

To be fair, however, one of the main reasons that we believed (and are still believing) in open science is we all agree science could have been done better, in the era of replication crises. Being early career researchers (ECRs), we thus sought best practices in doing science. But on many occasions, we practice open science in a bottom-up manner. Our institutions, at least back then, provided little guidance or incentives regarding open science. We just did what we believed was appropriate.

Things have changed dramatically since then. Universities and research institutions are now providing more streamlined open science training as early as undergraduate students. Academic societies are providing prizes for recognising open science practices. More importantly, research funding agencies are outlining open science as one of the main evaluation criteria. All of these initiatives have strikingly moved open science from bottom-up toward top-down.

/open_science.jpg

The FAIR Data Principles

Let’s first focus on one of the main aspects of open science, namely, open data, which follows the FAIR Data Principles. The FAIR principle was first introduced and published in the journal Nature Scientific Data[3]. FAIR means Findable, Accessible, Interoperable, and Reusable. Findable: data are assigned a globally unique and persistent identifier, so search engines can find them. I believe this is self-explanatory. Accessible: data should not only be found but be accessed in an open, free, and low-barrier fashion. It is not ideal when the data you eventually find online requires complicated password protection. Interoperable: data must have a formal, accessible, and broadly applicable language for its representation. Essentially, data needs to be well-documented comprehensively, such that data from different sources can be integrated. You do not want to spend half a week figuring out the meaning of each variable, and whether or not they are quality controlled. Reusable: data are richly described with relevant attributes, and ideally meet domain-relevant standards. When sharing data, it is almost unnecessary to invent the wheel when there are domain-relevant standards to follow. Following these standards also makes it a lot easier for individuals to understand and eventually make use of of the data.

The FAIR Data Principles relevant to the Wellcome Mental Health Data Prize

Wellcome will evaluate open science practice, especially the FAIR principle, for work developed during the Discovery Phase. Our first thought is: this is excellent! Rather than saying open science practices are encouraged, it places more value when open science practices are actually evaluated. This ensures all projects will meet the open science guidelines to the highest degree possible. Specifically to data tool development, following FAIR principles will make the data tool (a) findable with a public identifier, (b) accessible with neither extra cost nor password protection, (c) interoperable with integrability handling multiple data sources, and (d) reusable with detailed documentation. Eventually, following FAIR principles when designing the data tool will maximise the use cases, ranging from researchers and practitioners to patients and policymakers.

In fact, Wellcome, as one of the major research funders in the UK and perhaps also in the world, has been staying in the frontier regarding reformation funding policies on open science. Wellcome explicitly states that “We expect our researchers to maximise the availability of research data, software and materials with as few restrictions as possible.”[4] This means research funded by Wellcome ought to openly share data and code by default, unless other restrictions apply (e.g., patient data protected by an additional data privacy policy).

Beyond UK funding landscape and open data

It won’t make much sense if only Welcome is driving open science practice as a research funder. Fortunately, many more funders are driving the changes, for good and responsible science. For example, the Europe Research Council (ERC[5]) and the Dutch Research Council (NWO[6]) have both implemented FAIR principles into their funding policy, and more funders are moving into this direction. In the foreseeable future, open science practice will become the default way of doing science, rather than something only “good to have.” On a separate note, open data is one of many layers entailed in open science. Other aspects include open access, open education, and public engagement, all relevant, meaningful, and essential to the Wellcome Mental Health Data Prize, from Prototyping Phase, through Sustainability Phase, to public dissemination.

Conclusion

Open science, in particular FAIR Principles in open data, used to be bottom-up in the early days. As a result, researchers and practitioners have had to rely on many “common science” regarding what ought to be the right thing to proceed. By contrast, major research funders like Wellcome have started to implement open science practices in a more top-down fashion. Such incentive is crucial to bring the best in open science to ever rigorous science.

[1] https://www.escan2022.eu/program/awards/

[2] https://www.bap.org.uk/awardinfo.php?awardinfoID=2&year=2022

[3] https://www.nature.com/articles/sdata201618

[4] https://wellcome.org/grant-funding/guidance/data-software-materials-management-and-sharing-policy

[5] https://erc.europa.eu/manage-your-project/open-science

[6] https://www.nwo.nl/en/research-data-management