-Better software for Better research-
Introduction to the FAIR² principles for Research Software

Romain Thomas
Head of Research Software Engineering,
The University of Sheffield
rse.shef.ac.uk

MADE4MADE4Manufacturing Centre for Doctoral Training,
Data science and RSE module ,
October 2025

Useful Information:

You can follow the slides on your own laptop ➡️➡️
Slides are available freely on github.
Interrupt me whenever you want.
Every blue text is a hyperlink

Who am I?

Name : Romain Thomas
Role : Head of Research Software Engineering
Previously : Staff Astronomer and Software Project manager at the Very Large Telescope (Chile)
Released/Published a few modules/software:
- dfitspy
- SPARTAN
- SCUBA
- STON (submitted)

Who are we? The teams behind the programme

Research Software Engineering

The Research Software Engineering team is composed of 13 members and collaborates with researchers across the University in building research software. Areas of expertise within the group include: general software development, code optimisation and performance, reproducibility, GPU computing and Deep Learning, High Performance Computing, training, etc…

Who are we? The teams behind the programme

Data Analytics Service

The Data Analytics Service (IT Services) supports research excellence at the University of Sheffield by bridging technical and analytical gaps through consultation, delivering training, and long-term collaboration with research teams. DAS supports researchers with reproducible data analysis, data visualisation, data engineering, machine learning, statistics, big data, research software, web design, and much more.

Who are we? The teams behind the programme

Library’s Scholarly Communications

The Library’s Scholarly Communications Team provides specialist services to support researchers at the University of Sheffield. They offer guidance on making your research outputs open access, and give support on good practice in research data management, copyright and licensing as well as open research more broadly.

Before we start, let me ask you some questions…

Who here has already made some code?
If you opened that code a year from now, how easy would it be for you to run it again?
What would someone else need in order to use it — just the files, or also data, documentation, instructions, dependencies…?
If I wanted to find your code tomorrow, would I be able to? Where would I look?
Would I be able to reuse it for a slightly different purpose than yours?

Why FAIR?

Why OPEN?

Research is a continuous process

“The succession of researchers is comparable to a single person who learns indefinitely.
Pascal, Pensee, French Mathematician, Physicist, inventor, philosopher and theologian [1623-1662]

That’s very old….
But still very valid…
And becomes much more difficult with the
complexity of modern research

Research creates knowledge…that is passed down

“Knowledge is humankind’s most precious treasure. Everything that we accomplished has been done due to the capacity to create a transmissible heritage, which spares each new generation the task of starting from scratch.” B. Sirbey, le grand homme qui apprend.

If we are doing the research we are doing today, it is thanks to the work of previous generations that created the knowledge that we are using now.

And that can be trusted…

Research relies on the ability to trust what has been done before.
This means that a result has been tested, verified and could be reproduced ➡️➡️
Tools and methods used for a particular result are known and shared…

The Turing Way project illustration by Scriberia. Used under a CC-BY 4.0 licence. DOI: 10.5281/zenodo.3332807

What if a generation of researchers stop doing this?

Tools and methods used for a particular results are NOT known and shared…
This means that a result can NOT be tested and verified and can NOT be reproduced.
➡️ It is harder to trust research

The Turing Way project illustration by Scriberia. Used under a CC-BY 4.0 licence. DOI: 10.5281/zenodo.3332807

Are we far from reaching this situation?

Source: Baker M., Nature, 2016

90% said there is a crisis!
More than 70% of researchers have tried and failed to reproduce another scientist’s experiments…
And more than half have failed to reproduce their own experiments.

So how do we get better?

Let’s improve!

Source: www.aalto.fi

And why not start with your software?

Let’s start by a definition: What is a software?

“Source code files, algorithms, scripts, computational workflows and executables that were created during the research process or for a research purpose.”

Barker et al. Scientific Data 9:622 (2022) “Introducing the FAIR Principles for research software”

What is FAIR?

The FAIR principles

The Turing Way project illustration by Scriberia. Used under a CC-BY 4.0 licence. DOI: 10.5281/zenodo.3332807

A guideline for those wishing to enhance the reusability of their data holdings
–Wilkinson et al. (2016)–

The FAIR principles

“Many of the FAIR Guiding Principles can be directly applied to research software by treating software and data as similar digital research objects. However, specific characteristics of software — such as its executability, composite nature, and continuous evolution and versioning — make it necessary to revise and extend the principles.”

Chue Hong, Neil P. et al, FAIR Principles for Research Software (FAIR4RS Principles)

The FAIR principles: what do they say?

Findable: Software, and its associated metadata, is easy for both humans and machines to find
Accessible: Software, and its metadata, is retrievable via standardised protocols

Barker et al. Scientific Data 9:622 (2022) “Introducing the FAIR Principles for research software” DOI: 10.1038/s41597-022-01710-x

FINDABLE: Research software should have a globally unique and persistent identifier (e.g., DOI or a persistent URL) so that it can be easily found and cited. Sufficient metadata should be provided to help users discover the software. This includes descriptions of the software’s function, version information, authorship, and where to access it. The software and its metadata should be indexed in searchable repositories so it can be discovered via common search engines and research infrastructure platforms (e.g., Zenodo, GitHub, or institutional repositories).

ACCESSIBLE: The software should be easily retrievable using the unique identifier. Typically, this involves storing the software in a trusted repository that ensures long-term access. Clear information about the conditions under which the software can be accessed should be available, including open access options, if applicable. This ensures users understand whether they can freely use or adapt the software.

The FAIR principles: what do they say?

Interoperable: Software interoperates with other software by exchanging data and/or metadata, and/or through interaction via application programming interfaces (APIs), described through standards.
Reusable: Software is both usable (can be executed) and reusable (can be understood, modified, built upon, or incorporated into other software)

Barker et al. Scientific Data 9:622 (2022) “Introducing the FAIR Principles for research software” DOI: 10.1038/s41597-022-01710-x

INTEROPERABLE: The software should use standardized data formats and interfaces where possible, allowing it to work with other software, tools, or systems. Clear documentation should be provided so users know how to integrate the software with other tools or systems. Where possible, the software should implement and support established protocols, formats, and APIs that are widely adopted in the research community.

REUSABLE: The software should be well-documented, including clear instructions on how to install, run, and modify it. The metadata should describe how and where the software can be reused, including dependencies, versioning, and requirements. An appropriate open or permissive license should be provided to ensure that others can legally reuse, modify, and redistribute the software. Adhering to coding standards, including the use of tests and continuous integration (CI), enhances the reliability and reusability of the software.

University’s position about FAIR

‘‘We aspire to open research culture that values a diverse range of contributions and adheres to the FAIR principles to enable the results of our research to be of maximum benefit to society (findable, accessible, interoperable and reusable), whilst also respecting circumstances that limit data sharing (for example, due to issues of privacy, non-consent, contractual agreements, legislation or practicality).’’
University of Sheffield, Statement on Open Research

‘‘All researchers, including postgraduate research students, have a personal responsibility to manage effectively the data they create….. All researchers are expected to document research data and software in line with the FAIR principles…..’’
University of Sheffield, Policy on good research and innovation practices

Barriers to FAIR²4RS

fear of prejudice
- important to create a positive culture
fear of ‘theft’
- licensing and citation
technical and time barriers
- support is available!
- only need to learn once
non-commercialisable?
- open source and commercialisation are compatible
- greater impact through open source

Better Science through Better Data 2017) scribe images.

Benefits of FAIR²4RS

Better Science through Better Data 2017) scribe images.

Accelerate research
Increase transparency of research
Increase visibility, citation, reputation and impact
Reduce duplication of effort

How to be FAIR?

FAIR4RS: Think about how you are coding…

Where possible, make your code modular.
Comment your code to make it as clear as possible.
Create and provide tests that others can use.
Follow code standards

FAIR4RS: Be open even inside the code!

Where possible and applicable, outputs (even between pieces of code) should use open and accessible data formats, which will help if other researchers only wish to use part of your code.
But do NOT reinvent the wheel! In some research fields data format are standardized ➡️ if you want people to use your code, use [your] community standards!

FAIR4RS: Version your code!

Using version control software platform such as Github/GitLab allows you to keep track of the changes you make to your code
You can release version of your software/code/scripts directly from Github. While it should not be used a long term storage place, It gives a place where your code can be downloaded and where people can contribute.

https://www.sheffield.ac.uk/library/research-data-management/repositories

FAIR4RS: Document your code!

A little poem from A beginner’s guide to writing documentation:

If people don’t know why your project exists, they won’t use it.
If people can’t figure out how to install your code, they won’t use it.
If people can’t figure out how to use your code, they won’t use it.

FAIR4RS: Document your code!

A little poem from A beginner’s guide to writing documentation:

If people don’t know why your project exists, they won’t use it.
If people can’t figure out how to install your code, they won’t use it.
If people can’t figure out how to use your code, they won’t use it.

In practice, Github can host documentation as website (and it is very easy to do!) ➡️➡️

FAIR4RS: Licence your code!

You need to tell people how they can re-use your code.

GPLv3 The GNU General Public License: a free, copyleft license for software and other kinds of works. It is intended to guarantee your freedom to share and change all versions of a software to make sure it remains free software for all its users
MIT licence: is a permissive free software license. Without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software,

The licence must be made clear in the code repository and in the documentation.

FAIR4RS: Get credit for your work

If people are using your software you should get credit for it.

➡️ state how you want to be credited. You can add it in the documentation and/or create a citation.cff file that you can add with your code (tools. are available to generate them)

FAIR4RS: Publish it!

Generalist software journals
- JOSS: Journal of Open Source Software: Academic journal with a formal peer review process that is designed to improve the quality of the software submitted.
- JORS: Journal of Open Research Software: Features peer reviewed Software Metapapers describing research software with high reuse potential.
- Software Impacts: multidisciplinary, open access, peer-reviewed journal which publishes short, articles that describe software which addresses a research challenge.

Some are domain specific:
- Astronomy and computing
- Journal of Artificial Societies and Social Simulation
- Journal of Statistical Software
- Science of Computer Programming
- Computer Methods and Programs in Biomedicine

You can find a list of potential journals here

The RSE module: what will you learn this year?

Lack of skills for developing software

Do you feel that you have received sufficient training to develop reliable software?

Bob Turner & Paul Richmond, UoS, RSE team, github.com/RSE-Sheffield/sssurvey.

The FAIR²4RS Programme: Overview

Software Lifecycle Planning

Who: R Thomas

When/Length: After this talk today!

Abstract:

When you start writing software it is often very useful to think about the development process and how you will make your software sustainable in the long term. In this module we will introduce important aspects of software development in research: software management plan, licences and dissemination. This module should allow you to ask yourself the right questions when starting a research software project.

Version control: Git, GitHub and GitKraken - From Zero to Hero

Who: Neil Shephard

When/Length: 27/10 & 03/11.

Abstract:

If you’ve never heard of or used version control and Git before this is the course for you. We start by introducing version control and exploring how it can be beneficial to researchers, then we introduce some useful tools and get started with some basic workflow using these tools. We build on those foundations with collaborative exercises that introduce key concepts such as forks, pull requests and branches and give you the chance to get some hands-on experience with using version control in a research setting.

Design your code (and write less of it)

Who: Martin Dyer, Neil Shephard

When/Length: 24/11 & 01/12

Abstract:

The way you write your code will have a massive impact on how easy it is to maintain in the long run. This course on Code Design introduces essential principles and best practices for writing clean and maintainable code. We will learn how we can write clean code, adhering to naming conventions, commenting, and following PEP 8 guidelines. We will then explore some fundamental principles such as DRY, KISS or YAGNI that are important to keep in mind when writing new code and see how we can spend less time touching the code by introducing configuration files and command line interface.

Documentation

Who: Joe Heffer

When/Length: 19/2

Abstract:

Well-documented software promotes reproducibility, maintainability, and increased research impact through wider adoption and citation. This course teaches researchers how to document their software effectively, making it accessible and understandable to others. It covers topics such as writing readable code and usage instructions.

Code Testing

Who: Sylvia Whittle, Michael Foster

When/Length: January (half day)

Format: In person

Abstract:

Does your code work? Are you sure? How do you ensure that it keeps working when you change it? Manually verifying is slow and tedious. Why not automate it? Software testing checks that your code works for you, and when it breaks, it can show you exactly where it broke, without you having to trawl through hundreds of lines of code manually.This course aims to provide you with the tools you need to start automatically ensuring the reliability of your code.

Reproducible Environments

Who: Dan Brady

When/Length: 30/4

Abstract:

Ensuring that others are able to take your code, run it, and are able to produce the same (or equivalent) results is one of the key tenets of FAIR and reproducible research software. This course will provide you with an overview of different ways to make your code reproducible and then focus on virtual environments as a specific tool for computational reproducibility.

RSE website

Contacts:

Tamora James -RSE and FAIR²4RS Programme Manager -(t.d.james@sheffield.ac.uk)
Romain Thomas -Head of RSE-(romain.thomas@sheffield.ac.uk)

Acknowledgements & References

Thank you to Tamora James for leading the development of this training programme
Thank you to Christopher Wild, Ric Campbell, Farhad Allian, Daniel Brady, Kate O’neill, Joe Heffer, Jenni Adams, Neil Shephard, Sylvia Wittle and Arfon Smith for dedicating time to prepare all the material!

References
* D. Wilby Lunchbyte talk on the FAIR principles
* T. James, FAIR for research software, Talk OpenFest 2024
* The Turing Way
* B. Sirvey Le grand homme qui apprend
* Chue Hong, Neil P. et al, FAIR principles for Research Software

-Better software for Better research-Introduction to the FAIR2 principles for Research Software

Useful Information:

Who am I?

Who are we? The teams behind the programme

Who are we? The teams behind the programme

Who are we? The teams behind the programme

Before we start, let me ask you some questions…

Why FAIR?

Why FAIR?

Why OPEN?

Research is a continuous process

Research creates knowledge…that is passed down

And that can be trusted…

What if a generation of researchers stop doing this?

Are we far from reaching this situation?

So how do we get better?

Let’s improve!

And why not start with your software?

Let’s start by a definition: What is a software?

What is FAIR?

The FAIR principles

The FAIR principles

The FAIR principles: what do they say?

The FAIR principles: what do they say?

University’s position about FAIR

Barriers to FAIR24RS

Benefits of FAIR24RS

How to be FAIR?

FAIR4RS: Think about how you are coding…

FAIR4RS: Be open even inside the code!

FAIR4RS: Version your code!

FAIR4RS: Document your code!

FAIR4RS: Document your code!

FAIR4RS: Licence your code!

FAIR4RS: Get credit for your work

FAIR4RS: Share it!

FAIR4RS: Share it!

FAIR4RS: Share it!

FAIR4RS: Publish it!

The RSE module: what will you learn this year?

Lack of skills for developing software

The FAIR24RS Programme: Overview

Software Lifecycle Planning

Version control: Git, GitHub and GitKraken - From Zero to Hero

Design your code (and write less of it)

Documentation

Code Testing

Reproducible Environments

RSE website

Acknowledgements & References

-Better software for Better research-
Introduction to the FAIR² principles for Research Software

Barriers to FAIR²4RS

Benefits of FAIR²4RS

The FAIR²4RS Programme: Overview