Skip to main content

DELIMIT

Deliberative workshops with public members: Establishing trust in the use of synthetic data

Background

Synthetic data (often referred to as ‘mock’ or ‘dummy’ data) is a term used to describe a new copy of a dataset. This dataset is created at random but replicates the structure and some of the patterns of the original ‘real’ data set, while attempting to minimise the risk of identifying any specific individual. This type of data be used to explore the potential usefulness of a ‘real’ dataset and provide training in its use for those interested in accessing it for research purposes.

Different types of synthetic data pose more risk to confidentiality than others depending on how closely they match the original dataset – “low-fidelity” synthetic data has lower risk, but may be less useful for researchers.

Need for consultation

Data providers, including NHS and other UK Government departments, already make some low-fidelity datasets available for researchers. Despite recent work in this area to expand the use of synthetic data and encourage both researchers and data providers to utilise these datasets, there has been no widespread consultation with the public.

We are undertaking a large public consultation with members from across the four nations of the United Kingdom. We will recruit a diverse range of public members (n=40) from across the four UK nations, working with a community engagement agency (Egality) to help recruit our cohort. We will run 4 workshops run across 2024 to explore public attitudes towards the use of synthetic data for research, including: perceived benefits and risks of synthetic data, the acceptability of scaling up its use and language and techniques for communicating about synthetic data with the public. Workshop content and dissemination strategies will be guided by an expert steering group and public collaborators.

Intended study outputs

Our outputs will include a set of recommendations for researchers and data providers based on the key themes identified from the first three workshops. In the final workshop we will agree final recommendations with members before producing a full written report. The recommendations will be relevant to departments releasing, or planning to release, synthetic data and to the researchers who access them. We will also produce our own accessible output (e.g. an infographic) on the topic of synthetic data, based on our recommendations which will be freely available.

Information

Chief Investigator(s)

Key facts

Start date 1 Mar 2024
End date 31 Jan 2025
Grant value £120,893
Status
  • Set up

General enquiries