I am an ONS Senior Lecturer in Statistics. Prior to my appointment at Cardiff University I was a Lecturer at Lancaster University and prior to that a Lecturer at the University of Southampton.
My main research areas are dealing with problems arising due to missing data and data confidentiality. I also have interests in Bayesian methods more generally. I enjoy working collaboratively, both with colleagues in academia as well as with non-academic partners. Some of my previous collaborations have included working with the Office for National Statistics and the National Health Service Blood and Transplant as well as the Institute of Employment Research in Germany. More details can be found on my research page.
I am active member of the Royal Statistical Society (RSS). I was Chair of the Medical Section from January 2017- January 2020 and am currently a RSS Council member. In 2020 I also taught a course for the African Institute for Mathematical Sciences that was supported by the RSS.
Currently funded projects
My main research areas are in addressing challenges to deal with missing data and data confidentiality. Some current funded projects I am working on are:
"Partial recovery of missing responses - a toolbox for efficient design and analysis when data may be missing not at random" funded by EPSRC (348k)
"Developing synthetic data methods for large confidential administrative databases" funded by ESRC (a 3 year PhD CASE studentship)
Missing data is a common unavoidable problem that arises in many fields. Key statistical challenges include dealing with the missing data problem most effectively as well ensuring appropriate inferences are drawn from the incomplete data. My projects in this area have broadly looked at two main areas. The first develops multiple imputation methodology to improve techniques to impute missing values as well as to analyse the multiple imputed data sets. The second considers how to design experiements most effectively when missing data in the outcome of the experiment is to be expected.
One of my current research interests in missing data is in using optimal design methods to deal with the problem of Not Missing at Random missing data and have recently been awarded a two year grant proposal from EPSRC to support this line of research. I also have an ongoing project in missing data to deal with the problem in offline evaluation of Bandit methods.
Data confidentiality is becoming an ever increasing issue as the demand for access to different types of data increases. It is important to protect the privacy of the individuals that provide their data, while at the same time preserving the utility of any released data so that it serves a useful purpose to researchers. My main interest in this area is the approach of synthetic data. The idea here is to use statistical models to simulate a synthetic data set that is a good reflection of the original data. As the original data has been partly, or fully, replaced with synthetic values confidentiality has been protected to an extent while providing a plausbile model has been used, statistical properties of the original data should be preserved in the synthetic data.
One of my current research projects in this area is looking at methods to synthesis large confidential administrative databases. The Office for National Statistics are Industrial Partners on this project. This is being funded by an ESRC CASE studentship and supports a PhD studentship for three years.