Dr Robin Mitra
Senior Lecturer
- Ar gael fel goruchwyliwr ôl-raddedig
Trosolwg
I am an ONS Senior Lecturer in Statistics. Prior to my appointment at Cardiff University I was a Lecturer at Lancaster University and prior to that a Lecturer at the University of Southampton.
My main research areas are dealing with problems arising due to missing data and data confidentiality. I also have interests in Bayesian methods more generally. I enjoy working collaboratively, both with colleagues in academia as well as with non-academic partners. Some of my previous collaborations have included working with the Office for National Statistics and the National Health Service Blood and Transplant as well as the Institute of Employment Research in Germany. More details can be found on my research page.
I am active member of the Royal Statistical Society (RSS). I was Chair of the Medical Section from January 2017- January 2020 and am currently a RSS Council member. In 2020 I also taught a course for the African Institute for Mathematical Sciences that was supported by the RSS.
Cyhoeddiadau
2020
- Pankhurst, L., Mitra, R., Kimber, A. and Collett, D. 2020. Multiply imputing missing values arising by design in transplant survival data. Biometrical Journal 62(5), pp. 1192-1207. (10.1002/bimj.201800253)
- Mitra, R., Blanchard, S., Dove, I., Tudor, C. and Spicer, K. 2020. Confidentiality challenges in releasing longitudinally linked data. Transactions on Data Privacy 13(2), pp. 151-170.
2019
- Lee, K. M., Biedermann, S. and Mitra, R. 2019. D-optimal designs for multiarm trials with dropouts. Statistics in Medicine 38(15), pp. 2749-2766. (10.1002/sim.8148)
2018
- May, K. M., Biedermann, S. and Mitra, R. 2018. Optimal design for experiments with possibly incomplete variations. Statistica Sinica 28(3), pp. 1611-1632.
- Ghosh, J., Li, Y. and Mitra, R. 2018. On the use of Cauchy prior distributions for Bayesian logistic regression. Bayesian Analysis 13(2), pp. 359-383. (10.1214/17-BA1051)
- May, K., Mitra, R. and Biedermann, S. 2018. Optimal design when outcome values are not missing at random. Statistica Sinica 28(4), pp. 1821-1838.
2017
- Lee, M. C., Mitra, R., Lazaridis, E., Lai, A., Goh, Y. K. and Yap, W. 2017. Data privacy preserving scheme using generalised linear models. Computers and Security 69, pp. 142-154. (10.1016/j.cose.2016.12.009)
2016
- Lee, M. C., Mitra, R., Lazaridis, E., Lai, A. C., Goh, Y. K. and Yap, W. 2016. Statistical disclosure control for data privacy using sequence of generalised linear models. Presented at: 21st Australasian Conference on Information Security and Privacy (ACISP 2016), Melbourne, VIC, Australia, 4-6 July 2016 Presented at Liu, J. K. and Steinfeld, R. eds.Information Security and Privacy: 21st Australasian Conference, ACISP 2016, Melbourne, VIC, Australia, July 4-6, 2016, Proceedings, Part I, Vol. 9722. 0302-9743 Springer Verlag pp. 77-93., (10.1007/978-3-319-40253-6_5)
- Lee, M. C. and Mitra, R. 2016. Multiply imputing missing values in data sets with mixed measurement scales using a sequence of generalised linear models. Computational Statistics & Data Analysis 95, pp. 24-38. (10.1016/j.csda.2015.08.004)
- Mitra, R. and Reiter, J. P. 2016. A comparison of two methods of estimating propensity scores after multiple imputation. Statistical Methods in Medical Research 25(1), pp. 188-204. (10.1177/0962280212445945)
2015
- Rashid, S., Mitra, R. and Steele, R. 2015. Using mixtures of t densities to make inferences in the presence of missing data with a small number of multiply imputed data sets. Computational Statistics and Data Analysis 92, pp. 84-96. (10.1016/j.csda.2015.05.009)
2013
- Hu, J., Mitra, R. and Reiter, J. 2013. Are independent parameter draws necessary for multiple imputation?. American Statistician 67(3), pp. 143-149. (10.1080/00031305.2013.821953)
2012
- Ghosh, M., Song, J., Forster, J., Mitra, R. and Mukherjee, B. 2012. On the equivalence of posterior inference based on retrospective and prospective likelihoods: application to a case-control study of colorectal cancer. Statistics in Medicine 31(20), pp. 2196-2208. (10.1002/sim.5358)
2011
- Mitra, R. and Reiter, J. P. 2011. Estimating propensity scores with missing covariate data using general location mixture models. Statistics in Medicine 30(6), pp. 627-641. (10.1002/sim.4124)
2010
- Mitra, R. and Dunson, D. 2010. Two-level stochastic search variable selection in GLMs with missing predictors. International Journal of Biostatistics 6(1), article number: 33. (10.2202/1557-4679.1173)
2009
- Reiter, J. P. and Mitra, R. 2009. Estimating risks of identification disclosure in partially synthetic data. Journal of Privacy and Confidentiality 1(1) (10.29012/jpc.v1i1.567)
2006
- Mitra, R. and Reiter, J. P. 2006. Adjusting survey weights when altering identifying design variables via synthetic data. Presented at: International Conference on Privacy in Statistical Databases, Rome, Italy, 13-15 December 2006 Presented at Domingo-Ferrer, J. and Franconi, L. eds.Privacy in Statistical Databases. Berlin, Heidelberg: Springer Verlag pp. 177-188., (10.1007/11930242_16)
Currently funded projects
My main research areas are in addressing challenges to deal with missing data and data confidentiality. Some current funded projects I am working on are:
"Partial recovery of missing responses - a toolbox for efficient design and analysis when data may be missing not at random" funded by EPSRC (348k)
"Developing synthetic data methods for large confidential administrative databases" funded by ESRC (a 3 year PhD CASE studentship)
Missing data
Missing data is a common unavoidable problem that arises in many fields. Key statistical challenges include dealing with the missing data problem most effectively as well ensuring appropriate inferences are drawn from the incomplete data. My projects in this area have broadly looked at two main areas. The first develops multiple imputation methodology to improve techniques to impute missing values as well as to analyse the multiple imputed data sets. The second considers how to design experiements most effectively when missing data in the outcome of the experiment is to be expected.
One of my current research interests in missing data is in using optimal design methods to deal with the problem of Not Missing at Random missing data and have recently been awarded a two year grant proposal from EPSRC to support this line of research. I also have an ongoing project in missing data to deal with the problem in offline evaluation of Bandit methods.
Data confidentiality
Data confidentiality is becoming an ever increasing issue as the demand for access to different types of data increases. It is important to protect the privacy of the individuals that provide their data, while at the same time preserving the utility of any released data so that it serves a useful purpose to researchers. My main interest in this area is the approach of synthetic data. The idea here is to use statistical models to simulate a synthetic data set that is a good reflection of the original data. As the original data has been partly, or fully, replaced with synthetic values confidentiality has been protected to an extent while providing a plausbile model has been used, statistical properties of the original data should be preserved in the synthetic data.
One of my current research projects in this area is looking at methods to synthesis large confidential administrative databases. The Office for National Statistics are Industrial Partners on this project. This is being funded by an ESRC CASE studentship and supports a PhD studentship for three years.