EPSRC DTP PhD in Mathematics: Anomaly Detection for Large Complex Data
|Application deadline||31 July 2018|
|Start date||1 October 2018|
|Level of study||Postgraduate research|
|Award type||PhD studentship|
|Number of studentships||1|
This PhD is designed to develop novel mathematics which bridges linear algebra, statistics and optimization, and to introduce new modern techniques for anomaly detection.
The Office for National Statistics (ONS) is the UK’s largest independent producer of official statistics, and is the recognised national statistics institute of the UK. It produces a wide range of statistics on the UK’s economy, society and population. These are used in policy decisions across government, in the allocation of billions of pounds of funding, and by the private, academic and third sector to inform decision-making.
ONS is currently undergoing ambitious transformation, moving away from the traditional use of survey data to compile statistics, to significantly increased use of administrative and other non-survey data to better meet the needs of users.
The Independent Review of UK Economic Statistics (Bean, 2016) recommended in Strategic Recommendation D that ONS ‘make the most of existing and new data sources and the technologies for dealing with them’, and in Recommended Action 13, that ONS ‘build ONS’s capacity to clean, match and analyse very large datasets’. It also recommends that ONS build its capacity to do this through collaboration with the academic sector.
Large, complex, multi-variable and multiple data type data sources present a new challenge for anomaly detection as part of the statistical production process. Simple parametric models used for outlier detection in survey data are no longer suitable. They require model assumptions that would become prohibitively complex, are not efficient in processing large data sets, and do not allow for mixed variable types.
Linear algebra has seen applications in a wide variety of areas in multivariate statistics but the last decade has generated a number of new settings in which such techniques are being applied in statistics. Examples include the developments in compressed sensing, and matrix completion, work pioneered by prominent mathematicians such as Candès (Candès & Tao, 2010), Donoho (Donoho, 2006), Tao (Candès & Tao, 2007) and Tsybakov (Rohde & Tsybakov, 2011).
The escalation of ‘big data’ has given rise to more considered thought on how optimization can inform statistical procedure as the dimensions of the problem grow. A modern trend has been to form statistical problems as (approximate) convex optimization problems, where the technology is such that existing routines can solve such problems in huge dimensions fairly quickly (Boyd & Vandenberghe, 2004).
An interesting question is how close the solution to the approximate convex optimization problem is to the solution of the original statistical problem. This PhD is set in this context outlined, to tackle the problem of anomaly detection.
This project therefore offers you the novel opportunity not only to work on datasets not usually available to academia, but also to do so in a state-of-the art distributed processing environment.
Datasets that you would work on may include HMRC’s turnover and expenditure data from value added tax returns and HMRC payroll data. ONS is exploring the potential to use these in the compilation of headline economic statistics including gross domestic product (GDP). Robust understanding of these new datasets is crucial in ensuring the quality of market-moving statistics.
Chair in Statistics
Senior Lecturer in Statistics / Director of Admissions
Dr Louisa Nolan (Data Science Campus, Office for National Statistics)
|Tuition fee support||Full UK/EU tuition fees|
|Maintenance stipend||Doctoral stipend matching UK Research Council National Minimum|
|Residency||UK Research Council eligibility conditions apply|
Applicants should have obtained, or be about to obtain, a First or Upper Second Class UK Honours degree, or the equivalent qualifications gained outside the UK. Applicants with a Lower Second Class degree will be considered if they also have a Master’s degree.
Consideration is automatic on applying for Doctor of Philosophy in Mathematics, with a start date of October.
In the 'Research Proposal and Funding' section of your application, please specify the project title and supervisors of this project and copy the project description in the text box provided.
Please select 'No, I am not self-funding my research' when asked whether you are self-funding your research.
Please add 'PhD in Anomaly Detection for Large Complex Data' when asked 'Please provide the name of the funding you are applying for'.
We reserve the right to close applications early should sufficient applications be received.