I am a PhD student working on neural network with a data science and maths background, I speak French and English fluently
Generative Adversarial Networks for Rare Event Augmentation
Good quality data are valuable, but not always available. Fit-for-purpose real-world data collection can be difficult, expensive, time-consuming, or simply not possible, and comes with problems of data privacy and security. This project will produce synthetic data generation methods for creating synthetic data in data-limited situations, allowing the user to subsequently apply modern data intensive techniques, and hence make better decisions.
Our main tool will be Generative Adversarial Networks (GANs), which are paired neural networks that we will train to simulate rare events. GAN’s are a class of machine learning systems introduced by Goodfellow in 2014. The key idea is that two neural networks contest with each other in a game, in the sense of mathematical game theory, to improve each other’s performance.
The Generative Network tries to synthesise new data that looks like it comes from the same source as our original data set, while the Discriminator Network tries to spot the fakes among the originals. As they compete, both networks improve, to the point where the generative network can synthesise new data indistinguishable from the old. An important feature of this approach is that neural networks can encode extremely complex dependencies in the data, and thus pick up features the user is unaware of.
The aim of the project is to extend the application of GANs to rare event augmentation, though an intermediate rare event simulator, allowing a combination of supervised and unsupervised learning. DCWW already use a SMOTE system (Chawla et al., 2002) to address class imbalance problems, which will be available for benchmarking the performance of the GAN data augmentation methodology.