Dr Thomas Connor
- +44 (0)29 2087 4147
Throughout my career to date I have worked on projects featuring a wide array of organisms. This is principally because rather than being focused on examining a particular organism, my research is based in the first instance around the fundamental narratives of pathogen evolution. The processes that underpin the diversity we see in the microbial world are often the same whatever species we look at; although the results they produce can be markedly different.
My research is made possible by a combination of technologies that enable us to explore organisms at a resolution that has never before been possible. Firstly, whole genome sequencing, combined with high quality metatdata provides the datasets that we can use to derive the answers to the questions that we seek. Secondly, using computational and mathematical approaches, we are able to make sense of the "Big Data" challenge that is posed by the large, rich datasets that we produce.
Research within my group is therefore characterised by developing and applying population genomics, comparative genomics, and phylogenetics to elucidate the natural histories of microbial pathogens. In a number of cases we have developed tools or approaches to analyse our data. However, in all cases we start first with the biological questions, and then develop the approaches to answer our question. So it could be said that while what we do is broadly Bioinformatics, the research focus is on the Biology first, and the informatics provides the tools to unlock the data that we produce.
Population and Comparative Genomics
Whole genome sequences provide us with the complete blueprint for the organisms that we are investigating. To understand our organisms of interest, we consider how their genomes vary between organisms (comparative genomics) and how they have changed/evolved over time (population genomics).
Unlike eukaryotic organisms, bacteria have highly variable genomes; they can gain and loose genes at a very high frequency, and members of the same named species may have fewer than half of their genes in common. This genomic plasticity is hugely important, as the genes that vary between strains are often the genes that are associated with characteristics of interest – such as virulence or antimicrobial resistance. Using whole genome sequence data we perform comparative genomics to:
- work out how pathogens are related, in terms of the gene content they share
- work out how they vary in their gene content
- work out how their genetic variation relates to differences in their phenotype (basically their behaviour – such as the seriousness of disease that they cause)
We complement comparative genomics with phylogenetics, which enables us to determine the relationships between isolates, and by integrating the results from these in silico analyses with phenotypic data produced from in vitro and in vivo experimental work, we are able to derive a better understanding of how, and why our organisms of interest cause disease.
While the comparative genomics work is focused on examining the similarities and differences between organisms, and how this relates to the phenotype of organisms, we supplement this by performing population genetic analyses to identify structure within the population, and to infer the recent evolutionary history of strains of interest. This work has been underpinned by a strong, longstanding collaboration with Professor Jukka Corander of the University of Helsinki, with whom I have developed a number of population genetic approaches to analyse bacterial genome-scale datasets (Cheng et al. 2011, Cheng et al. 2013, Marttinen et al. 2012).
I have developed considerable expertise using these approaches and to date I have applied these approaches to datasets including those comprising Vibrio cholerae (Mutreja et al. 2011), Salmonella Typhimurium (Mather et al. 2013, Okoro et al. 2012) and Clostridium difficile (He et al. 2013). In these cases, using a population genetic framework called BEAST, we reconstructed the evolutionary history of these organisms not in evolutionary time, but in human-understandable calendar units – years/days. Using this data I have been able to contribute significantly to answering key questions about how, and when outbreaks have begun, as well as being able to identify key events in the evolution of the pathogens of interest.
Bacteria do not respect borders; and local outbreaks can, and sadly sometimes do, lead to global epidemics. By combining population genomic approaches with excellent metadata, we are able to move beyond simple dated phylogenies towards a greater understanding of how bacteria move in time and space. I have worked extensively in projects that have examined the phylogeography of bacterial pathogens such as Vibrio cholerae, Salmonella Typhimurium and Clostridium difficile, deploying approaches to combine strain metadata and genomic information to derive insight into how and when pathogens of interest have spread around the world.
Cheng L, Connor T R, Aanensen D M, Spratt B G and Corander J (2011) Bayesian semi-supervised classification of bacterial samples using MLST databases. BMC Bioinformatics 12 302.
Cheng L, Connor T R, Siren J, Aanensen D M and Corander J (2013) Hierarchical and spatially explicit clustering of DNA sequences with BAPS software. Mol Biol Evol 30 (5) 1224-1228.
Dziva F, Hauser H*, Connor T R*, van Diemen P M, Prescott G, Langridge G C, Eckert S, Chaudhuri R R, Ewers C, Mellata M, Mukhopadhyay S, Curtiss R, 3rd, Dougan G, Wieler L H, Thomson N R, Pickard D J and Stevens M P (2013) Sequencing and functional annotation of avian pathogenic Escherichia coli serogroup O78 strains reveal the evolution of E. coli lineages pathogenic for poultry via distinct mechanisms. Infect Immun 81 (3) 838-849.
Fookes M, Schroeder G N, Langridge G C, Blondel C J, Mammina C, Connor T R, Seth-Smith H, Vernikos G S, Robinson K S, Sanders M, Petty N K, Kingsley R A, Baumler A J, Nuccio S P, Contreras I, Santiviago C A, Maskell D, Barrow P, Humphrey T, Nastasi A, Roberts M, Frankel G, Parkhill J, Dougan G and Thomson N R (2011) Salmonella bongori provides insights into the evolution of the Salmonellae. PLoS Pathog 7 (8) e1002191.
He M, Miyajima F, Roberts P, Ellison L, Pickard D J, Martin M J, Connor T R, Harris S R, Fairley D, Bamford K B, D'Arc S, Brazier J, Brown D, Coia J E, Douce G, Gerding D, Kim H J, Koh T H, Kato H, Senoh M, Louie T, Michell S, Butt E, Peacock S J, Brown N M, Riley T, Songer G, Wilcox M, Pirmohamed M, Kuijper E, Hawkey P, Wren B W, Dougan G, Parkhill J and Lawley T D (2013) Emergence and global spread of epidemic healthcare-associated Clostridium difficile. Nat Genet 45 (1) 109-113.
Marttinen P, Hanage W P, Croucher N J, Connor T R, Harris S R, Bentley S D and Corander J (2012) Detection of recombination events in bacterial genomes from large population samples. Nucleic Acids Res 40 (1) e6.
Mather A E, Reid S W, Maskell D J, Parkhill J, Fookes M C, Harris S R, Brown D J, Coia J E, Mulvey M R, Gilmour M W, Petrovska L, de Pinna E, Kuroda M, Akiba M, Izumiya H, Connor T R, Suchard M A, Lemey P, Mellor D J, Haydon D T and Thomson N R (2013) Distinguishable epidemics of multidrug-resistant Salmonella Typhimurium DT104 in different hosts. Science 341 (6153) 1514-1517.
Mutreja A, Kim D W, Thomson N R, Connor T R, Lee J H, Kariuki S, Croucher N J, Choi S Y, Harris S R, Lebens M, Niyogi S K, Kim E J, Ramamurthy T, Chun J, Wood J L, Clemens J D, Czerkinsky C, Nair G B, Holmgren J, Parkhill J and Dougan G (2011) Evidence for several waves of global transmission in the seventh cholera pandemic. Nature 477 (7365) 462-465.
Okoro C K, Kingsley R A, Connor T R, Harris S R, Parry C M, Al-Mashhadani M N, Kariuki S, Msefula C L, Gordon M A, de Pinna E, Wain J, Heyderman R S, Obaro S, Alonso P L, Mandomando I, MacLennan C A, Tapia M D, Levine M M, Tennant S M, Parkhill J and Dougan G (2012) Intracontinental spread of human invasive Salmonella Typhimurium pathovariants in sub-Saharan Africa. Nat Genet 44 (11) 1215-1221.