Skip to main content
Professor Thomas Connor

Professor Thomas Connor


School of Biosciences

+44 (0)29 2087 4147
Sir Martin Evans Building, Museum Avenue, Cardiff, CF10 3AX
Media commentator
Available for postgraduate supervision


The research that is undertaken in my lab seeks to understand pathogen variation in order to answer a range of questions from how pathogens evolve, to how and why they spread in local and global outbreaks. These questions are specifically focused on examining a group of gastrointestinal pathogens (E. coli, Salmonella, Shigella, C. difficile) which collectively account for over a billion cases of disease around the world every year. In addition to my work around gastrointestinal pathogens, I also have a strong interest in antimicrobial resistance (particularly amongst the ESKAPE pathogens) and viral evolution (in HIV and Influenza).

My research is underpinned by whole genome sequencing, using the data from organisms genomes to work out how they are related to, and different from, other organisms of interest. This work is heavily computational, and we use and develop mathematical and computational approaches to analyse and interpret the "Big Biological Data" that we and our collaborators generate. This work is also readily translatable, and I currently work closely with the NHS to translate the approaches we develop into diagnostic and surveillance tools that can be used at a local and national level.

In addition to my pathogen research activities, I have an additional focus on the development and design of research infrastructures to support the analysis of the genomic datasets that we produce. This work has included the design and development of national computational resources in the UK, as well as the design of computational resources used in translational settings in the NHS and Industry.

Microbiomes, Microbes and Informatics

The Connor group is part of the recently formed Microbiomes, Microbes and Informatics (MMI) group (webpage underdevelopment). The MMI group currently comprises the research groups of Thomas Connor, Esh Mahenthiralingam, Julian Marchesi and Andrew Weightman, and has over 25 active research staff and postgraduate students.

The MMI group are highly research active generating over £3.5 million in grant income between 2010 and 2017, and publishing extensively in top journals (cumulative h index > 150, > 400 publications, and > 25,000 citations; source

The four current MMI staff recently moved (June 2017) to a single shared location within a new £1.6 million refurbished area of the Sir Martin Evans Building. This comprises a large class II certified research laboratory, equipment and tissue culture rooms, a group office area and academic offices. The MMI group welcomes approaches by potential fellowship applicants and funded PhD students to host their research and expand our strategic research on Microbiomes, Microbes and Informatics.


Module Leader: BI3252 The ‘omics revolution (Bioinformatics & Functional Genomics)

Biocomputing Research Hub lead

Member of College of Biomedical and Life Sciences Data Strategy Group

Member of Supercomputing Wales Infrastructure Committee

Member of Cardiff Supercomputing Facility Oversight Group

Wales regional lead and technical lead for the Cloud Infrastructure for Microbial Bioinformatics

Bioinformatics Lead for the Public Health Wales Pathogen Genomics Unit.

Interested in joining my lab as a self-funded post-graduate student or a postdoc/fellow?  Please contact me by email.















Population and Comparative Genomics

Whole genome sequences provide us with the complete blueprint for the organisms that we are investigating. To understand our organisms of interest, we consider how their genomes vary between organisms (comparative genomics) and how they have changed/evolved over time (population genomics).

Unlike eukaryotic organisms, bacteria have highly variable genomes; they can gain and loose genes at a very high frequency, and members of the same named species may have fewer than half of their genes in common. This genomic plasticity is hugely important, as the genes that vary between strains are often the genes that are associated with characteristics of interest – such as virulence or antimicrobial resistance. Using whole genome sequence data we perform comparative genomics to:

  • work out how pathogens are related, in terms of the gene content they share
  • work out how they vary in their gene content
  • work out how their genetic variation relates to differences in their phenotype (basically their behaviour – such as the seriousness of disease that they cause)

We complement comparative genomics with phylogenetics, which enables us to determine the relationships between isolates, and by integrating the results from these in silico analyses with phenotypic data produced from in vitro and in vivo experimental work, we are able to derive a better understanding of how, and why our organisms of interest cause disease.

While the comparative genomics work is focused on examining the similarities and differences between organisms, and how this relates to the phenotype of organisms, we supplement this by performing population genetic analyses to identify structure within the population, and to infer the recent evolutionary history of strains of interest.  This work has been underpinned by a strong, longstanding collaboration with Professor Jukka Corander of the University of Helsinki, with whom I have developed a number of population genetic approaches to analyse bacterial genome-scale datasets (Cheng et al. 2011, Cheng et al. 2013, Marttinen et al. 2012).

I have developed considerable expertise using these approaches and to date I have applied these approaches to datasets including those comprising Vibrio cholerae (Mutreja et al. 2011), Salmonella Typhimurium (Mather et al. 2013, Okoro et al. 2012) and Clostridium difficile (He et al. 2013). In these cases, using a population genetic framework called BEAST, we reconstructed the evolutionary history of these organisms not in evolutionary time, but in human-understandable calendar units – years/days. Using this data I have been able to contribute significantly to answering key questions about how, and when outbreaks have begun, as well as being able to identify key events in the evolution of the pathogens of interest.


Bacteria do not respect borders; and local outbreaks can, and sadly sometimes do, lead to global epidemics. By combining population genomic approaches with excellent metadata, we are able to move beyond simple dated phylogenies towards a greater understanding of how bacteria move in time and space. I have worked extensively in projects that have examined the phylogeography of bacterial pathogens such as Vibrio cholerae, Salmonella Typhimurium and Clostridium difficile, deploying approaches to combine strain metadata and genomic information to derive insight into how and when pathogens of interest have spread around the world.


Cheng L, Connor T R, Aanensen D M, Spratt B G and Corander J (2011) Bayesian semi-supervised classification of bacterial samples using MLST databases. BMC Bioinformatics 12 302.

Cheng L, Connor T R, Siren J, Aanensen D M and Corander J (2013) Hierarchical and spatially explicit clustering of DNA sequences with BAPS software. Mol Biol Evol 30 (5) 1224-1228.

Dziva F, Hauser H*, Connor T R*, van Diemen P M, Prescott G, Langridge G C, Eckert S, Chaudhuri R R, Ewers C, Mellata M, Mukhopadhyay S, Curtiss R, 3rd, Dougan G, Wieler L H, Thomson N R, Pickard D J and Stevens M P (2013) Sequencing and functional annotation of avian pathogenic Escherichia coli serogroup O78 strains reveal the evolution of E. coli lineages pathogenic for poultry via distinct mechanisms. Infect Immun 81 (3) 838-849.

Fookes M, Schroeder G N, Langridge G C, Blondel C J, Mammina C, Connor T R, Seth-Smith H, Vernikos G S, Robinson K S, Sanders M, Petty N K, Kingsley R A, Baumler A J, Nuccio S P, Contreras I, Santiviago C A, Maskell D, Barrow P, Humphrey T, Nastasi A, Roberts M, Frankel G, Parkhill J, Dougan G and Thomson N R (2011) Salmonella bongori provides insights into the evolution of the Salmonellae. PLoS Pathog 7 (8) e1002191.

He M, Miyajima F, Roberts P, Ellison L, Pickard D J, Martin M J, Connor T R, Harris S R, Fairley D, Bamford K B, D'Arc S, Brazier J, Brown D, Coia J E, Douce G, Gerding D, Kim H J, Koh T H, Kato H, Senoh M, Louie T, Michell S, Butt E, Peacock S J, Brown N M, Riley T, Songer G, Wilcox M, Pirmohamed M, Kuijper E, Hawkey P, Wren B W, Dougan G, Parkhill J and Lawley T D (2013) Emergence and global spread of epidemic healthcare-associated Clostridium difficile. Nat Genet 45 (1) 109-113.

Marttinen P, Hanage W P, Croucher N J, Connor T R, Harris S R, Bentley S D and Corander J (2012) Detection of recombination events in bacterial genomes from large population samples. Nucleic Acids Res 40 (1) e6.

Mather A E, Reid S W, Maskell D J, Parkhill J, Fookes M C, Harris S R, Brown D J, Coia J E, Mulvey M R, Gilmour M W, Petrovska L, de Pinna E, Kuroda M, Akiba M, Izumiya H, Connor T R, Suchard M A, Lemey P, Mellor D J, Haydon D T and Thomson N R (2013) Distinguishable epidemics of multidrug-resistant Salmonella Typhimurium DT104 in different hosts. Science 341 (6153) 1514-1517.

Mutreja A, Kim D W, Thomson N R, Connor T R, Lee J H, Kariuki S, Croucher N J, Choi S Y, Harris S R, Lebens M, Niyogi S K, Kim E J, Ramamurthy T, Chun J, Wood J L, Clemens J D, Czerkinsky C, Nair G B, Holmgren J, Parkhill J and Dougan G (2011) Evidence for several waves of global transmission in the seventh cholera pandemic. Nature 477 (7365) 462-465.

Okoro C K, Kingsley R A, Connor T R, Harris S R, Parry C M, Al-Mashhadani M N, Kariuki S, Msefula C L, Gordon M A, de Pinna E, Wain J, Heyderman R S, Obaro S, Alonso P L, Mandomando I, MacLennan C A, Tapia M D, Levine M M, Tennant S M, Parkhill J and Dougan G (2012) Intracontinental spread of human invasive Salmonella Typhimurium pathovariants in sub-Saharan Africa. Nat Genet 44 (11) 1215-1221.