Skip to content

Statistics Research Group

The group is very active both in applications of statistical techniques and in theory.

The main areas of research within the current group are:

  • time series analysis
  • multivariate data analysis
  • applications to market research
  • search algorithms and stochastic global optimisation
  • probabilistic number theory
  • optimal experimental design
  • stochastic processes and random fields with weak and strong dependence
  • diffusion processes and PDE with random data
  • anomalous diffusion
  • Burgers and KPZ turbulence, fractional ordinary and PDE, and statistical inference with higher-order information
  • extreme value analysis.

Various topics in fisheries and medical statistics are also considered, such as errors in variables regression.

Collaborations

Statisticians within the School have been prominent in collaborating with researchers in other disciplines. There are strong links with:

  • the School of Medicine, working on applications of multivariate statistics and time series analysis in bioinformatics
  • the School of Engineering, in the areas of image processing and stochastic global optimisation of complex systems
  • the Business School, in the field of analysis of economics time series.

Ongoing international collaborations exist with many Universities including Columbia, Taiwan, Queensland, Aarhus, Roma, Cleveland, Pau, Hokkaido, Boston, Caen, Calambria, Maine, Trento, Nice, Bratislava, Linz, St.Petersburg, Troyes, Vilnius, Siegen, Mannheim, and Copenhagen.

Industrial sponsorship

Significant industrial sponsorship has been obtained from:

  • Procter and Gamble (USA) working on statistical modelling in market research
  • the Biometrics unit of SmithKline Beecham collaborating on different aspects of pharmaceutical statistics
  • ACNielsen/BASES (USA) on applications of mixed Poisson models in studying marketing consumer behaviour
  • General Electric HealthCare on environmental statistics.

Our main areas of research within the current group are:

  • time series analysis
  • multivariate data analysis
  • applications to market research
  • search algorithms and stochastic global optimisation
  • probabilistic number theory
  • optimal experimental design
  • stochastic processes and random fields with weak and strong dependence
  • diffusion processes and PDE with random data
  • anomalous diffusion
  • Burgers and KPZ turbulence
  • fractional ordinary and PDE, and statistical inference with higher-order information.

In focus

Time series analysis

In recent years a powerful technique of time series analysis has been developed and applied to many practical problems. This technique is based on the use of the Singular-value decomposition of the so-called trajectory matrix obtained from the initial time series by the method of delays. It is aimed at an expansion of the original time series into a sum of a small number of 'independent' and 'interpretable' components.

Also, the spatial analogies of the popular ARMA type stochastic time series have been developed based on the fractional generalizations of the Laplacian with two fractal indices. These models describe important features of processes of anomalous diffusions such as strong dependence and/or intermittency.

Multivariate statistics

The objective is development of a methodology of exploratory analysis of temporal-spatial data of complex structure with the final aim of construction of suitable parametric models.

The applications include various medical, biological, engineering and economical data. Several market research projects where the development of statistical models was a substantial part have taken place.

Stochastic global optimisation

Let ƒ be a function given on an d-dimensional compact set X and belonging to a suitable functional class F of multiextremal continuous functions.

We consider the problem of its minimization, that is approximation of a point x' such that ƒ(x')=min ƒ(x), using evaluations of ƒ at specially selected points.

Probabilistic methods in search and number theory

Several interesting features of the accuracy of diophantine approximations can be expressed in probabilistic terms.

Many diophantine approximation algorithms produce a sequence of sets F(n), indexed by n, of rational numbers p/q in [0,1]. Famous examples of F(n) are the Farey sequence, the collection of rationals p/q in [0,1] with q<=n, and the collection of all n-th continued fraction convergents.

Stochastic processes

New classes of stochastic processes with student distributions and various types of dependence structure have been introduced and studied. A particular motivation is the modelling of risk assets with strong dependence through fractal activity time.

The asymptotic theory of estimation of parameters of stochastic processes and random fields has been developed using higher-order information (that is, information on the higher-order cumulant spectra). This theory allows analysis of non-linear and non-Gaussian models with both short- and long-range dependence.

Burgers turbulence problem

Explicit analytical solutions of Burgers equation with quadratic potential has been derived and used to handle scaling laws results for the Burgers turbulence problem with quadratic potential and random initial conditions of Ornstein-Uhlenbeck type driven by Levy noise.

Results have considerable potential for stochastic modelling of observational series from a wide range of fields, such as turbulence or anomalous diffusion.

Topics in medical statistics

A number of topics that have been associated with medical statistics presently researched in Cardiff include time-specific reference ranges, and errors in variables regression. Current research focuses on the search for a unified methodology and approach to the errors in variables problem.

Extreme Value Analysis

Extreme value analysis is a branch of probability and statistics that provides non-parametric procedures for extrapolation beyond the range of data (as good as possible and depending on the quality of data, knowing the limits is also an important issue). Its methods are usually relevant for institutions that are exposed to high risks, for instance, financial services and insurance companies or environmental engineering institutions.

Group leader

Prof Anatoly Zhigljavsky photograpgh

Professor Anatoly Zhigljavsky

Chair in Statistics

Email:
zhigljavskyaa@cardiff.ac.uk
Telephone:
+44 (0)29 2087 5076

Academic staff

Andreas Artemiou

Dr Andreas Artemiou

Lecturer

Email:
artemioua@cardiff.ac.uk
Telephone:
+44 (0)29 2087 0616
Dr Bertrand Gauthier photograph

Dr Bertrand Gauthier

Lecturer

Email:
gauthierb@cardiff.ac.uk
Telephone:
+44(0)29 2087 5544
Photograph of Dr Jonathan Gillard

Dr Jonathan Gillard

Reader in Statistics
Director of Admissions

Email:
gillardjw@cardiff.ac.uk
Telephone:
+44 (0)29 2087 0619
Photograph of Professor Nikolai Leonenko

Professor Nikolai Leonenko

Professor

Email:
leonenkon@cardiff.ac.uk
Telephone:
+44 (0)29 2087 5521
Photograph of Dr Andre Pepelyshev

Dr Andrey Pepelyshev

Senior Lecturer

Email:
pepelyshevan@cardiff.ac.uk
Telephone:
+44 (0)29 2087 5530
Statistics illustration

Dr Kirstin Strokorb

Lecturer

Email:
strokorbk@cardiff.ac.uk
Telephone:
+44 (0)29 2068 8833

All seminars will commence at 12:10pm in room M/0.34, The Mathematics Building, Cardiff University, Senghennydd Road (unless otherwise stated).

Please contact Dr Timm Oertel for more details regarding Operational Research/WIMCS lectures and Dr Andrey Pepelyshev for more details regarding Statistics lectures.

Seminars

DateSpeakerSeminar

21 October 2019

Tri-Dung Nguyen (University of Southampton)

To be confirmed

7 October 2019

George Loho (LSE)

To be confirmed

8 April 2019

Dr. Andreas Anastasiou (LSE)

Detecting multiple generalized change-points by isolating single ones

In this talk, we introduce a new approach, called Isolate-Detect (ID), for the consistent estimation of the number and location of multiple generalized change-points in noisy data sequences. Examples of signal changes that ID can deal with, are changes in the mean of a piecewise-constant signal and changes in the trend, accompanied by discontinuities or not, in the piecewise-linear model. The method is based on an isolation technique, which prevents the consideration of intervals that contain more than one change-point. This isolation enhances ID’s accuracy as it allows for detection in the presence of frequent changes of possibly small magnitudes. Thresholding and model selection through an information criterion are the two stopping rules described in the talk. A hybrid of both criteria leads to a general method with very good practical performance and minimal parameter choice. Applications of our method on simulated and real-life data sets show its very good performance in both accuracy and speed. The R package IDetect implementing the Isolate-Detect method is available from CRAN.

1 April 2019Stephen Disney (Cardiff University)

When the Bullwhip Effect is an Increasing Function of the Lead Time

We study the relationship between lead times and the bullwhip effect produced by the order-up-to policy. The usual conclusion in the literature is that longer lead-time increase the bullwhip effect, we show that this is not always the case. Indeed, it seems to be rather rare. We achieve this by first showing that a positive demand impulse response leads to an always increasing in the lead time bullwhip effect when the order-up-to policy is used to make supply chain inventory replenishment decisions. By using the zeros and poles of the z-transform of the demand process, we reveal when this demand impulse is positive. To make concrete our approach in a nontrivial example we study the ARMA(2,2) demand process.

22 March 2019

Martina Testori (University of Southampton)

How group composition affects cooperation in fixed networks: can psychopathic traits influence group dynamics?

Static networks have been shown to foster cooperation for specific cost-benefit ratios and numbers of connections across a series of interactions. At the same time, psychopathic traits have been discovered to predict defective behaviours in game theory scenarios. This experiment combines these two aspects to investigate how group cooperation can emerge when changing group compositions based on psychopathic traits. We implemented a modified version of the Prisoner’s Dilemma game which has been demonstrated theoretically and empirically to sustain a constant level of cooperation over rounds. A sample of 190 undergraduate students played in small groups where the percentage of psychopathic traits in each group was manipulated. Groups entirely composed of low psychopathic individuals were compared to communities with 50% high and 50% low psychopathic players, to observe the behavioural differences at the group level. Results showed a significant divergence of the mean cooperation of the two conditions, regardless of the small range of participants’ psychopathy scores. Groups with a large density of high psychopathic subjects cooperated significantly less than groups entirely composed of low psychopathic players, confirming our hypothesis that psychopathic traits affect not only individuals’ decisions but also the group behaviour. This experiment highlights how differences in group composition with respect to psychopathic traits can have a significant impact on group dynamics, and it emphasizes the importance of individual characteristics when investigating group behaviours.

18
March
2019

Joe Paat

The proximity function for IPs

Proximity between an integer program (IP) and a linear program (LP) measures the distance between an optimal IP solution and the closest optimal LP solution. In this talk, we consider proximity as a function that depends on the right hand side vector of the IP and LP. We analyze how this proximity function is distributed and create a spectrum of probabilistic-like results regarding its value. This work uses ideas from group theory and Ehrhart theory, and it improves upon a recent result of Eisenbrand and Weismantel in the average case. This is joint work with Timm Oertel and Robert Weismantel. The proximity functions for IPs.

15 March 2019

Prof Philip Broadbridge (La Trobe University)

Shannon entropy as a diagnostic tool for PDEs in conservation form

After normalization, an evolving real non-negative function may be viewed as a probability density. From this we may derive the corresponding evolution law for Shannon entropy. Parabolic equations, hyperbolic equations and fourth-order “diffusion” equations evolve information in quite different ways. Entropy and irreversibility can be introduced in a self-consistent manner and at an elementary level by reference to some simple evolution equations such as the linear heat equation. It is easily seen that the 2nd law of thermodynamics is equivalent to loss of Shannon information when temperature obeys a general nonlinear 2nd order diffusion equation. With the constraint of prescribed variance, this leads to the central limit theorem.

With fourth order diffusion terms, new problems arise. We know from applications such as thin film flow and surface diffusion, that fourth order diffusion terms may generate ripples and they do not satisfy the Second Law. Despite this, we can identify the class of fourth order quasilinear diffusion equations that increase the Shannon entropy.

4 March 2019

Dr. Emrah Demir (Cardiff Business School)

Creating Green Logistics Value through Operational Research

Green logistics is related to producing and dispatching goods in a sustainable way, while playing attention to environmental factors. In a green context, the objectives are not only based on economic considerations, but also aim at minimising other detrimental effects on society and on the environment. A conventional focus on planning the associated activities, particularly for the freight transportation, is to reduce expenses and, consequently, increase profitability by considering internal transportation costs. With an ever-growing concern about the environment by governments, markets, and other private entities worldwide, organizations have started to realize the importance of the environmental and social impacts associated with transportation on other parties or the society.

Efficient planning of freight transportation activities requires a comprehensive look at wide range of factors in the operation and management of transportation to achieve safe, fast, and environmentally suitable movement of goods. Over the years, the minimization of the total travelled distance has been accepted as the most important objective in the field of vehicle routing and intermodal transportation. However, the interaction of operational research with mechanical and traffic engineering shows that there exist factors which are critical to explain fuel consumption. This triggered the birth of the green vehicle routing and green intermodal studies in operational research. In recent years, the number, quality and the flexibility of the models have increased considerably. This talk will discuss green vehicle routing and green intermodal transportation problems along with models and algorithms which truly represent the characteristics of green logistics.

25
February
2019

Oded Lachish (Birkbeck, University of London)

Smart queries versus property independent queries

In the area of property testing, a central goal is to design algorithms, called tests, that decide, with high probability, whether a word over a finite alphabet is in a given property or far from the property. A property is a subset of all the possible words over the alphabet. For instance, the word can be a book, and the property can be the set of all the books that are written in English - a book is 0.1 far from being written in English if at least 0.1 of its words are not in English. The 0.1 is called the distance parameter and it can be any value in [0,1]. The input of a test is the distance parameter, the length of the input word and access to an oracle that answers queries of the sort: please give me the i'th letter in the word.

The quality of a test is measured by it query complexity, which is the maximum number of queries it uses as a function of the input word length and the distance parameter, ideally this number does not depend on the input length. Tests that achieve this ideal for specific properties have been discovered for numerous properties. In general, tests that achieve the ideal for different properties differ in the manner in which they select their queries. That is, the choice of queries depends on the property.

In this talk, we will see that for the price of a significant increase in the number of queries it is possible to get rid of this dependency. We will also give scenarios in which this trade-off is beneficial.

18 February 2019 (Time 13:10 - 14:00)

Prof. Giles Stupfler (University of Nottingham)

Asymmetric least squares techniques for extreme risk estimation

Financial and actuarial risk assessment is typically based on the computation of a single quantile (or Value-at-Risk). One drawback of quantiles is that they only take into account the frequency of an extreme event, and in particular do not give an idea of what the typical magnitude of such an event would be. Another issue is that they do not induce a coherent risk measure, which is a serious concern in actuarial and financial applications. In this talk, I will explain how, starting from the formulation of a quantile as the solution of an optimisation problem, one may come up with two alternative families of risk measures, called expectiles and extremiles. I will give a broad overview of their properties, as well as of their estimation at extreme levels in heavy-tailed models, and explain why they constitute sensible alternatives for risk assessment using some real data applications. This is based on joint work with Abdelaati Daouia, Irène Gijbels and Stéphane Girard.

21 January 2019

Stefano Coniglio (University of Southampton)

Bilevel programming and the computation of pessimistic single-leader-multi-follower equilibria in Stackelberg games

We give a very broad overview of bilevel programming problems and their relationship with Stackelberg games, with focus on two classical limitations of this paradigm: the presence of a single follower and the assumption of optimism.
We then investigate the problem of computing an equilibrium in Stackelberg games with two or more noncooperating followers who react to the strategy chosen by the leader by playing a Nash Equilibrium, focusing, in particular, on the pessimistic case where, if the follower's game (parameterized by the leader's strategy) admits more Nash equilibria, the followers choose one which minimizes the leader's utility.


We then address the case where the followers are restricted to pure strategies, illustrate some hardness and inapproximability results, and the concentrate on exact solution algorithms.


After proposing a single-level (but undecidable) reformulation for the problem, we propose an exact implicit enumeration algorithm capable of computing the supremum of the problem as well as an alpha-approximate strategy, for any nonnegative alpha.


Experimental results are presented and illustrated, showing the viability of our approach.

11 December 2018

Anatoly Zhigljavsky (University of Cardiff)

Multivariate dispersion

3 December 2018

Dr Ilaria Prosdocimi (University of Bath)

Detecting coherent changes in flood risk in Great Britain

Flooding is a natural hazard which has affected the UK throughout history, with significant costs for both the development and maintenance of flood protection schemes and for the recovery of the areas affected by flooding. The recent large repeated floods in Northern England and other parts of the country raise the question of whether the risk of flooding is changing, possibly as a result of climate change, so that different strategies would be needed for the effective management of flood risk. To assess whether any change in flood risk can be identified, one would typically investigate the presence of some changing patterns in peak flow records for each station across the country. Nevertheless, the coherent detection of any clear pattern in the data is hindered by the limited sample size of the peak flow records, which typically cover about 45 years. We investigate the use of multi-level hierarchical models to better use the information available at all stations in a unique model which can detect the presence of any sizeable change in the peak flow behaviour at a larger scale. Further, we also investigate the possibility of attributing any detected change to naturally varying climatological variables.

26
November
2018

Prof Benjamin Gess (Max Planck Institute)

Random dynamical systems for stochastic PDE with nonlinear noise

In this talk we will revisit the problem of generation of random dynamical systems by solutions to stochastic PDE. Despite being at the heart of a dynamical system approach to stochastic dynamics in infinite dimensions, most known results are restricted to stochastic PDE driven by affine linear noise, which can be treated via transformation arguments. In contrast, in this talk we will address instances of stochastic PDE with nonlinear noise, with particular emphasis on porous media equations driven by conservative noise. This class of stochastic PDE arises in particular in the analysis of stochastic mean curvature motion, mean field games with common noise and is linked to fluctuations in non-equilibrium statistical mechanics.

Past events

Past Seminars 2017-18

Past Seminars 2016-17

Past Seminars 2015-16