Ewch i’r prif gynnwys

Statistics Research Group

The group is active both in applications of statistical techniques and in theory.

The group is very active both in applications of statistical techniques and in theory.

The main areas of research within the current group are:

  • time series analysis
  • multivariate data analysis
  • applications to market research
  • search algorithms and stochastic global optimisation
  • probabilistic number theory
  • optimal experimental design
  • stochastic processes and random fields with weak and strong dependence
  • diffusion processes and PDE with random data
  • anomalous diffusion
  • Burgers and KPZ turbulence, fractional ordinary and PDE, and statistical inference with higher-order information
  • extreme value analysis.

Various topics in fisheries and medical statistics are also considered, such as errors in variables regression.

Collaborations

Statisticians within the School have been prominent in collaborating with researchers in other disciplines. There are strong links with:

  • the School of Medicine, working on applications of multivariate statistics and time series analysis in bioinformatics
  • the School of Engineering, in the areas of image processing and stochastic global optimisation of complex systems
  • the Business School, in the field of analysis of economics time series.

Ongoing international collaborations exist with many Universities including Columbia, Taiwan, Queensland, Aarhus, Roma, Cleveland, Pau, Hokkaido, Boston, Caen, Calambria, Maine, Trento, Nice, Bratislava, Linz, St.Petersburg, Troyes, Vilnius, Siegen, Mannheim, and Copenhagen.

Industrial sponsorship

Significant industrial sponsorship has been obtained from:

  • Procter and Gamble (USA) working on statistical modelling in market research
  • the Biometrics unit of SmithKline Beecham collaborating on different aspects of pharmaceutical statistics
  • ACNielsen/BASES (USA) on applications of mixed Poisson models in studying marketing consumer behaviour
  • General Electric HealthCare on environmental statistics.

Our main areas of research within the current group are:

  • time series analysis
  • multivariate data analysis
  • applications to market research
  • search algorithms and stochastic global optimisation
  • probabilistic number theory
  • optimal experimental design
  • stochastic processes and random fields with weak and strong dependence
  • diffusion processes and PDE with random data
  • anomalous diffusion
  • Burgers and KPZ turbulence
  • fractional ordinary and PDE, and statistical inference with higher-order information.

In focus

Time series analysis

In recent years a powerful technique of time series analysis has been developed and applied to many practical problems. This technique is based on the use of the Singular-value decomposition of the so-called trajectory matrix obtained from the initial time series by the method of delays. It is aimed at an expansion of the original time series into a sum of a small number of 'independent' and 'interpretable' components.

Also, the spatial analogies of the popular ARMA type stochastic time series have been developed based on the fractional generalizations of the Laplacian with two fractal indices. These models describe important features of processes of anomalous diffusions such as strong dependence and/or intermittency.

Multivariate statistics

The objective is development of a methodology of exploratory analysis of temporal-spatial data of complex structure with the final aim of construction of suitable parametric models.

The applications include various medical, biological, engineering and economical data. Several market research projects where the development of statistical models was a substantial part have taken place.

Stochastic global optimisation

Let ƒ be a function given on an d-dimensional compact set X and belonging to a suitable functional class F of multiextremal continuous functions.

We consider the problem of its minimization, that is approximation of a point x' such that ƒ(x')=min ƒ(x), using evaluations of ƒ at specially selected points.

Probabilistic methods in search and number theory

Several interesting features of the accuracy of diophantine approximations can be expressed in probabilistic terms.

Many diophantine approximation algorithms produce a sequence of sets F(n), indexed by n, of rational numbers p/q in [0,1]. Famous examples of F(n) are the Farey sequence, the collection of rationals p/q in [0,1] with q<=n, and the collection of all n-th continued fraction convergents.

Stochastic processes

New classes of stochastic processes with student distributions and various types of dependence structure have been introduced and studied. A particular motivation is the modelling of risk assets with strong dependence through fractal activity time.

The asymptotic theory of estimation of parameters of stochastic processes and random fields has been developed using higher-order information (that is, information on the higher-order cumulant spectra). This theory allows analysis of non-linear and non-Gaussian models with both short- and long-range dependence.

Burgers turbulence problem

Explicit analytical solutions of Burgers equation with quadratic potential has been derived and used to handle scaling laws results for the Burgers turbulence problem with quadratic potential and random initial conditions of Ornstein-Uhlenbeck type driven by Levy noise.

Results have considerable potential for stochastic modelling of observational series from a wide range of fields, such as turbulence or anomalous diffusion.

Topics in medical statistics

A number of topics that have been associated with medical statistics presently researched in Cardiff include time-specific reference ranges, and errors in variables regression. Current research focuses on the search for a unified methodology and approach to the errors in variables problem.

Extreme Value Analysis

Extreme value analysis is a branch of probability and statistics that provides non-parametric procedures for extrapolation beyond the range of data (as good as possible and depending on the quality of data, knowing the limits is also an important issue). Its methods are usually relevant for institutions that are exposed to high risks, for instance, financial services and insurance companies or environmental engineering institutions.

Group leader

Yr Athro Anatoly Zhigljavsky

Yr Athro Anatoly Zhigljavsky

Chair in Statistics

Email
zhigljavskyaa@caerdydd.ac.uk
Telephone
+44 (0)29 2087 5076

Staff academaidd

Dr Andreas Artemiou

Dr Andreas Artemiou

Lecturer

Email
artemioua@caerdydd.ac.uk
Telephone
+44 (0)29 2087 0616
Dr Bertrand Gauthier

Dr Bertrand Gauthier

Lecturer

Email
gauthierb@caerdydd.ac.uk
Telephone
+44(0)29 2087 5544
Dr Jonathan Gillard

Dr Jonathan Gillard

Senior Lecturer in Statistics

Email
gillardjw@caerdydd.ac.uk
Telephone
+44 (0)29 2087 0619
Yr Athro Nikolai Leonenko

Yr Athro Nikolai Leonenko

Professor

Email
leonenkon@caerdydd.ac.uk
Telephone
+44 (0)29 2087 5521
Dr Andrey Pepelyshev

Dr Andrey Pepelyshev

Lecturer

Email
pepelyshevan@caerdydd.ac.uk
Telephone
+44 (0)29 2087 5530
Dr Kirstin Strokorb

Dr Kirstin Strokorb

Lecturer

Email
strokorbk@caerdydd.ac.uk
Telephone
+44 (0)29 2068 8833
Dr Robin Mitra

Dr Robin Mitra

Senior Lecturer

Email
mitrar5@caerdydd.ac.uk
Telephone
+44 (0)29 2087 5052

Seminars

All seminars will be held virtually via Zoom and commence at 14:10 on Thursdays (unless otherwise stated).

View the seminar calendar of the Statistics and OR group.

The calendar is maintained independently by members of the research groups.

Please contact Dr Timm Oertel for more details regarding Operational Research/WIMCS lectures and Bertrand Gauthier and Kirstin Strokorb for more details regarding Statistics lectures.

DateSpeakerSeminar

9 March 2020

Room M/0.40

Almut Veraart (Imperial College London)

Volatility estimation in time and space

The concept of (stochastic) volatility/intermittency is of central importance in many fields of science. In this talk I am going to discuss how stochastic volatility can be introduced in a stochastic model and which properties of the stochastic model have an influence on the methods available for volatility estimation. I will showcase some recent results on how stochastic volatility can be estimated in multivariate non-semimartingale settings and show some first results in extending the classical stochastic volatility concept to spatial/spatio-temporal settings.

The results presented in this talk are based on collaborations with Ole E. Barndorff-Nielsen, Fred Espen Benth, Andrea Granelli, Michele Nguyen, Riccardo Passaggeri.

2 March 2020Ioannis Kosmidis (Warwick University)

Improved estimation of models for ordinal responses

For the estimation of cumulative link models and adjacent category models for ordinal data, we derive adjustments to the likelihood score functions, whose solution ensures an estimator with smaller asymptotic bias than the maximum likelihood estimator typically has. The form of the adjustments suggests a parameter-dependent adjustment of the multinomial counts, which in turn suggests the solution of the adjusted score equations through iterated maximum likelihood fits on adjusted counts, greatly facilitating implementation.

Like the maximum likelihood estimator, the reduced-bias estimator is found to respect the key invariance properties that make cumulative link models a good choice for the analysis of categorical data. Its additional finiteness and optimal frequentist properties, along with the adequate behaviour of related asymptotic inferential procedures, make the reduced-bias estimator attractive as a default choice for practical applications.

We will also discuss the improved estimation of the adjacent category model, which is another popular model for ordinal data, and how this can be achieved using a modification of the so-called "Poisson trick".

13 February 2020

Time:14:10 to 15:10

Room M/2.20

Tatiana Benaglia (University of Campinas)Bayesian Mixture Models for longitudinal data on cognition loss in elderly people

A regression mixture model to handle elderly’s cognitive ability up to their death is presented. Cognition is measured across time with standard questionnaires from geriatrics which involve, amongst others, memory, language and reasoning issues. The output of such questionnaires is recorded with a countable and finite score. Models for Binomial response variables are discussed here. The mixture specification rises to discriminate two prevalent behaviours in the data: one group of elderly people presents cognition decline at constant rate; whilst the other experiences a spontaneous accelerated decline at some time. The latter aspect is dealt with random change points nonlinear predictors. In addition, logit and complementary log-log link functions were used to model the mixture allocation with predictor variables. The study’s goal is to quantify associations amidst cognition loss and the diagnostics of dementias like Alzheimer’s disease, besides sociodemographic factors. The proposed model is evaluated in the database provided by the Rush University - Chicago, United States, through the Rush Memory and Aging Project from 1997 to 2016.

The talk is based on joint work with Eric Krishna, Hildete P. Pinheiro (Campinas) and Graciela Muniz-Terrera (Edinburgh).

10 February 2020Xin Liu (University of Bath)Diversification in Lottery-Like Features and Portfolio Pricing Discounts

I study the asset pricing implications of cumulative prospect theory on portfolio discounts. I extend Barberis and Huang (2008) and show that a portfolio consisting of lottery-like stocks should trade at a discount due to diversification. This discount can be partially mitigated if lottery-like stocks tend to produce extreme payoffs at the same time. I utilize three empirical settings to support this theoretical prediction: the closed-end fund puzzle, the announcement returns of mergers and acquisitions, and conglomerate discounts. My findings support cumulative prospect theory from an alternative perspective and provide a novel and unifying explanation for three seemingly unrelated phenomena.
27 January 2020Dino Sejdinovic (University of Oxford)Noise Contrastive Meta-Learning for Conditional Density Estimation using Kernel Mean Embeddings

Current meta-learning approaches focus on learning functional representations of relationships between variables, i.e. estimating conditional expectations in regression. In many applications, however, the conditional distributions cannot be meaningfully summarized solely by expectation (due to e.g. multimodality). We introduce a novel technique for meta-learning conditional densities, which combines neural representation and noise contrastive estimation together with well-established literature in conditional mean embeddings into reproducing kernel Hilbert spaces. The method shows significant improvements over standard density estimation methods on synthetic and real-world data, by leveraging shared representations across multiple conditional density estimation tasks.

Past events

Past seminars 2019-20

Past seminars 2018-19

Past seminars 2017-18

Past seminars 2016-17

Past seminars 2015-16