Skip to main content

Data Analytics for Government

We are pleased to offer nine modules from the MSc Data Analytics for Government (MDataGov), available to study on a standalone basis for continuing professional development (CPD).

We have developed this programme in conjunction with the Office of National Statistics (ONS).

The modules on offer include four core and five optional modules. CPD students wishing to accumulate credits towards a qualification (either a Postgraduate Certificate, Diploma, or full MSc) must study and pass the four core modules before studying optional modules.

Modules are only open to UK public sector employees. Please contact us if you are unsure if this applies to you.

These modules are suitable for those working in public-sector bodies in the UK who want to upskill or further their career, and who are happy to study alongside full-programme MSc students at the University.

Each module is worth either 10 or 20 credits. We are working towards a system that will enable you to accumulate credits towards a postgraduate qualification over a period of time, should you wish.

The MSc is also available as one-year full-time programme. UK public sector employees are eligible to study the full MSc at a discounted price of £8,725 (for study during the 2021/22 academic year).

Core modules

Credits

10 credit module (reference MAT032)

Dates

Autumn semester

Cost

£550 (for the 2021/22 academic year)

Assessment

Written examination 80% (two hours)

Coursework 20% (two pieces of written work)

It will be assumed that you will be taking the assessment for the module.

Outline description

In this module we cover the fundamentals of survey statistics. In particular:

  • standard methods of drawing samples from finite populations
  • how to make inferences about population characteristics
  • survey-based estimation of population totals and related quantities
  • regression estimation for modelling relationships between variables
  • the principles and methods used to compensate for non-response following survey data collection
  • calibration methods for household surveys
  • index numbers.

Students who have only completed A-level statistics may have to complete some revision before starting.

Objectives

On successful completion of the module, you should be able to:

  • estimate means, totals, proportions, and ratios of population variables from data collected using standard sampling methods
  • adjust estimates to compensate for the effects of unit non-response
  • use calibration to improve estimates from household surveys
  • assess the suitability of a survey for a given estimation problem.

Skills you will practise and develop

  • data analytics: the collection of data using surveys and the estimation of population variables.
  • mathematical reasoning: calculation of probabilities for sampling events.

Syllabus content

  • estimation for simple random sampling, stratified sampling and cluster sampling
  • regression and ratio estimation
  • non-response and imputation of missing values
  • calibration
  • index numbers.

Delivery

Modules will be delivered through blended learning. You will be guided through learning activities appropriate to your module, which may include:

  • online resources that you work through at your own pace (eg videos, web resources, e-books, quizzes)
  • online interactive sessions to work with other students and staff (eg discussions, live streaming of presentations, live-coding, team meetings)
  • face-to-face small group sessions (eg help classes, feedback sessions).

Credits

10 credit module (reference CMT314)

Dates

Autumn and spring semesters

Cost

£550 (for the 2021/22 academic year)

Assessment

A blend of assessment types which may include coursework and portfolio assessments, class tests, and/or formal examinations.

Outline description

  • this module will introduce core data science concepts, including understanding of the different types of data sources available (administrative data, survey data, open data, big data, etc)
  • how to collect data, including innovative data collection methods, e.g. web scraping
  • understanding the challenges with unstructured data
  • how to treat different data types; how to undertake basic data analysis (structured and unstructured data)
  • how to present data through basic data visualisations.

Objectives

On successful completion of the module, you should be able to:

  • use the Python programming language to complete a range of programming tasks
  • critically analyse and discuss methods of data collection
  • extract textual and numeric data from a range of sources, including online
  • reflect upon the legal, ethical and social issues relating to data science and its applications.

Skills you will practise and develop

  • fundamental programming in Python
  • reading and writing common data formats
  • data analysis using appropriate libraries.

Syllabus content

  • basic programming in Python: Fundamental data types, program control structures, basic language features
  • data extraction and importing; analysis using common libraries (e.g. Pandas, Numpy/Scipy)
  • natural language processing using common libraries (e.g. NLTK, SpaCY)
  • retrieving data from online sources (web scraping, APIs)
  • Data Science applications
  • legal issues relating to data science (GDPR)
  • social and ethical issues relating to data science.

Delivery

Modules will be delivered through blended learning. You will be guided through learning activities appropriate to your module, which may include:

  • online resources that you work through at your own pace (eg videos, web resources, e-books, quizzes)
  • online interactive sessions to work with other students and staff (eg discussions, live streaming of presentations, live-coding, team meetings)
  • face-to-face small group sessions (eg help classes, feedback sessions).

Credits

10 credit module (reference CMT315)

Dates

Autumn and spring semesters

Cost

£550 (for the 2021/22 academic year)

Assessment

A blend of assessment types which may include coursework and portfolio assessments, class tests, and/or formal examinations.

Outline description

This will be a practical module, which will consider programming with structured and unstructured data and statistical analysis of this data. You will learn how to analyse both numeric and textual data using a range of computational programming languages.

Objectives

On successful completion of the module you should be able to:

  • use code to extract, store and analyse textual and numeric data
  • carry out data analysis and statistical testing using code
  • critically analyse and discuss methods of data collection, management and storage
  • analyse and visualise textual and numeric data from a range of sources, including online.

Skills you will practise and develop

  • data analysis using appropriate libraries.

Syllabus content

  • basic programming in Python: Fundamental data types, program control structures, Object Oriented Programming and other basic language features
  • data extraction and importing; analysis using common libraries (e.g. Pandas, Numpy/Scipy)
  • descriptive statistics
  • hypothesis testing
  • natural language processing using common libraries (e.g. NLTK, SpaCY)
  • retrieving data from online sources (web scraping, APIs).

Delivery

Modules will be delivered through blended learning. You will be guided through learning activities appropriate to your module, which may include:

  • online resources that you work through at your own pace (eg videos, web resources, e-books, quizzes)
  • online interactive sessions to work with other students and staff (eg discussions, live streaming of presentations, live-coding, team meetings)
  • face-to-face small group sessions (eg help classes, feedback sessions).

Credits

10 credit module (reference SIT760)

Dates

Spring semester

Cost

£550 (for the 2021/22 academic year)

Assessment

This short module will be assessed through one 2-hour examination during which students will answer two exam questions.

Outline description

This module provides an overview of issues and ideas concerning the scope and organisation of official statistics, as well as its processes and products.

The module provides a general foundation for the more detailed study of these elements and identifies links with other relevant disciplines.

Objectives

On successful completion of the module, you should be able to:

  • Critically evaluate the UK Statistical System and Code of Practice
  • Show clear understanding of Quality Control, Dissemination and Ethical issues relevant to the production and management of Official Statistics.

Skills you will practise and develop

You will have a broad overview of the fundamental issues underlying the organisation of official statistics and be able to apply this knowledge in discussing the relative merits of alternative approaches.

Mandatory topics

  • Overview of the importance of statistics, policy and administrative uses.
  • History of the development of official statistics in the UK.
  • Statistical Legislation
  • Quality
  • Ethics
  • Dissemination

Delivery

Teaching will be delivered through interactive lectures and seminars.

This module will be delivered through a mixture of synchronous and asynchronous activities, as part of this programme’s blended provision, which will include on-campus and online teaching and support.

The precise mode of delivery and details – subject to Welsh Government and Public Health Wales guidance – of the teaching and support activities will be made available at the start of the semester.

There will also be regular virtual ‘office hours’ during which module convenors will be available.

Optional modules

Credits

20 credit module (reference CMT220)

Dates

Spring semester

Cost

£1,100 (for the 2021/22 academic year)

Assessment

A blend of assessment types which may include coursework and portfolio assessments, class tests, and/or formal examinations.

Outline description

Database systems are the most widely used software systems in commerce and industry. Database management systems are used to store and manage the complex integrated information resources of organisations. This module introduces the theoretical and practical issues relating to the design and use of these systems. In addition to the provision of a sound foundation in traditional, second generation database systems, it explores the representation and management of complex information resources with NoSQL database technology.

Objectives

On successful completion of the module, you should be able to:

  • design a relational database, i.e. map conceptual models to efficient representations in a database schema
  • manage relational database systems
  • use SQL to define and query a relational database
  • discuss and evaluate the principles of data integrity, security and concurrency control
  • model and manage information using markup languages
  • describe and evaluate the principles behind other types of database management systems, for example NoSQL.

Skills you will practise and develop

  • understanding the role of information in decision making
  • designing relational databases (including conceptual design, logical design, physical design)
  • evaluation of issues concerning database applications, including security and data integrity
  • modelling information using markup languages (XML and JSON)
  • awareness of the differences between relational and NoSQL databases
  • syllabus content
  • introduction to databases
  • information, data and knowledge
  • database systems
  • relational data model
  • structured Query Language (SQL)
  • relational algebra
  • database design
  • conceptual database design (ER diagrams)
  • logical database design (ER to SQL)
  • physical database design (indexes)
  • security, transactions and concurrency
  • security and integrity
  • transactions and recovery
  • concurrency control
  • markup languages and NoSQL databases
  • XML, XPath and XQuery
  • JSON
  • NoSQL.

Delivery

Modules will be delivered through blended learning. You will be guided through learning activities appropriate to your module, which may include:

  • online resources that you work through at your own pace (eg videos, web resources, e-books, quizzes)
  • online interactive sessions to work with other students and staff (eg discussions, live streaming of presentations, live-coding, team meetings)
  • face-to-face small group sessions (eg help classes, feedback sessions).

Credits

20 credit module (reference CMT218)

Dates

Spring semester

Cost

£1,100 (for the 2021/22 academic year)

Assessment

A blend of assessment types which may include coursework and portfolio assessments, class tests, and/or formal examinations.

Outline description

The aim of this module is to give you an understanding of the processes and tools required to create interactive visualisations and explanations of data. The module will allow you to critically appreciate correct visualisations, and to identify biased or manipulated interpretations. It will cover the practical skills required to create visualisations using tools such as Python and JavaScript, while also examining the theory of design required.

Objectives

On successful completion of the module, you should be able to:

  • describe and discuss the theory behind visualisation design
  • critically analyse visualisations of data
  • examine and explore data to find the best way it can be visually represented
  • create static, animated and interactive visualisations of data
  • critically reflect upon and discuss the merits and shortcomings of their own visualisation work
  • skills you will practise and develop
  • use of appropriate tools for data analysis and visualisation
  • critical analysis of visualisation
  • JavaScript and Python for data access, manipulation, statistical analysis and visualisation.

Syllabus content

  • encoding theory
  • visualisation theory
  • visualisation history
  • current trends in visualisation
  • use of appropriate software tools and libraries for data analysis and visualisation
  • Python: Pandas, Scipy, Numpy, Matplotlib, Seaborn, Altair, Bokeh
  • JavaScript: D3, Plotly, Highcharts
  • retrieving and storing data (JSON, csv) using JavaScript and Python
  • visualisation development.

Delivery

Modules will be delivered through blended learning. You will be guided through learning activities appropriate to your module, which may include:

  • online resources that you work through at your own pace (eg videos, web resources, e-books, quizzes)
  • online interactive sessions to work with other students and staff (eg discussions, live streaming of presentations, live-coding, team meetings)
  • face-to-face small group sessions (eg help classes, feedback sessions).

Credits

10 credit module (reference MAT007)

Dates

Spring semester

Cost

£550 (for the 2021/22 academic year)

Assessment

100% written examination.

Outline description

This module will introduce the student to the ways in which Statistics and OR is used within Government.  It will be predominately taught by staff from the Office for National Statistics (ONS) and Welsh Assembly Government (WAG), and will therefore provide a first-hand fascinating insight in to the roles of Statisticians and Operational Researchers within these organisations. Government is a large employer of graduates in Statistics/OR, and hence this module provides excellent training for students considering a career in Government or for those interesting to know what kinds of methods ONS and WAG utilise in assisting them to produce important analyses and reports.

Objectives

On successful completion of the module, you should be able to:

  • appreciate the ways in which Statistics and OR is used within Government
  • understand the methods used in undertaking surveys and handling large datasets
  • appreciate the nature of key statistical indicators produced by Government and the various reports produced.

Syllabus content

Session 1 – Introductory Session

This first session will give participants an overview of government statistics. It covers their purpose and key uses, some major key statistical series and the blend of administrative and survey data used. This session will cover the structure and governance of the GSS, including legislation and Code of Practice.

Session 2 – Questionnaire Design

This is an introductory session on questionnaire design. The aim is to help participants develop an understanding of how surveys are designed and why design is important to the survey process. The session will cover general design principles for questions, response categories, instructions, guidance and overall questionnaire design. The importance and methods of testing questions and questionnaires will also be covered.

Session 3 – Editing and imputation

UK official statistics are generally derived from survey data, often based on large samples of people or businesses. In practice, it is impossible to collect survey data which are complete and without error. The editing process aims to identify and correct errors in the data. Modern editing techniques aim to optimise this process by editing as efficiently as possible whilst maintaining the accuracy of resulting statistics. Imputation deals with the problem of incomplete and missing responses by estimating their expected values. If applied properly, imputation reduces the threat of non-response bias. This course covers the main editing and imputation methods used in official statistics with examples relating to the population Census and key economic and social statistics.

Session 4 – Index Numbers

Index numbers are a very commonly used way of presenting statistics. Very high profile examples are GDP and the Consumer Price Index. Underpinning such important indices are some intriguing concepts and challenges and this session will cover how these are handled in theory and in practice.

Session 5 – Data matching

The advent of high-powered computing has brought about major advances in the processing and analysis of information, and many organisations now maintain large numbers of datasets in vast databases or data warehouses. In sectors such as finance and healthcare, masses of data are generated as by-products of day-to-day activities and processes, however much of this information can be difficult to harness in a meaningful way.

Data matching is a technique that facilitates the linkage of information from different data sources, making it possible to create rich new virtual datasets composed of data fields taken from a number of existing datasets; datasets which would have previously been analysed separately and in isolation. The mantra of data matching is that “the whole is better than the sum of the parts”, and by bringing together previously disparate datasets, we are able to add value, for example in the ability to study as-yet unknown relationships between sets of variables.

Delivery

20 hours of lectures, practical workshops and case studies.

Credits

20 credit module (reference MAT022)

Dates

Autumn semester

Cost

£1,100 (for the 2021/22 academic year)

Assessment

100% written examination.

Outline description

This module will introduce a range of fundamental statistical ideas. The broad aim of the module is to provide students with:

  • an understanding of the mathematical ideas that underpin some fundamental statistical methods
  • proficiency in performing practical data analysis using statistical software
  • the ability to communicate the results of data analysis by written report.

Objectives

On successful completion of the module, you should be able to:

  • formulate problems involving uncertainty within the framework of probability theory
  • understand the conditions under which various statistical methods can be applied
  • summarise a data set using descriptive statistics
  • calculate confidence intervals and perform hypothesis tests
  • identify the sources of variation in data
  • fit linear models to data and evaluate the accuracy of these models
  • perform variable selection and dimension reduction
  • write technical reports to communicate the results of data analysis procedures.

Syllabus content

  • elementary probability
  • descriptive statistics
  • estimation
  • hypothesis testing
  • categorical data
  • correlation
  • analysis of variance
  • regression
  • principal components analysis
  • non-parametric methods.

Delivery

Modules will be delivered through blended learning. You will be guided through learning activities appropriate to your module, which may include:

  • online resources that you work through at your own pace (eg videos, web resources, e-books, quizzes),
  • online interactive sessions to work with other students and staff (eg discussions, live streaming of presentations, live-coding, group meetings)
  • face-to-face small group sessions (eg tutorials, exercise classes, feedback sessions).

Credits

10 credit module (reference MAT005)

Dates

Spring semester

Cost

£550 (for the 2021/22 academic year)

Assessment

100% coursework.

Outline description

Forecasting methods are utilised in a range of industries and are important tools for both Statisticians and Operational Researchers.  This module will introduce the students to time series models and associated forecasting methods.  It will demonstrate how such models and methods can be implemented to analyse time series data, and for students to appreciate the different fields of applications.  Computer workshops will allow students to build and experiment with practical forecasting tools using data from a variety of applications.

Objectives

On successful completion of the module, you should be able to:

  • fit models for data from a large variety of sources
  • appreciate and use modern methods of statistical inference
  • forecast using a range of methods, including exponential smoothing methods and ARMA and ARIMA models.

Syllabus content

  • time series models: decomposition, analysis and removal of trends and seasonality
  • exponential smoothing methods: single exponential, Holt and Holt-Winters methods
  • autoregressive, moving average and ARMA models
  • non-stationary series - ARIMA-models.  Forecasting using ARIMA models.

Delivery

Modules will be delivered through blended learning. You will be guided through learning activities appropriate to your module, which may include:

  • online resources that you work through at your own pace (eg videos, web resources, e-books, quizzes),
  • online interactive sessions to work with other students and staff (eg discussions, live streaming of presentations, live-coding, group meetings)
  • face-to-face small group sessions (eg tutorials, exercise classes, feedback sessions).

Entry requirements

You must be a UK public sector employee.

You must have a 1st or upper 2nd class UK honours degree or equivalent in a numerate subject, such as mathematics, operational research, statistics, computer science, management science, economics, engineering or a suitable science degree, or equivalent professional experience.

Or, if you are applying solely on the basis of your professional experience, you must have been working in a relevant role for at least a year. If you are unsure as to whether your qualifications or professional experience are relevant, please contact admissions@cardiff.ac.uk.

CPD students wishing to accumulate credits towards the full MSc must pass 120 credits of modules, including the four core modules, before progressing to the 60-credit dissertation.

CPD students are not permitted to study more than 60 credits in an academic year.

Applicants whose first language is not English must meet our English Language requirements.

How to apply

Please use the online application form to apply for one or more modules.

We have provided some Guidance Notes to help clarify and simply your application experience.

Applications received before Friday 3 September 2021 will be considered for the 2021-22 intake. Applications received after this date will be considered for the 2022-23 intake.

If your organisation is funding the study of your module(s) then you will need to confirm this by uploading a simple sponsorship letter which clearly states the total amount (£) your employer/department will pay towards your studies, so please organise this before you apply.

It will be assumed that you will be taking the assessment for each module. We are currently working on a process for those who do not want to wish to take the assessment, and we will provide further details during induction.

All the information on this page is correct at time of publication, however module details are subject to change.

Contact us

Please contact the Admissions team for further guidance on the application process:

Applicant enquiries