Skip to main content
Dr Dawn Knight BA, MA, PhD (Nottingham)

Dr Dawn Knight

BA, MA, PhD (Nottingham)

Reader

Email
knightd5@cardiff.ac.uk
Telephone
+44 (0)29 2087 6325
Campuses
3.57, John Percival Building, Colum Drive, Cardiff, CF10 3EU
Users
Available for postgraduate supervision

Overview

I am a member of the Centre for Language and Communication Research, and have worked at Cardiff University since 2015. I have been involved, as Principal Investigator(PI)/Co-Investigator (CI) in a range of externally funded research funded projects (with circa £3.6m external funding obtained to date). My current projects include the following:

  • 2022-23: CI, Welsh Government funded ‘ThACC – Thesawrws Ar-lein Cymraeg Cyfoes - Using Word Embeddings to Create a Thesaurus of Contemporary Welsh’ project. Working with colleagues from WELSH and Computer Science at Cardiff and Lancaster Universities, the project aims to develop an open-access, freely available online thesaurus of the Welsh language, for Welsh speakers and learners alike. We received £90,000 for this project.
  • 2022-23: PI, AHRC funded ‘FreeTxt: supporting bilingual free-text survey and questionnaire data analysis’. Working with colleagues from Lancaster University, and co-designed and co-constructed with partners Cadw and National Trust Wales, this project aims to create an innovative open-source online free-text analysis tool that enables the quick and easy analysis of English and Welsh language data. We received £100,000 for this project. Visit the project website here.
  • 2022-23: CI, AHRC-Funded ‘Wild Swimming and Blue Spaces: Mobilising interdisciplinary knowledge and partnerships to combat health inequalities at scale’ project (with Adolphs, Nottingham as PI). This project aims to develop a new mixed methods approach, drawing on corpus linguistics and narrative analysis, to create effective public health messaging (with a focus on the benefits of wild swimming) that includes content from a range of academic disciplines. Ultimately this project will benefit the many individuals and diverse communities who will be enabled to enjoy wild swimming in a safe way to improve health, and to gain an increased awareness of the nature of blue spaces and their role as a community asset. We received £178,000 for this project. Visit the project website here
  • 2021-24: Co-PI (with Anne O’Keeffe, Mary Immaculate College), AHRC/IRC funded ‘Interactional variation online: harnessing emerging technologies in the digital humanities to analyse online discourse in different workplace contexts’ project. Working with colleagues from Mary Immaculate College, Swansea University, The University of Nottingham, University College Dublin, and University of Aberdeen, the project first aims to examine virtual workplace communication to gain depth of insight into the potential barriers to effective communication. Our second aim is to propose the next generation of frameworks for analysing online discourse and will make these frameworks available to all arts and humanities research and end user communities. This will lead to a step change in our ability to develop equality of access in online communication. We received £390,000 from AHRC +€270,000 [circa £620,700] from IRC for this project. Visit the project website here.

From 2016-2020, I was also PI on the ‘CorCenCC: Corpws Cenedlaethol Cymraeg Cyfoes (The National Corpus of Contemporary Welsh): A community driven approach to linguistic corpus construction’ project. Funded by the ESRC (Economic and Social Research Council) and AHRC (Arts and Humanities Research Council), this £1.8 million inter-disciplinary and multi-institutional project led to the creation of a large-scale, open-source corpus of contemporary Welsh language. Full details of project outputs, including links to the: corpus query interface, full corpus dataset, project report, Y Tiwtiadur pedagogic toolkit, CyTag part-of-speech tagger/tag-set and CySemTag semantic tagger/tag-set can be found on the CorCenCC project website and via the CorCenCC GitHub page

Details of my other research activities, and previously funded projects, can be found on the 'research' tab of this page. 

Regarding external and professional leadership roles, I was Chair of BAAL (British Association for Applied Linguistics) from 2018-2021. BAAL is a learned society with over 1,300 members internationally, making it the most influential forum for academics and professionals interested in language and applied linguistics within the UK and beyond. For further information see: www.baal.org.uk

    I am currently a member of the Economic and Social Research Council’s (ESRC) Strategic Advisory Network (SAN) - 2021-2024. The SAN is comprised of leading experts from the academic and user communities. It helps the ESRC exploit opportunities and access the voice and expertise of its communities. For further details of the SAN, see here. I am also a member of the AHRC (Arts, Humanities and Research Council) Peer Review College (2022-2025) and am currently the Director of Research Funding for ENCAP. 

    Biography

    • 2015Certificate in Advanced Studies in Academic Practice, Newcastle University
    • 2004 – 2009: PhD in Applied Linguistics, The University of Nottingham
      • Thesis title: A multi-modal corpus approach to the analysis of backchanneling behaviour
      • Funding: ESRC +3 award winner
    • 2003 – 2004: MA in Applied Linguistics, The University of Nottingham
    • 2000 – 2003BA in English Studies, The University of Nottingham

    Professional memberships

    • Associate Fellow of the Higher Education Academy (AFHEA), 2013 – present.
    • Member, BAAL (British Association for Applied Linguistics).
    • Executive Committee member, CRiLLS (Centre for Research in Linguistics and Language Sciences, Newcastle University), 2011 – 2015.
    • Member, CRAL (Centre for Research in Applied Linguistics), 2006 – 2011.
    • Member, IVACS (Inter-Varietal Applied Corpus Studies), 2004 – present
    • Member, AILA (International Association of Applied Linguistics), 2004 – present
    • Member, Language Teaching and Technology; Language Learning and Teaching and iLaB (ICT) research clusters in ECLS, 2012 – 2015.

    Academic positions

    • 2016 – present: Reader in Applied Linguistics, Cardiff University
    • 2015 – 2016: Senior Lecturer in Applied Linguistics, Cardiff University.
    • 2014 – 2015: Senior Lecturer in Applied Linguistics, Newcastle University.
    • 2011 – 2014Lecturer in Applied Linguistics, Newcastle University.
    • 2009 – 2011: Part-time Research Fellow and lecturer on BA and M-Level home and distance learning modules, The University of Nottingham.
    • 2006 – 2009: Part-time Research Assistant and lecturer on BA and M-Level home and distance learning modules, The University of Nottingham.
    • 2005 – 2006Full-time Research Assistant, ESRC funded HeadTalk interdisciplinary project, The University of Nottingham.
    • 2004 – 2005: Resident Hall Tutor, Hugh Stewart Hall, The University of Nottingham.

    Speaking engagements

    Invitiations to address conferences, workshops and seminars

    • Knight, D. (2022). Enhancing language technology resources in minoritised language contexts: roles and applications of corpora. Invited presentation delivered as part of the Applied Linguistics and Discourse Analysis (ALDS) Seminar Series, December 2022.
    • Knight, D. and O’Keeffe. (2022). The contribution of corpus linguistics to research on online interaction. Invited podcast talk, delivered as part of the CorpusCast series, November 2022.
    • O’Keeffe, A., Knight, D. and Fitzgerald, C. (2022I). “I think you’re on mute”: Variation in Online Workplace Discourse. Invited presentation delivered as part of the CALS Seminar Series, March 2022, Mary Immaculate College, Ireland.
    • Knight, D. and Fitzgerald, C. (2022). Navigating Virtual Meetings: Multimodality and Variation in Online Professional Discourse. Invited presentation delivered as part of the DiscourseNet Seminar Series, February 2022, Open University.
    • Morris, J., Ezeani, I., Gruffydd, I., Young, K., Davies, L., El-Haj, M. and Knight, D. (2022). Welsh Automatic Text Summarisation.Wales Academic Symposium on Language Technologies 202228 January 2022, Bangor University.
    • Atkins, S. and Knight, D. (2021). Good Practice in Applied Linguistics: a call to action. Invited paper delivered as part of the BAAL Executive Committee Invited Colloquium: Ethics in Social Justice in Applied Linguistics, BAAL (British Association for Applied Linguistics) 2021 conference, Northumbria University, UK.
    • Knight, D. (2020). Ethical considerations for corpus construction: A Welsh language case study. Invited seminar presentation delivered as part of the Centre of Forensic Linguistics seminar series. Aston University, December 2020.
    • Knight, D. (2019). Multimodal Corpus Linguistics: Looking back and thinking forward. Invited keynote delivered at the CLAVIER conference on Knowledge Dissemination and Multimodal Literacy: Research Perspectives on ESP in a Digital World. University of Pisa, November 2019.
    • Knight, D. (2019). The application of corpora: minoritised language contexts: supporting and informing the pedagogic landscape. Invited keynote presentation delivered at the Assessing World Languages conference 2019, University of Macau, Macau, China. 
    • Knight, D. (2019). Invited LTF pre-conference workshop entitled ‘Corpus Linguistics for researchers and practitioners’ workshop’ delivered at the UKALTA (UK Association for Language Testing and Assessment) conference, November 2019, Swansea University.
    • Knight, D. and Morris, D. (2019). Invited LTF pre-conference workshop entitled ‘Exploring the National Corpus of Contemporary Welsh (CorCenCC): user-driven corpus design for under-resourced languages’ workshop’ delivered at the UKALTA (UK Association for Language Testing and Assessment) conference, November 2019, Swansea University.
    • Knight, D. (2019). From ECR to PI: some reflections from a decade of Dr-hood. Invited panel member for the PGR BAAL Colloquium, ‘How can an Early-Career Researcher best succeed in Applied Linguistics’, held at the annual BAAL conference, Manchester Met University, August 2019.
    • Knight, D. (2019). Welsh language in healthcare. Invited panel member for the Healthcare Text Analytics Conference, 24-25 April 2019. Cardiff University, Cardiff.
    • Knight, D. (2019). Examining patterns of language use: a guide to WMatrix. Invited workshop delivered as part of the Applied Linguistics Research Seminar Series, Swansea University, 6th March 2019.
    • Knight, D. (2018). Representativeness in CorCenCC: corpus design in minoritised languages. Invited plenary delivered to the JET workshop as part of the French Cognitive Linguistics Association (AFLiCo) conference, 3 – 4 May 2018. Paris, France.
    • Knight, D. (2018). An overview of the CorCenCC Welsh Corpus project. Invited presentation delivered as part of the Applied Linguistics Research Seminar Series, Swansea University, 2nd February 2018.
    • Knight, D. (2018). A corpus approach to free-text analysis: examining the NSS. Invited presentation delivered to members of the Senior Management Team at York St John University, 24th January 2018.
    • Knight, D. (2017). Perseverance pays: reflections on getting your first grant. Invited presented delivered as part of the Securing your First Research Grant workshop event, 28/11/17, Cardiff University.
    • Knight, D. (2017). A corpus-based analysis of the NSS results. Invited presentation delivered to the Data Professionals Group, 3/7/17, Cardiff University.
    • Knight, D. (2017). Big Data and Corpus Construction Introducing CorCenCC. Invited seminar presentation at the Investigating (with) Big Data event run by the Cardiff University Digital Humanities Network, 24/5/17, Cardiff University.
    • Knight, D. (2017). Research funding and building networks in the Arts, Humanities and Social Sciences: the case of CorCenCC (Corpws Cenedlaethol Cymraeg Cyfoes - The National Corpus of Contemporary Welsh). Invited seminar presentation as part of the Cardiff School of Journalism, Media and Cultural Studies 2016/17 research seminar series, 5/4/17, Cardiff University.
    • Knight, D. (2017). Constructing corpora of minoritized languages: A focus on CorCenCC. Invited plenary presentation delivered as part of the Corpus Linguistics in the South Conference, 4/3/17, Birkbeck University.
    • Knight, D. (2016). Constructing E-Language Corpora: a focus on CorCenCC (The National Corpus of Contemporary Welsh). Invited plenary presentation at the 4th Computer-Mediated Communication and Social Media Corpora for the Humanities conference, 27-28/9/16, University of Ljubljana, Slovenia.
    • Knight, D. (2016). Innovations in corpus-based research. Invited seminar presentation at the Tokyo Chapter of the Japanese Association of Language Teachers (JALT) meeting, 9/9/16, Tokyo, Japan.
    • Knight, D. (2016). The application of corpora: supporting and informing the pedagogic landscape. Invited plenary presentation at the InForm Conference, 16/7/16, Durham University.
    • Knight, D. (2016). Corpora and Pedagogy: developing the community-driven National Corpus of Contemporary Welsh. Invited presentation at the Welsh for Adults annual conference, 8/7/16, Cardiff.
    • Knight, D. (2016). The National Corpus of Contemporary Welsh: A community driven approach to linguistic corpus construction. Invited presentation at the UCREL Corpus Research Seminar Series, 9/6/16, Lancaster University.
    • Knight, D. (2015). Dispelling the myths: the ubiquity of corpora in linguistic research. Invited keynote presentation at the annual. Cardiff University ENCAP Postgraduate Conference, 2/6/15, Cardiff University.
    • Knight, D. (2015). Multimodal Corpus Linguistics. Invited presentation delivered at the joint seminar between Lund University Humanities Lab and the Linneaus Centre CCL, 26/5/15, Lund University.
    • Knight, D. (2015). Analysing Literature using Corpora. Invited presentation delivered as part of the Cardiff BookTalk series, 30/4/15, Cardiff University.
    • Knight, D. (2015). Multimodal Corpus Linguistics. Invited presentation delivered as part of the Vlunch Seminar Series, School of Computer Science and informatics, Cardiff University, 30th April.
    • Knight, D. (2014). Practical applications for corpora. Invited workshop at Welsh tutors conference, 5/12/14, Cardiff University.
    • Knight, D. (2014). (Re)defining context in corpus linguistics. Invited keynote presentation at Information Visualization seminar series, 5/11/14, Potsdam University of Applied Sciences.
    • Knight, D. and Murphy, B. (2014). Exploring the meta in 'meta-data': corpus investigations in sociolinguistic contexts. Invited keynote presentation at IVACS 2014, 13/6/14, Newcastle University.
    • Knight, D. (2013). A corpus-based approach to Digital Discourse. Invited keynote presentation at the BAAL Language and New Media SIG event ‘Research Methods and Approaches for Analysing Social Media’, 22/11/13, Leicester University.
    • Knight, D. (2013). A corpus-based approach to Digital Discourse. Invited keynote presentation at the BAAL Language and New Media SIG event ‘Research Methods and Approaches for Analysing Social Media’, 22/11/13, Leicester University.
    • Knight, D. (2013). Record – Transcribe – Code – Analyse: Tackling Multimodal Data. Invited keynote presentation at the annualNewcastle University ECLS Postgraduate Conference, 20/6/13, Newcastle University.
    • Knight, D. (2013). Recording and analysing real-life interaction ‘in the wild’. Invited keynote presentation at the Cardiff School of English PhD Applied Linguistics (Lexical Studies) Annual conference, 21/3/13, Cardiff University, Wales, UK.
    • Knight, D. (2013). Gesture and talk ‘in the wild’. Invited keynote presentation at the BAAL Corpus Linguistics SIG event “Building and Mining Small Specialised Corpora”. Edinburgh, 22/2/13.
    • Carter, R. and Knight, D. (2012). CANELC – The Cambridge and Nottingham eLanguage Corpus. Invited keynote presentation at theELT Insights Seminar, 24/1/13, Cambridge University Press, Cambridge
    • Knight, D. and Adolphs, S. (2011). Multimodal Corpora for Sign Language Research. Invited keynote presentation at the 2nd Symposium in Applied Sign Linguistics. “Documenting Sign Languages for Learning and Teaching Purposes”. Bristol, June 2011.
    • Knight, D. (2011). Mobile and Location-based Data: Capture, Representation and Analysis. Paper presented at the CAQDAS digital social research showcase event, 23rd March 2011, Oxford, UK.

    Conference papers, workshops, posters and conference demonstrations

    • Knight, D., Ezeani, I., Rayson, P., El-Haj, M. and Morris, S. (2022). FreeTxt: A corpus-based qualitative analysis toolkit. Paper to be presented at the American Association for Applied Linguistics (AAAL) annual conference, March 2023, Portland, Oregon, USA.
    • Knight, D., O’Keeffe, A., Fitzgerald, C., Mark, G., McNamara, J., Adolphs, S., Clarke, L., Cowan, B., Fahey-Palma, T., Farr, F. and Peraldi, S. (2023). Interactional Variation Online (IVO): Corpus Approaches and Applications to Analyzing Multi-modal Collaboration in Virtual Meetings. Paper to be presented at the American Association for Applied Linguistics (AAAL) annual conference, March 2023, Portland, Oregon, USA.
    • Fitzgerald, C., McNamara, J., Knight, D., Mark, G., Flanagan, G., O’Keeffe, A. and Farr, F. (2022). Navigating Online Business Communication: Perspectives and Practice. Paper to be presented at the ABC Regional Conference Europe, Middle East and Africa 2023, Naples, Italy.
    • Knight, D., McClaughlin, E., Adolphs, S., Moffatt, F., Carroll, J., Jackson, J. H., McGowan, S., Pratt, L., Sotirova, V. and Timmons, S. (2022). The linguistic representation of the benefits and risks of wild swimming. Paper to be presented at Public Health Wales Research and Evaluation Conference 2022, Cardiff, Wales.
    • McClaughlin, E., Knight, D. Vilar-Lluch, S., Adolphs, S., Barnard, P., Clos, J., McAuley, D., Nichele, E. and Lang, A. (2022). Community-focused approaches: linguistic evidence for effective public health messaging during the Covid-19 pandemic. Paper to be presented at Public Health Wales Research and Evaluation Conference 2022, Cardiff, Wales.
    • O’Keeffe, A., Fitzgerald, C., Mark, G., McNamara, J., Knight, D., Adolphs, S., Clarke, L., Cowan, B., Fahey-Palma, T., Farr, F. and Peraldi, S. (2022).We’ll come back to Tom when he’s back in real life: A multimodal corpus analysis of virtual meetings. Paper presented at the Irish Association of Applied Linguistics Conference, September 2022, Munster Technological University, Ireland.
    • Knight, D., Fitzgerald, C., Mark, G., McNamara, J., O’Keeffe, A., Adolphs, S., Clarke, L., Cowan, C., Fahey-Palma, T., Farr F. and Peraldi, S. (2022). You’re on mute: examining virtual workplace communication. Paper presented at the British Association for Applied Linguistics Conference, September 2002, Dublin, Ireland.
    • McClaughlin, E., Nichele, E., Adolphs, S., Barnard, P., Clos, J., Knight, D., MacAuley, D. and Lang, A. (2021). Coronavirus Discourses: using Corpus Linguistic tools to investigate the reception of public health messaging. Paper presented at the British Association for Applied Linguistics Conference, September 2022, Dublin, Ireland.
    • Knight, D., Ezeani, I., El-Haj, M. and Rayson, P. (2022). FreeTxt: a corpus-based qualitative text analysis toolkit. Paper presented at the British Association for Applied Linguistics Conference, September 2022, Dublin, Ireland.
    • Adolphs, A., Knight, D., Nichele, E., McClaughlin, E., Clos, J., MacAuley, D., Barnard, P. and Lang, A. (2022).Advances in Privacy-Preserving Analysis of Online Communication Data for Health Message Designers: Coronavirus Discourses in the UK. Abstract submitted to the International and Interdisciplinary Conference on Communication, Medicine and Ethics, July 2022, The Hong Kong Polytechnic University, Hong Kong.
    • Mark, G., Fitzgerald, C., McNamara, J., Knight, D., O’Keeffe, A., Adolphs, S., Clarke, L., Cowan, B., Fahey-Palma, T., Farr, F. and Peraldi, S. (2022).A Multimodal Analysis of Virtual Workplace Discourse: Form corpus to classroom applications. Workshop abstract submitted to the Teaching and Language Corpora Conference, July 2022, Limerick University, Ireland.
    • Knight, D., Tovey-Walsh, B., Prosser, H., Davies, E., Morris, S. and Fitzpatrick, T. (2022). Developing vocabulary lists for adult learners of Welsh: a user-driven iterative approach. Abstract submitted to the Teaching and Language Corpora Conference, July 2022, Limerick University, Ireland.
    • Burnard, P., Clos, J., McClaughlin, E., Nichele, E., Adolphs, S., Knight, D., MacAuley, D. and Lang, A. (2022). Public health messaging for at risk populations: a UK-based case study. Paper presented at the American Association for Applied Linguistics (AAAL) annual conference, March 2022, Pittsburgh, USA.
    • Clos, J., Barnard, P., McClaughlin, E., Nichele, E., Adolphs, S., Knight, D., MacAuley, D. and Lang, A. (2022). Borders in coronavirus discourses: feedback on UK public health messages from readers of online news. Paper presented at the American Association for Applied Linguistics (AAAL) annual conference, March 2022, Pittsburgh, USA.
    • Morris, J., Ezeani, I., Gruffydd, I., Young, K., Davies, L., El-Haj, M. a Knight, D. (2022). Welsh Automatic Text Summarisation.Wales Academic Symposium on Language Technologies 2022, 28 January 2022, Bangor University.
    • Knight, D., Morris, S. and Fitzpatrick, T. (2021). Building the National Corpus of Contemporary Welsh: from vision to reality. Paper presented at the BAAL (British Association for Applied Linguistics) 2021 conference, Northumbria University, UK.
    • Knight, D., Morris, S. and Fitzpatrick, T. (2021). Corpus Design and Construction in Minoritised Language Contexts. Paper presented at the 12th International Corpus Linguistics Conference, July 2021, University of Limerick, Ireland.
    • Ezeani, I., Neale, S., Piao, S., Rayson, P., Donnelly, K. and Knight, D. (2019). Embedding-Based MultiTask Learning for Welsh Part-of-Speech and Semantic Taggers. Proceedings of the Association for Computational Linguistics (ACL) conference 2019, Florence, July 2019.
    • Knight, D., Morris, S., Fitzpatrick, T., Morris, J., Rayson, P., Spasić, I., Thomas, E.M., Lovell, A., Ezeani, I., Neale, S., Needs, J., Piao, S., Rees, M., Williams, L., Arman, L., Muralidaran, V., Tovey, B., Anthony, L., Cobb, T.M., Deuchar, M., Donnelly, K., McCarthy, M. and Scannell, K. (2019). Introducing CorCenCC (Corpws Cenedlaethol Cymraeg Cyfoes - The National Corpus of Contemporary Welsh). Half-day workshop delivered at the 10th International Corpus Linguistics Conference, July 2019, Cardiff University, UK.
    • Thomas, E.M., Needs, J., Fitzpatrick, T., Stonelake, M., Lovell, A., Cobb, T.M., Knight, D., Spasić, I., Kurteva, A., Neale, S., Teahan, W. and Davies, J. (2019). Developing corpus-driven pedagogical toolkits in a minority language context: the case of Welsh. Poster presented at the 10th International Corpus Linguistics Conference, July 2019, Cardiff University, UK.
    • Knight, D., Morris, S. and Fitzpatrick, T. (2019). Designing and constructing a National Corpus of Contemporary Welsh: some challenges and opportunities for minoritized languages. Paper presented at the 10th International Corpus Linguistics Conference, July 2019, Cardiff University, UK.
    • Williams, L., Neale, S. and Knight, D. (2019). CorCenCC: A web scraping approach to collecting e-language corpora. Poster presented at the 10th International Corpus Linguistics Conference, July 2019, Cardiff University, UK.
    • Neale, S., Anthony, L., Spasic, I. and Knight, D. (2019). Developing the CorCenCC Corpus Query Tools. Paper  presented at the 10th International Corpus Linguistics Conference, July 2019, Cardiff University, UK.
    • Piao, S., Neale, S., Rayson, P., Knight, D., Donnelly, K. and Ezeani, I. (2019). Construction of Welsh language resources for corpus annotation tools. Paper presented at the 10th International Corpus Linguistics Conference, July 2019, Cardiff University, UK.
    • Knight, D. Morris, S. and Fitzpatrick, T. (2019). Peaks and troughs: reflections on securing and managing large academic research projects. Paper presented at the IVACS (Inter-Varietal and Applied Corpus Studies) one-day symposium on Corpus Linguistics and the Classroom, February 2019, TU Dortmund University, Germany.
    • Knight, D., Morris, S., Fitzpatrick, T., Morris, J., Rayson, P., Spasić, I., Thomas, E.M., Neale, S., Needs, J., Piao, S., Rees, M. and Williams, L. (2018). CorCenCC (Corpws Cenedlaethol Cymraeg Cyfoes – National Corpus of Contemporary Welsh): The journey so far. Paper presented at the IVACS (Inter-Varietal and Applied Corpus Studies) one-day symposium on Corpus Linguistics and the Classroom, February 2019, TU Dortmund University, Germany.
    • Knight, D. Morris, S. and Fitzpatrick, T. (2018). From vision to reality: reflections on securing and managing large research projects. Paper presented at the BAAL (British Association for Applied Linguistics) 2018 conference, York St. John University, UK.
    • Knight, D., Morris, S., Fitzpatrick, T., Morris, J., Rayson, P., Spasić, I., Thomas, E.M., Neale, S., Needs, J., Piao, S., Rees, M. and Williams, L. (2018). CorCenCC (Corpws Cenedlaethol Cymraeg Cyfoes – National Corpus of Contemporary Welsh): A demonstration. Paper presented at the BAAL (British Association for Applied Linguistics) 2018 conference, York St. John University, UK.
    • Knight, D. (2018). Corpus Design and Construction: the challenges faced by minoritized language. Paper presented at the IVACS (Inter-Varietal Applied Corpus Studies) 2018 conference, Valetta, Malta.
    • Handford, M. and Knight, D. (2018). Conceptualising context in corpus linguistics: moving beyond metadata. Paper  presented at the IVACS (Inter-Varietal Applied Corpus Studies) 2018 conference, Valetta, Malta.
    • Morris, S., Knight, D. and Fitzpatrick, T. (2018). CorCenCC: applying the sociolinguistics of new speakers within a contemporary corpus of Welsh. Paper presented at the IVACS (Inter-Varietal Applied Corpus Studies) 2018 conference, Valetta, Malta.
    • Piao, S., Rayson, P., Knight, D. and Watkins, G. (2018). Towards a Welsh Semantic Annotation System. Paper to be presented at the LREC (Language Resources Evaluation) 2018 Conference, May 2018, Miyazaki, Japan.
    • Neale, S., Donnelly, K., Watkins, G. and Knight, D. (2018). Leveraging Lexical Resources and Constraint Grammar for Rule-Based Part-of-Speech Tagging in Welsh. Paper to be presented as part of the LREC (Language Resources Evaluation) 2018 Conference, May 2018, Miyazaki, Japan.
    • Rees, M., Needs, J., Williams, L., Morris, S. and Knight, D. (2018). My Welsh is rubbish: Corpus data collection in a lesser-used language context – some challenges. Paper presented at the IVACS (Inter-Varietal and Applied Corpus Studies) one-day symposium on Corpus Research in Challenging Contexts, February 2018, Maynooth University.
    • Needs, J., Rees, M., Williams, L., Morris, S. and Knight, D. (2018). Representing contemporary Welsh – who speaks it, where, when and how: Designing, collecting and transcribing CorCenCC’s spoken component. Paper presented at the IVACS (Inter-Varietal and Applied Corpus Studies) one-day symposium on Corpus Research in Challenging Contexts, February 2018, Maynooth University.
    • Knight, D., Fitzpatrick, T. and Morris, S. (2017). CorCenCC (Corpws Cenedlaethol Cymraeg Cyfoes - The National Corpus of Contemporary Welsh): An overview. Paper presented as part of the annual British Association for Applied Linguistics (BAAL) conference, September 2017, University of Leeds.
    • Morris, S., Fitzpatrick, T. and Knight, D. (2017). Creating pedagogic wordlists in an under-resourced language. Poster presented as part of the annual British Association for Applied Linguistics (BAAL) conference, September 2017, University of Leeds.
    • Rees, M., Watkins, G., Needs, J., Morris, S. and Knight, D. (2017). Creating a Bespoke Corpus Sampling Frame for a Minoritised Language: CorCenCC, the National Corpus of Contemporary Welsh. Paper presented at the CL2017 conference, University of Birmingham, Birmingham, 24-28 July 2017.
    • Piao, S., Rayson, P., Watkins, G., Knight, D. and Donnelly, K. (2017). Towards a Welsh Semantic Tagger: Creating Lexicons for A Resource Poor Language. Paper presented at the Corpus Linguistics Conference 2017, July 2017, University of Birmingham.
    • Piao, S., Rayson, P., Knight, D., Watkins, G. and Donnelly, K. (2017). Towards a Welsh Semantic Tagger: Creating Lexicons for A Resource Poor Language. Paper presented at the CL2017 conference, University of Birmingham, Birmingham, 24-28 July 2017.
    • Needs, J., Knight, D., Morris, S., Fitzpatrick, T., Thomas, E. and Neale, S. (2017). "How will you make sure the material is suitable for children?": User-informed design of Welsh corpus-based learning/teaching tools. Paper presented at the CL2017 conference, University of Birmingham, Birmingham, 24-28 July 2017.
    • Neale, S., Spasić, I., Needs, J., Watkins, G., Morris, S., Fitzpatrick, T., Marshall, L. and Knight, D. (2017). The CorCenCC Crowdsourcing App: A Bespoke Tool for the User-Driven Creation of the National Corpus of Contemporary Welsh. Paper presented at the CL2017 conference, University of Birmingham, Birmingham, 24-28 July 2017.
    • Knight, D., Morris, S., Fitzpatrick, T. and Anthony, L. (2016). Charting the vocabulary of a minoritised language: Challenges and opportunities in the creation and application of the National Corpus of Contemporary Welsh. Paper presented at the Vocab@Tokyo international conference, September 2016, Tokyo, Japan.
    • Fitzpatrick, T., Knight, D. and Morris, S. (2016). Creating pedagogical wordlists: a comparison of thematic and corpus approaches. Poster presented at the Pacific Second Language Research Forum (PacSLRF2016), September 2016, Tokyo, Japan.
    • Knight, D., Fitzpatrick, T. and Morris, S. (2016). CorCenCC - Corpws Cenedlaethol Cymraeg Cyfoes (The National Corpus of Contemporary Welsh). WISERD (Wales Institute of Social and Economic Research, Data and Methods), July 2016, Swansea University.
    • Handford, M. and Knight, D. (2016). Corpus-informed discourse analysis: a methodology for exploring context in spoken corpora. Paper presented at the IVACS 2016 conference, June 2016, Bath Spa University.
    • Knight, D., Neale, S., Spasic, I., Morris, S. and Fitzpatrick, T. (2016). Crowdsourcing corpus construction: contextualizing plans for CorCenCC (Corpws Cenedlaethol Cymraeg Cyfoes - The National Corpus of Contemporary Welsh). Paper presented at the IVACS 2016 conference, June 2016, Bath Spa University.
    • Needs, J., Rees, M., Watkins, G., Morris, S., Knight, D. and Fitzpatrick, T. (2016). CorCenCC (Corpws Cenedlaethol Cymraeg Cyfoes – The National Corpus of Contemporary Welsh): Challenges and applications in a minoritised language context. Paper presented at the IVACS 2016 conference, June 2016, Bath Spa University.
    • Piao, S., Rayson, P., Archer, D., Bianchi, F., Dayrell, C. El-Haj, M., Jiménez, R-M., Knight, D., Michal Křen, M., Löfberg, L., Nawab, R., Shafi, J., The, P-L. and Mudraya, O. (2016). Lexical Coverage Evaluation of Large-scale Multilingual Semantic Lexicons for Twelve Languages. Paper delivered at the LREC (Language Resources Evaluation) 2016 Conference, May 2016, Slovenia.
    • Adolphs, S. Knight, D., Ofemile, A. and Clark, L. (2016). Crowdsourcing new communities of discourse: analysing human-computer interaction in different contexts. Paper presented at the AAAL (American Association of Applied Linguistics) conference, March 2016, Florida, USA.
    • Knight, D. and Adolphs, S. (2015). Language in the Digital Age: Revisiting the Speech-Writing Continuum. Paper delivered as part of the Topics in Corpus Linguistics for Social Media Research workshop, Corpus Linguistics 2015, July 2015, Lancaster University.
    • Knight, D. (2014). The ‘spokenness’ of e-language. Paper presented at the Corpus Linguistics in the South Conference, November 2014, Reading University.
    • Seedhouse, P. and Knight, D. (2014). Using technology to solve a research problem. Paper presented at the British Association for Applied Linguistics (BAAL) annual meeting, September 2014, Warwick University.
    • Adolphs, S. and Knight, D. (2014). Capturing Formulaic Sequences 'in the wild'. Paper presented as part of the New Insights into the Acquisition, Assessment and Pedagogy of Formulaic Languagecolloquium held at the American Association for Applied Linguistics conference, 23/3/14, Portland, Oregon.
    • Knight, D. (2013). Gesture and talk ‘in the wild’. Paper presented at the American Association for Applied Linguistics 2013 Conference (AAAL), March 2013, Dallas, Texas.
    • Knight, D. (2012). Gesture and talk ‘in the wild’. Paper presented at the British Association for Applied Linguistics (BAAL) annual meeting, September 2012, Southampton University.
    • Adolphs, S. and Knight, D. (2012). Formality and professional discourse in online contexts. Paper presented at the British Association for Applied Linguistics (BAAL) annual meeting, September 2012, Southampton University.
    • Knight, D. and Adolphs, S. (2012). CANELC: Cambridge and Nottingham eLanguage Corpus. Paper presented at the Inter-Varietal Applied Corpus Studies (IVACS) symposium, January 2012, Cambridge University.
    • Walsh, S. and Knight, D. (2012). Investigating small group teaching in a higher education context. Paper presented at the CUP English Profile Seminar, February 2012, Cambridge University.
    • Knight, D. and Adolphs, S. (2012). CANELC: Cambridge and Nottingham eLanguage Corpus. Paper presented at the Inter-Varietal Applied Corpus Studies (IVACS) symposium, January 2012, Cambridge University.
    • Knight, D. and Walsh, S. (2012). Investigating small group teaching in a higher education context. Paper presented at the Inter-Varietal Applied Corpus Studies (IVACS) symposium, January 2012, Cambridge University.
    • Knight, D., Mullany, L., Adolphs, S., Harvey, K., Hunt, D., Smith, C. and Atkins, S. (2011). New Developments in multi-modal Corpus Analysis. Paper presented at World Congress of Applied Linguistics (AILA), August 2011, Beijing, China.
    • Knight, D. And Adolphs, S. (2011). Experiencing space and place: A multi-modal corpus approach. Paper presented at the BAAL annual conference, September 2011, Bristol UWE.
    • Knight, D. and Adolphs, S. (2010). Space, place and SMS: capturing context and network in multimodal corpus development. Paper presented at the BAAL annual conference, September 2010, Edinburgh.
    • Adolphs, S., Carter, R. and Knight, D. (2010). Second phase multi-modal corpora: Heterogeneous datasets for linguistic analysis. Paper delivered at the 5th Inter-Varietal Applied Corpus Studies (IVACS) conference, June 2010, University of Edinburgh.
    • Knight, D., Tennent, P., Adolphs, S. and Carter, R. (2010). Developing heterogeneous corpora using the Digital Replay System (DRS). Paper delivered at the LREC 2010 (Language Resources Evaluation Conference) Workshop on Multimodal Corpora: Advances in Capturing, Coding and Analyzing Multimodality, May 2010, Giessen, Germany.
    • Knight, D., Carter, R. and Adolphs, S. (2010). Corpora and context: A discussion of ‘Thrill’. Paper presented at the 31st ICAME conference, May 2010, Giessen University, Germany.
    • Knight, D. (2010). Language, Corpora and Context: A 'Thrilling' Case Study. Paper presented at theIVACS Annual Research Symposium, January 2010, University of Leeds.
    • Adolphs, S. and Knight, D. (2009). Language, Corpus and Context: Record, represent and replay. Presentation delivered as part of the Second and Foreign Language Pedagogy Seminar Series, October 2009, School of Education, Nottingham University.
    • Carter, R., Adolphs, S. and Knight, D. (2009). Language, Corpus and Context: ubiquitous computing and corpus development. Paper delivered at the BAAL annual conference, September 2009, Newcastle.
    • Knight, D., Adolphs, S., Carter, R. and Tennent, P. (2009). A multi-modal approach to the construction and analysis of spoken corpora. A two-hour workshop co-ordinated at the Corpus Linguistics 2009 conference, July 2009, Liverpool University.
    • Knight, D. (2009). Collecting and collating heterogeneous datasets for multi-modal corpora. Paper be presented at the Corpus Linguistics 2009 conference, July 2009, Liverpool University.
    • Adolphs, S., Carter, R. Knight, D., Brundell, P. and Tennent, P. (2009). Constructing and interrogating linguistic corpora using heterogeneous datasets. A two-hour workshop co-ordinated at the 5th International Conference on e-Social Science (ICeSS), Cologne, June 2009.
    • Adolphs, S., Knight, D. and Carter, R. (2009). Redefining context in communication: a multi-modal perspective. Paper presented at the 30th ICAME conference, May 2009, Lancaster University.
    • Knight, D., Adolphs, S. and Carter, R. (2009). Multi-modal corpus construction and analysis. Poster presented at the 30th ICAME conference, May 2009, Lancaster University.
    • Knight, D. and Adolphs, S. (2009). Corpus Perspectives: from production to reception. Paper presented at the Inter-Varietal Applied Corpus Studies (IVACS) Annual Research Symposium, January 2009, University of Edinburgh.
    • Adolphs, S. and Knight, D. (2008). Analysing Discourse Markers: A Multi-Modal Approach. Paper presented at the BAAL 2008 annual conference, September 2008, University of Swansea.
    • Brundell, P., Tennent, P., Greenhalgh, C., Knight, D., Crabtree, A., O’Malley, C., Ainsworth, S., Clarke, D., Carter, R. & Adolphs, S. (2008). Digital Replay system (DRS): A Tool for Interaction Analysis. Paper delivered at the International Conference for the Learning Sciences 2008 (ICLS), Utrecht, The Netherlands. June-July 2008.
    • Brundell, P., Knight, D., Tennent, P., Naeem, A., Adolphs, S., Ainsworth, S., Carter, R., Clarke, D., Crabtree, A., Greenhalgh, C., O’Malley, C., Pridmore, T. and Rodden, T. (2008). The experience of using Digital Replay System for social science research. Paper presented at the 4th International Conference on e-Social Science (ICeSS), the University of Manchester, June 2008.
    • Knight, D. and Evans, D. (2008). Multi-Modal Corpora, Discourse and Gesture. Paper presented at AAAL 2008, Washington DC, US.
    • Knight, D. (2008). Gesturing power in dyadic conversations: A study of academic supervisory meetings. Paper delivered at the 4th Inter-Varietal Applied Corpus Studies (IVACS) conference, June 2008, University of Limerick.
    • Knight, D. and Tennent, P. (2008). Introducing DRS: A tool for the future of Corpus Linguistic research and analysis. Poster presentation with demo, delivered at the 6th Language Resources and Evaluation Conference (LREC), Palais des Congrés Mansour Eddahbi, Marrakech, Morocco.
    • Knight, D., Adolphs, S., Tennent, P. and Carter, R. (2008). The Nottingham Multi-Modal Corpus: A Demonstration. Paper during the ‘Multimodal Corpora’ workshop held at the 6th Language Resources and Evaluation Conference (LREC), Palais des Congrés Mansour Eddahbi, Marrakech, Morocco.
    • Adolphs, S., Knight, D. and Evans, D. (2007). Multi-modal corpora. Presentation delivered as part of the CRAL seminar series, University of Nottingham, October 2007.
    • Adolphs, S. and Knight, D. (2007). HeadTalk. Presentation delivered as part of the Technologies for Enhancing Visual Methods workshop at the 3rd International eSocial Science Conference (ICeSS), University of Michigan, US.
    • Tennent, P. and Knight, D. (2007). Multi-modal corpora: Adapting gesture recognition techniques for linguistic analysis. Poster delivered at the 3rd International eSocial Science Conference (ICeSS), October 2007, University of Michigan, US.
    • Adolphs, S., Carter, R., Knight, D. and Evans, D. (2007). e-Social Science and Applied Linguistics: a multimodal corpus case study. Paper delivered at the New Horizons in Linguistics symposium, September 2007, University of Oxford.
    • Knight, D., Evans, D., Carter, R. and Adolphs, S. (2007). Multi-modal corpus design, construction and use. Paper delivered at the BAAL 2007 annual conference, September 2007, University of Edinburgh.
    • Knight, D., Evans, D., Adolphs, S. and Carter, R. (2007). Approaching the problems: Capturing, coding and analysing gesture in multi-modal communication data. Paper delivered at the Corpus Linguistics 2007 Conference, July 2007, University of Birmingham.
    • Knight, D. (2007). HeadTalk: The development and exploration of multi-modal linguistic corpora. Paper delivered at the Annual Nottingham University Postgraduate Symposium, 2007.
    • Knight, D. (2006). Little old ladies and dodgy old men: An exploration of the representation of old age in everyday spoken discourse. Paper delivered at GLoBE conference, September 2006, University of Warsaw.
    • Carter, R., Knight, D. and Adolphs, S. (2006). Head-talk: Towards a Multi-Modal Corpus. Paper delivered at the BAAL 2006 annual conference, September 2006, University College, Cork.
    • Knight, D., Bayoumi, S., Mills, S., Crabtree, A., Adolphs, S., Pridmore, T. and Carter, R. (2006). Beyond the Text: Construction and Analysis of Multi-Modal Linguistic Corpora. Paper presented at 2nd International Conference on e-Social Science (ICeSS), Manchester, 28 - 30 June 2006.
    • French, A., Wright, M., Greenhalgh, C., Knight, D., Brundell, P., O'Malley, C., Ainsworth, S., Clarke, D. and Tom Rodden. (2006). ‘Replaytool’ software in practice. Poster presented at the 2nd annual international eSocial Science Conference (ICeSS), June 2006, Manchester University.
    • Knight, D., Adolphs, S. and Carter, R. (2006). The Multi-Modal Corpus: Coding and representing data- the issues. Paper delivered at the 3rd Inter-Varietal Applied Corpus Studies conference, May 2006, University of Nottingham.
    • Knight, D. and Adolphs, S. (2006). Analysing Spoken Corpora: Methodological Issues and Technological Challenges. Delivered at theBAAL SIG Seminar (Special Interest Group: Corpus Linguistics), April 2006, The Open University, Milton Keynes.
    • Knight, D. (2006). Developing a Multi-Modal Corpus: Data Coding Issues. Paper delivered at theInter-Varietal Applied Corpus Studies (IVACS) Annual Research Symposium, February 2006, University of Limerick.

    Committees and reviewing

    • Editorial board member of Applied Linguistics (journal, 2021+)
    • Ambassador of the Data Innovation Research Institute (DIRI) at Cardiff University. In this role I lead a special interest group (SIG) that facilitates deep interdisciplinary collaboration across the University in the area of data science (2018+).
    • Editorial board member of Elements in Corpus Linguistics (book series) published by Cambridge University Press.
    • Lead organiser and Chair of the 2020 online BAAL conference. Over 400 members of the association registered to participate in this conference.
    • Lead organiser of the biannual International Corpus Linguistic Conference (CL2019), a 5-day globally leading conference for academics working within this discipline (2018-2019).
    • Member of the ESRC’s Centres for Doctoral Training (CDT) Peer Review College (2016+)
    • Honorary Visiting Fellow at the Centre for Research in Applied Linguistics (CRAL), The University of Nottingham (May–July 2018, during Research Leave)
    • Visiting Researcher at the Department of English Language and Applied Linguistics, Swansea University (April–July 2018, during Research Leave)
    • General Secretary for BAAL, the British Association for Applied Linguistics (2013 - 2018); Meetings Secretary for BAAL (2010-2013); Postgraduate Development and Liaison Officer for BAAL (2007-2009).
    • Co-organiser of the IVACS (Inter-Varietal and Applied Corpus Studies) 2006 and IVACS 2014 conferences.
    • Editor (with Professor Svenja Adolphs) of the Routledge Handbook of English Language and the Digital Humanities [under contract].
    • Reviews Editor for the Yearbook of Corpus Linguistics and Pragmatics, 2012-2015 (Springer Verlag).
    • Editorial board member for the journal Discourse, Context and Media
    • Reviewer for International Journal of Corpus Linguistics (IJCL), Journal of Pragmatics, Context and Discourse, Corpora Journal and the BAAL annual book prize.
    • Programme committee member: Big Data and Natural Language Processing workshop hosted at IEEE Big Data, December 2016.
    • Programme committee member: 9th International Corpus Linguistics conference, July 2017, University of Birmingham; Challenges in the Management of Large Corpora + Big Data and Natural Language Processing joint meeting, July 2017, University of Birmingham.
    • Advisory Editorial Board member for the Journal of Corpus Linguistics and Pragmatics (Springer Verlag).
    • Advisory board member for Language, Texts and Society (LTS) – a journal produced at the University of Nottingham.
    • Advisory board member for CLiC – a corpus tool for the analysis of literary texts, led by Professor Mahlberg, University of Birmingham (funded by the AHRC).

    Publications

    2023

    2022

    2021

    2020

    2019

    • Ezeani, I., Piao, S., Neale, S., Rayson, P. and Knight, D. 2019. Leveraging pre-trained embeddings for Welsh Taggers. Presented at: 4th Workshop on Representation Learning for NLP, Florence, Italy, July 2019ACL Anthology: Proceedings of the 4th Workshop on Representation Learning for NLP, Vol. W19-43. Association for Computational Linguistics pp. -., (10.18653/v1/W19-4332)
    • Spasic, I., Owen, D., Knight, D. and Artemiou, A. 2019. Unsupervised multi-word term recognition in Welsh. Presented at: Celtic Language Technology Workshop 2019, Dublin, Ireland, 19 August 2019 Presented at Lynn, T. et al. eds.Proceedings of the Celtic Language Technology Workshop. European Association for Machine Translation

    2018

    2017

    2016

    2015

    2014

    2013

    2011

    2010

    • Knight, D., Tennent, P., Adolphs, S. and Carter, R. 2010. Developing heterogeneous corpora using the Digital Replay System (DRS).. Presented at: Multimodal Corpora: Advances in Capturing, Coding and Analyzing Multimodality, Malta, 18 May 2010 Presented at Kipp, M. et al. eds.Proceedings of the LREC 2010 (Language Resources Evaluation Conference) Workshop on Multimodal Corpora: Advances in Capturing, Coding and Analyzing Multimodality, May 2010, Malta.. European Language Resources Association pp. 16-21.
    • Adolphs, S. and Knight, D. 2010. Building a spoken corpus: What are the basics?. In: O’Keeffe, A. and McCarthy, M. eds. The Routledge handbook of corpus linguistics. Routledge handbooks in applied linguistics Oxford: Routledge

    2009

    2008

    2006

    • Knight, D., Bayoumi, S., Mills, S., Crabtree, A., Adolphs, S., Pridmore, T. and Carter, R. 2006. Beyond the text: construction and analysis of multi-modal linguistic corpora. Presented at: 2nd International Conference on e-Social Science, Manchester, UK, 28-30 June 2006Proceedings of the 2nd International Conference on e-Social Science, Manchester, 28 - 30 June 2006.. ICeSS pp. n/a.

    Research interests:

    I am an applied linguist whose research interests lie in the areas of corpus linguistics, discourse analysis, and multimodality. I have expertise in developing methodological frameworks for crowdsourcing data collection for corpus compilation and in developing methodological approaches to the construction of minoritised language corpora. One contribution of my work has been to pioneer the development of a new research area in applied linguistics: multimodal corpus-based discourse analysis. This has included the introduction of a novel methodological approach to the analysis of the relationships between language and gesture-in-use based on large-scale real-life records of interaction (corpora).

    I also have an interest in examining the use of wearable technologies as a means of capturing language, gesture and embodied actions in naturally occurring interaction (‘in the wild’) and on developing methodological and technical frameworks for crowdsourcing data collection for corpus compilation. I am also interested in examining and developing methodological approaches to the construction of corpora in minoritised language contexts (refer to the CorCenCC webpage for more details).  

    More details of my recent research projects can be found on my main University homepage. 

    Research funding:

    • 2022: £90,000 received from the Welsh Government for the ‘ThACC – Thesawrws Ar-lein Cymraeg Cyfoes - Using Word Embeddings to Create a Thesaurus of Contemporary Welsh’ project. Working with colleagues from WELSH and Computer Science at Cardiff and Lancaster Universities(with Morris as PI - I am one of the CIs), the project aims to develop an open-access, freely available online thesaurus of the Welsh language, for Welsh speakers and learners alike.
    • 2022: £178,000 received from the AHRC for the ‘Wild Swimming and Blue Spaces: Mobilising interdisciplinary knowledge and partnerships to combat health inequalities at scale’ project (with Adolphs, Nottingham as PI - I am one of the CIs). This project aims to develop a new mixed methods approach, drawing on corpus linguistics and narrative analysis, to create effective public health messaging (with a focus on the benefits of wild swimming) that includes content from a range of academic disciplines. Visit the project website here.
    • 2022: £100,000 received from the AHRC for the 'FreeTxt: supporting bilingual free-text survey and questionnaire data analysis’ project. I am PI on this project. Working with colleagues from Lancaster University, and co-designed and co-constructed with partners Cadw and National Trust Wales, this project aims to create an innovative open-source online free-text analysis tool that enables the quick and easy analysis of English and Welsh language data: FreeTxt. Visit the project website here.
    • 2021: £14,988 received from the ESRC Impact Acceleration Account (IAA). This is for a project, working with the National Centre for Learning Welsh that will support the creation of vocabulary lists, based on data extracted from CorCenCC (National Corpus of Contemporary Welsh).
    • 2021: £90,000 received from the Welsh-Government for the ‘Welsh Automatic Text Summarisation’ project. Working with colleagues from WELSH and Computer Science at Cardiff and Lancaster Universities, the project aims to build a summarisation tool that will allow professionals to quickly summarise long documents for efficient presentation. Visit the project website here. 
    • 2021: £450,000 received from AHRC for the 'Coronavirus Discourses: linguistic evidence for effective public health messaging' project. Developed in partnership with Public Health England, Public Health Wales and NHS Education for Scotland, this project addressed key challenges that the coronavirus pandemic presents in relation to understanding the flow and impact of public health messages as reflected in public and private discourses. Led by Svenja Adolphs (Nottingham - I was CI on this project), this interdisciplinary project carried out the first large scale analysis of the trajectories of public health messages relating to the coronavirus pandemic in the UK [£465,000]. Visit the project website here
    • 2020: £90,000 received from the Welsh Government for the 'Learning English-Welsh bilingual embeddings and applications in text categorisation' project. This was an interdisciplinary project involving Irena Spasić, Padraig Corcoran, Luis Espinosa-Anke (School of Computer Science and Informatics – COMSC) and Geraint Palmer (School of Mathematics) as Co-Investigators (CIs). In was PI on this project. For more information, see here.
    • 2019: £90,000 received from the Welsh Government for the ‘Welsh words by numbers: “Wales” + “capital” = “Cardiff”’ project (focusing on word embeddings for Welsh). I am a CI on this project. 
    • 2019: £2,100 received for the internally funded CUROP project entitled ‘FreeTxt: analysing free-text comments using a corpus-based approach’. I was PI on this project. 
    • 2019: £20,000 received from the Welsh Government for the Welsh Stemmer project, I was CI on this project with Irena Spasić (Cardiff) as PI. 
    • 2018: £2,100 received for the internally funded CUROP project entitled ‘Corpws Cenedlaethol Cymraeg Cyfoes: National Corpus of Contemporary Welsh – a focus on spoken data’. I was PI on this project (with Lowri Williams).
    • 2018: £2,100 received for the internally funded CUROP project entitled ‘Corpws Cenedlaethol Cymraeg Cyfoes: National Corpus of Contemporary Welsh – semantic tagging and data annotation’. I was PI on this project (with Paul Rayson).
    • 2017: £19,964 received from the Grant Cymraeg 2050 fund to automatically construct a WordNet for Welsh, a lexical database in which words are grouped into sets of synonyms (synsets), which are then organised into a network of lexico-semantic relationships. I was CI on this project. 
    • 2017: £2,000 received (as PI) from the British Council in support of a launch event for the CorCenCC project (held on 28th February 2017).
    • 2016-19: £1,800,000 received from the ESRC and AHRC for the CorCenCC project (Corpws Cenedlaethol Cymraeg Cyfoes (The National Corpus of Contemporary Welsh): A community driven approach to linguistic corpus construction). I am PI on this project. 
    • 2016: £1,600 received for the internally funded CUROP project entitled ‘Analysis on non-verbal communication in construction industry interactions’. I was CI on this project (with Mike Handford).
    • 2015: £24,999 received from the AHSS (College of Arts, Humanities and Social Sciences) Network Digital Humanities Initiator Bid. The aim of this network Bid is to bring build significant capacity in Digital Humanities at Cardiff University. I was CI on this project.
    • 2014: £3,850 received from the Newcastle University Faculty Research Fund for a project entitled Crowdsourcing data collection for corpus compilation: Scoping methods for the future’ (with Patrick Olivier).
    • 2013: £3900 received from the Newcastle University Faculty Bid Preparation Fund for Corpws Cenedlaethol Cymraeg (CorCenCC) to support the development of the bid application.
    • 2013: £17,500 funding received from the British Council Aptis Research Grants for a project entitled ‘Characterising interactional competence in higher education small group talk’. I am a Co-I on this project with Steve Walsh (PI) and Paul Seedhouse.
    • 2012: £3,920 received from the Newcastle University Faculty Research Fund for a pilot project entitled ‘Gesture and talk ‘in the wild’ (with Professor Olivier).

    Research experience/positions:

    • Research Fellow on Crowd Sourcing: A Toolkit-based Approach (2010-2011). RCUK Grant EP/G065802/1 Horizon Digital Economy Research. Work carried out at The University of Nottingham.
    • Research Associate on DReSS II (Understanding Digital Records for eSocial Science (2008-2011). ESRC Grant No. RES-149-25-1067. Work carried out at The University of Nottingham.
    • Research Assistant on DReSS I (Understanding Digital Records for eSocial Science (2005-2008). ESRC Grant No. RES-149-25-0035RA on Headtalk (2005-2006). ESRC Grant No. RES-149-25-1016. Work carried out at The University of Nottingham.
    • I have also been involved in work with the Cambridge University Press (CUP) on the English Profile (EP) Project and from 2009-2012 I was involved in the construction of CANELC, the Cambridge and Nottingham e-Language Corpus (working with CUP and staff from the University of Nottingham), the first large-scale corpus of digital discourse.

    Refer to my biography page for a list of invited plenaries and conference presentations. 

    Supervision

    • Corpus linguistics
    • Corpus pragmatics
    • Language use in context
    • Non-verbal communication
    • Discourse analysis
    • Digital interaction (‘E-language’)

    In addiition to the students listed above, I also supervised the RAs involved in work on the CorCenCC project and am a co-supervisor of PhD students Vigneshwaran Muralidaran (with Irena Spasic, School of Computer Science and Informatics) and Debora Cabral Lima (with Christopher Heffer, ENCAP). 

    Current supervision

    David Griffin

    David Griffin

    Research student

    alt

    Emily Powell

    Research student

    Jennifer Jordan-Grote

    Research student

    alt

    Kate Barber

    Research student

    Past projects

    I previously supervised:

    • Shanru Yang, who completed her PhD between 2011 and 2014 at Newcastle University. Her thesis was entitled: Investigating discourse markers in Chinese college EFL teacher talk: A multi-layered analytical approach. This was a co-supervision with Steve Walsh.
    • Rezan Mohammed Alharbi, who completed her PhD in 2016 at Newcastle University. Her thesis was entitled: Acquisition of Lexical Collocations: A corpus-assisted contrastive analysis and translation approach. This was a co-supervision with Mei Lin.