Skip to main content

Mapping news content

26 November 2012

Best Performing Postgraduate Course

A smart computing technique has helped researchers at Cardiff and Bristol universities analyse news content in 2.5 million articles and could provide a new platform for social science research in the future.

Led by Professor Nello Cristianini of Bristol University's Intelligent Systems Laboratory and Professor Justin Lewis, Head of Cardiff's School of Journalism, Media and Cultural Studies, the study used artificial intelligence to examine 498 different English language newspapers over a ten month period.

Usually, news content analysis is labour-intensive, limited in the sample sizes that can be processed, and the kind of questions that can be addressed. The approach used in this study involved the automation of some tasks, enabling the research team to analyse data sets significantly larger than those normally used.

The researchers found that:

  • As expected, readability measures show that online tabloid newspapers are more readable than broadsheets and use more sentimental language. Among 15 US and UK newspapers, The Sun is the easiest to read, comparable to the BBC's children's news programme, Newsround, while The Guardian is the most difficult to read. 'Sport' and 'Arts' were the most readable topics while 'Politics' and 'Environment' were the least readable. 
  • The Sun is also the most likely to use adjectives with sentiment, while The Wall Street Journal uses the fewest emotional adjectives.
  • The study found that men dominated the content of newspapers during the period analysed. The ranking of topics based on the gender bias of the articles found 'Sport' and 'Financial' articles were the most male biased, with sports news mentioning men eight times more often than women. 'Fashion' and 'Arts' were the least biased, with 'Fashion' articles being one the few topics featuring equal proportions of men and women. 
  • The most appealing topics to online readers were 'Disasters', 'Crime', and the 'Environment' while the least appealing topics were 'Fashion', 'Markets' and 'Prices'. The researchers also found that the popular articles tend to be more readable and more linguistically subjective.

Nello Cristianini, Professor of Artificial Intelligence at the University of Bristol, speaking about the research, said: "The automation of many tasks in news content analysis will not replace the human judgement needed for fine-grained, qualitative forms of analysis, but it allows researchers to focus their attention on a scale far beyond the sample sizes of traditional forms of content analysis."

Professor Justin Lewis, Head of the School of Journalism, Media and Cultural Studies at Cardiff, added: "Even some of the more predictable findings give us pause for thought. The extent to which news is male dominated shows how far we are from gender equity across most areas of public life. The fact that articles about politics are the least readable might also explain widespread public disengagement."

The study is published online in Digital Journalism.