Search
Program Calendar
Browse By Day
Browse By Time
Browse By Person
Browse By Room
Browse By Division
Browse By Session Type
Search Tips
Personal Schedule
Sign In
We present a new method for identifying linked news stories from within a large number of articles, using Information Retrieval (IR) techniques to identify the textual closeness between pairs of articles and a network approach using Infomap to subsequently optimize the partition of the group into distinct stories. We distinguish IR approaches from other popular approaches to quantitative analysis of text, including dictionary, supervised and automated clustering methods. We argue for the value of Information Retrieval approaches as a means of quantitatively analysing textual data, particularly when trying to identify small amounts of relevant text within a very large corpus. This paper serves as a demonstration of how a real-world research question can benefit from the application of information retrieval techniques, as well as a substantive contribution to computational research projects with a unit of analysis at story rather than article level.
Tom Nicholls, Blavatnik School of Government, University of Oxford
Jonathan Bright, Oxford Internet Institute, U of Oxford