Predicting flu epidemics using Twitter and historical data

Giovanni Stilo, Paola Velardi, Alberto E. Tozzi, Francesco Gesualdo

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Recently there has been a growing attention on the use of web and social data to improve traditional prediction models in politics, finance, marketing and health, but even though a correlation between observed phenomena and related social data has been demonstrated in many cases, yet the effectiveness of the latter for long-term or even mid-term predictions has not been shown. In epidemiological surveillance, the problem is compounded by the fact that infectious diseases models (such as susceptible-infected-recovered-susceptible, SIRS) are very sensitive to current conditions, such that small changes can produce remarkable differences in future outcomes. Unfortunately, current or nearly-current conditions keep changing as data are collected and updated by the epidemiological surveillance organizations. In this paper we show that the time series of Twitter messages reporting a combination of symptoms that match the influenza-like-illness (ILI) case definition represent a more stable and reliable information on "current conditions", to the point that they can replace, rather than simply integrate, official epidemiological data. We estimate the effectiveness of these data at predicting current and past flu seasons (17 seasons overall), in combination with official historical data on past seasons, obtaining an average correlation of 0.85 over a period of 17 weeks covering the flu season.

Original languageEnglish
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
PublisherSpringer Verlag
Pages164-177
Number of pages14
Volume8609 LNAI
ISBN (Print)9783319098906
DOIs
Publication statusPublished - 2014
Event2014 International Conference on Brain Informatics and Health, BIH 2014 - Warsaw, Poland
Duration: Aug 11 2014Aug 14 2014

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume8609 LNAI
ISSN (Print)03029743
ISSN (Electronic)16113349

Other

Other2014 International Conference on Brain Informatics and Health, BIH 2014
CountryPoland
CityWarsaw
Period8/11/148/14/14

Keywords

  • epidemiological surveillance
  • predictability of health-related phenomena
  • Twitter mining

ASJC Scopus subject areas

  • Computer Science(all)
  • Theoretical Computer Science

Fingerprint Dive into the research topics of 'Predicting flu epidemics using Twitter and historical data'. Together they form a unique fingerprint.

  • Cite this

    Stilo, G., Velardi, P., Tozzi, A. E., & Gesualdo, F. (2014). Predicting flu epidemics using Twitter and historical data. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8609 LNAI, pp. 164-177). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 8609 LNAI). Springer Verlag. https://doi.org/10.1007/978-3-319-09891-3_16