Multi-document summarization based on the Yago ontology

Elena Baralis, Luca Cagliero, Saima Jabeen, Alessandro Fiori, Sajid Shah

Research output: Contribution to journalArticle

Abstract

Sentence-based multi-document summarization is the task of generating a succinct summary of a document collection, which consists of the most salient document sentences. In recent years, the increasing availability of semantics-based models (e.g., ontologies and taxonomies) has prompted researchers to investigate their usefulness for improving summarizer performance. However, semantics-based document analysis is often applied as a preprocessing step, rather than integrating the discovered knowledge into the summarization process. This paper proposes a novel summarizer, namely Yago-based Summarizer, that relies on an ontology-based evaluation and selection of the document sentences. To capture the actual meaning and context of the document sentences and generate sound document summaries, an established entity recognition and disambiguation step based on the Yago ontology is integrated into the summarization process. The experimental results, which were achieved on the DUC'04 benchmark collections, demonstrate the effectiveness of the proposed approach compared to a large number of competitors as well as the qualitative soundness of the generated summaries.

Original languageEnglish
Pages (from-to)6976-6984
Number of pages9
JournalExpert Systems with Applications
Volume40
Issue number17
DOIs
Publication statusPublished - 2013

Keywords

  • Document summarization
  • Entity recognition
  • Text mining

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Science Applications
  • Engineering(all)

Fingerprint Dive into the research topics of 'Multi-document summarization based on the Yago ontology'. Together they form a unique fingerprint.

  • Cite this