Information extraction from Italian medical reports: An ontology-driven approach

Natalia Viani, Cristiana Larizza, Valentina Tibollo, Carlo Napolitano, Silvia G Priori, Riccardo Bellazzi, Lucia Sacchi

Research output: Contribution to journalArticlepeer-review


OBJECTIVE: In this work, we propose an ontology-driven approach to identify events and their attributes from episodes of care included in medical reports written in Italian. For this language, shared resources for clinical information extraction are not easily accessible.

MATERIALS AND METHODS: The corpus considered in this work includes 5432 non-annotated medical reports belonging to patients with rare arrhythmias. To guide the information extraction process, we built a domain-specific ontology that includes the events and the attributes to be extracted, with related regular expressions. The ontology and the annotation system were constructed on a development set, while the performance was evaluated on an independent test set. As a gold standard, we considered a manually curated hospital database named TRIAD, which stores most of the information written in reports.

RESULTS: The proposed approach performs well on the considered Italian medical corpus, with a percentage of correct annotations above 90% for most considered clinical events. We also assessed the possibility to adapt the system to the analysis of another language (i.e., English), with promising results.

DISCUSSION AND CONCLUSION: Our annotation system relies on a domain ontology to extract and link information in clinical text. We developed an ontology that can be easily enriched and translated, and the system performs well on the considered task. In the future, it could be successfully used to automatically populate the TRIAD database.

Original languageEnglish
Pages (from-to)140-148
Number of pages9
JournalInternational Journal of Medical Informatics
Publication statusPublished - Mar 2018


Dive into the research topics of 'Information extraction from Italian medical reports: An ontology-driven approach'. Together they form a unique fingerprint.

Cite this