The great majority of Information Extraction tools can be applied to English texts only as the syntactic algorithms are tailored for this language. Moreover, the rare multilingual applications present in literature are not appropriate for real time applications. With the objective of providing non-English languages with tools that are necessary in modern medical information processing, this paper presents a real time IE system that aims at tagging medical corpora available in some non-English languages with UMLS concepts. The NLP application has been evaluated in respect to both accuracy in extraction and execution times performances on a subset of a corpus of 450,000 textual radiological reports written in the Italian language. The Automatic Terms Recognition results were found to be superior to those observed in similar non-English focused studies. The tool achieves a throughput of 26K bytes of text per second.

Terminology-driven Radiological Information Extraction from Clinical Narratives in Multilingual Corpora

PIVETTI, SUSANNA;GIACOMINI, MAURO
2013-01-01

Abstract

The great majority of Information Extraction tools can be applied to English texts only as the syntactic algorithms are tailored for this language. Moreover, the rare multilingual applications present in literature are not appropriate for real time applications. With the objective of providing non-English languages with tools that are necessary in modern medical information processing, this paper presents a real time IE system that aims at tagging medical corpora available in some non-English languages with UMLS concepts. The NLP application has been evaluated in respect to both accuracy in extraction and execution times performances on a subset of a corpus of 450,000 textual radiological reports written in the Italian language. The Automatic Terms Recognition results were found to be superior to those observed in similar non-English focused studies. The tool achieves a throughput of 26K bytes of text per second.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11567/568919
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact