: There is increasing interest in assessing whether machine learning (ML) techniques could further improve the early diagnosis of candidemia among patients with a consistent clinical picture. The objective of the present study is to validate the accuracy of a system for the automated extraction from a hospital laboratory software of a large number of features from candidemia and/or bacteremia episodes as the first phase of the AUTO-CAND project. The manual validation was performed on a representative and randomly extracted subset of episodes of candidemia and/or bacteremia. The manual validation of the random extraction of 381 episodes of candidemia and/or bacteremia, with automated organization in structured features of laboratory and microbiological data resulted in ≥99% correct extractions (with confidence interval < ±1%) for all variables. The final automatically extracted dataset consisted of 1338 episodes of candidemia (8%), 14,112 episodes of bacteremia (90%), and 302 episodes of mixed candidemia/bacteremia (2%). The final dataset will serve to assess the performance of different ML models for the early diagnosis of candidemia in the second phase of the AUTO-CAND project.

There is increasing interest in assessing whether machine learning (ML) techniques could further improve the early diagnosis of candidemia among patients with a consistent clinical picture. The objective of the present study is to validate the accuracy of a system for the automated extraction from a hospital laboratory software of a large number of features from candidemia and/or bacteremia episodes as the first phase of the AUTO-CAND project. The manual validation was performed on a representative and randomly extracted subset of episodes of candidemia and/or bacteremia. The manual validation of the random extraction of 381 episodes of candidemia and/or bacteremia, with automated organization in structured features of laboratory and microbiological data resulted in >= 99% correct extractions (with confidence interval < +/- 1%) for all variables. The final automatically extracted dataset consisted of 1338 episodes of candidemia (8%), 14,112 episodes of bacteremia (90%), and 302 episodes of mixed candidemia/bacteremia (2%). The final dataset will serve to assess the performance of different ML models for the early diagnosis of candidemia in the second phase of the AUTO-CAND project.

Validation of an Automated System for the Extraction of a Wide Dataset for Clinical Studies Aimed at Improving the Early Diagnosis of Candidemia

Daniele Roberto Giacobbe;Sara Mora;Alessio Signori;Giorgia Brucci;Cristina Campi;Sabrina Guastavino;Cristina Marelli;Alessandro Limongelli;Antonio Vena;MALGORZATA MIKULSKA;Anna Marchese;Antonio Di Biagio;Mauro Giacomini;Matteo Bassetti
2023-01-01

Abstract

There is increasing interest in assessing whether machine learning (ML) techniques could further improve the early diagnosis of candidemia among patients with a consistent clinical picture. The objective of the present study is to validate the accuracy of a system for the automated extraction from a hospital laboratory software of a large number of features from candidemia and/or bacteremia episodes as the first phase of the AUTO-CAND project. The manual validation was performed on a representative and randomly extracted subset of episodes of candidemia and/or bacteremia. The manual validation of the random extraction of 381 episodes of candidemia and/or bacteremia, with automated organization in structured features of laboratory and microbiological data resulted in >= 99% correct extractions (with confidence interval < +/- 1%) for all variables. The final automatically extracted dataset consisted of 1338 episodes of candidemia (8%), 14,112 episodes of bacteremia (90%), and 302 episodes of mixed candidemia/bacteremia (2%). The final dataset will serve to assess the performance of different ML models for the early diagnosis of candidemia in the second phase of the AUTO-CAND project.
2023
: There is increasing interest in assessing whether machine learning (ML) techniques could further improve the early diagnosis of candidemia among patients with a consistent clinical picture. The objective of the present study is to validate the accuracy of a system for the automated extraction from a hospital laboratory software of a large number of features from candidemia and/or bacteremia episodes as the first phase of the AUTO-CAND project. The manual validation was performed on a representative and randomly extracted subset of episodes of candidemia and/or bacteremia. The manual validation of the random extraction of 381 episodes of candidemia and/or bacteremia, with automated organization in structured features of laboratory and microbiological data resulted in ≥99% correct extractions (with confidence interval &lt; ±1%) for all variables. The final automatically extracted dataset consisted of 1338 episodes of candidemia (8%), 14,112 episodes of bacteremia (90%), and 302 episodes of mixed candidemia/bacteremia (2%). The final dataset will serve to assess the performance of different ML models for the early diagnosis of candidemia in the second phase of the AUTO-CAND project.
File in questo prodotto:
File Dimensione Formato  
Validation of an Automated System for the Extraction of a Wide Dataset for Clinicalpdf.pdf

accesso aperto

Tipologia: Documento in Post-print
Dimensione 906.9 kB
Formato Adobe PDF
906.9 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11567/1112558
Citazioni
  • ???jsp.display-item.citation.pmc??? 2
  • Scopus 5
  • ???jsp.display-item.citation.isi??? 4
social impact