Telomere biology disorders (TBDs) are a heterogeneous group of diseases characterized by germline mutations in genes encoding proteins involved in telomere length (TL) homeostasis. Telomere shortening has been shown to cause serious, multifaceted degenerative diseases, including bone marrow failure, lung, liver, and vessel diseases, abnormalities of the skin, hair and annexes, oral cavity, eyes, and gastrointestinal tract, and an increased risk of cancer, primarily epithelial cancers of the head and neck. The broad spectrum of TBDs, the extremely variable and misleading clinical phenotype and the incomplete penetrance, make diagnosis very challenging, especially in those who lack the classical features and whose genetics are inconclusive. In this study, we developed an alternative model to the classical diagnostic work-up used for TBDs. Our aim was to create a helpful tool for those cryptic cases in which the correct diagnostic process is not clear. Given its recent increasing application in clinical investigations, we decided to apply machine learning (ML) to our cohort of patients followed at the Hematology Unit of the Giannina Gaslini Institute from 1989 to 2023. Inclusion criteria were persistent cytopenia, suspected/confirmed TBD even in the absence of cytopenia or familiarity with a TBD patient. Both main ML algorithms were used: supervised and unsupervised. The strength of our work is that we were able, by using two different and independent ML algorithms, to characterize a heterogeneous group of patients by creating well-defined clusters (unsupervised analysis) and by creating a score of prediction (supervised analysis) with typical and atypical clinical-biochemical findings. Another important strength is the novelty of the ML approach that, to the best of our knowledge, has never been applied to rare diseases like TBD, and that can open new ways for analysis for these and other rare disorders. Machine Learning is still an experimental tool that requires further training and validation for becoming an established diagnostic instrument, but it can suggest additional patient monitoring and regular and accurate updating of VUS (variant of uncertain significance), as the interpretation of a variant may change over time as more information is acquired. Our future aim is to train and validate a new model with a larger number of cases, by including patients referred to other hospitals.

APPLICATION OF MACHINE LEARNING IN THE DIAGNOSTIC WORK-UP OF TELOMERE BIOLOGY DISORDERS

MASSACCESI, ERIKA
2024-05-28

Abstract

Telomere biology disorders (TBDs) are a heterogeneous group of diseases characterized by germline mutations in genes encoding proteins involved in telomere length (TL) homeostasis. Telomere shortening has been shown to cause serious, multifaceted degenerative diseases, including bone marrow failure, lung, liver, and vessel diseases, abnormalities of the skin, hair and annexes, oral cavity, eyes, and gastrointestinal tract, and an increased risk of cancer, primarily epithelial cancers of the head and neck. The broad spectrum of TBDs, the extremely variable and misleading clinical phenotype and the incomplete penetrance, make diagnosis very challenging, especially in those who lack the classical features and whose genetics are inconclusive. In this study, we developed an alternative model to the classical diagnostic work-up used for TBDs. Our aim was to create a helpful tool for those cryptic cases in which the correct diagnostic process is not clear. Given its recent increasing application in clinical investigations, we decided to apply machine learning (ML) to our cohort of patients followed at the Hematology Unit of the Giannina Gaslini Institute from 1989 to 2023. Inclusion criteria were persistent cytopenia, suspected/confirmed TBD even in the absence of cytopenia or familiarity with a TBD patient. Both main ML algorithms were used: supervised and unsupervised. The strength of our work is that we were able, by using two different and independent ML algorithms, to characterize a heterogeneous group of patients by creating well-defined clusters (unsupervised analysis) and by creating a score of prediction (supervised analysis) with typical and atypical clinical-biochemical findings. Another important strength is the novelty of the ML approach that, to the best of our knowledge, has never been applied to rare diseases like TBD, and that can open new ways for analysis for these and other rare disorders. Machine Learning is still an experimental tool that requires further training and validation for becoming an established diagnostic instrument, but it can suggest additional patient monitoring and regular and accurate updating of VUS (variant of uncertain significance), as the interpretation of a variant may change over time as more information is acquired. Our future aim is to train and validate a new model with a larger number of cases, by including patients referred to other hospitals.
28-mag-2024
telomere biology disorders; machine learning
File in questo prodotto:
File Dimensione Formato  
phdunige_4257310.pdf

accesso aperto

Tipologia: Tesi di dottorato
Dimensione 8.41 MB
Formato Adobe PDF
8.41 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11567/1176016
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact