Applications based on Machine learning (ML) are growing in popularity in a multitude of different contexts such as medicine, bioinformatics, and finance. However, there is a lack of established approaches and strategies able to assure the reliability of this category of software. This has a big impact since nowadays our society relies on (potentially) unreliable applications that could cause, in extreme cases, catastrophic events (e.g., loss of life due to a wrong diagnosis of an ML-based cancer classifier). In this paper, as a preliminary step towards providing a solution to this big problem, we used automatic mutations to mimic realistic bugs in the code of two machine learning algorithms, Multilayer Perceptron and Logistic Regression, with the goal of studying the impact of implementation bugs on their behaviours. Unexpectedly, our experiments show that about 2/3 of the injected bugs are silent since they does not influence the results of the algorithms, while the bugs emerge as runtime errors, exceptions, or modified accuracy of the predictions only in the remaining cases. Moreover, we also discovered that about 1% of the bugs are extremely dangerous since they drastically affect the quality of the prediction only in rare cases and with specific datasets increasing the possibility of going unnoticed.
How do implementation bugs affect the results of machine learning algorithms?
Leotta, Maurizio;Ricca, Filippo;Noceti, Nicoletta
2019-01-01
Abstract
Applications based on Machine learning (ML) are growing in popularity in a multitude of different contexts such as medicine, bioinformatics, and finance. However, there is a lack of established approaches and strategies able to assure the reliability of this category of software. This has a big impact since nowadays our society relies on (potentially) unreliable applications that could cause, in extreme cases, catastrophic events (e.g., loss of life due to a wrong diagnosis of an ML-based cancer classifier). In this paper, as a preliminary step towards providing a solution to this big problem, we used automatic mutations to mimic realistic bugs in the code of two machine learning algorithms, Multilayer Perceptron and Logistic Regression, with the goal of studying the impact of implementation bugs on their behaviours. Unexpectedly, our experiments show that about 2/3 of the injected bugs are silent since they does not influence the results of the algorithms, while the bugs emerge as runtime errors, exceptions, or modified accuracy of the predictions only in the remaining cases. Moreover, we also discovered that about 1% of the bugs are extremely dangerous since they drastically affect the quality of the prediction only in rare cases and with specific datasets increasing the possibility of going unnoticed.File | Dimensione | Formato | |
---|---|---|---|
p1304-leotta.pdf
accesso chiuso
Tipologia:
Documento in versione editoriale
Dimensione
1.97 MB
Formato
Adobe PDF
|
1.97 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.