Applications based on Machine learning (ML) are growing in popularity in a multitude of different contexts such as medicine, bioinformatics, and finance. However, there is a lack of established approaches and strategies able to assure the reliability of this category of software. This has a big impact since nowadays our society relies on (potentially) unreliable applications that could cause, in extreme cases, catastrophic events (e.g., loss of life due to a wrong diagnosis of an ML-based cancer classifier). In this paper, as a preliminary step towards providing a solution to this big problem, we used automatic mutations to mimic realistic bugs in the code of two machine learning algorithms, Multilayer Perceptron and Logistic Regression, with the goal of studying the impact of implementation bugs on their behaviours. Unexpectedly, our experiments show that about 2/3 of the injected bugs are silent since they does not influence the results of the algorithms, while the bugs emerge as runtime errors, exceptions, or modified accuracy of the predictions only in the remaining cases. Moreover, we also discovered that about 1% of the bugs are extremely dangerous since they drastically affect the quality of the prediction only in rare cases and with specific datasets increasing the possibility of going unnoticed.

How do implementation bugs affect the results of machine learning algorithms?

Leotta, Maurizio;Ricca, Filippo;Noceti, Nicoletta
2019-01-01

Abstract

Applications based on Machine learning (ML) are growing in popularity in a multitude of different contexts such as medicine, bioinformatics, and finance. However, there is a lack of established approaches and strategies able to assure the reliability of this category of software. This has a big impact since nowadays our society relies on (potentially) unreliable applications that could cause, in extreme cases, catastrophic events (e.g., loss of life due to a wrong diagnosis of an ML-based cancer classifier). In this paper, as a preliminary step towards providing a solution to this big problem, we used automatic mutations to mimic realistic bugs in the code of two machine learning algorithms, Multilayer Perceptron and Logistic Regression, with the goal of studying the impact of implementation bugs on their behaviours. Unexpectedly, our experiments show that about 2/3 of the injected bugs are silent since they does not influence the results of the algorithms, while the bugs emerge as runtime errors, exceptions, or modified accuracy of the predictions only in the remaining cases. Moreover, we also discovered that about 1% of the bugs are extremely dangerous since they drastically affect the quality of the prediction only in rare cases and with specific datasets increasing the possibility of going unnoticed.
2019
9781450359337
File in questo prodotto:
File Dimensione Formato  
p1304-leotta.pdf

accesso chiuso

Tipologia: Documento in versione editoriale
Dimensione 1.97 MB
Formato Adobe PDF
1.97 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11567/946394
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 3
  • ???jsp.display-item.citation.isi??? 3
social impact