Smart speakers and voice-based virtual assistants are used to retrieve information, interact with other devices, and command a variety of Internet of Things (IoT) nodes. To this aim, smart speakers and voice-based assistants typically take advantage of cloud architectures: vocal commands of the user are sampled, sent through the Internet to be processed and transmitted back for local execution, e.g., to activate an IoT device. Unfortunately, even if privacy and security are enforced through state-of-the-art encryption mechanisms, the features of the encrypted traffic, such as the throughput, the size of protocol data units or the IP addresses can leak critical information about the habits of the users. In this perspective, in this paper we showcase this kind of risks by exploiting machine learning techniques to develop black-box models to classify traffic and implement privacy leaking attacks automatically. We prove that such traffic analysis allows to detect the presence of a person in a house equipped with a Google Home device, even if the same person does not interact with the smart device. We also present a set of experimental results collected in a realistic scenario, and propose possible countermeasures.

Fine-hearing Google Home: why silence will not protect your privacy

Caputo D.;Verderame L.;Ranieri A.;Merlo A.;Caviglione L.
2020-01-01

Abstract

Smart speakers and voice-based virtual assistants are used to retrieve information, interact with other devices, and command a variety of Internet of Things (IoT) nodes. To this aim, smart speakers and voice-based assistants typically take advantage of cloud architectures: vocal commands of the user are sampled, sent through the Internet to be processed and transmitted back for local execution, e.g., to activate an IoT device. Unfortunately, even if privacy and security are enforced through state-of-the-art encryption mechanisms, the features of the encrypted traffic, such as the throughput, the size of protocol data units or the IP addresses can leak critical information about the habits of the users. In this perspective, in this paper we showcase this kind of risks by exploiting machine learning techniques to develop black-box models to classify traffic and implement privacy leaking attacks automatically. We prove that such traffic analysis allows to detect the presence of a person in a house equipped with a Google Home device, even if the same person does not interact with the smart device. We also present a set of experimental results collected in a realistic scenario, and propose possible countermeasures.
File in questo prodotto:
File Dimensione Formato  
jowua-v11n1-4.pdf

accesso aperto

Tipologia: Documento in versione editoriale
Dimensione 845.32 kB
Formato Adobe PDF
845.32 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11567/1015224
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 23
  • ???jsp.display-item.citation.isi??? ND
social impact