AllertaLOM: privacy analysis of a COVID-19 tracking app

APPLICATION CONTEXT

AllertaLOM (GooglePlay) is a multi-service Android and iOS application developed by Italian's administrative region Lombardy.
The main purpose of AllertaLOM is to provide a single point of information focused on alerts and notifications about the territory. Scope is therefore restricted to the solely Lombardy region.

Between March and April 2020, the application scope has been extended to support the emerging COVID-19 threat. A new feature, called "CERCACOVID" has been developed. It allows, through the voluntary submission of a questionnaire, the tracking of users' clinical conditions.

The application has been promoted through an SMS campaign, which pushed, to all Lombardy's cells registered phones and without any user's consent, an advertisement (bulk text message) with the following text:

Regione Lombardia-CercaCovid: scarica app AllertaLOM e compila ogni giorno il questionario anonimo sul tuo stato di salute. Aiuterai a tracciare mappa contagio.

(EN translation) Regione Lombardia-CercaCovid: download the AllertaLOM app and fill it out anonymous questionnaire on your health every day. You will help draw contagion map.

Here you can download the official document used to legitimate the campaign, while at this address you can find more details about the "CERCACOVID" project and the AllertaLOM application,

PRIVACY POLICY

Since the application is actually collecting sensitive clinical information about people, the authors tried to minimize the negative impact in public opinion through the adoption of the term "anonymous" in several places of the both the advertising campaign and privacy policy.

It is obvious, that the application is collecting not just the clinical status of citizens, but also keeping track of it (as the submission is encouraged on a daily basis). An accidental disclosure, should the data not be anonymized correctly (or not at all), would have severe negative effects in people's privacy.

Relevant sections of the displayed Terms of Service document are:

  • [..] AllertaLOM ospita al proprio interno un servizio di raccolta dati anonimi [..]
    [..] AllertaLOM hosts an anonymous data collection service [..]
  • [..] i dati raccolti in forma anonima attraverso il questionario CercaCOVID restano di proprieta' dell'Ente che potra' utilizzarli per i predetti scopi di indagine statistica e analisi finalzizata al contrasto della diffusione del contagio [..]
    [..] the data collected anonymously through the CercaCOVID questionnaire remain the property of the Body which will be able to use it for the aforementioned purposes of statistical investigation and analysis aimed at contrasting the spread of the infection [..]
  • [..] in virtu' della forma anonima e statistica del questionario, l'ente non e' in grado di associare una specifica sintomatologia all'identita' di chi ha compilato le informazioni [..]
    [..] by virtue of the anonymous and statistical form of the questionnaire, the body is not able to associate a specific symptomatology with the identity of the person who compiled the information [..]

Relevant section of the displayed Privacy Policy document is:

  • [..] In particolare, ad ogni utente dell'applicazione e' associata una chiave generata casualmente, che viene impiegata per raccordare le informazioni relative alla sintomatologia, nel caso in cui questi voglia aggiornarne lo stato nel tempo. [...]
    [..] In particular, each user of the application is associated with a randomly generated key, which is used to link the information relating to the symptomatology, in case they want to update their status over time. [...]

RANDOMLY GENERATED KEY

Should above requisites be respected, the AllertaLOM application would actually respect citizens privacy.
As a privacy and information security researcher, but also as a "user", I'm quite interested in understanding which method the developer used to create the key, as in the scenario depicted by the Privacy Policy, it could represent the most critical piece of the puzzle.

IONIC FRAMEWORK

The developers took advantage of the famous Open Source Ionic Framework therefore a static code analysis did not even require any disassembly or reverse engineering technique. The Android APK can just be unzipped. The whole application code (plain javascript, not even obfuscated) can be read with any text editor from the directory /assets/www

This analysis is based on version 1.6.0 of the application.

THE UUID

Two functions are actually called when sending the data to the backend: getPersonalRequestParams() and getHealthRequestParams().
Interestingly, getPersonalRequestParams contains a piece of code which caught my attention:

o.prototype.getPersonalRequestParams = function(o, i) {
  return {
    deviceUuid: this.device.uuid || (this.plt.is("cordova") ? "BAD-DEVICEUUID" : Math.random().toString()), 
    firebaseToken: this.firebase.fcmToken, 
    cf: o.cf || null, // this is not used, in the version analyzed it is always null
    nome: o.nome || null,  // this is not used, in the version analyzed it is always null
    cognome: o.cognome || null,  // this is not used, in the version analyzed it is always null
    sesso: o.sesso,
    eta: o.eta,
    patologie: o.patologie,
    profilo: i
}

the Device UUID (Universally Unique Identifier) is always sent back to the server, at every survey submission, unless an error occurred (i.e. the Ionic framework "device plugin" is not able to retrieve the device unique identifier, therefore the code either is flattened to a constant string, or to a random number).

What is the UUID and why is relevant to user's privacy?

Each and every Android and iOS device provides to applications an identifier (a code, very similar in purpose to a serial number) unique per each device.
Despite this being "randomly generated" by the operative system, under many circumstances, it never changes for the entire life of the mobile device. An other important detail is that the application do not need any special permission to read this code.

Can the UUID be used to actively identify a user?

It depends from the operative system installed in the device. For example, as defined in official Android documentation, older versions of Android (< API level 26) do not generate an UUID with a "per APK signing key" scope, therefore this value can definitely be used to identify the user, for example through the pairing of the UUID and other user's details, read from an other installed application.

Due to the sensitivity of the content managed by AllertaLOM, and related to COVID-19 infection, I'd expect the developers to adopt a more privacy-aware device identification method, for example through the generation of a real "random" id, deleted when the application is uninstalled and reset every time the software is re-installed.

ADDITIONAL CONSIDERATIONS

The use of an application not specifically developed to manage sensitive information (like the clinical history of a person) increments the risks of accidental disclosure.

The methods used to pair a survey submission to the device are not privacy-friendly, as the UUID used can in any case identify the device, even after the application deletion (and even on newer Android versions, as the UUID id "generated" and not randomly picked).

Since the AllertaLOM originally had a different purpose, the Android Manifest contains permissions which, even if not used in the COVID-19 survey context, can be activated/exploited by/in future versions of the application.
More specifically:

  • android.permission.GET_ACCOUNTS
  • android.permission.CALL_PHONE

CONCLUSION

Despite this not being a complete analysis of the AllertaLOM mobile application, from a security perspective it seems that the COVID-19 survey functionality was roughly introduced, without paying too much attention to the privacy of its users.
Developers could have used real anonymization techniques as, in this version, we can barely talk about pseudo-anonymization, and only if the device respects specific requirements.