notes on tracking apps for covid-19

Before we even start this, there is a fundamental limitation here: adoption.

In the United States, the number of active users for Facebook is 180million while the whole population is 300m. That’s active users and not app installs. When we only need a handful of people spreading the virus, we won’t be able to completely do full contact tracing. Not to mention the people without smartphones. So to start with any app that tries to do contact tracing should make sure that it communicates that its imperfect and it’s a coarse measure of showing a likely value of full of false negatives.

To that extent, I think attempts using aggregated data of people movement is more useful to asses the risk of being affected as a macro scale.

And this depends on the phase of spread, I think this is too late for doing contact tracing. The infection rate for New York is already too high that the only way to protect your self is shelter-at-home. Which we already have a lot of places practicing this. If the app is safe and avoids to collect location data in close proximity of you home, the app won’t do anything just draining power. I think it’s still useful to do this in preparation of the the next disaster, but that is a whole new question about maintaining software and keeping the adoption rate high.

And there is the test rate, while countries are picking up the speed and innovation is happening, it’s still not enough to have everyone have that test. Which ultimately we should have periodic testing.

Let’s think that we still want to make an app. Let’s think about the worst case scenario. Zero privacy.

Figure 1: current status

Fig. 1 (normal) does nothing to protect privacy. It sucks whatever location data and timestamp available from the SDK, with device IDs and sends it to the server in plain text. The entity that holds the central server forces users to install the app (or a wristband) and you get a full surveillance system. From a United States perspective the majority of the people will think this is bad, but there is enough information to control the situation. This is basically the situation that we have now. Even if you avoid installing that actual app, we know that there are numerous ways to track individuals. But it’s not just bad things, researchers with proper knowledge can use that data and provide useful insights. I just think I’m personally not qualified to use that data and responsibility but with structured protocol and legislation I think we can handle it.1

Still you can swap phones within your family or spoof your location easily and mess up or have your kids play with other kids at the playground and infect their grandparents.

Figure 2: does this work?

So if the first method (where we are now) is what we want to avoid, let’s try to have a non-harmful or less invasive method. Fig. 2(non-fancy) does two things to improve the first setting. The first change is to stop asking precise location data directly from disease carriers, ask for consent and change the data to redacted points. Noise can be added, the links between the points can be cut, it could be a geofence large enough that people will be comfortable to share. Precise location can be approximated by cutting digits. Health organizations can use designated protocol to protect their patients privacy. Of course if it’s blurred too much, it’s useless.

The second difference is that we are not collecting other users geolocation data anymore, and compare it at the local device level. People can be confident that their information sits on their own device by looking at a file, and other app developers can audit the code to make sure there is no back doors.

There is nothing fancy, but it looks like it will work. But even this has comes short. We loose data by asking consent from people diagnosed positive. Let’s add another assumption that every one says yes, Even if the patient says yes, we can’t assure that user will continue use that app.

The app will be deleted or the location sharing might be turned off. Leaving your phone inside the car while your going to the groceries is another easy thing to do. While showing risk for preventive action may not be the core purpose of contacting tracing, that’s a big loss in terms of other users expectations. People would want to know which grocery store is safe when they feel healthy.

There’s another trade off by letting users saving their location in their own devices. The vulnerable to individual attacks. Right now, if something happens, we can put responsibility to the big corporations that is deploying that technology, but once it’s the individuals task to manage their own data resource, it’s like life without any banks and people needs to maintain their own safe at home.

so far we have these assumptions

  • Adoption rate is very high like Facebook level. (I don’t have it installed in my phone) People are well informed that it’s imperfect and this is not an app for preventive actions.

  • The app(or the SDK) is ready in a very early stage, with everyone aware about the incident saving their data at least 14 days before the first patient diagnosed positive.

  • Everyone says yes and is comfortable sharing their data through an redaction method, that exposes your location to public in a resolution high enough that it’s useful to be used for contact tracing.

  • Everyone is capable of protecting their own data saved in their own locations.

At this point, it’s too hypothetical and academic. I won’t be too comfortable that this will be practical or even make the situation better. We’re talking about something serious that potentially affects peoples behavior.

Change the research question

So maybe the research question of making an app for privacy safe contact tracing was too much of a leap. Maybe other attempts just wanted to do contact tracing leaving aside privacy issues and potential threat, but just wanted to prevent deaths. We don’t want to give up the privacy part, because that’s what we happened to be in interest. Then the only path left it to relax the contact tracing part. You still need to think about a good way of handling geolocation data. Instead of rushing to make a half baked app, you can think of how make a sustainable relationship of the user and her own data. It’s obviously less sexy, but the difficulty and importance is same to have a safer world.

Figure 3: gamification?

  1. Of course the general public lets us to do so. Which my feeling is that science is gradually loosing trust. ↩︎