You have invited your friends, your partner or your family to spend the afternoon at home. You are having quite a normal conversation in the living room. When the night comes, the guests leave and you relax on the sofa, you take your smartphone and start checking the latest posts on your favorite social network such as Twitter or Instagram. Suddenly, boom! An ad appears and your face turns pale. The advert is related to the conversation you had in the afternoon. You wonder, Is it a coincidence? Are they listening? I’m sure that this or other similar situations have happened to you more than once. But, is this real? I can tell you that the answer is yes. If you’re starting to feel fear, keep calm, in this story we’re going to explore and crack this mystery.
In my previous article, where I explored the applicability of Big Data and Artificial Intelligence in disease control, specifically on covid19 case, I mentioned that one of the main data sources could be the location of our mobile devices since we would be able to reproduce the path followed by infected citizens and calculate the infection hot spots. Our mobile device has become an essential accessory in our daily routine, it goes with us everywhere, home, supermarket, work or trips. These devices are a double-edged sword. On the one hand, they are a powerful tool, since they provide a world of possibilities a click away. We use our smartphones for everything: we chat with our colleagues, we share a photo on Instagram, we buy tickets for the concert next month or we buy clothes. However, on the other hand, we must keep in mind that everything we do on the Internet can be tracked. Even when we are not using our smartphone, they’re still active, they work 24/7 capturing information through their sensors.
What do big companies know about me?
At this point, I’m sure you’re asking yourself: What do they know about me? As I mentioned in my first article, behind the vast amount of data that is generated every day, some companies try to find high-value diamonds that they can use. Information giants such as Google or Facebook dominate the highest positions in the markets because, among other things, they use the knowledge obtained to offer new valued services to their users and clients. So we will focus on analyzing how much these two companies know about us.
Google has a complete set of very different services: Search, Maps, Translate, Chrome or Youtube are the most used but, if you have a look at this page, you will agree with me that, nowadays, it is practically impossible to not use a Google service. Now, open Google Ad settings. If you are not logged in, please, log in with your Google account (I’m sure you have one) and, then, you’ll see something like this:
It will probably be on as that is the default configuration unless you have modified it. So, scroll down and check all the interests that Google thinks you have. Here are the first 10 categories for me (8 of them are right):
The other company we’re taking as an example is Facebook. Facebook was born as a social network but quickly expanded its catalog of products and bought other companies. At this point you can say, I never had a Facebook account, they can’t know anything about me. But, let me ask you: Do you chat on Whatsapp? If your answer is yes, which I imagine will be, then you are a Facebook user, since Facebook bought WhatsApp in 2014 for the figure of 19 billion dollars. Almost nothing. Besides, Facebook also owns the social network Instagram.
Privacy and data policies of both Facebook and Instagram are the same because they are the same company, so, in this example, I’ll use Instagram. So, if you’re an Instagram user, please, take your smartphone, open the application and go to Settings > Security > Access Data. In this section, you will be able to see all the information that Instagram keeps about you. With the adoption of the new data protection regulations in 2016 (GDPR), you have the right to know all the information that a third party has about you.
However, this time, we are just going to scroll to the bottom of the page and access the section Ads interests:
Once it loads, you will see a complete list of all the categories of ads that Facebook has associated with your profile based on your activity on Facebook products. Here are the first 10 interests for me (again, 8 are correct):
- Health & wellness
- American football
- Consumer electronics
If you have read the complete lists of “your interests” both on Google and Instagram, you’ll have been able to verify that they have a very high success rate. Although they may have not been successful in all cases, we can say that about 80% are correct. Also, from what I have been able to verify, some categories do not refer to me directly, but to people close to me, so the ads in those categories could also be effective.
How can they do it?
As I’ve already mentioned, ad personalization is done based on the information these companies have about you. These data can be obtained mainly from these two sources:
- The activity that you carry out in their applications.
Think about this well-known affirmation: “If the product is free, the product is you”. All these companies, like Google or Facebook, offer all their digital services for free. Like the data giants they are, they are continually measuring user activity on their products to improve existing services, offer new ones, or sell this information to third parties. The searches you do on google.com, the time you spend looking at a photo, the number of times you enter a profile … everything is being measured.
- Data offered by third parties.
On the other hand, focusing just on the “mobile” world, data-driven companies often develop and make SDKs (Software Development Kits), called trackers, available to application developers, which allows them to capture and send data to their servers. SDKs are pieces of code that application developers can easily introduce into their applications. These pieces of code are set up to track your activity whether you’re in or out of the app. We could say that a tracker is a next-generation cookie.
The exodus project
I would like to introduce the Exodus project. Exodus analyzes Android applications looking for embedded trackers inside. Then, they build a report with a list of all of those trackers and the permissions that of the application needs. You can browse for a report of any Android application on this link. They also have an application you can install on your Android device. I suggest you search for some of the applications you’ve installed and think if they need the permissions they’re requesting.
Let see an example of the popular film database, part of Amazon: IMDb. On IMDb, you can find all kinds of information related to the entertainment industry (movies, series, shows, events …) and it has a large database of user ratings and reviews. This application has 11 trackers from different providers, this is a high-level classification of the most relevant ones:
- Amazon Advertisement (Advertisement)
- Amazon Analytics (Usage)
- ComScore (Advertisement)
- Facebook Analytics (Usage)
- Facebook Share (Usage)
- Google Ads (Advertisement)
- Google DoubleClick (Usage)
- Google Firebase Analytics (Usage)
- Tune (Advertisement)
They’re responsible for capturing usage metrics and information for both advertisement and usage. As we can see, most of the trackers are for well-known companies such as Amazon, Facebook or Google but also are other companies like Tune or ComScore that develop trackers to capture metrics on our devices and offer marketing services to application owners.
Now, let’s have a look at the permissions that IMDB uses. Everything seems ok but, wait! IMDb has this permission: ACCESS_COARSE_LOCATION. This leads to the following question: Does the IMDb application need access to my coarse location? This is a rhetorical question but mind you that we are continually exposed to this kind of situation. It’s not surprising to see apps that require permissions like location or microphone when they don’t need it. Even if they needed it, they could also use it for other purposes….
Although this analysis has been done for Android applications, thanks to the reports that the Exodus Project provides us, we must bear in mind that this situation is not exclusive to Android. In the case of Apple, even though its DRMs do not allow this type of research, we must bear in mind that the developers of the tracker SDKs usually do it for the different mobile platforms, so we could be equally exposed.
In conclusion, I would like to emphasize that on many occasions when we talk about the Internet, we are not aware of what we are exposed to. In general, we accept terms and conditions of use of services that we do not read or do not understand what they entail. I would like to remark again: nothing is free. But underneath all this technology is the people who use it and, although it may seem a scandal against our privacy, I think that deep down we are happy that Netlflix recommends us movies to watch or Spotify coming up with a new list of songs to listen to or Amazon recommending products to buy. After all, those services make life easier so, is it so bad?