Over the past three months we’ve learnt that NSA uses third-party tracking cookies for surveillance (1, 2). These cookies, provided by a third-party advertising or analytics network (e.g. doubleclick.com, scorecardresearch.com), are ubiquitous on the web, and tag users’ browsers with unique pseudonymous IDs. In a new paper, we study just how big a privacy problem this is. We quantify what an observer can learn about a user’s web traffic by purely passively eavesdropping on the network, and arrive at surprising answers.
At first sight it doesn’t seem possible that eavesdropping alone can reveal much. First the eavesdropper on the Internet backbone sees millions of HTTP requests and responses. How can he associate the third-party HTTP request containing a user’s cookie with request to the first-party web page that the browser visited, which doesn’t contain the cookie? Second, how can visits to different first parties be linked to each other? And finally, even if all the web traffic for a single user can be linked together, how can the adversary go from a set pseudonymous cookies to the user’s real-world identity?
The diagram illustrates how the eavesdropper can use multiple third-party cookies to link traffic. When a user visits ‘www.exampleA.com,’ the response contains the embedded tracker X, with an ID cookie ‘xxx’. The visits to exampleA and to X are tied together by IP address, which typically doesn’t change within a single page visit . Another page visited by the same user might embed tracker Y bearing the pseudonymous cookie ‘yyy’. If the two page visits were made from different IP addresses, an eavesdropper seeing these cookies can’t tell that the same browser made both visits. But if a third page, however, embeds both trackers X and Y, then the eavesdropper will know that IDs ‘xxx’ and ‘yyy’ belong to the same user. This method applied iteratively has the potential of tying together a lot of the traffic of a single user.
By Dillon Reisman
Source and read the full paper: