The extent of online tracking

Does Google know what you’re doing online? Eighty-eight percent of the time, the answer might be yes. Earlier this month a group of graduate students at UC Berkeley released a detailed report on the extent of online tracking and the disconnect between reality and user expectations. The data in the report was collected from various sources including examination of popular and randomly selected websites, surveys, and Freedom of Information Act (FOIA) requests. The examinations of websites searched for the presence of “web bugs”—the term the researchers apply to the various methods third party tracking services use to monitor activities online.

The researchers found that every one of the top 100 websites contained at least one web bug. They also found that just a few third party tracking services were represented over a wide selection of sites. For example, in an examination of nearly 400,000 sites it was found that Google tracking cookies were served by 88%. Thus, quite a lot of data is being aggregated by a small group of companies. The researchers also examined and categorized the privacy policies of the top 50 websites. They found the policies to be largely contradictory or confusing. For example, all of the websites say that they won’t share the information collected about users with third parties, but they allow exceptions for affiliates and contractors. When the researchers investigated the affiliate relationships of these companies they found that the websites in question had an average of 297 affiliates each. Further, only 36 of the top 50 sites mentioned third party trackers in their privacy policies and in each case it was stated that the practices of third parties were outside the scope of the website’s privacy policy.

The report highlights a significant problem with privacy online: there exists a lack of transparency with regard to what information is collected about users and what is done with it. Users have access only to a vaguely worded privacy policy that may even mislead them if they read it. Third party tracking is particularly problematic as not only does the user have the normal barriers but he is unlikely to know it is even going on—third party tracking is normally accomplished using tiny, invisible images or javascript code. Additionally, it is seldom clear what will be become of the data in the long term. Is it deleted once it is no longer useful, or retained for as-yet-unknown uses? The report’s major recommendation is for regulation requiring companies to provide access to collected data and explicit notice of the purposes and destinations of all data.

The recommendations are not brand-new and it has generally been argued that such requirements would either frustrate business online or be of little benefit to users. I believe that it is indeed true that the requirements may frustrate certain kinds of businesses online but in the long run will benefit the majority. Greater transparency and accountability will increase user trust in online businesses. In the short term it may affect the revenue streams garnered by targeted advertising and data collection but it is not impossible for businesses to find new sources of income or adapt the old ones to respect the privacy of users online. The second argument is usually that users don’t really want the kind of privacy the report is advocating. They are content as long as no actual harm comes from the collection of information. The surveys and FTC complaints summarized by the Berkeley report contradict this argument. Users are concerned about their privacy and want greater control. The history of FTC complaints has shown that when users are made aware of invasions of their privacy, they will act to hold the company accountable.

  1. From the report it looks like a lot of the types of information collected by these sites (such as IP addresses and contact info) would not be deemed to have significant privacy interests, meaning that they would likely be fair game in the absence of any agreements to the contrary. From a string of recent court decisions having to do with online privacy it seems that, at least in Ontario (though these cases had to do with public law), the list of what we can reasonably expect to remain private is diminishing. Though I personally don’t have major qualms with most of the current practices of information gathering online (since I don’t believe they will ever be used for any sinister purposes and I think most of the info becomes essentially anonymous anyway), I can understand that many people would feel uncomfortable with the sheer volume of data being collected. I agree that the major recommendations regarding transparency would be a good start for informing users of what they are consenting to, but I wonder how much of an effect that would have if users are still arguably forced to agree to the same terms of service on these popular sites.

  2. The scarier picture is the extent of third party tracking. It’s not just that Google (sorry Google for picking on you) knows my name and IP address, but they also know about 88% of the websites I’ve ever visited. If that site is using Google Analytics, Google even knows things like how long it takes you to read a page. And of course, they possibly have access to your email and calendar and even medical data. It’s a little creepy, to say the least.

    And, of course, you’re right that it all comes down to privacy policies and lack of choice in the end. But I think that with some public awareness, you might see privacy become a selling point.

