Google Books and Privacy

Google Books and Privacy

Michael Perry is a graduate from the School of Information at University of Michigan specializing in the field of Information Policy.

Google Books was first introduced in late 2003.  The project’s goal was to scan books and make them available to online searches.  Users would be able to search for specific terms in the books.  Google partnered with major universities, including the University of Michigan and Harvard University, and the New York Public Library to build its digital library database.

Copyright Infringement

In 2005, the Authors Guild and the Association of American Publishers (AAP) filed separate lawsuits claiming copyright infringement.  The complaint centered on the scanning of the books.  Google was making a digital copy of a book without permission from the author/owner of the copyright.  However, Google defended the scanning practice as fair use.  It argued that in order to make searches possible, it was necessary to scan the entire book.  Google stated that the user never had access to the entire digital scan.  Only a few lines of text were displayed to the user.  In October 2008, a settlement between the parties was announced.  Under the agreement, a Book Rights Registry would be created to represent the interests of the copyright holders (known as the Rightsholders).  Because it scanned books without permission, Google agreed to pay the Rightsholders $45 million.  For books scanned after May 5, 2009, the parties would participate in a revenue-sharing venture, with 63% of the revenue going to the Rightsholders and 37% going to Google.  The settlement, however, is still pending judicial approval.

Privacy Concerns

One issue with the settlement was Google’s privacy policy.  Initially, Google did not have one for this service.  In 2007 a Library Journal article highlighted the concern that, without a strong privacy element designed into the system, Google Books could cross-list information from its many platforms (Gmail, Google Search, etc.) and combine it with the user’s book searches.  The thrust of the privacy debate began in early 2009 when the American Library Association (ALA) voiced its concerns about user privacy.1 Google’s initial response was underwhelming – citing that there was not a privacy component in the settlement because the end-product had not yet been developed.  However, in July 2009, Google announced that they were investigating how to protect user privacy and by September had released a policy specific to Google Books.

Google’s Privacy Policies

Google has a general privacy policy but has created specific policies for particular services.  For example, Google Health, which stores an individual’s medical records, keeps the records and logs of those accounts completely separate from its other services.  It effectively silos the information.  Google Latitude, a service that allows mobile users to display their location, requires users to opt-in before their location is shown.  Google Maps takes steps to blur individual faces found in the Street View.2 Privacy advocates are pushing for a similar, service-specific approach to Google Books.  The difficulty is that Google Books is a hybrid of the online bookstore and the virtual library.  At a 2010 Fairness Hearing over the 2008 settlement agreement, Judge Chin asked about the difference between Google Books and Amazon.  Cindy Cohn, a representative of Electronic Frontier Foundation (EFF), replied that Amazon can only know what book you purchase whereas Google can track every page that is viewed.

Is it possible to extend the privacy expectations associated with libraries to a Google Books privacy policy?  The answer is – partially.  The American Civil Liberties Union (ACLU) and the EFF have urged Google to provide stronger privacy protection.  The EFF proposed that Google delete the log of a user’s search results within 30 days.  Not only would this minimize the potential abuse of the search records but it is in keeping with other Google policies.  Google Health deletes logs after two weeks.  This proposed deletion period is reasonable – it allows Google to analyze data to improve search algorithms while addressing the user’s (valid) expectation of privacy while searching for books.  It is in line with other existing Google privacy policies.  Finally, as one observer commented during the Fairness Hearing, Judge Chin is looking for a compromise to address the privacy concerns.

This balancing act may not meet the legal standards that physical public libraries must meet.  However, the profile of privacy with the Google Books service is much more visible today.  Advocates for privacy have momentum on their side, especially as users become more aware of privacy policies with online services (one only has to look at the reaction by users to Facebook’s privacy stance).  It remains to be seen what the final form of that policy will be.  But there will be a privacy component to the Google Books settlement.  And it will more closely resemble that of a library than that of a bookseller.


1 "ALA Reluctantly Accepts Google Book Deal, but Internet Archive Finds it Untenable." Washington Internet Daily 22 apr 2009: n. pag.

2 "What the web knows about you". Consumer Privacy November 29, 2009: 28.