Author Archive

“Databuse” as the Future of Privacy?

Is “privacy” such a broad concept as to be meaningless from a legal and policy perspective? On Tuesday, October 14th, the Center for Democracy & Technology hosted a conversation with Benjamin Wittes and Wells Bennett, frequently of the national security blog, Lawfare, to discuss their recent scholarship on “databuse” and the scope of corporate responsibilities for personal data.

Coming from a world of FISA and ECPA, and the detailed statutory guidance that accompanies privacy in the national security space, Wittes noted that privacy law on the consumer side is vague and amorphous, and largely “amounts to don’t be deceptive and don’t be unfair.” Part of the challenge, as number privacy scholars have noted, is that privacy encompasses a range of different social values and policy judgments. “We don’t agree what value we’re protecting,” Wittes said, explaining that government privacy policies have values and distinctions such as national borders and citizen/non-citizen than mean something.

Important distinctions are much less easier to find in consumer privacy. Wittes’ initial work on “databuse” in 2011 was considerably broader and more provocative, applying to all data controllers — first and third party, but his follow-up work with Bennett attempted to limit its scope to the duties owed to consumers exclusively by first parties. According to the pair, this core group of duties “lacks a name in the English language” but “describe a relationship best seen as a form of trusteeship.”

Looking broadly at law and policy around data use, including FTC enforcement actions, the pair argue that there is broad consensus that corporate custodians face certain obligations when holding personal data, including (1) obligations to keep it secure, (2) obligations to be candid and straightforward with users about how their data is being exploited, (3) obligations not to materially misrepresent their uses of user data, and (4) obligations not to use them in fashions injurious to or materially adverse to the users’ interests without their explicit consent. According to Wittes, this core set of requirements better describes reality than any sort of “grandiose conception of privacy.”

“When you talk in the broad language of privacy, you promise consumers more than the legal and enforcement system can deliver,” Wittes argued. “If we want useful privacy policy, we should focus on this core,” he continued, noting that most of these requirements are not directly required by statute.

Bennett detailed how data uses fall into three general categories. The first, a “win/win” category,” describes where the interests of business and consumers align, and he cited the many uses of geolocation information on mobile devices as a good example of this. The second category reflects cases where businesses directly benefit but consumers face a neutral value proposition, and Bennett suggested online behavioral advertising fit into this second category. Finally, a third category of uses are when businesses benefit at consumer’s expense, and he argued that regulatory action would be appropriate to limit these behaviors.

Bennett further argued that this categorization fit well with FTC enforcement actions, if not the agency’s privacy rhetoric. “FTC report often hint at subjective harms,” Bennett explained, but most of the Commission’s actions target objective harms to consumers by companies.

However, the broad language of “privacy” distorts what harms the pair believe regulators — and consumers, as well — are legitimately concerned about. Giving credit to CDT for initially coining the term “databuse,” Wittes defines the term as follows:

[T]he malicious, reckless, negligent, or unjustified handling, collection, or use of a person’s data in a fashion adverse to that person’s interests and in the absence of that person’s knowing consent. . . . It asks not to be left alone, only that we not be forced to be the agents of our own injury when we entrust our data to others. We are asking not necessarily that our data remain private; we are asking, rather, that they not be used as a sword against us without good reason.

CDT’s Justin Brookman, who moderated the conversation, asked whether (or when) price discrimination could turn into databuse.

“Everyone likes [price discrimination] when you call it discounts,” Wittes snarked, explaining that he was “allergic to the merger of privacy and antidiscrimination laws.” Where personal data was being abused or unlawful discrimination was transpiring, Wittes supported regulatory involvement, but he was hesitant to see both problems as falling into the same category of concern.

The conversation quickly shifted to a discussion of the obligations of third parties — or data brokers generally — and Wittes and Bennett acknowledged they dealt with the obligations of first parties because its an easier problem. “We punted on third parties,” they conceded, though Wittes’ background in journalism forced him to question how “data brokers” were functionally different from the press. “I haven’t thought enough about the First Amendment law,” he admitted, but he wasn’t sure what principle would allow advocates to divine “good” third parties and “bad” third parties.

But if the pair’s theory of “databuse” can’t answer every question about privacy policy, at least we might admit the term should enter the privacy lexicon.

-Joseph Jerome, Policy Counsel

Thoughts on the Data Innovation Pledge

Yesterday, as he accepted the IAPP Privacy Vanguard award, Intel’s David Hoffman made a “data innovation pledge” that he would work only to promote ethical and innovative uses of data. As someone who only relatively recently entered the privacy world by diving headfirst into the sea of challenges surrounding big data, I think an affirmative pledge of the sort David is proposing is a great idea.

While pledges can be accused of being mere rhetorical flourishes, words do matter. A simple pledge can communicate a good deal — and engage the public in a way that drives conversation forward. Think of Google’s early motto to “don’t be evil.” For years this commitment fueled a large reservoir of trust that many have for the company. With every new product and service that Google releases, its viewed through the lens of whether or not its “evil.” That places a high standard on the folks at Google, and for that, we should pleased.

Of course, pledges present obligations and challenges. Using data only for good presents a host a new questions. As FPF explored in our whitepaper on benefit-risk analysis for Big Data, there are different aspects to consider when evaluating the benefits of data use — and some of these factors are largely subjective. Ethics is a broad field, and it also exposes the challenging philosophical underpinnings of privacy.

The very concept of privacy has always been a philosophical conundrum, but so much of the rise of the privacy profession has focused on compliance issues and the day-to-day reality of data protection. But today, we’re swimming in a sea of data, and all of this information makes us more and more transparent to governments, industry, and each other. It’s the perfect catalyst to consider what the value of “privacy” truly is. Privacy as an information on/off switch may be untenable, but privacy as a broader ethical code makes a lot of sense.

There are models to learn from. As David points out, other professions are bound by ethical codes, and much of that seeps into how we think about privacy. Doctors not only pledge to do no harm, but they also pledge to keep our confidences about our most embarrassing or serious health concerns. Questionable practices around innovation and data in the medical field led to review boards to protect patients and human test subjects and reaffirmed the role of every medical professional to do no evil.

Similar efforts are needed today as everything from our wristwatches to our cars is “datafied.” In particular, I think about all of the debates that have swirled around the use of technology in the classroom in recent years. A data innovation pledge could help relieve worried parents. If Monday’s FTC workshop is indication, similar ethical conversations may even be needed for everyday marketing and advertising.

The fact is that there are a host of different data uses that could benefit from greater public confidence. A data innovation pledge is a good first start. There is no question that companies need to do more to show the public how they are promoting innovative and ethical uses of data. Getting that balance right is tough, but here’s to privacy professionals helping to lead that effort!

-Joseph Jerome, Policy Counsel

FTC Wants Tools to Increase Transparency and Trust in Big Data

However we want to define “Big Data” – and the FTC’s latest workshop on the subject suggests a consensus definition remains elusive – the path forward seems to call for more transparency and the establishment of firmer frameworks on the use of data. As Chairwoman Ramirez suggested in her opening remarks, Big Data calls for a serious conversation about “industry’s ethical obligations as stewards of information detailing nearly every facet of consumers’ lives.”

Part of that challenge is that some Big Data uses are often “discriminatory”. Highlighting findings from his paper on Big Data and discrimination, Solon Barocas began the workshop by noting that whole point of data mining is to differentiate and to draw distinctions. In effect, Big Data is a rational form of discrimination, driven by apparent statistical relationships rather than any capriciousness. When humans introduce unintentional biases into the data, there is no ready solution at a technical or legal level. Barocas called for a conversation for lawyers and public policy makers to have a conversation with the technologists and computer scientists working directly with data analytics – a sentiment echoed when panelists realized a predictive analytics conference was going on simultaneously across town.

But the key takeaway from the workshop wasn’t that Big Data could be used as tool to exclude or include. Everyone in the civil rights community agreed that data could be a good thing, and a number of examples were put forth to suggest once more that data had the potential to be used for good or for ill. Pam Dixon of the World Privacy Forum classifying individuals creates a “data paradox,” where the same data can be used to help or to harm that individual. For our part, FPF released a report alongside the Anti-Defamation League detailing Big Data’s ability to combat discrimination. Instead, there was considerable desire to understand more about industry’s approach to big data. FTC staff repeatedly asked not just for more positive uses of big data by the private sector, but inquired as to what degree of transparency would help policy makers understand Big Data decision-making.

FTC Chief Technologist Latanya Sweeney followed up her study that suggested web searches for African-American names were more likely than searches of white-sounding names to return ads suggesting the person had an arrest record by looking at credit card advertising and website demographics. Sweeney presented evidence that advertisements for harshly criticized credit cards were often directed to the homepage of Omega Psi Phi, a popular black fraternity.

danah boyd observed that there was a general lack of transparency about how Big Data is being used within industry, for a variety of complex reasons. FTC staff and Kristin Amerling from Senate Commerce singled out the opacity surrounding the practices of data brokers when describing some of the obstacles being faced when policy makers try to under how Big Data is being used.

Moreover, while consumers and policy makers are trying to grapple with what companies are doing with their streams of data, industry is also placed in the difficult position of making huge decisions about how that data can be used. For example, boyd cited the challenges JPMorgan Chase faces when using analytics to evaluate human trafficking. She applauded the positive work the company was doing, but noted that expecting it to have the ability or expertise to effectively intervene in trafficking perhaps asks too much. They don’t know when to intervene or whether to contact law enforcement or social services.

These questions are outside the scope of their expertise, but even general use of Big Data can prove challenging for companies. “A lot of the big names are trying their best, but they don’t always know what the best practices should be,” she concluded.

FTC Commissioner Brill explained that her support for a legislative approach to increase transparency and accountability among data brokers, their data sources, and their consumers, was to help consumers and policy makers “begin to understand how these profiles are being used in fact, and whether and under what circumstances they are harming vulnerable populations.” In the meantime, she encouraged industry to take more proactive steps. Specifically, she recommended again that data brokers explore how their clients are using their information and take steps to prevent any inappropriate uses and further inform the public. Companies can begin this work now, and provide all of us with greater insight into – and greater assurances about – their models,” she concluded.

A number of legal regimes may already apply to Big Data, however. Laws that govern the provision of credit, housing, and employment will likely play a role in the Big Data ecosystem. Carol Miaskoff at the Equal Employment Opportunity Commission suggested there was real potential with Big Data to gather information about successful employees and use that to screen people for employment in a way that exacerbates prejudices built into the data. Emphasizing his recent white paper, Peter Swire suggested there were analogies to be made between sectoral regulation in privacy and sectoral legislation in anti-discrimination law. With existing legal laws in place, he argued that it was past time to “go do the research and see what those laws cover” in the context of Big Data.

“Data is the economic lubricant of the economy,” the Better Business Bureau’s C. Lee Peeler argued, and he supported the FTC’s continued efforts to explore the subject of Big Data. He cited earlier efforts by the Commission to examine inner-city marketing practices, which produced a number of best practices still valid today. He encouraged the FTC to look at what companies are doing with Big Data on a self-regulatory basis as a basis for developing workable solutions to potential problems.

So what is the path forward? Because Big Data is, in the words of Promontory’s Michael Spadea, a nascent industry, there is a very real need for guidelines on not just how to evaluate the risks and benefits of Big Data but also how to understand what is ethically appropriate for business. Chris Wolf highlighted FPF’s recent Data-Benefit Analysis and suggested companies were already engaged in detailed analysis of the use of Big Data, though everyone recognized that businesses practices and trade secrets precluded making much of this public.

FTC staff noted there was a “transparency hurdle” to get over in Big Data. Recognizing that “dumping tons of information” onto consumers would be unhelpful, staff picked up on Swire’s suggestion that industry needed some mechanism to justify what is going on to either regulators or self-regulatory bodies. Spadea argued that “the answer isn’t more transparency, but better transparency.” The Electronic Frontier Foundation’s Jeremy Gillula recognized the challenge companies face revealing their “secret sauce,” but encouraged them to look at more way to give consumer more general information about what was going on. Otherwise, he recommended, consumers ought to collect big data on big data and turn data analysis back on data brokers and industry at large through open-source efforts.

At the same time, Institutional Review Boards, which are used in human subject testing research, were again proposed as a model for how companies can begin affirmatively working through these problems. Citing a KPMG report, Chris Wolf insisted that strong governance regimes, including “a strong ethical code, along with process, training, people, and metrics,” were essential to confront the many ethical and philosophical challenges that flirted around the day’s discussions.

Jessica Rich, the Director on the FTC’s Consumer Protection Bureau, cautioned that the FTC would be watching. In the meantime, industry is on notice. The need for clearer data governance frameworks is clear, and careful consideration of Big Data project should be both reflexive and something every industry privacy professional talks about.

-Joseph Jerome, Policy Counsel

Relevant Reading from the Workshop:


A Path Forward for Big Data

How should privacy concerns be weighed against the benefits of big data? This question has been at the heart of policy debates about big data all year, from the President’s announcement of the White House Big Data review in January to the FTC’s latest workshop looking at big data’s ability to exclude or include. Answering this question could very well present the biggest public policy challenge of our time, and the need to face that challenge is growing.

Increasingly, there are new worries that big data is being used in ways that are unfair to some people or classes of people. Resolving those worries and ensuring that big data is being used fairly and legitimately is a challenge should be a top priority for industry and government alike.

Today, FPF is releasing two papers that we hope will help frame the big data conversation moving forward and promote better understanding of how big data can shape our lives. These papers provide a practical guide for how benefits can be assessed in the future, but they also show how data is already is being used in the present.  FPF Co-Chairman Christopher Wolf will discuss key points from these papers at the Federal Trade Commission public workshop entitled “Big Data: A Tool for Inclusion or Exclusion?” in Washington on Monday, September 15.

We are also releasing a White Paper which is based on comments that will be presented at the FTC Workshop by  Peter Swire, Nancy J. & Lawrence P. Huang Professor of Law and Ethics, Georgia Institute of Technology.  Swire, also Senior Fellow at FPF, draws lessons from fair lending law that are relevant for online marketing related to protected classes.

The papers are entitled:

*                      *                      *

The world of big data is messy and challenging. The very term “big data” means different things within different contexts. Any successful approach to the challenge of big data must recognize that data can be used in a variety of different ways. Some of these uses are clearly beneficial, some of them clearly are problematic, some are for uses that some believe beneficial and others believe to be harmful.  Some uses have no real impact on individuals at all. We hope these documents can offer new ways to look at big data in order to ensure that it is only being used for good.

Big Data: A Benefit and Risk Analysis

Privacy professionals have become experts at evaluating risk, but moving forward with big data will require rigorous analysis of project benefits to go along with traditional privacy risk assessments. We believe companies or researchers need tools that can help evaluate the cases for the benefits of significant new data uses.  Big Data: A Benefit and Risk Analysis is intended to help companies assess the “raw value” of new uses of big data. Particularly as data projects involve the use of health information or location data, more detailed benefit analyses that clearly identify the beneficiaries of a data project, its size and scope, and that take into account the probability of success and evolving community standards are needed.   We hope this guide will be a helpful tool to ensure that projects go through a process of careful consideration.

Identifying both benefits and risks is a concept grounded in existing law. For example, the Federal Trade Commission weighs the benefits to consumers when evaluating whether business practices are unfair or not. Similarly, the European Article 29 Data Protection Working Party has applied a balancing test to evaluate legitimacy of data processing under the European Data Protection Directive. Big data promises to be a challenging balancing act.

Big Data: A Tool for Fighting Discrimination and Empowering Groups

Even as big data uses are examined for evidence of facilitating unfair and unlawful discrimination, data can help to fight discrimination. It is already being used in myriad ways to protect and to empower vulnerable groups in society. In partnership with the Anti-Defamation League, FPF prepared a report that looked at how businesses, governments, and civil society organizations are leveraging data to provide access to job markets, to uncover discriminatory practices, and to develop new tools to improve education and provide public assistance.  Big Data: A Tool for Fighting Discrimination and Empowering Groups explains that although big data can introduce hidden biases into information, it can also help dispel existing biases that impair access to good jobs, good education, and opportunity.

Lessons from Fair Lending Law for Fair Marketing and Big Data

Where discrimination presents a real threat, big data need not necessary lead us to a new frontier. Existing laws, including the Equal Credit Opportunity Act and other fair lending laws, provide a number of protections that are relevant when big data is used for online marketing related to lending, housing, and employment. In comments to be presented at the FTC public workshop, Professor Peter Swire will discuss his work in progress entitled Lessons from Fair Lending Law for Fair Marketing and Big Data. Swire explains that fair lending laws already provide guidance as to how to approach discrimination that allegedly has an illegitimate, disparate impact on protected classes. Data actually plays an important role in being able to assess whether a disparate impact exists! Once a disparate impact is shown, the burden shifts to creditors to show their actions have a legitimate business need and that no less reasonable alternative exists. Fair lending enforcement has encouraged the development of rigorous compliance mechanisms, self-testing procedures, and a range of proactive measures by creditors.

*                      *                      *

There is no question that big data will require hard choices, but there are plenty of avenues for obtaining the benefits of big data while avoiding – or minimizing – any risks. We hope the following documents can help shift the conversation to a more nuanced and balanced analysis of the challenges at hand.

To contact the authors to discuss any of the papers or issues related to privacy and big data, contact

Data Protection Law Errors in Google Spain LS, Google Inc. v. Agencia Espanola de Proteccion de Datos, Mario Costeja Gonzalez

The following is a guest post by Scott D. Goss, Senior Privacy Counsel, Qualcomm Incorporated, addressing the recent “Right to be Forgotten” decision by the European Court of Justice.

There has been quite a bit of discussion surrounding the European Court of Justice’s judgment in Google Spain LS, Google Inc. v. Agencia Espanola de Proteccion de Datos (AEPD), Mario Costeja Gonzalez.  In particular, some interesting perspectives have been shared by Daniel Solove, Ann Cavoukian and Christopher Wolf, and Martin Husovec.  The ruling has been so controversial, newly appointed EU Justice Commissioner, Martine Reicherts delivered a speech defending it.  I’d like to add to the discussion.[1]  Rather than focusing on the decision’s policy implications or on the practicalities of implementing the Court’s ruling, I’d like to instead offer thoughts on a few points of data protection law.

To start, I don’t think “right to be forgotten” is an apt description of the decision, and instead distorts the discussion.  Even if Google were to follow the Court’s ruling to the letter, the information doesn’t cease to exist on the Internet.  Rather, the implementation of the Court’s ruling just makes internet content linked to peoples’ names harder to find.  The ruling, therefore, could be thought of as, “the right to hide”.  Alternatively, the decision could be described as, “the right to force search engines to inaccurately generate results.”  I recognize that such a description doesn’t roll off the tongue quite so simply, but I’ll explain why that description is appropriate below.

I believe the Court made a few important legal errors that should be of interest to all businesses that process personal data.  First was the Court’s determination that Google was a “controller” as defined under EU data protection law and second was the application of the information relevance question.    Then, I’ll explain why “the right to force search engines to inaccurately generate results” may be a more appropriate description of the Court’s ruling.

1.       “Controller” status must be determined from the activity giving rise to the complaint

To understand how the Court erred in determining that Google is a “controller” in this case, it helps to understand how search engines work.  At a conceptual level, search is comprised of three primary data processing activities:  (i) caching all the available content, (ii) indexing the content, and (iii) ranking the content.  During the initial caching phase, a search engine’s robot minions scour the Internet noting all the content on the Internet and its location.  The cache can be copies of all or parts of the web pages on the Internet.  The cache is then indexed to enable much faster searching by sorting the content.  Indexing is important because without it searching would take immense computing power and significant time for each page of the Internet to be examined for users’ search queries.  Finally, the content within the index is ranked for relevance.

From a data processing perspective, I believe that caching and indexing achieve two objective goals: Determining the available content of the internet and where can it be found.  Tellingly, the only time web pages are not cached and indexed is when website publishers, not search engines, include a special code on their web pages instructing search engines to ignore the page.  This special code is called robots.txt

The web pages that are cached and indexed could be the text of the Gettysburg address, the biography of Dr. Martin Luther King, Jr., the secret recipe for Coca-Cola, or newspaper articles that include peoples’ names.  It is simply a fact that the letters comprising the name “Mario Costeja Gonzalez” could be found on certain web pages.  Search engines cannot control that fact any more than they could take a picture of the sky and be said to control the clouds in the picture.

After creating the cache and index, the next step involves ranking the content.  Search engine companies employ legions of the world’s best minds and immense resources to determine rank order.  Such ranking is subjective and takes judgment.  Arguably, ranking search results could be considered a “controller” activity, but the ranking of search results was never at issue in the Costeja Gonzalez case.  This is a key point underlying the Court’s errors.  Mr. Costeja Gonzalez’s complaint was not that Google ranked search results about him too high (i.e., Google’s search result ranking activity), but rather that the search engine indexed the information at all.  The appropriate question, therefore, is whether Google is the “controller” of the index.  The question of whether Google’s process of ranking search results confers “controller” status on Google is irrelevant.  The Court’s error was to conflate Google’s activity of ranking search results with its caching and indexing of the Internet.

Some may defend the Court by arguing that controller status of some activities automatically anoints controller status on all activities.  This would be error.  The Article 29 Working Party opined,

[T]he role of processor does not stem from the nature of an entity processing data but from its concrete activities in a specific context.  In other words, the same entity may act at the same time as a controller for certain processing operations and as a processor for others, and the qualification as controller or processor has to be assessed with regard to specific sets of data or operations.

Opinion 1/2010 on the concepts of “controller” and “processor”, page 25, emphasis added.  In this case, Mr. Costeja Gonzalez’s complaint focused on the presence of certain articles about him in the index.  Therefore, the “concrete activities in a specific context” is the act of creating the index and the “specific sets of data” is the index itself.  The Article 29 WP went on to give an example of an entity acting as both a controller and a processor of the same data set:

An ISP providing hosting services is in principle a processor for the personal data published online by its customers, who use this ISP for their website hosting and maintenance.  If however, the ISP further processes for its own purposes the data contained on the websites then it is the data controller with regard to that specific processing.

I submit that creation of the index is analogous to an ISP hosting service.  In creating an index, search engines create a copy of everything on the Internet, sort it, and identify its location.  These are objective, computational exercises, not activities where the personal data is noted as such and treated with some separate set of processing.  Following the Article 29 Working Party opinion, search engines could be considered processors in the caching and indexing of Internet content because such activities are mere objective and computational exercises, but controllers in the ranking of the content due to the subjective and independent analysis involved.

Further, as argued in the Opinion of Advocate General Jaaskinen, a controller needs to recognize that they are processing personal data and have some intention to process it as personal data. (See paragraph 82).  It is the web publishers who decide what content goes into the index.  Not only do they have discretion in deciding to publish the content on the Internet in the first instance, but they also have the ability to add the robots.txt code to their web pages which directs search engines to not cache and index.  The mark of a controller is one who “determines the purposes and means of the processing of personal data.” (Art. 2, Dir. 95/46 EC).  In creation of the index, rather than “determining”, search engines are identifying the activities of others (website publishers) and heeding their instructions (use or non-use of robots.txt).  I believe such processing cannot, as a matter of law, rise to the level of “controller” activities.

Finding Google to be a “controller” may have been correct if either the facts or the complaint had been different.  Had Mr. Costeja Gonzalez produced evidence that: (i) the web pages he wanted removed contained the “robots.txt” instruction or, (ii) the particular web pages were removed from the Internet by the publisher but not by Google in its search results, then it may be appropriate to hold Google as a “controller” due to these independent activities.  Such facts would be similar to the example given by the Article 29 Working Party of an ISP’s independent use of personal data maintained by its web hosting customers.  Similarly, had Mr. Costeja Gonzalez’s complaint been that search results regarding his prior bankruptcy been ranked too high, then I could understand (albeit I may still disagree) that Google would be found to be a controller.  But that was not his complaint.  His complaint was that certain information was included in the index at all – and for that, I believe, Google should have no more control over than it has in the content of the Internet itself.

2.       “Relevance” of Personal Data must be evaluated in light of the purpose of the processing.

The Court’s second error arose in the application of the controller’s obligations.  Interestingly, after finding that Google is the controller of the index, the Court incorrectly applied the relevancy question.  To be processed legitimately, personal data must be “relevant and not excessive in relation to the purposes for which they are collected and/or further processed.” Directive 95/46 EC, Article 6(c) (emphasis added). Relevancy is thus a question in relation to the purpose of the controller – not as to the data subject, a customer, or anyone else.  The purpose of the index, in Google’s own words, is to “organize the world’s information and make it universally accessible and useful.” (  With that purpose in mind, all information on the Internet is, by definition, relevant.  While clearly there are legal boundaries to the information that Google can make available, the issue is whether privacy law contains one of those boundaries.  I suggest that in the context of caching and creating an index of the Internet, it is not.

The court found that Google legitimizes its data processing under the legitimate interest test of Article 7(f) of the Directive.  Google’s legitimate interests must be balanced against the data subjects’ fundamental rights under Article 1(1).  Since Article 1(1) provides no guidance as to what those rights are (other than “fundamental”), the Court looks to subparagraph (a) of the first paragraph of Article 14.  That provision provides data subjects with a right to object to data processing of their personal data, but offers little guidance as to when controllers must oblige.  Specifically, it provides in cases of legitimate interest processing, a data subject may,

“object at any time on compelling legitimate grounds relating to his particular situation to the processing of data relating to him, save where otherwise provided by national legislation.  Where there is a justified objection, the processing instigated by the controller may no longer involve those data.”

What are those “compelling legitimate grounds” for a “justified objection”?  The Court relies on Article 12(b) “the rectification, erasure, or blocking of data the processing of which does not comply with the provisions of this Directive, in particular because of the incomplete or inaccurate nature of the data”.  It is here the Court erred.

The Court took the phrase “incomplete or inaccurate nature of the data” and erroneously applied it to the interests of the data subject.  Specifically, the Court held that the question is whether the search results were “incomplete or inaccurate” representations of the data subject as he/she exists today.  I submit that was not the intent of Article 12(b).  Rather, Article 12(b) was referring back to the same use of that phrase in Article 6 providing that:

“personal data must be: . . .(d) accurate and, where necessary, kept up to date; every reasonable step must be taken to ensure that data which are inaccurate or incomplete, having regard to the purposes for which they were collected or for which they are further processed, are erased or rectified.

The question is not whether the search results are “incomplete or inaccurate” representations of Mr. Costeja Gonzalez, but whether the search results are inaccurate as to the purpose of the processing.  The purpose of the processing is to copy, sort, and organize the information on the internet.   In this case, queries for the characters “Mario Costeja Gonzalez,” displayed articles that he admits were actually published on the Internet.  Such results, therefore, are by definition not incomplete or inaccurate as to the purpose of the data processing activity.  To put it simply, the Court applied the relevancy test to the wrong party (Mr. Costeja Gonzalez) as opposed to Google and the purpose of its index.

To explain by analogy, examine the same legal tests applied to a credit reporting agency.  A credit reporting analogy is helpful because it also has at least three parties involved in the transaction.  In the case of the search engine, those parties are the search engine, the data subject, and the end user conducting the search.  In the case of credit reporting, the three parties involved are the credit ratings businesses, the consumers who are rated (i.e., the data subjects), and the lenders and other institutions that purchase the reports.  It is well-established law that consumers can object to information used by credit ratings businesses as being outdated, irrelevant, or inaccurate.  The rationale for this right is found in Article 12(f) and Article 6 in relation to the purpose of the credit reporting processing activity.

The purpose of credit reporting is to provide lenders an opinion on the credit worthiness of the data subject.  The credit ratings business must take care that the information they use is not “inaccurate or incomplete” or they jeopardize the purpose of their data processing by generating an erroneous credit score.  For example, if a credit reporting agency collected information about consumers’ height or weight, consumers would be able to legitimately object.  Consumers’ objections would not be founded on the fact that the information is not representative of who they are – indeed such data may be completely accurate and current.  Instead, consumers’ objections would be founded on the fact that height or weight are not relevant for the purpose of assessing consumers’ credit worthiness.

Returning to the Costeja Gonzalez case, the issue was whether the index (not the ranking of such results) should include particular web pages containing the name of Mr. Costeja Gonzalez.  Since the Court previously determined that Google was the “controller” of the index (which I contend was error), the Court should have determined Google’s purpose of the index and then set the inquiry as to whether the contested web pages were incorrect, inadequate, irrelevant or excessive as to Google’s purpose.  As discussed above, Goolge’s professed goal is to enable the discovery of the world’s information and to that end the purpose of the index is to, as much as technologically possible, catalog the entire Internet – all the good, bad, and ugly.  For that purpose, any content on the Internet about the key words “Mario Costeja Gonzalez” is, by definition, not incorrect, inadequate, irrelevant or excessive because the goal is to index everything.  Instead, however, the Court erred by asking whether the web pages were incorrect, inadequate, irrelevant or excessive as to Mr. Costeja Gonzalez, the data subject.  Appling the relevancy question as to Mr. Costeja Gonzalez is, well, not relevant.

Some may argue that the Court recognized the purpose of the processing when making the relevancy determination by finding that Mr. Costeja Gonzalez’s rights must be balanced against the public’s right to know.  By including the public’s interest in the relevancy evaluation, some may argue, the Court has appropriately directed the relevancy inquiry to the right parties.  I disagree.  First, I do not believe it was appropriate to inquire as to the relevancy of the links vis-a-vis Mr. Costeja Gonzalez in the first instance and, therefore, to balance it against other interests (in this case the public) does not cure the error.  Secondly, to weigh the interests of the public, one must presume that the purpose of searching for individuals’ names is to obtain correct, relevant, not inadequate and not excessive information.  I do not believe such presumptions are well-founded.  For example, someone searching for “Scott Goss” may be searching for all current, relevant, and non-excessive information about me.  On the other hand, fifteen years from now perhaps someone is searching for all privacy articles written in 2014 and they happened to know that I wrote one and so searched using my name.  One cannot presume to know the purpose of an individual’s search query other than a desire to have access to all the information on the Internet containing the query term.

If not the search engines, where would it be pertinent to ask the question of whether information on the Internet about Mr. Costeja Gonzalez was incorrect, inadequate, irrelevant or excessive?  The answer to this question is clear:  it is the entities that have undertaken the purpose of publishing information about Mr. Costeja Gonzales.  Specifically, website publishers process personal data for the purpose of informing their readers about those individuals.  The website publishers, therefore, have the burden to ensure that such information is not incorrect, inadequate, irrelevant or excessive as to Mr. Costeja Gonzales.  That there may be an exception in data protection law for web publishers, does not mean that courts should be free to foist obligations onto search engines.

3.       The right to force search engines to inaccurately generate results

Finally, the “right to force search engines to inaccurately generate results” is, I believe, an apt description of the ruling.  A search engine’s cache and index is supposed to contain the entire web’s information that web publishers want the world to know.  Users expect that search engines will identify all information responsive to their queries when they search.  Users further expect that search engines will rank all the results based upon their determination of the relevancy of the results in relation to the query.  The Court’s ruling forces search engines to generate an incomplete list of search results by gathering all information relevant to the search and then pretending that certain information on the Internet doesn’t exist on the Internet at all.  The offending content is still on the Internet, people just cannot rely on finding the content using individuals’ names entered into search engines (at least the search engines on European country-coded domains).


[1] These thoughts are my own and not the company for which I work and I do not profess to be an expert in search technologies or the arguments made by the parties in the case.

Privacy Calendar

6:00 pm Consumer Action’s 43rd Annual Aw... @ Google
Consumer Action’s 43rd Annual Aw... @ Google
Oct 21 @ 6:00 pm – Oct 21 @ 8:00 pm
To mark its 43rd anniversary, Consumer Action’s Annual Awards Reception on October 21, 2014, will celebrate the theme of “Train the Trainer.” Through the power of individual and small group trainings, Consumer Action each year is[...]
9:00 am Web Privacy & Transparency Confe... @ Princeton University
Web Privacy & Transparency Confe... @ Princeton University
Oct 24 @ 9:00 am – 4:00 pm
On Friday, October 24, 2014, the Center for Information Technology Policy (CITP) at Princeton University is hosting a public conference on Web Privacy and Transparency. It will explore the quickly emerging area of computer science research that[...]
4:00 pm Big Data and Privacy: Navigating... @ Schulze Hall
Big Data and Privacy: Navigating... @ Schulze Hall
Oct 29 @ 4:00 pm – 7:00 pm
The rapid emergence of “big data” has created many benefits and risks for businesses today. As data is collected, stored, analyzed, and deployed for various business purposes, it is particularly important to develop responsible data[...]
9:00 am The Privacy Act @40: A Celebrati... @ Georgetown Law
The Privacy Act @40: A Celebrati... @ Georgetown Law
Oct 30 @ 9:00 am – 5:30 pm
The Privacy Act @40 A Celebration and Appraisal on the 40th Anniversary of the Privacy Act and the 1974 Amendments to the Freedom of Information Act October 30, 2014 Agenda 9 – 9:15 a.m. Welcome[...]
all-day George Washington Law Review 201... @ George Washington University Law School
George Washington Law Review 201... @ George Washington University Law School
Nov 7 – Nov 8 all-day
Save the date for the GW Law Review‘s Annual Symposium, The FTC at 100: Centennial Commemorations and Proposals for Progress, which will be held on Saturday, November 8, 2014, in Washington, DC. This year’s symposium, hosted in[...]
10:15 am You Are Here: GPS Location Track... @ Mauna Lani Bay Hotel & Bungalows
You Are Here: GPS Location Track... @ Mauna Lani Bay Hotel & Bungalows
Nov 11 @ 10:15 am
EFF Staff Attorney Hanni Fakhoury will present twice at the Oregon Criminal Defense Lawyers Association’s Annual Sunny Climate Seminar. He will give a presentation on government location tracking issues and then participate in a panel[...]
all-day IAPP Practical Privacy Series 2014
IAPP Practical Privacy Series 2014
Dec 2 – Dec 3 all-day
Government and FTC and Consumer Privacy return to Washington, DC. For more information, click here.

View Calendar