Author Archive

Privacy Chutzpah: A Story for the Onion?

I recently received an email promoting a campaign by a group called Some Of Us, an organization that generates petitions opposing various activities of large companies. This campaign was directed at Facebook, calling on the social network to not sell user data to advertisers. Facebook has recently announced plans to allow advertisers to target ads to Facebook users based on the web sites users have visited. Facebook is not selling user data to advertisers, but I can understand the confusion. Behavioral advertising is complicated, and although selling user data to advertisers is very different than choosing ads for users based on their web surfing, it’s not uncommon for critics to use broad language to blast targeted ads in general.

How Ads Work on Facebook from Facebook on Vimeo.

The surprise was what I found when I examined the privacy policy for the Some of Us site. In a move worthy of an Onion fake news story, the Some of Us policy discloses that it works with ad networks to retarget ads to users on the web after they visit the Some of Us site. Yup! Some of Us does exactly what it is calling on users to protest to Facebook. A quick scan of the site using popular tracking cookie scanner Ghostery finds the code for several ad companies, including leading data broker Axciom.

Some of Us also complains that the Facebook opt-out process, where Facebook links users to the industry central opt-out site found at aboutads.info, is too tedious. But Some of Us doesn’t even bother to provide its visitors with a link or an url to opt-out, as the behavioral advertising code enforced by the Better Business Bureau requires. Some of Us just tells visitors they can visit the Network Advertising Initiative opt out page, leaving them to research how to find the opt-out page on their own.

It gets better. Some of Us solicits users emails and names for petitions, but only if you read the site privacy policy will you learn that signing the petition adds you to the email list for future emails from Some of Us about other causes. The site privacy policy also explains the use of email web bugs that enable Some of Us to personally track when and if the recipients of emails open and read the emails.

I am used to reading stories in the media blasting behavioral ads on the home pages of newspapers embedded with dozens of web trackers. Reporters don’t run the web sites of newspapers, and although they might want to consider whether the ad tracking they consider odious is funding their salaries, they can credibly argue that the business side of media and reporting are separate worlds. But how can an advocacy group blast behavioral ads while targeting behavioral ads to users who come to sign a petition against behavioral ads?!!!

I signed the petition and was immediately taken to a page where Some of Us encouraged me to share the news with my friends on Facebook.

-Jules Polonetsky, Executive DirectorThis post originally appeared on LinkedIn

Mexico Takes Step Toward Data Privacy Interoperability

Last week, the Mexican Institute for Federal Access to Information (IFAI) hosted an event in Mexico City to discuss the recently-announced “Parameters of Self-Regulation for the Protection of Personal Data.”  FPF participated in this workshop along with representatives from the Mexican government, TRUSTe, EuroPriSe and the Better Business Bureau.

As described in opening remarks by the Secretary for Data Protection, under the new regulation, IFAI now has the authority to recognize codes of conduct for data protection and has developed a process through which an organization can be recognized as a certifying body for these codes.  Under the new regulation, the Mexican Accreditation Agency will make a determination on applicant organizations against a set recognition criteria.  Successful applicants will then receive formal recognition as certifying entities from the Ministry of the Economy.

This approach mirrors the process developed as part of the Asia Pacific Economic Cooperation’s (APEC) Cross Border Privacy Rules (CBPR) system in several key ways.  First, the certifying organizations contemplated under this approach serve the same function as “Accountability Agents” under the CBPR system.  In addition, both approaches require a formal recognition based on established criteria.  And second, the standards to which these organizations will be certifying companies are both keyed to Mexico’s Federal Law on the Protection of Personal Information  (the legal basis for Mexico’s participation in the CBPR system).  Given these parallels in both process and substance, a company that receives CBPR certification in Mexico should also be able to attain recognition under this approach.  But perhaps most importantly, CBPR certification should allow a company to avail itself of the incentives offered under Mexican law.

Article 68 of the implementing regulations of the privacy law encourages the development of self-regulatory frameworks and states that participation in a recognized framework (such as the CBPR system) will be taken into account in order to determine any reduction in sanctions determined by IFAI in the event of a violation of the privacy law.

What makes this development so critical to global interoperability is that it serves as a model for other APEC member economies to consider how an enforceable code of conduct based on an international standard can be successfully incorporated into a legal regime – including extending express benefits to certified companies.  It remains to be seen how other APEC economies  will manage this task – but Mexico’s approach offers a promising start.

-Josh Harris, Policy Director 

“Gambling? In This Casino?” Jules and Omer on the Facebook Experiment

Today, Re/code ran an essay by Jules Polonetsky and Omer Tene, offering their take on the Facebook’s now-infamous experiment looking at the effects of tweaking the amount of positive or negative comments on a user’s News Feed:

As the companies that serve us play an increasingly intimate role in our lives, understanding how they shape their services to influence users has become a vexing policy issue. Data can be used for control and discrimination or utilized to support fairness and freedom. Establishing a process for ethical decision-making is key to ensuring that the benefits of data exceed their costs.

FPF Statement on Today’s Joint Subcommittee Hearing on Education Privacy

One of the most important sections of the Administration’s recent report on Big Data concerns was focused on education technology and privacy. The report noted the need to ensure that innovations in educational technology, including new approaches and business models, have ample opportunity to flourish.

Many of these benefits include robust tools to improve teaching and instructional methods; diagnose students’ strengths and weaknesses and adjust materials and approaches for individual learners; identify at-risk students so teachers and counselors can intervene early; and rationalize resource allocation and procurement decisions. Today, students can access materials, collaborate with each other, and complete homework all online.

Some of these new technologies and uses of data raise privacy concerns. Schools may not have the proper contracts in place to protect data and restrict uses of information by third parties. Many school officials may not even have an understanding of all the data they hold. As privacy expert Daniel Solove has noted, privacy infrastructure in K-12 schools is lacking. Without this support, some schools and vendors may not understand their obligations under student privacy laws such as COPPA, FERPA, and PPRA.

The Future of Privacy Forum believes it is critical that schools are provided with the help needed to build the capacity for data governance, training of essential personnel, and basic auditing. Schools must ensure additional data transparency to engender trust, tapping into innovative solutions such as digital backpacks, and providing parent friendly communications that explain how technology and data are used in schools.

Representatives Jared Polis and Luke Messer have called for bipartisan action on student data privacy, and the Future of Privacy Forum looks forward to working with them on their efforts.

Without measures to help parents see clearly how data are used to help their children succeed, the debate about data in education will remain polarized. With such measures in place, ed tech can be further harnessed to bridge educational inequalities, better tailor solutions for individual student needs, and provide objective metrics for measurement and improvement.

Striking a nuanced and thoughtful balance between harnessing digital innovation in education, while taking into account the need to protect student privacy, will help ensure trust, transparency, and progress in our education paradigm for years to come.

-Jules Polonetsky, Executive Director

Making Perfect De-Identification the Enemy of Good De-Identification

This week, Ann Cavoukian and Dan Castro waded into the de-identification debate with a new whitepaper, arguing that the risk of re-identification has been greatly exaggerated and that de-identification will play a central role in the age of big data. FPF has repeatedly called for the need for informed conversations about what practical de-identification requires, and while part of the challenge is that terms like de-identification or “anonymization” have come to mean very different things to different stakeholders, privacy advocates have effectively made perfection the enemy of the good when it comes to de-identifying data.

Cavoukian and Castro highlight the oft-cited re-identification of Netflix users as an example of how re-identification risks have been overblown. Researchers were able to compare data released by Netflix with records available on the Internet Movie Database in order to uncover the identities of Netflix users.  While this example highlights the challenges facing organizations when they release large public datasets, it is easy to ignore that only two out of 480,189 Netflix users were successfully identified in this fashion. That’s a 0.0004 percent re-identification rate – that’s only a little bit worse than anyone’s odds of being struck by lightning.*

De-identification’s limitations are often conflated with a lack of trust in how organization’s handle data in general. Most of the big examples of re-identification, like the Netflix example, focus on publicly-released datasets. When data is released into the wild, organizations need to be extremely careful; once data is out there anyone with the time, energy, or technological capability has the opportunity to try to re-identify the dataset. There’s no question that companies have made mistakes when it comes to making their data widely available to the public.

But focusing on publicly-released information does not describe the entire universe of data that exists today. In reality, much data is never released publicly. Instead, de-identification is often paired with a variety of administrative and procedural safeguards that govern how individuals and organizations can use data. When used in combination, bad actors must (1) circumvent administrative restraints and (2) then re-identify any data before getting any value from their malfeasance. As a matter of simple statistics, the probability of breaching both sets of controls and successfully re-identifying data in a non-public database is low.

De-identification critics remain skeptical. Some have argued that any potential ability to reconnect information to an individual’s personal identify suggests inadequate de-identification. Perfect unlinkability may be an impossible standard, but this argument is less an attack on the efficacy of de-identification than it is a manifestation of a lack of trust. When some suggest we ignore privacy, it makes it easier for critics to not trust how businesses protect data. Fights about de-identification thus became a proxy for how much to trust industry.

In the process, discussions about how to advance practical de-identification are lost. As a privacy community, we should fight over exactly what de-identification means. FPF is currently engaged in just such a scoping project. Recognizing that there are many different standards for how academics, advocates, and industry understand “de-identified” data should be the start of a serious discussion about what we expect out of de-identification, not casting aside the concept altogether. Perfect de-identification may be impossible, but good de-identification isn’t.

-Joseph Jerome, Policy Counsel

* Daniel Barth-Jones notes that I’ve compared the Netflix re-identification study to the annual risk of being hit by lightning and responds as follows:

This was an excellent and timely piece, but there’s a fact that should be corrected because this greatly diminishes the actual impact of the statistic you’ve cited. The article cites the fact that only two out of 480,189 Netflix users were successfully identified using the IMDb data, which rounds to a 0.0004 percent (i.e., 0.000004 or 1/240,000) re-identification risk. This is correct, but then the piece goes on to say “that’s only a little bit worse than anyone’s odds of being struck by lightning.” Which, without further explanation, is likely to be misconstrued.

The blog author cites the annual risk for being hit by lightning (which is, of course, exceedingly small). However, the way most people probably think about lightning risk is not “what’s the risk of being hit in the next year”, but rather “what’s my risk of ever being hit by lightning”? While estimates of the lifetime risk of being hit by lightning vary slightly (according to the precision of the formulas used to calculate this estimate), one’s lifetime odds of being hit by lightning is somewhere between 1 in 6,250 and 1 in 10,000, so even if you went with the more conservative number here, the risk being re-identified by the Netflix attack was only 1/24 of your lifetime risk of being hit by lighting (assuming you’ll make to age 80 without something else getting you). This is truly a risk at a magnitude that no one rationally worries about.

Although the evidence-base provided by the Netflix re-identification was extremely thin, the algorithm is intelligently designed and it will be helpful to the furtherance of sound development of public policy to see what the re-identification potential is for such an algorithm with a real-world sparse dataset (perhaps medical data?) for a randomly selected data sample when examined with some justifiable starting assumptions regarding the extent of realistic data intruder background knowledge (which should reasonably account for practical data divergence issues).


Privacy Calendar

Jul
24
Thu
9:30 am The Federal Trade commission and Its Section 5 Authority: Prosecutor, Judge, and Jury @ Rayburn House Office Building, Room 2154
The Federal Trade commission and… @ Rayburn House Office Building, Room 2154
Jul 24 @ 9:30 am – 11:00 am
The House Oversight and Government Reform Committee will be holding a hearing on the Federal Trade Commission and its Section 5 authority.
Sep
15
Mon
all-day Big Data: A Tool for Inclusion or Exclusion? @ Constitution Center
Big Data: A Tool for Inclusion o… @ Constitution Center
Sep 15 all-day
The Federal Trade Commission will host a public workshop entitled “Big Data: A Tool for Inclusion or Exclusion?” in Washington on September 15, 2014, to [...]
Sep
17
Wed
all-day IAPP Privacy Academy and CSA Congress 2014 @ San Jose Convention Center
IAPP Privacy Academy and CSA Con… @ San Jose Convention Center
Sep 17 – Sep 19 all-day
This fall, the International Association of Privacy Professionals (IAPP) and Cloud Security Alliance (CSA) are bringing together the IAPP Privacy Academy and the CSA Congress [...]
Oct
21
Tue
6:00 pm Consumer Action’s 43rd Annual Awards Reception @ Google
Consumer Action’s 43rd Annual Aw… @ Google
Oct 21 @ 6:00 pm – 8:00 pm
To mark its 43rd anniversary, Consumer Action’s Annual Awards Reception on October 21, 2014, will celebrate the theme of “Train the Trainer.” Through the power of [...]
Jan
28
Wed
all-day Data Privacy Day
Data Privacy Day
Jan 28 all-day
“Data Privacy Day began in the United States and Canada in January 2008, as an extension of the Data Protection Day celebration in Europe. The [...]

View Calendar