MAC Addresses and De-Identification

MAC Addresses and De-Identification

Location analytics companies log the hashed MAC address of mobile devices in range of their sensors at airports, malls, retail locations, stadiums and other venues. They do so primarily in order to create statistical reports that provide useful aggregated information such as average wait times on line, store “hot spots,” and the percentage of devices that never make it into a zone that includes a checkout register. FPF worked with the leading companies providing these services to create an enforceable Mobile Location Code of Conduct that restricts discriminatory uses of data, creates a central opt out, promotes in-store notice and other protections. We filed comments last week with the FTC describing the program in detail.

The only data transmitted by mobile devices that most location companies can log is the MAC address – the Wi-Fi or Bluetooth identifier devices broadcast when Wi-Fi or Bluetooth is turned on. The privacy debate around the use of this technology and the Code has centered on the sensitivity of logging and maintaining hashed MAC addresses, and hinges on whether a MAC address should be considered personal information.

Is a MAC address personal information? Well, it is linked to individual consumer devices, either as a consistent Wi-Fi or Bluetooth identifier. If enough data is linked to any consistent identifier over time, it is in the realm of technical possibility that the identity of a user can be ascertained. If there was a commercially-available database of MAC addresses, it is possible that such a database could be used to identify users. We are not aware of any such MAC address look-up database. But we do recognize that the data collected is linked to a specific device. For this reason, the Code of Conduct treats hashed MAC addresses associated with unique devices as something in between fully anonymized data and explicitly personal data. This reflects the view that Professor Daniel Solove posited effectively when he argued that PII exists not as a binary, but on a spectrum, with no risk of identification at one end, and individual identification at the other. In many real-world instances of data collection, the privacy standards in place reflect where the data lies on this spectrum; they consist not only of technical measures to protect the data, but also internal security and administrative controls, as well as enforceable legal commitments. In the case of Mobile Location Analytics, many companies are confident that by hashing MAC addresses, keeping them under administrative and security controls, and publicly committing not to attempt to identify users, they have adequately de-identified the data they log.

However, it is important to understand, that Code does NOT take the position that hashing MAC addresses amounts to a de-identification process that fully resolves privacy concerns. According to the Code, data is only considered fully “de-identified” where it may not reasonably be used to infer information about or otherwise be linked to a particular consumer, computer, or other device. To qualify as de-identified under the Code, a company must take measures such as aggregating data, adding noise to data, or statistical sampling. These are considered to be reasonable measures that de-identify data under the Code, as long as an MLA company also publicly commits not to try to re-identify the data, and contractually prohibits downstream recipients from trying to re-identify it. To assure transparency, any company that does de-identify data in this way must describe how they do so in their privacy policy.

As most of the companies involved in mobile location analytics do indeed link hashed MAC addresses to individual devices, the data they collect to track devices over time does not qualify as strictly “de-identified” under the Code and the data they collect is not exempt from the Code. Rather, the companies collect and use what the Code terms “de-personalized” data.* De-personalized data is defined in the Code as data that can be linked to a particular device, but cannot reasonably be linked to a particular consumer. Companies using de-personalized data must:

    1. take measures to ensure that the data cannot reasonably be linked to an individual (for instance, hashing a MAC address or deleting personally identifiable fields);
    2. publicly commit to maintain the data as de-personalized; and
    3. contractually prohibit downstream recipients from attempting to use the data to identify a particular individual.

When companies hash MAC addresses, they are thus fully subject to the Codes requirements, including signage, consumer choice, non-discrimination.

Different kinds of data on the PII/non-PII spectrum — given the inherent risks and benefits of each — merit a careful consideration of the combination of reasonable technical encryption and administrative measures and legal commitments that would be most suitable. After all, if “completely unidentifiable by any technical means, no matter how complex or unlikely” were the standard for the use of any data in the science and business worlds, much valuable research and commerce would come to an end. The MLA Code represents a pragmatic view that allows vendors to provide a service that is useful for businesses and consumers, while applying responsible privacy standards.

* Suggestions for a better term that de-personalized are welcomed. We considered “pseudonymized” but found the term awkward.

Comments

Posted On
Mar 27, 2014
Posted By
Jim Fenton

There are at least two issues you haven’t addressed:

1. The ability of location analytics compaies to aggregate results from more than one of their customers. If they’re simply providing a service to retailer A and retailer B, they should hash the MAC addresses from the two stores differently (using different retailer-specific values appended to the MAC addresses). But if they’re planning to aggregate the results from multiple retailers, they’re creating a more comprehensive profile of the consumer, and that creates a greater privacy concern.

2. None of the hashing discussion addresses requests that might be obtained from the government about information relating to the activities of a given MAC address. The given MAC address could, of course, just be hashed and compared with the database. While such requests might be entirely legitimate, the location analytics companies should not create a false expectation that hashing addresses this sort of potential privacy concern and should explicitly state how long individual records are kept.

Leave a Reply


Privacy Calendar

Oct
24
Fri
9:00 am Web Privacy & Transparency Confe... @ Princeton University
Web Privacy & Transparency Confe... @ Princeton University
Oct 24 @ 9:00 am – 4:00 pm
On Friday, October 24, 2014, the Center for Information Technology Policy (CITP) at Princeton University is hosting a public conference on Web Privacy and Transparency. It will explore the quickly emerging area of computer science research that[...]
Oct
29
Wed
4:00 pm Big Data and Privacy: Navigating... @ Schulze Hall
Big Data and Privacy: Navigating... @ Schulze Hall
Oct 29 @ 4:00 pm – 7:00 pm
The rapid emergence of “big data” has created many benefits and risks for businesses today. As data is collected, stored, analyzed, and deployed for various business purposes, it is particularly important to develop responsible data[...]
Oct
30
Thu
9:00 am The Privacy Act @40: A Celebrati... @ Georgetown Law
The Privacy Act @40: A Celebrati... @ Georgetown Law
Oct 30 @ 9:00 am – 5:30 pm
The Privacy Act @40 A Celebration and Appraisal on the 40th Anniversary of the Privacy Act and the 1974 Amendments to the Freedom of Information Act October 30, 2014 Agenda 9 – 9:15 a.m. Welcome[...]
Nov
7
Fri
all-day George Washington Law Review 201... @ George Washington University Law School
George Washington Law Review 201... @ George Washington University Law School
Nov 7 – Nov 8 all-day
Save the date for the GW Law Review‘s Annual Symposium, The FTC at 100: Centennial Commemorations and Proposals for Progress, which will be held on Saturday, November 8, 2014, in Washington, DC. This year’s symposium, hosted in[...]
Nov
11
Tue
10:15 am You Are Here: GPS Location Track... @ Mauna Lani Bay Hotel & Bungalows
You Are Here: GPS Location Track... @ Mauna Lani Bay Hotel & Bungalows
Nov 11 @ 10:15 am
EFF Staff Attorney Hanni Fakhoury will present twice at the Oregon Criminal Defense Lawyers Association’s Annual Sunny Climate Seminar. He will give a presentation on government location tracking issues and then participate in a panel[...]
Nov
12
Wed
all-day PCLOB Public Meeting on “Definin... @ Washington Marriott Hotel
PCLOB Public Meeting on “Definin... @ Washington Marriott Hotel
Nov 12 all-day
The Privacy and Civil Liberties Oversight Board will conduct a public meeting with industry representatives, academics, technologists, government personnel, and members of the advocacy community, on the topic: “Defining Privacy.”   While the Board will[...]
Nov
20
Thu
all-day W3C Workshop on Privacy and User... @ Berlin, Germany
W3C Workshop on Privacy and User... @ Berlin, Germany
Nov 20 – Nov 21 all-day
The Workshop on User Centric App Controls intents to further the discussion among stakeholders of the mobile web platform, including researchers, developers and service providers. This workshop serves to investigate strategies toward better privacy protection[...]

View Calendar