Google’s FLoC & Its Impact on Privacy

For more than two decades, the third-party cookie had been backing multi-billion dollar advertising surveillance that had netizens followed across the web, profiled and retargeted based on their online activity. Although the technology worked in favour of marketers, it had risen beyond control for users, permeating their online browsing, breaching privacy, and transpiring a common choice that it be subdued for the better.

FLoC

Citing the privacy-endangering aspect, several browsers including Firefox and Safari have started phasing out third-party cookies from their platforms by default. With cookies’ departure also sets sail the days of personalised advertising, leaving a void for replacement. Chrome, still on the cookie-fed ventilator, has to find its feet on the ground, amid the foundation of the privacy landscape rapidly changing and newer legal and regulatory frameworks are being enacted across the world.

Google’s Federated Learning of Cohorts (FLoC) is based on the idea of a Privacy Sandbox, a Google led initiative for websites to request certain bits of information about users without over-stepping the mark.  Besides FLoC, the Privacy Sandbox covers other technologies too; gor for preventing ad fraud, for helping website developers analyze incoming traffic for measuring advertising effectiveness etc.

Google in March 2021, almost a year after Safari facilitated users to turn off third-party cookies, announced that they would put an end to third-party trackers by 2022 and replace the aforementioned with Privacy Sandbox’s Federated Learning of Cohorts (FLoC for short). The idea behind FLoC is to serve ads based on the interest of the users without their browsing history revealed to advertisers. FLoC replaces third-party cookies with a new technology called “cohort” identifier which basically involves grouping users with similar interests.

Tracking mechanism of FLoC

Reportedly, FLoC will use the SimHash algorithm. It was originally created for use by Google web crawlers to detect nearly identical web pages. With FLoC, users’ browsing history remains private. Instead of cookies’ way of tracking users’ browsing history, FLoC categories users with similar browsing behaviour into numbered “cohorts”.

Each cohort, or simply group, contains thousands of users. This method hides individuals in the group and deploys on-device processing to keep a person’s web history private on the browser. Since this happens locally, on user’s devices, their data wouldn’t get stored on a server–one of the privacy concerns linked with third-party cookies.

COHORT FORMATION

According to the proposed model, each week an individual’s browser will run a review of the sites visited by the individual and cluster them to a cohort. Each cohort holds visitors’ interest and behaviour data for up to a week and is updated weekly based on the prior week’s interest and behaviour data.

FLoC assigns an anonymised ID to the accumulated browser history of an individual and adds it to a group of other browsers with similar behaviours where the overall patterns are accessible to advertisers. (Note: The website, along with its contents, influence the user ‘clustering’.)

Let’s understand how Google Chrome’s algorithms assign users a common “cohort” with an example, but before that let’s acquaint with different parties involved in the process:

  • The advertiser (a company that pays for advertising) let’s say is an online shoe retailer: shoestore.example
  • The publisher (a site that sells ad space) let’s say is a news site: dailynews.exmple
  • The adtech platform (one that provides software and tools to deliver advertising): adnetwork.example

Let’s, for this particular case, call users, Brad and Angelina whose browsers belong to the same cohort, 1234. (Note: Names are random. With FLoC, names and individual identities are not revealed to the advertisers, publishers, or adtech platforms. Also, think of cohorts as a grouping of browsing activity, not a collection of people.)

Let’s see the different layers of serving ads:

  1. FLoC service: The FLoC service of the browser formulates a mathematical model with thousands of “cohorts”, each representing thousands of web browsers with similar browsing histories. Each cohort is issued an ID.
  2. Browser: From the FLoC service, Brad’s browser gets data describing the FLoC model. Using the FLoC model’s algorithm, Brad’s browser exercises which cohort corresponds closely to its own browsing history, which for this case is 1234. (Note: Brad’s browser doesn’t share any data with FLoC service.) Similarly, Angelina’s browser calculates its cohort ID and associates itself to 1234. (Note: Angelina’s browsing history is different from Brad’s yet close enough to belong to the same cohort.)
  3. Advertiser: Brad, looking for hiking boots, visits shoestore.example. The site fetches cohort 1234 from Brad’s browser. The site registers that someone from cohort 1234 exhibited an interest in hiking boots. The site also registers some additional interest in its product from the same cohort, as well as from other cohorts, which it periodically aggregates and shares with adtech platform, adnetwork.example.
  4. Publisher: Angelina visits dailynews.example where the site asks Angelina’s browser for its cohort. The site then makes a request to its adtech platform, adnetwork.example, for an ad, including Angelina’s browser’s cohort, 1234.
  5. Adtech platform: adnetwork.example selects an ad suitable for Angelina by mixing the data–Angelina’s cohort (1234) provided by dailynews.example and data related to cohorts and product interests provided by shoestore.example–acquired from the publisher and the advertiser. Adnework.example selects an ad for hiking boots for Angelina and dailynews.example displays the ad.

Impact on advertisers and publishers

Nowadays, when people are becoming more and more privacy-conscious, switching to cohorts can be seen as a go-to strategy for marketers, rather than interpreting it to future-proofing marketing strategies, for there is nothing new about cohorts. In fact, the very concept around which FLoC is built–the clustering of large groups of people with a shared interest in such a manner that privacy stays unviolated–has been a marketing principle for nearly forever.

Cohorts pose the same limitations for advertisers and publishers as used to third-party cookies: insufficient, time-bounded, browser-level insights of their audience. Advertisers are limited to seeing only the cohort an individual belongs to; without any info about characteristics that link its members. As the case with FLoC appears, advertisers and publishers should forget about delivering bespoke experiences to individuals like the case with third-party trackers.

Capitalisation and monetization  on data and building billion-dollar companies off  it would soon get pivoted around the privacy hinge wasn’t something that marketers expected. But if anything was sure about advertising’s future, was cohorts coming to full-fledged potential. From the Google Blog we find, results driven from simulation tests run on the effectiveness of principles defined in Chrome’s FLoC proposal yield at least 95% of the conversions per dollar spent compared to cookie-based advertising.

Source : https://blog.google/products/ads-commerce/2021-01-privacy-sandbox/

The conjunction of cohorts and probabilistic data–identifying users by matching them with a known user who exhibits similar browsing behaviour–is a well-established concept within many of the world’s largest enterprises, but it hasn’t received mainstream attention–until now. Probabilistic onboarding is all about structuring cohorts and finding new customers. This business strategy which lies at the heart of Google’s FLoC, can’t be overshadowed on the quest for personalisation.

With the implementation of FLoC, Google wants advertisers and publishers to begin tracking user activity with its own first-party cookies rather than depending on third-party data. The marketer’s solution to FLoC will be leveraging first-party data, which will no longer be optional but will comprise the core component of any successful marketing strategy, for creating better customer experiences and optimizing marketing efforts.

“73% of consumers are willing to share more data if a company is transparent about how and why it is used.”

Privacy analysis of FLoC

There are numerous privacy issues with FLoC that are getting public attention way before launch. We are addressing here a few:

Cohort IDs can be used for tracking

According to Firefox CTO Eric Rescorla, cohorts will likely consist of thousands of users at most. Tracking companies can employ browser fingerprinting to narrow down the list of potential users in a cohort to just a few very quickly. To do so, trackers would only require “a relatively small amount of information” when combined with a FLoC cohort.

This is possible through a number of ways:

Browser Fingerprinting

Even though users’ local browsing data is not shared–only cohort information is transmitted–that data along with other data exposed in the browser can be compiled to create a unique fingerprint of each person.

Each detail of user-specific variation–like browser type, OS brand, language, country–can help reveal a distinction between users. In case a cohort of about 10000 users is divided into 5000 groups with a fingerprint technique, the number of users in each FLoC cohort pair/fingerprinting group narrows down to as low as one-digit–as easy as pie to identify people individually.

Though this is not possible with cohorts of large size, it doesn’t set FLoC free from individual targeting.

Multiple visits

People’s interests online aren’t constant and neither are their FLoC IDs which are recomputed every next week. If a tracker succeeds in using other already available information to link up users multiple visits over time, it’s within their capacity to distinguish individual users by combining FLoC IDs in week 1, week 2, etc.

It poses a big challenge for de-anonymisation as FLoC restores cross-site tracking even if users have anti-tracking mechanisms enabled.

The project’s Github page states, “Sites that know a person’s PII (e.g., when people sign in using their email address) could record and reveal their cohorts. This means that information about an individual’s interest may eventually become public.” In other words, FLoC’s technology will share personal data with existing trackers which already identify users.

Source : https://github.com/WICG/floc

FLoC exposes ton load of info other than necessary

A site interested in learning users’ interests only needs to participate in tracking the user across a large number of sites or work with some other big trackers.

Because FLoC IDs are common across all sites, they become a shared key to which trackers can link data with external sources, making it possible for a tracker with a large first-party interest database to work out a service that answers questions about the interests of a given FLoC ID, like “Do people with this cohort ID like pizza?” To do so, all a site needs to do is call the FLoC APIs to fetch the cohort ID and then use it to scan information in the service.

Also, this ID can be combined with fingerprinting data to learn a lot more about a user. For example, “Do people who have this cohort ID, live in India and use Safari have any affinity for a certain product?”

Safety of sensitive information

Google has proposed that it will suppress FLoC cohorts which it finds closely linked with “sensitive” topics. In a whitepaper entitled “Measuring Sensitivity of Cohorts Generated by the FLoC API” Google details out its strategy regarding the safety of sensitive data.  Read more about the whitepaper at Sensitivity of Cohorts.pdf (google.com).

If Google finds users in a given cohort frequently visiting a set of sites with sensitive info, they will return an empty cohort ID pertinent to that cohort. In addition, they will also remove sites that they find sensitive from the FLoC computation.

However, complications with the sensitive info categorisation–like people’s disagreement over what qualifies as sensitive for them, incomplete formulation of sensitive categories, correlation of non-sensitive sites with sensitive sites–make Google’s defence mechanism quite a hard task to execute.

Although Google has proposed plenty of countermeasures to mitigate sensitive data-related problems, including making FLoC opt-in for websites and suppressing cohorts associated with sensitive topics, Firefox finds it not enough.

Addressing this issue, Rescorla wrote, “While these mitigations seem useful, they seem to mostly be improvements at the margins, and don’t address the basic issues described above, which we believe require further study by the community.”

Honing the significance attached to protection of sensitive data in post-cookie era, Marshall Vale, the product manager at Google’s privacy sandbox, writes: “Before a cohort becomes eligible, Chrome analyses it to see if the cohort is visiting pages with sensitive topics, such as medical websites or websites with religious content, at a high rate. If so, Chrome ensures that the cohort isn’t used, without learning which sensitive topics users were interested in.”

Source: https://blog.google/products/chrome/privacy-sustainability-and-the-importance-of-and/

FLoC is getting defamed, for obvious reasons

FLoC is only being tested in countries where GDPR and e-Privacy is not in place. FLoC trial in the European Union has been paused on the grounds of GDPR and e-Privacy non-compliance. FLoC lacks the consent mechanism for users to opt-out of having their interest and behavioural data included for advertising.

The other confusion prevailing on Google FLoC is about the Data Controller and Data Processor as defined in the EU GDPR.  There is no clarity that in Google FLoC who will serve as the Data Controller and who will serve as the Data Processor in the creation of cohorts.

Source : Google Will Not Run FLoC Origin Tests In Europe Due To GDPR Concerns (At Least For Now) | AdExchanger

According to Malwarebytes, millions of Chrome users were automatically made part of the FLoC’s pilot without being informed. Despite Google’s rhetoric stance on safeguarding user privacy, Google started testing FLoC without sending individualised notifications to users. Chrome users have no option to opt-out, instead having to block all third-party cookies to pull out of the trial.

Source: https://blog.malwarebytes.com/cybercrime/privacy/2021/04/millions-of-chrome-users-quietly-added-to-googles-floc-pilot/

In one of the Electronic Frontier Foundation (EFF) posts, “Google’s FLoC Is a Terrible Idea”, Bennett Cyphers, author of the article, writes: Google is adopting a false dichotomy when it comes to privacy. “Instead of re-inventing the tracking wheel, we should imagine a better world without the myriad problems of targeted ads.” The author argues that users’ options should not be truncated to “You either have old tracking or new tracking”.

Source: https://www.eff.org/deeplinks/2021/03/googles-floc-terrible-idea

Privacy pundits like DuckDuckGo and Brave browser take issue with all forms of tracking. Citing Google’s tracking via FLoC is non-optional, DuckDuckGo raised voice against Google’s new tracking technology. It’s bringing FLoC-blocking features to DuckDuckGo search engine and Chrome browser extension. Brave browser said that FLoc is promoting a false notion of what privacy is, and why privacy is important.

Amazon has also blocked Google Floc Tests from collecting data from its digital properties like Amazon.com, WholeFoods.com and Zappos.com.  It is a major step from Amazon restricting and denying the Google FLoC, ability to collect valuable shopper data within the e-commerce eco-system.

Source : Amazon Allegedly Blocking Google FLoC: Reports 06/17/2021 (mediapost.com)

Google FLoC and Antitrust probe European Union

The European Commission has already verified in writing that Google’s use of data to fuel its adtech business is a focus of its ongoing antitrust investigation.  And the Google’s attempt to kill off 3rd party cookies with its Federated Learning of Cohorts system is within the scope of the EU’s adtech competition probe.

Source : EU’s Margrethe Vestager Confirms That Google’s Planned Removal Of Third-Party Cookies Is An Antitrust Concern | AdExchanger

Conclusion

These changes are being driven by consumer demands for privacy and control over their personal information.

But Consumer demand is not the only motivator. However there are new legislative requirements and clear guidance from regulators that it is time to move away from the tracking based advertising ecosystem. This is a death blow to cookies with successive piece of privacy legislation. The European Union regulator’s guidance stepping from the EU General Data Protection Regulation and e-Privacy directive ushered in new approaches to cookie banners requiring specific, unambiguous and freely given consent. Cookie notice banners were in then they were out. Cookie walls became a problem; Toggles replaced them. 

The California Consumer Privacy Act (CCPA) 2018 regulations ensured business compliance with user enabled global privacy controls to send such opt-out signals necessitating a browser based opt-out tool.

For more information, read the California Consumer Privacy Act of 2018 at Codes Display Text (ca.gov)

The California Privacy Rights Acts (CPRA) included cookies in the definition of personal information.  This has raised an interesting possibility that the deployment of 3rd party cookies for advertising constitutes a sale, requiring an opt-out button. 

The California Privacy Rights Act (comes into effect in January 2023), has brought more clarity to the rules for 3rd party cookies, granting consumer the right to opt out of the sharing of their personal information for cross context behavioural advertising. The European Union proposed e-Privacy regulation requires explicit consent prior to placement of a cookie but allows users to “white list” certain providers via browser settings to avoid constant cookie requests.  This legislative progression has not only pushed for increasingly granular cookie consent but it has also pushed toward a platform based control.

For more information, read the California Consumer Privacy Rights Act at https://oag.ca.gov/system/files/initiatives/pdfs/19-0021A1 %28Consumer Privacy – Version 3%29_1.pdf

The growing privacy awareness of users and their intent to safeguard their personal data is not beneficial for targeted advertisers or Google. All the hullabaloo about FLoC is for the underlying reason that Google’s FLoC is plagued with a number of privacy risks if it were deployed in its current form, which is in the testing phase.

At first glance FLoC appears to be a win-win for advertisers, publishers and internet users, but there is more to it than easy execution and Google’s dream of dominance in AdTech world, which we shall see once Google finally uncover its long-awaited advertising technique and market’s response to it.

While FLoC has been a matter of uncertainty for marketers recently, it’s time for them to get serious about leveraging first-party data strategy, which is the future of digital marketing.

We at Data Secure(www.datasecure.ind.in)  can help you to understand Privacy and Trust while dealing with data and provide Privacy Training and Awareness sessions to improve upon the knowledge of Privacy what you already know.

For any demo/presentation of solutions on Data Privacy and Privacy Management as per EU GDPR, CCPA or Draft India PDPB 2019 and Secure Email transmission, kindly write to us at info@datasecure.ind.in.

For downloading the various Global Privacy Laws kindly visit the Resources page in DATA SECURE – Privacy Automation Solution

Leave a Reply

Your email address will not be published. Required fields are marked *