Data Sanitization

Firefox has several Data Sanitization features. They allow users to clear preferences and website data. Clearing data is an essential feature for user privacy. There are two major privacy issues data clearing helps mitigate:

  1. Websites tracking the user via web-exposed APIs and storages. This can be traditional storages, e.g. localStorage, or cookies. However, sites can also use Supercookies, e.g. caches, to persist storage in the browser.

  2. Attackers who have control over a computer can exfiltrate data from Firefox, such as history, passwords, etc.

Protection Background

What similar protections do other browsers have?

All major browsers implement data clearing features (Chrome, Edge, Safari, Brave). They usually include a way for users to clear site data within a configurable time-span along with a list of data categories to be cleared.

Chrome, Edge and Brave all share Chromium’s data clearing dialog with smaller adjustments. Notably, Brave extends it with a clear-on-shutdown mechanism similar to Firefox, while Chrome only supports clearing specifically site data on shutdown.

Safari’s history clearing feature only allows users to specify a time span. It does not allow filtering by categories, but clears all website related data.

All browsers allow fine grained control over website cookies and storages via the developer tools.

Is it standardized?

This is a browser UX feature and is therefore not standardized. It is not part of the web platform.

There is a standardized HTTP header that sites can send to clear associated browser cache, cookies and storage: Clear-Site-Data. However, Firefox no longer allows sites to clear caches via the header since Bug 1671182.

How does it fit into our vision of “Zero Privacy Leaks?”

Clearing site data protects users against various tracking techniques that rely on browser state to (re-)identify users. While Total Cookie Protection covers many cross-site tracking scenarios, clearing site data can additionally protect against first-party tracking and other tracking methods that bypass TCP such as navigational tracking.

Firefox Status

What is the ship state of this protection in Firefox?

This long standing set of features is shipped in Release in default ETP mode. In Firefox 91 we introduced Enhanced Cookie Clearing which makes use of TCP’s cookie jars. This feature only benefits users who have TCP enabled - which is not enabled default for users.

Is there outstanding work?

Since Bug 1422365 the ClearDataService provides a common interface to clear data of various storage implementations. However, we don’t have full coverage of all browser state yet. There are several smaller blind spots, most of which are listed in this meta bug. There is also a long backlog of data sanitization bugs here.

The clear history UI has a intuitive and modern UI, which offers users an easy way to clear their data, while feeling that their privacy is secured in Firefox. The UI changes were undertaken in this meta bug.

Data clearing can take a long time on bigger Firefox profiles. Since these operations mostly run on the main thread, this can lock up the UI making the browser unresponsive until the operation has completed.

Generally it would be worth revisiting cleaner implementations in the ClearDataService and beyond to see where we can improve clearing performance.

Slow data clearing is especially problematic on shutdown. If the sanitize-on-shutdown feature takes too long to clear storage, the parent process will be terminated, resulting in a shutdown crash. Bug 1756724 proposes a solution to this: We could show a progress dialog when clearing data. This way we can allow a longer shutdown phase, since the user is aware that we’re clearing data.

Important outstanding bugs:

Existing Documentation

-

Technical Information

Feature Prefs

Pref Description
places.forgetThisSite.clearByBaseDomain Switches “Forget about this site” to clear for the whole base domain rather than just the host.
privacy.sanitize.sanitizeOnShutdown Whether to clear data on Firefox shutdown.
privacy.clearOnShutdown.* Categories of data to be cleared on shutdown. True = clear category. Data is only cleared if privacy.sanitize.sanitizeOnShutdown is enabled.
privacy.clearHistory.* Categories of data to be cleared in the clear history or browser context. True = clear category.
privacy.clearSiteData.* Categories of data to be cleared in the clear site data context. True = clear category.

How does it work?

The following section lists user facing data sanitization features in Firefox, along with a brief description and a diagram how they tie into the main clearing logic in nsIClearDataService.

The recent revamp of them clear history dialog led to a combination of the various entry points, to use the same dialog

Clear browsing data and cookies

newClearHistoryDialog

Sanitize on Shutdown

  • Can be enabled via about:preferences#privacy => History: Firefox will: Use custom settings for history => Check “Clear history when Firefox closes”

    • After Bug 1681493 it can also be controlled via the checkbox “Delete cookies and site data when Firefox is closed”

  • On shutdown of Firefox, will clear all data for the selected categories. The list of categories is defined in Sanitizer.sys.mjs

  • Categories are the same as for the “Clear recent history” dialog

  • Exceptions

    • Sites which have a “cookie” permission, set to ACCESS_SESSION always get cleared, even if sanitize-on-shutdown is disabled

    • Sites which have a “cookie” permission set to ACCESS_ALLOW are exempt from data clearing

    • Caveat: When “site settings” is selected in the categories to be cleared, the Sanitizer will remove exception permissions too. This results in the above exceptions being cleared.

  • Uses PrincipalsCollector to obtain a list of principals which have site data associated with them

  • getAllPrincipals queries the QuotaManager, the cookie service and the service worker manager for principals

  • The list of principals obtained is checked for permission exceptions. Principals which set a cookie ACCESS_ALLOW permission are removed from the list.

  • Sanitizer.sys.mjs calls the ClearDataService to clear data for every principal from the filtered list

  • Source

flowchart TD A[Clear History] & B[Clear Site Data] & C[Clear On Shutdown]-->|init| D[sanitizeDialog.js] D --> |Clear|E[Sanitizer.sys.mjs] E --> F[ClearDataService.sys.mjs]

Forget About this Site

  • Accessible via hamburger menu => History => Contextmenu of an item => Forget About This Site

  • Clears all data associated with the base domain of the selected site

  • [With TCP] Also clears data of any third-party sites embedded under the top level base domain

  • The goal is to remove all traces of the associated site from Firefox

  • Clears [flags]

    • History, session history, download history

    • All caches

    • Site data (cookies, dom storages)

    • Encrypted Media Extensions (EME)

    • Passwords (See Bug 702925)

    • Permissions

    • Content preferences (e.g. page zoom level)

    • Predictor network data

    • Reports (Reporting API)

    • Client-Auth-Remember flag, Certificate exceptions

    • Does not clear bookmarks

  • Source

flowchart TD A[Places controller.js] --> B[removeDataFromBaseDomain] B --> C[ForgetAboutSite.sys.mjs] C --> D[ClearDataService.sys.mjs]

image2

Manage Cookies and Site Data

  • Accessible via about:preferences#privacy => Cookies and Site Data => Manage Data

  • Clears [flags]

    • Cookies

    • DOM storages

    • EME

    • Caches: CSS, Preflight, HSTS

  • Lists site cookies and storage grouped by base domain.

  • Clearing data on a more granular (host or origin) level is not possible. This is a deliberate decision to make this UI more thorough in cleaning and easier to understand. If users need very granular data management capabilities, they can install an addon or use the devtools.

  • Allows users to clear storage for specific sites, or all sites

  • [With TCP] Also clears data of any third-party sites embedded under the top level base domain

  • Collects list of sites via SiteDataManager.getSites

  • Before removal, prompts via SiteDataManger.promptSiteDataRemoval

  • On removal calls SiteDataManager.removeAll() if all sites have been selected or SiteDataManager.remove() passing a list of sites to be removed.

  • Source

Clear Cookies and Site Data

  • Accessible via the identity panel (click on lock icon in the URL bar)

  • Clears [flags]

    • Cookies

    • DOM storages

    • EME

    • Caches: CSS, Preflight, HSTS

  • Button handler method: clearSiteData

  • Calls SiteDataManager.remove() with the base domain of the currently selected tab

  • The button is only shown if a site has any cookies or quota storage. This is checked here.

  • Source

image7

image5

A broad overview of the different data clearing features accessible via about:preferences#privacy.

The user can clear data on demand or choose to clear data on shutdown. For the latter the user may make exceptions for specific origins not to be cleared or to be always cleared on shutdown.

ClearDataService

This service serves as a unified module to hold all data clearing logic in Firefox / Gecko. Callers can use the nsIClearDataService interface to clear data. From JS the service is accessible via Services.clearData.

To specify which state to clear pass a combination of flags into aFlags.

Every category of browser state should have its own cleaner implementation which exposes the following methods to the ClearDataService: - deleteAll: Deletes all data owned by the cleaner - deleteByPrincipal: Deletes data associated with a specific principal. - deleteByBaseDomain: Deletes all entries which are associated with the given base domain. This includes data partitioned by Total Cookie Protection. - deleteByHost: Clears data associated with a host. Does not clear partitioned data. - deleteByRange: Clear data which matches a given time-range. - deleteByLocalFiles: Delete data held for local files and other hostless origins. - deleteByOriginAttributes: Clear entries which match an OriginAttributesPattern.

Some of these methods are optional. See comment here.

If a cleaner does not support a specific method, we will usually try to fall back to deleteAll. For privacy reasons we try to over-clear storage rather than under-clear it or not clear it at all because we can’t target individual entries.

image8

Overview of the most important cleaning methods of the ClearDataService called by other Firefox / Gecko components. deleteDataFromPrincipal is called programmatically, while user exposed data clearing features clear by base domain, host or all data.