The about:home startup cache

By default, a user’s browser session starts with a single window and a single tab, pointed at about:home. This means that it’s important to ensure that about:home loads as quickly as possible to provide a fast overall startup experience.

about:home, which is functionally identical to about:newtab, is generated dynamically by calculating an appropriate state object in the parent process, and passing it down to a content process into the React library in order to render the final interactive page. This is problematic during the startup sequence, as calculating that initial state can be computationally expensive, and requires multiple reads from the disk.

The about:home startup cache is an attempt to address this expense. It works by assuming that between browser sessions, about:home usually doesn’t need to change.

Components of the about:home startup cache mechanism

There are 3 primary components to the cache mechanism:

The HTTP Cache

The HTTP cache is normally used for caching webpages retrieved over the network, but seemed like the right fit for storage of the about:home cache as well.

The HTTP cache is usually queried by the networking stack when browsing the web. The HTTP cache is, however, not typically queried when accessing chrome:// or resource:// URLs, so we have to do it ourselves, manually for the about:home case. This means giving about:home special capabilities for populating and reading from the HTTP cache. In order to avoid potential security issues, this requires that we sequester about:home / about:newtab in their own special content process. The “privileged about content process” exists for this purpose, and is also used for about:logins and about:certificate.

The HTTP cache lives in the parent process, and so any read and write operations need to be initiated in the parent process. Thankfully, however, the HTTP cache accepts data using nsIOutputStream and serves it using nsIInputStream. We can send nsIInputStream over the message manager, and convert an nsIInputStream into an nsIOutputStream, so we have everything we need to efficiently communicate with the “privileged about content process” to save and retrieve page data.

The official documentation for the HTTP cache can be found here.

AboutHomeStartupCache

This singleton component lives inside of BrowserGlue to avoid having to load yet another JSM out of the omni.ja file in the parent process during startup.

AboutHomeStartupCache is responsible for feeding the “privileged about content process” with the nsIInputStream’s that it needs to present the initial about:home document. It is also responsible for populating the cache with updated versions of about:home that are sent by the “privileged about content process”.

Since accessing the HTTP cache is asynchronous, there is an opportunity for a race, where the cache can either be accessed and available before the initial about:home is requested, or after. To accommodate for both cases, the AboutHomeStartupCache constructs nsIPipe instances, which it sends down to the “privileged about content process” as soon as one launches.

If the HTTP cache entry is already available when the process launches, and cached data is available, we connect the cache to the nsIPipe’s to stream the data down to the “privileged about content process”.

If the HTTP cache is not yet available, we hold references to those nsIPipe instances, and wait until the cache entry is available. Only then do we connect the cache entry to the nsIPipe instances to send the data down to the “privileged about content process”.

AboutNewTabService

The AboutNewTabService is used by the AboutRedirector in both the parent and content processes to determine how to handle attempts to load about:home and about:newtab.

There are distinct versions of the AboutNewTabService - one for the parent process (BaseAboutNewTabService), and one for content processes (AboutNewTabChildService, which inherits from BaseAboutNewTabService).

The AboutRedirector, when running inside of a “privileged about content process” knows to direct attempts to load about:home to AboutNewTabChildService’s aboutHomeCacheChannel method. This method is then responsible for choosing whether or not to return an nsIChannel for the cached document, or for the dynamically generated version of about:home.

AboutHomeStartupCacheChild

This singleton component lives inside of the “privileged about content process”, and is initialized as soon as the message is received from the parent that includes the nsIInputStream’s that will be used to potentially load from the cache.

When the AboutRedirector in the “privileged about content process” notices that a request has been made to about:home, it asks nsIAboutNewTabService to return a new nsIChannel for that document. The AboutNewTabChildService then checks to see if the AboutHomeStartupCacheChild can return an nsIChannel for any cached content.

If, at this point, nothing has been streamed from the parent, we fall back to loading the dynamic about:home document. This might occur if the cache doesn’t exist yet, or if we were too slow to pull it off of the disk. Subsequent attempts to load about:home will bypass the cache and load the dynamic document instead. This is true even if the privileged about content process crashes and a new one is created.

The AboutHomeStartupCacheChild will also be responsible for generating the cache periodically. Periodically, the AboutNewTabService will send down the most up-to-date state for about:home from the parent process, and then the AboutHomeStartupCacheChild will generate document markup using ReactDOMServer within a ChromeWorker. After that’s generated, the “privileged about content process” will send up nsIInputStream instances for both the markup and the script for the initial page state. The AboutHomeStartupCache singleton inside of BrowserGlue is responsible for receiving those nsIInputStream’s and persisting them in the HTTP cache for the next start.

What is cached?

Two things are cached:

  1. The raw HTML mark-up of about:home.

  2. A small chunk of JavaScript that “hydrates” the markup through the React libraries, allowing the page to become interactive after painting.

The JavaScript being cached cannot be put directly into the HTML mark-up as inline script due to the CSP of about:home, which does not allow inline scripting. Instead, we load a script from about:home?jscache. This goes through the same mechanism for retrieving the HTML document from the cache, but instead pulls down the cached script.

If the HTML mark-up is cached, then we presume that the script is also cached. We cannot cache one and not the other. If only one cache exists, or only one has been sent down to the “privileged about content process” by the time the about:home document is requested, then we fallback to loading the dynamic about:home document.

Refreshing the cache

The cache is refreshed periodically by having ActivityStreamMessageChannel tell AboutHomeStartupCache when it has sent any messages down to the preloaded about:newtab. In general, such messages are a good hint that something visual has updated for the next about:newtab, and that the cache should probably be refreshed.

AboutHomeStartupCache debounces notifications about such messages, since they tend to be bursty.

Invalidating the cache

It’s possible that the composition or layout of about:home will change over time from release to release. When this occurs, it might be desirable to invalidate any pre-existing cache that might exist for a user, so that they don’t see an outdated about:home on startup.

To do this, we set a version number on the cache entry, and ensure that the version number is equal to our expectations on startup. If the version number does not match our expectation, then the cache is discarded and the about:home document will be rendered dynamically.

The version number is currently set to the application build ID. This means that when the application updates, the cache is invalidated on the first restart after a browser update is applied.

Handling errors

about:home is typically the first thing that the user sees upon starting the browser. It is critically important that it function quickly and correctly. If anything happens to go wrong when retrieving or saving to the cache, we should fall back to generating the document dynamically.

As an example, it’s theoretically possible for the browser to crash while in the midst of saving to the cache. In that case, we might have a partial document saved, or a partial script saved - neither of which is acceptable.

Thankfully, the HTTP cache was designed with resilience in mind, so partially written entries are automatically discarded, which allows us to fall back to the dynamic page generation mode.

As additional redundancy to that resilience, we also make sure to create a new nsICacheEntry every time the cache is populated, and write the version metadata as the last step. Since the version metadata is written last, we know that if it’s missing when we try to load the cache that the writing of the page and the script did not complete, and that we should fall back to dynamically rendering the page.