Worker’s Life-Cycle and WorkerRefs
Worker, as a thread programming model, is introduced to the Web world to use the computing power efficiently in the Web world. Just like the regular thread programming, Worker can be created and deleted anytime when needed.
Since Worker can be deleted anytime, when developing APIs on the Workers should be careful when handling the shutdown behavior. Otherwise, memory problems, UAF, memory leaking, or shutdown hang, would not be a surprise. In addition, debugging these issues on Workers sometimes is not easy. The crash stack might not provide very useful information. The bug sometimes needs a special sequence to reproduce since it could be a thread interleaving problem. To avoid getting into these troubles, keeping the Worker’s life cycle and how to play with WorkerRefs in mind would be very helpful.
Worker Life-Cycle
The worker’s life cycle is maintained by a status machine in the WorkerPrivate class. A Worker could be in following status
Pending
Running
Closing
Canceling
Killing
Dead
Following we briefly describe what is done for each status.
Pending:
This is the initial status of a Worker.
Worker’s initialization is done in this status in the parent(main or the parent worker) and worker thread.
Worker’s initialization starts from its parent thread, which includes
Get WorkerLoadInfo from parent window/worker
Create a WorkerPrivate for the Worker
Register the Worker in the RuntimeService object
Initialize a thread(worker thread) for Worker, and dispatch a WorkerThreadPrimaryRunnable on the worker thread
Connect debugger
Dispatch CompileScriptRunnable to the worker thread
Before the Worker thread starts running runnables, a Worker could have already been exposed to its parent window/worker. So the parent window/worker can send messages to the worker through the postMessage() method. If the Worker is not in the “Running” status yet, these runnables would be kept in WorkerPrivate::mPreStartRunnables.
When WorkerThreadPrimaryRunnable starts executing on the worker thread, it continues the initialization on the worker thread, which includes
Build the connection between WorkerPrivate and the worker thread. Then moving the WorkerPrivate::mPreStartRunnables to the worker thread’s event queue.
Initialize the PerformanceStorage for the Worker.
Start the Cycle-Collector for the Worker.
Initialize the JS context for the Worker.
Call WorkerPrivate::DoRunLoop() to consume the Runnables in the worker thread’s event queue.
Running:
This is the status which the Worker starts to execute runnables on the worker thread.
Once the Worker gets into “Running”,
Enable the memory reporter
Start the GC timer.
“Running” is the status where we play with the Worker. At this time point, we can
Create WorkerRefs to get the Worker shutdown notifications and run shutdown cleanup jobs through the registered callback.
Create sync-eventLoop to make the worker thread to wait for another thread’s execution.
Dispatching events to WorkerGlobalScope to trigger event callbacks defined in the script.
We will talk about WorkerRef, Sync-EventLoop in detail later.
Closing:
This is a special status for DedicatedWorker and SharedWorker when DedicateWorkerGlobalScope.close()/SharedWorkerGlobalScope.close() is called.
When Worker enters into the “Closing” status,
Cancel all Timeouts/TimeIntervals of the Worker.
Do not allow BroadcastChannel.postMessage() on the WorkerGlobalScope.
Worker will keep in the “Closing” status until all sync-eventLoops of the Worker are closed.
Canceling:
When Worker gets into the “Canceling” status, it starts the Worker shutdown steps.
Set the WorkerGlobalScope(nsIGlobalObject) as dying.
This means the event will not be dispatched to the WorkerGlobalScope and the callbacks of the pending dispatched event will not be executed.
Cancel all Timeouts/TimeIntervals of the Worker.
Notify WorkerRef holders and children Workers.
So the WorkerRef holders and children Workers will start the shutdown jobs
Abort the script immediately.
Once all sync-eventLoops are closed,
Disconnect the EventTarget/WebTaskScheduler of the WorkerGlobalScope
Killing:
This is the status that starts to destroy the Worker
Shutdown the GC Timer
Disable the memory reporter
Switch the status to “Dead”
Cancel and release the remaining WorkerControlRunnables
Exit the WorkerPrivate::DoRunLoop()
Dead:
The Worker quits the main event loop, it continues the shutdown process
Release the remaining WorkerDebuggerRunnables
Unroot the WorkerGlobalScope and WorkerDebugGlobalScope
Trigger GC to release GlobalScopes
Shutdown the Cycle-Collector for Worker
Dispatch TopLevelWorkerFinishRunnable/WorkerFinishRunnable to the parent thread
Disable/Disconnect the WorkerDebugger
Unregister the Worker in the RuntimeService object
Release WorkerPrivate::mSelf and WorkerPrivate::mParentEventTargetRef
The WorkerPrivate is supposed to be released after its self-reference is nullified.
Dispatch FinishedRunnable to the main thread to release the worker thread.
How to shutdown a Worker
Normally, there are four situations making a Worker get into shutdown.
Worker is GC/CCed.
Navigating to another page.
Worker is idle for a while. (Notice that idle is not a status of Worker, it is a condition in “Running” status)
self.close() is called in the worker’s script.
Worker.terminate() is called in its parent’s script.
Firefox shutdown.
Worker Status Flowchart
This flowchart shows how the status of a Worker is changing.
When the WorkerThreadPrimaryRunnable calls WorkerPrivate::DoRunLoop on the worker thread, the status changes from “Pending” to “Running.” If Firefox shutdown happens before entering into “Running,” the status directly changes from “Pending” to “Dead.”
When a Worker is in “Running,” status changing must be caused by requesting a Worker shutdown. The status switches to “Closing,” for the special case that worker’s script calls self.close(). Otherwise, the status switches to “Canceling.” And a “Closing” Worker will switch to “Canceling” when all sync-eventLoops are completed.
A “Canceling” Worker switches its status to “Killing” when following requirements are fulfilled.
No WorkerRefs, no children Workers, no Timeouts, and no sync-eventLoops
No pending runnable for the worker thread main event queue, control runnables and debugger runnables
The status switches from “Killing” to “Dead” automatically.
WorkerRefs
Since a Worker’s shutdown can happen at any time, knowing when the shutdown starts is important for development, especially for releasing the resources and completing the operation in the Worker shutdown phase. Therefore, WorkerRefs is introduced to get the notification of the Worker’s shutdown. When a Worker enters the “Canceling” status, it notifies the corresponding WorkerRefs to execute the registered callback on the worker thread. The WorkerRefs holder completes its shutdown steps synchronously or asynchronously in the registered callback and then releases the WorkerRef.
According to the following requirements, four types of WorkerRefs are introduced.
Should the WorkerRef block the Worker’s shutdown
Should the WorkerRef block cycle-collection on the Worker
Should the WorkerRef need to be held on other threads.
WeakWorkerRef
WeakWorkerRef, as its name, is a “Weak” reference since WeakWorkerRef releases the internal reference to the Worker immediately after WeakWorkerRef’s registered callback execution completes. Therefore, WeakWorkerRef does not block the Worker’s shutdown. In addition, holding a WeakWorkerRef would not block GC/CC the Worker. This means a Worker will be considered to be cycle-collected even if there are WeakWorkerRefs to the Worker.
WeakWorkerRef is ref-counted, but not thread-safe.
WeakWorkerRef is designed for just getting the Worker’s shutdown notification and completing shutdown steps synchronously.
StrongWorkerRef
Unlike WeakWorkerRef, StrongWorkerRef does not release its internal reference to the Worker after the callback execution. StrongWorkerRef’s internal reference is released when the StrongWorkerRef gets destroyed. That means StrongWorkerRef allows its holder to determine when to release the Worker by nulling the StrongWorkerRef. This also makes StrongWorkerRef’s holder block the Worker’s shutdown.
When using the StrongWorkerRef, resource cleanup might involve multiple threads and asynchronous behavior. StrongWorkerRef release timing becomes crucial not to cause memory problems, such as UAF or leaking. StrongWorkerRef must be released. Otherwise, a shutdown hang would not be a surprise.
StrongWorkerRef also blocks the GC/CC a Worker. Once there is a StrongWorkerRef to the Worker, GC/CC will not collect the Worker.
StrongWorkerRef is ref-counted, but not thread-safe.
ThreadSafeWorkerRef
ThreadSafeWorkerRef is an extension of StrongWorkerRef. The difference is ThreadSafeWorkerRef holder can be on another thread. Since it is an extension of StrongWorkerRef, it gives the same characters as StrongWorkerRef. Which means its holder blocks the Worker’s shutdown, and It also blocks GC/CC a Worker.
Playing with ThreadSafeWorkerRef, just like StrongWorkerRef, ThreadSafeWorkerRef release timing is important for memory problems. Except the release timing, it should be noticed the callback execution on the worker thread, not on the holder’s owning thread.
ThreadSafeWorkerRef is ref-counted and thread-safe.
IPCWorkerRef
IPCWorkerRef is a special WorkerRef for IPC actors which binds its life-cycle with Worker’s shutdown notification. (In our current codebase, Cache API and Client API uses IPCWorkerRef)
Because some IPC shutdown needs to be in a special sequence during the Worker’s shutdown. However, to make these IPC shutdown needs to ensure the Worker is kept alive, so IPCWorkerRef blocks the Worker’s shutdown. But IPC shutdown no need to block GC/CC a Worker.
IPCWorkerRef is ref-counted, but not thread-safe.
Following is a table for the comparison between WorkerRefs
WeakWorkerRef | StrongWorkerRef | ThreadSafeWorkerRef | IPCWorkerRef | |
Holder thread | Worker thread | Worker thread | Any thread | Worker thread |
Callback execution thread | Worker thread | Worker thread | Worker thread | Worker thread |
Block Worker’s shutdown | No | Yes | Yes | Yes |
Block GC a Worker | No | Yes | Yes | No |
WorkerRef Callback
WorkerRef Callback can be registered when creating a WorkerRef. The Callback takes the responsibility for releasing the resources related to WorkerRef’s holder. For example, resolving/rejecting the promises created by the WorkerRef’s holder. The cleanup behavior might be synchronous or asynchronous depending on how complicated the functionality involved. For example, Cache APIs might need to wait until the operation finishes on the IO thread and release the main-thread-only objects on the main thread.
To avoid memory problems, there are some things need to keep in mind for WorkerRef callback
Don’t release WorkerRef before finishing cleanup steps. (UAF)
Don’t forget to release resources related. (Memory leaking)
Don’t forget to release WorkerRef(StrongWorkerRef/ThreadWorkerRef/IPCWorkerRef) (Shutdown hang)