MIR optimizations from a thousand feet¶
MIR is the intermediate representation (
This is a quick summary of all the MIR passes as of Feb 2021. The italicized passes are classic optimizations that are likely to be extensively covered in a compiler textbook. Non-italicized passes are either JS-specific, or too trivial to cover.
The state of the MIR after each of these passes can be visualized using iongraph.
Single Static Assignment is a form of IR in which every value is defined exactly once. It has several nice properties for optimization. Note: SSA is why we have phi nodes.
Prune Unused Branches¶
What it says on the tin: prunes away branches that are never taken.
Fold Empty Blocks¶
A simple cleanup pass to get rid of empty blocks with one predecessor and one successor by folding them into their successor.
Split Critical Edges¶
In subsequent passes, we may choose to move code around. In preparation, this pass adds empty blocks along critical edges, so that we have a safe place to put those instructions.
This pass literally just reassigns block numbers.
After some of the above optimizations, some of our phi nodes may be things like
b = phi(a,a), which is redundant. This pass cleans those up.
If a function allocates and uses an object, but we can prove that the object never escapes the function, then we can sometimes avoid the allocation entirely by tracking each of the object’s components (fields in C/C++; slots/elements in JS) individually.
Each type of MIR node has a TypePolicy defining what type of input it accepts. If necessary, this pass inserts (potentially fallible) conversions to guarantee that the types work out.
Alias analysis determines whether two instructions may use/modify the same memory. If they do, then they can not be reordered with respect to each other, because that could change the semantics of the program.
Global Value Numbering is a classic optimization for finding places where we compute the same value multiple times, and eliminating the redundancy.
Loop-Invariant Code Motion finds instructions in a loop that will compute the same value every time and hoists them out of the loop.
This is done in preparation for range analysis. This particular approach to range analysis is taken from a paper by Gough and Klaren.
A classic optimization that determines the possible range of values a definition can take on. Used to implement many of the following passes.
Remove beta nodes now that we don’t need them.
RA Check UCE¶
Check to see if range analysis has made any code eligible for Unreachable Code Elimination.
Strength-reduce double arithmetic to integer arithmetic if range analysis says it’s okay.
If we compute a value that will only be used along some paths, and we could recover the value if one of the other paths bailed out, then we can postpone the computation of that variable until we are sure we will need it. More details here.
Remove Unnecessary Bitops¶
Remove bit-wise arithmetic operators that don’t do anything (like
x | 0 on integer input).
Fold Linear Arithmetic Constants¶
a + constant1 + constant2 into
a + (constant1+constant2).
Effective Address Analysis¶
This was added to ensure that we generated good code for memory accesses in asm.js.
Shuffle instructions around within a block to reduce the lifetime of intermediate values and reduce register pressure. This is a relatively simple version of instruction scheduling.
Make loops contiguous¶
Reorder blocks so that all the blocks in a loop are generated in one contiguous chunk, which is good for cache locality.
Edge Case Analysis (Late)¶
A place to check for edge cases after code has stopped being moved around. Currently used for checking whether some instructions need to handle negative zero.
Bounds Check Elimination¶
A classic optimization that eliminates bounds checks if we can prove at compile time that they can’t fail.
Loading a NaN-boxed value and then unboxing it can be slightly more efficient if we do both operations at once.
Add KeepAlive Instructions¶
While we access the slots or elements of an object, we have to ensure that the object itself is not collected by the GC. See bug 1160884.
After optimization is over, we lower our IR from MIR to LIR to do register allocation and code generation.