How to perftest a model
For each model running inside Firefox, we want to determine its performance in terms of speed and memory usage and track it over time.
To do so, we use the Perfherder infrastructure to gather the performance metrics.
Adding a new performance test is done in two steps: 1. making it work locally 2. add it in perfherder
Run locally
To test the performance of a model, you can add in the tests/browser a new file with the following structure and adapt it to your needs:
"use strict";
// unfortunately we have to write a full static structure here
// see https://bugzilla.mozilla.org/show_bug.cgi?id=1930955
const perfMetadata = {
owner: "GenAI Team",
name: "ML Test Model",
description: "Template test for latency for ml models",
options: {
default: {
perfherder: true,
perfherder_metrics: [
{ name: "pipeline-ready-latency", unit: "ms", shouldAlert: true },
{ name: "initialization-latency", unit: "ms", shouldAlert: true },
{ name: "model-run-latency", unit: "ms", shouldAlert: true },
{ name: "pipeline-ready-memory", unit: "MB", shouldAlert: true },
{ name: "initialization-memory", unit: "MB", shouldAlert: true },
{ name: "model-run-memory", unit: "MB", shouldAlert: true },
{ name: "total-memory-usage", unit: "MB", shouldAlert: true },
],
verbose: true,
manifest: "perftest.toml",
manifest_flavor: "browser-chrome",
try_platform: ["linux", "mac", "win"],
},
},
};
requestLongerTimeout(120);
add_task(async function test_ml_generic_pipeline() {
const options = {
taskName: "feature-extraction",
modelId: "Xenova/all-MiniLM-L6-v2",
modelHubUrlTemplate: "{model}/{revision}",
modelRevision: "main",
};
const args = ["The quick brown fox jumps over the lazy dog."];
await perfTest("example", options, args);
});
Then add the file in perftest.toml and rebuild with ./mach build.
The test downloads models it uses from the local disk, so you need to prepare them.
We provide a script to automate this.
$ mach python toolkit/components/ml/tests/tools/create_local_hub.py --list-models
Available git-based models from the YAML:
- xenova-all-minilm-l6-v2 -> path-prefix: onnx-models/Xenova/all-MiniLM-L6-v2/main/
- mozilla-ner -> path-prefix: onnx-models/Mozilla/distilbert-uncased-NER-LoRA/main/
- mozilla-intent -> path-prefix: onnx-models/Mozilla/mobilebert-uncased-finetuned-LoRA-intent-classifier/main/
- mozilla-autofill -> path-prefix: onnx-models/Mozilla/tinybert-uncased-autofill/main/
- distilbart-cnn-12-6 -> path-prefix: onnx-models/Mozilla/distilbart-cnn-12-6/main/
- qwen2.5-0.5-instruct -> path-prefix: onnx-models/Mozilla/Qwen2.5-0.5B-Instruct/main/
- mozilla-smart-tab-topic -> path-prefix: onnx-models/Mozilla/smart-tab-topic/main/
- mozilla-smart-tab-emb -> path-prefix: onnx-models/Mozilla/smart-tab-embedding/main/
(Use `--model <key>` to clone one of these repositories.)
You can then use –model to download locally models, by specifying the local MOZ_FETCHES_DIR directory, via the env var or command line argument :
$ mach python toolkit/components/ml/tests/tools/create_local_hub.py --model mozilla-smart-tab-emb --fetches-dir ~/ml-fetches
Found existing file /Users/tarekziade/Dev/fetches/ort-wasm-simd-threaded.jsep.wasm, verifying checksum...
Existing file's checksum matches. Skipping download.
Updated Git hooks.
Git LFS initialized.
Cloning https://huggingface.co/Mozilla/smart-tab-embedding into '/Users/tarekziade/Dev/fetches/onnx-models/Mozilla/smart-tab-embedding/main...
Cloning in '/Users/tarekziade/Dev/fetches/onnx-models/Mozilla/smart-tab-embedding/main'...
Checked out revision '2278e76f67ada584cfd3149fd2661dad03674e4d' in '/Users/tarekziade/Dev/fetches/onnx-models/Mozilla/smart-tab-embedding/main'.
Once done, you should then be able to run it locally with :
$ MOZ_FETCHES_DIR=~/ml-fetches ./mach perftest toolkit/components/ml/tests/browser/browser_ml_engine_perf.js --mochitest-extra-args=headless
Notice that MOZ_FETCHES_DIR is an absolute path to the root directory.
Add in the CI
To add the test in the CI you need to add an entry in
taskcluster/kinds/perftest/linux.yml
taskcluster/kinds/perftest/windows11.yml
taskcluster/kinds/perftest/macos.yml
With a unique name that starts with ml-perf
Example for Linux:
ml-perf:
fetches:
fetch:
- ort.wasm
- ort.jsep.wasm
- ort-training.wasm
- xenova-all-minilm-l6-v2
description: Run ML Models Perf Tests
treeherder:
symbol: perftest(linux-ml-perf)
tier: 2
attributes:
batch: false
cron: false
run-on-projects: [autoland, mozilla-central]
run:
command: >-
mkdir -p $MOZ_FETCHES_DIR/../artifacts &&
cd $MOZ_FETCHES_DIR &&
python3 python/mozperftest/mozperftest/runner.py
--mochitest-binary ${MOZ_FETCHES_DIR}/firefox/firefox-bin
--flavor mochitest
--output $MOZ_FETCHES_DIR/../artifacts
toolkit/components/ml/tests/browser/browser_ml_engine_perf.js
You also need to add the models your test uses (like the ones you’ve downloaded locally) by adding entries in taskcluster/kinds/fetch/onnxruntime-web-fetch.yaml and adapting the fetches section.
Once this is done, try it out with:
$ ./mach try perf --single-run --full --artifact
You should then see the results in treeherder.