How to perftest a model

For each model running inside Firefox, we want to determine its performance in terms of speed and memory usage and track it over time.

To do so, we use the Perfherder infrastructure to gather the performance metrics.

Adding a new performance test is done in two steps: 1. make it work locally 2. add it in perfherder

Run locally

To test the performance of a model, you can add in the tests/browser a new file with the following structure and adapt it to your needs:

"use strict";

// unfortunately we have to write a full static structure here
// see https://bugzilla.mozilla.org/show_bug.cgi?id=1930955
const perfMetadata = {
  owner: "GenAI Team",
  name: "ML Test Model",
  description: "Template test for latency for ml models",
  options: {
    default: {
      perfherder: true,
      perfherder_metrics: [
        { name: "pipeline-ready-latency", unit: "ms", shouldAlert: true },
        { name: "initialization-latency", unit: "ms", shouldAlert: true },
        { name: "model-run-latency", unit: "ms", shouldAlert: true },
        { name: "pipeline-ready-memory", unit: "MB", shouldAlert: true },
        { name: "initialization-memory", unit: "MB", shouldAlert: true },
        { name: "model-run-memory", unit: "MB", shouldAlert: true },
        { name: "total-memory-usage", unit: "MB", shouldAlert: true },
      ],
      verbose: true,
      manifest: "perftest.toml",
      manifest_flavor: "browser-chrome",
      try_platform: ["linux", "mac", "win"],
    },
  },
};

requestLongerTimeout(120);

add_task(async function test_ml_generic_pipeline() {
  const options = {
    taskName: "feature-extraction",
    modelId: "Xenova/all-MiniLM-L6-v2",
    modelHubUrlTemplate: "{model}/{revision}",
    modelRevision: "main",
  };

  const args = ["The quick brown fox jumps over the lazy dog."];
  await perfTest("example", options, args);
});

Then add the file in perftest.toml and rebuild with ./mach build.

The test downloads models it uses from the local disk, so you need to prepare them.

  • Create a directory with a subdirectory called onnx-models.

  • Download all the models in the subdirectory

The directory follows a organization/name/revision structure. To make the previous example work, it would require you to download the model files locally under <ROOT>/onnx-models/Xenova/all-MiniLM-L6-v2/main

Example:

cd ROOT/onnx-models
git lfs install
git clone -b main https://huggingface.co/Xenova/all-MiniLM-L6-v2 onnx-models/Xenova/all-MiniLM-L6-v2/main/

Once done, you should then be able to run it locally with :

MOZ_FETCHES_DIR=/Users/tarekziade/Dev/fetches ./mach perftest toolkit/components/ml/tests/browser/browser_ml_engine_perf.js --mochitest-extra-args=headless

Notice that MOZ_FETCHES_DIR is an absolute path to the root directory.

Add in the CI

To add the test in the CI you need to add an entry in

  • taskcluster/kinds/perftest/linux.yml

  • taskcluster/kinds/perftest/windows11.yml

  • taskcluster/kinds/perftest/macos.yml

With a unique name that starts with ml-perf

Example for Linux:

ml-perf:
    fetches:
        fetch:
            - ort.wasm
            - ort.jsep.wasm
            - ort-training.wasm
            - xenova-all-minilm-l6-v2
    description: Run ML Models Perf Tests
    treeherder:
        symbol: perftest(linux-ml-perf)
        tier: 2
    attributes:
        batch: false
        cron: false
    run-on-projects: [autoland, mozilla-central]
    run:
        command: >-
            mkdir -p $MOZ_FETCHES_DIR/../artifacts &&
            cd $MOZ_FETCHES_DIR &&
            python3 python/mozperftest/mozperftest/runner.py
            --mochitest-binary ${MOZ_FETCHES_DIR}/firefox/firefox-bin
            --flavor mochitest
            --output $MOZ_FETCHES_DIR/../artifacts
            toolkit/components/ml/tests/browser/browser_ml_engine_perf.js

You also need to add the models your test uses (like the ones you’ve downloaded locally) by adding entries in taskcluster/kinds/fetch/onnxruntime-web-fetch.yaml and adapting the fetches section.

Once this is done, try it out with:

./mach try perf --single-run --full --artifact

You should then see the results in treeherder.