Tutorial: 2026-06-22

2026-06-22

2026-06-22

Newest entry on top.

Visual-regression in CI — generate + commit Linux baselines

Followed up on the "CI now runs the full suite" work: visual-regression was the one suite skipped on CI (Windows-only *-chromium-win32.png baselines). Owner offered local Docker to generate Linux baselines.

Tried Docker first, pivoted. Ran mcr.microsoft.com/playwright:v1.61.0-noble detached (anonymous volumes over node_modules so the container's Linux installs wouldn't clobber the Windows tree), but the image pull crawled — the big browser layer made no progress for ~12 min (download happens inside Docker's WSL2 VM, bottlenecked by the connection). Killed it and pivoted to generating baselines on GitHub Actions — faster, and an exact match for the ubuntu-latest e2e job that consumes them (better than any local image).

What landed. playwright.config.js: browser by OS (process.platform === "win32" ? "chrome" : undefined) so Windows uses system Chrome and Linux/CI uses bundled chromium; visual skip re-keyed from CI to PLAYWRIGHT_SKIP_VISUAL (escape hatch). New .github/workflows/visual-baselines.yml (workflow_dispatch): on ubuntu-latest, npm ci + web-app npm ci + npx playwright install --with-deps chromium + playwright test visual.spec.js --update-snapshots, uploads tests/e2e/visual.spec.js-snapshots/ as the linux-visual-baselines artifact. Sequencing kept everything green: pushed the config + workflow with the e2e job temporarily setting PLAYWRIGHT_SKIP_VISUAL=1 (dev green), FF'd master so workflow_dispatch could find the file on the default branch, dispatched it, downloaded the three *-chromium-linux.png, committed them, and removed the temp skip so the e2e job now runs visual against them.

Landmine (again): restoring web-app deps with a plain npm --prefix web-app install re-pruned the cross-platform optional entries from web-app/package-lock.json (−25 lines). Not committed — git checkout -- web-app/package-lock.json restored the clean-room lock. Incremental installs keep doing this; only a clean-room/full resolve is safe for that lockfile.

Expand CI to run the full test suite (Vitest + Playwright)

Owner flagged that CI must not miss what we run locally unless absolutely necessary — ci.yml had only been running lint + format:check + smoke + the web-app build, skipping the 2.6.0 Vitest + Playwright suites. Expanded ci.yml to three jobs: check (+ npm run test:unit), web-app (+ npm --prefix web-app run test), and a new e2e job (root + web-app npm ci, npx playwright install --with-deps chromium, npm run test:e2e). Verified the Vitest suites locally first: test:unit 88 passed, test:web 30 passed.

The one necessary exception is visual-regression. toHaveScreenshot baselines are OS-specific and committed for Windows only (tests/e2e/visual.spec.js-snapshots/*-chromium-win32.png); on Linux CI they'd never match. So playwright.config.js now testIgnores **/visual.spec.js when process.env.CI is set, and the same guard drops channel: "chrome" on CI (system Chrome locally → bundled chromium on CI, the standard Playwright CI browser). Confirmed by loading the config under CI=1: testIgnore = ["**/visual.spec.js"], channel = undefined. To run visual in CI later, commit Linux baselines (test:e2e:update on Linux) and drop the guard. Notes updated (status, deployment, changelog).

Unbreak CI and ship devmaster (FF)

Owner asked to fast-forward master to the latest CI-green dev commit and keep master FF-only (mirroring the sibling project: master only ever fast-forwards to a dev HEAD that has passed GitHub CI). Found the hold-up: dev HEAD (and the last 12 commits) were red on GitHub CI, so nothing recent was shippable.

Diagnosis. Both CI jobs died in ~2s at the install step — not test failures. npm ci errored EUSAGE: package-lock.json out of sync with package.json (missing @emnapi/* and platform native bindings pulled in by the 2.6.0 Vitest/Rolldown deps). The last green CI run was 3d36635 (2026-06-20), ~50 commits back, before the DPL v3 work and the test suite.

Fix (build/style only, no version bump). Regenerated root + web-app lockfiles (Node 24 / npm 11); local npm ci then passes the sync check (root fully; web-app only hit a Windows-only EPERM unlinking a locked Rolldown .node, irrelevant to Linux CI). Landmine: the first regen was an incremental npm install on top of the Windows node_modules, so it omitted the Linux-only optional subtree and CI still failed (Missing: @emnapi/core@1.11.1). Fix that actually works: delete the lockfile (+ node_modules) and do a fresh full resolve from registry metadata, which records every platform's optional deps (@emnapi/*, wasm32-wasi + linux-* native bindings). Used --package-lock-only for web-app to avoid the locked .node. Documented in reference/. Also found format:check red — ~40 source/test files (e.g. src/core/nodeLoader.js, src/core/stages/dynamicPrompt.js, the new tests/**) were never Prettier-formatted; CI never got past install to flag them. Ran prettier --write. The other ~17 flagged files were just CRLF noise (confirmed via --end-of-line auto). Verified locally: lint 0 errors, format:check clean, smoke green, web-app build green — the full CI gate.

Ship. master had been intentionally held at the pre-revival 241a148; this lifts that hold. master fast-forwarded to the green dev HEAD (structurally clean — master was a strict ancestor, 0 unique commits vs dev's 106). Note: ci.yml only runs lint + format:check + smoke + web-app build — it does not run the Vitest/Playwright suites (a gap worth closing later; CLAUDE.md implies they run in CI).

Then the deploy pipelines went red — the first master push fired release.yml + pages.yml for the first time ever (deployments had been held), exposing two pre-existing breakages. (1) npm run docs aborted: JSDoc's catharsis parser rejects TypeScript-style types — arrow types (n)=>(…) and optional record keys {k?:T} in listManifest.js, promptFilesAndSuggestions.js, dynPromptManifest.js. Rewrote them in JSDoc-native form (function(string): (string[]|null), (T|undefined)); npm run docs now exits 0 (fixes Pages + the Release docs-zip step). (2) release.yml's verify step ran npm test, which 2.6.0 redefined to include the web-app jsdom suite (deps not installed there) — realigned to npm run lint + npm run smoke (its original intent, and consistent with the CI gate). CRLF landmine: the git checkout masterdev round-trip re-checked out the whole tree as CRLF (autocrlf=true), so format:check flagged 509 files locally — all phantom; prettier --end-of-line auto --check . is clean and git stores LF, so CI/Linux is unaffected.

Full automated test suite — Vitest + Playwright (2.6.0)

Added comprehensive testing to the project, which previously had only ESLint + the import smoke test.

What landed

  • Two Vitest suites. Node-side under tests/ (vitest.config.js, environment: node) covering unit, integration, snapshot, contract, and bug-regression; jsdom SPA suite under web-app/tests/ (web-app/vitest.config.js, which mergeConfigs vite.config.js so import.meta.glob + the lodash alias resolve as in the real build) covering unit, component/UI (React Testing Library), contract, and a real-data integration test of the browser engine facade.
  • Playwright (playwright.config.js) builds the SPA and serves dist/ via vite preview, then runs E2E (home.spec.js), visual-regression (visual.spec.js, stable chrome with the random suggestion masked), and @axe-core/playwright accessibility specs (accessibility.spec.js).
  • Helpers: tests/helpers/seededRandom.js (mulberry32 + withSeed) and tests/helpers/fakeLoader.js (in-memory loader implementing the engine's loader contract).
  • Scripts: test:unit, test:web, test:e2e, test:e2e:update, test:all, *:coverage; npm test now = lint + smoke + Node + SPA suites. .gitignore + ESLint ignores updated for coverage/, playwright-report/, test-results/ (configs, the tests/ suite, and committed visual baselines stay tracked).

Scope decision (owner instruction this session): the legacy classic server is being actively phased out, so tests target the active core engine + SPA only. Only the pure stages the core engine still imports (cleanup.js, prompt-salt.js) are tested from prompt-modules/. Recorded in memory.

Landmine found + documented: lodash captures Math.random at import, so _.random/_.sample/_.shuffle can't be stubbed by overriding Math.random. Tests touching lodash randomness assert invariants (token counts/shape) or use single-entry lists; only the DPL renderer (its own Math.random RNG) is seeded. Three first-draft tests failed on this and were rewritten. (Also: keywordAlias/artistAlias are the identity — "keyword"/"artist" — the alias indirection happens in the list store, not the repeater.)

Result: 118 Vitest tests green (88 Node + 30 SPA) plus 8 Playwright specs green (E2E + visual + a11y).

Playwright launch saga (Windows): the bundled Chrome-for-Testing build refused to launch — first spawn UNKNOWN, then (after the headless-shell + a clean --force reinstall) a persistent side-by-side ("SxS") error: chrome.exe's dependent assembly 149.0.7827.55 couldn't be found. The VC++ 2015–2022 runtime was already installed, so it wasn't a missing redistributable, and a clean re-extract didn't help — the bundled build just doesn't run on this machine. Fix: point Playwright at the system Google Chrome (channel: "chrome", v149.0.7827.115 — version-matched to the Chromium 149 the runner targets). The suite then passed 8/8 and wrote the visual baselines (tests/e2e/visual.spec.js-snapshots/, committed). A second run with no --update-snapshots confirmed the baselines are stable. CI can drop the channel to use the bundled browser.

Docs: rewrote notes/plans/testing.md, updated notes/status.md (open issues + build/run health), notes/reference/dependencies.md (test tooling table), CLAUDE.md (Build/Run/Verify + landmine), and the 2.6.0 changelog. Bumped VERSION + package.json to 2.6.0.