|
Pokered Save Editor 2
Pokemon Red & Blue save file editor - Qt 6 C++/QML
|
Created 2026-06-07. Status: Phases 1–2 built, run & ALL GREEN on the real Qt 6.11 kit (3/3 suites, 31 field assertions) — 2 real bugs found and fixed (Daycare empty-destructor crash; bank-2 checksum off-by-one). Tests live under projects/tests/; later phases here are still the design to build against. QML SCREEN SMOKE TEST + a first COMPREHENSIVE GUI SUITE are now BUILT (2026-06-13): tst_qml_screens loads every screen and fails on any QML warning/error (main gated on it), and a real-app GUI suite on the guiapp.h harness — tst_gui_navigation (sweep every screen on populated
Phase 2 added (2026-06-07): tests/savefile/tst_fields.cpp — per-field expand/flatten coverage, all green. Two styles per field: VALUE round-trip (set on the model → flatten → re-expand → assert the value survived, incl. boundary values) and BYTE isolation (edit one field → only that field's bytes + main checksum 0x3523 change). Covered: money (0/1/typical/999999), coins (0/1/9999), player ID (0/1/0x1234/0xFFFF), starter (0/1/153/255), player name (empty/1/typical/7-char), badges (pattern/all-on/all-off, both the primary 0x2602 and duplicate 0x29D6), pokédex (independent owned/seen patterns across all 151), rival (name+starter), and a bag item (index+amount). Offsets taken from playerbasics.cpp (second oracle) cross-checked with the .bt. Name/ID round-trips set the member directly to avoid the intentional OT-rewrite side effect of fullSet*.
Implemented so far (2026-06-07, phase 1 — needs a build + run on the Qt kit):
- projects/tests/ wired into CMake via add_subdirectory(tests) + enable_testing() (root CMakeLists.txt, guarded by option(PSE_BUILD_TESTS ON); Test added to the Qt6 components). Each test is a qt_add_executable registered with add_test (run via ctest). Fixtures are located through a PSE_ASSETS_DIR compile def → <repo>/assets (read-only; tests copy in memory).
- tests/helpers/savefilefixture.h — read a fixture, load into SaveFile, snapshot raw bytes, diff.
- tests/savefile/tst_roundtrip.cpp — (a) load→flatten→recalc identity on both fixtures (open-then-save changes zero bytes), (b) money single-edit isolation (only the money bytes + main checksum change; value round-trips). Offsets (money 0x25F3, main checksum 0x3523) taken from the .bt oracle; tests print the actual changed offsets on failure so a wrong constant is self-correcting on first run.
- tests/db/tst_db_integrity.cpp — DB boots, ≥151 species, every non-glitch species deep-links its primary type.
Key pipeline fact baked into the tests: SaveFileExpanded::save() (i.e. flattenData()) does not recompute checksums; recalcChecksums() runs in FileManagement::writeSaveData() at file-write time. So a faithful "open→save" test must call flattenData() then toolset->recalcChecksums().
Built & run on the real Qt 6.11 / llvm-mingw kit (2026-06-07, via PowerShell on the dev PC): configure + build clean; ctest results:
- tst_db_integrity — PASS (DB boots, ≥151 species, every non-glitch species deep-links its type).
- loadFlatten_isIdentity(new game) — PASS (a fresh save round-trips byte-perfectly, 0 changes).
- moneyEdit_touchesOnlyMoneyAndChecksum — PASS (confirms money 0x25F3 / checksum 0x3523; only those bytes change; value round-trips).
- loadFlatten_isIdentity(progressed) — PASS after fix. Originally failed: flatten changed 0 bytes (engine byte-perfect), but recalcChecksums() wrote 0x5A4B 00→C5 — the bank-2 box-checksum off-by-one. After correcting it to 0x5A4C/0x1A4C, BaseSAV.sav round-trips byte-perfect. All tests now green (100%).
Two real bugs found on day one and fixed (the whole point): (1) Daycare::~Daycare() null-deref crash on an empty Day Care (daycare.cpp); (2) bank-2 box-checksum off-by-one in recalcChecksums() (savefiletoolset.cpp) — corrected to 0x5A4C/0x1A4C, verified by tst_roundtrip. Both in reference/fix-patterns.md; checksum detail in status.md.
Reproduce: configure with C:/Qt/Tools/CMake_64/bin/cmake.exe -S projects -B <build>, build targets tst_roundtrip tst_db_integrity, run ctest --output-on-failure (PATH needs the Qt + lib-output dirs for DLLs). Tests use QTEST_GUILESS_MAIN and ran fine headless (no GUI dependency in DB load).
This is the master plan for full, professional test coverage of the project. The goal is not "some tests" — it is comprehensive coverage: every function exercised across normal, boundary, and invalid inputs, with both return values and side effects asserted; every error path verified to degrade gracefully; saving/loading proven correct in every context; randomization proven to hold its invariants; and every historical bug locked out by a permanent regression test. Coverage tooling makes "comprehensive" a measured fact, not a hope.
Two project values make automated testing unusually high-value:
The whole class of bug that consumed sessions 13f–13h — QML garbage-collecting parentless C++ QObjects → use-after-free — is exactly what a sanitizer build catches automatically. That alone may justify the harness.
The layered architecture (common → db → savefile → app, see ../systems/overview.md) is a gift for testing: common, db, and savefile are libraries with zero UI dependency. A headless test executable links them directly and exercises the entire byte engine, the game databases, and all save logic without ever creating a window. ~90% of the project's risk lives in those three layers, and 100% of it is reachable headless.
Only app (the Bridge, models, QML) needs the Qt GUI/Quick stack to test — and even that can run on the offscreen platform plugin with no display.
| Type | What it proves | Primary layers | Tooling |
|---|---|---|---|
| Unit | Each function correct across normal/boundary/invalid inputs; return values AND side effects | common, db, savefile, app(C++) | QtTest |
| Boundary / edge case | 0, max, empty, full, first/last index, -1 "unset" sentinels | all | QtTest (data-driven) |
| Negative / error path | Graceful degradation: bad input → defined error, no crash, save untouched | savefile, app, db(asset load) | QtTest |
| Round-trip / golden | load→flatten == identity; single edit changes ONLY intended bytes | savefile | QtTest + committed fixtures |
| Integration | Layers cooperate (db↔savefile expansion, FileManagement↔SaveFile cycle) | db+savefile, app | QtTest |
| End-to-end / system | Full journey: open → edit sequence → save → reopen → assert persisted + valid | savefile (+app) | QtTest |
| Randomization | Every playability invariant holds across many seeds | savefile, db | QtTest + seeded Random |
| Property-based / fuzz | Invariants survive random edits + random/garbage blobs | savefile | QtTest (generative helpers) |
| Regression | Each historical bug stays fixed forever | wherever the bug lived | QtTest (named per bug) |
| Compatibility | Red vs Blue; fresh / mid-game / post-E4 saves | savefile | QtTest + fixture matrix |
| Performance / benchmark | load/expand/flatten/randomize stay fast; no hangs | savefile | QtTest QBENCHMARK |
| Memory / sanitizer | No use-after-free / leak / UB (the QML-GC bug class) | all (whole suite) | ASan + UBSan build, Valgrind |
| Coverage | Proves comprehensiveness; finds untested lines/branches | all | llvm-cov (llvm-mingw) |
| GUI / QML | UI flows behave (commit-on-blur, tab reactivity, popup dismiss) | app/ui | Qt Quick Test (offscreen) — deferred |
common — var8/var16 width and overflow behavior; Random distribution + determinism under a fixed seed; every Utility helper; the QML-ownership guards return what they're handed.
db — asset integrity: 151 Pokémon load, every deep-link resolves (no dangling refs), no required field empty/null; every getter returns expected values vs the JSON; -1 "unset" handled (getWidth() >= 0, not truthiness); search APIs (MapSearch, FontSearch) across hits, misses, multi-match, empty; FontsDB::convertToCode / splice / expandStr on normal, empty, and the variable-tile / lone-variable-tile cases (the s13y infinite-loop bug).
savefile — the heart of the suite:
app (C++) — Bridge exposes the expected object graph; list models (pokedexModel, mapSelectModel, …) map C++ → item-model rows correctly (rowCount, roles, data at index, empty state); FileManagement open/save/recent-prune logic; Router screen set.
Run the randomizer across a large seed sweep and assert every promise from ../context/principles.md → "The Randomization Feature" holds on every run:
Because Random wraps QRandomGenerator, tests seed it deterministically so any failure is reproducible from the logged seed. This doubles as fuzz coverage of the flatten path.
llvm-mingw ships llvm-cov + llvm-profdata. The test build compiles with -fprofile-instr-generate -fcoverage-mapping; after ctest, generate a line+branch report.
Targets (initial, tunable):
| Layer | Line target | Branch target |
|---|---|---|
| common | 100% | ≥ 95% |
| db | ≥ 95% | ≥ 90% |
| savefile | 100% | ≥ 95% |
| app (C++) | ≥ 80% | ≥ 70% |
"Comprehensive" = the report shows no untested function and no untested error branch in common/savefile. Coverage gaps become the to-do list for filling tests.
Mirrors the source tree so "is function X tested?" maps to an obvious file.
Each becomes a permanent named test:
Each phase is independently valuable; the suite is useful from phase 1.
savefile to target coverage. Per-field expand/flatten, encodings, checksum, all verbs. Started 2026-06-07. tst_fields.cpp: trainer basics (money/coins/ID/starter/name), badges, pokédex, rival, one bag item — value round-trips + byte isolation, all green. tst_pokemon.cpp: party AND PC-box Pokémon records (BOX_MON/PARTY_MON) — every stored field round-trips: species, level, status, hp, catch rate, OT id/name, exp, the 5 stat-exp ("EVs"), the 4 DVs, all 4 move slots (id/pp/ppUp), nickname, and the party-only battle stats. (Surfaced that the engine clamps PP to each move's max, faithfully to the game.) Still to do: full item LISTS (add/remove/move/sort, terminator), world/area/world-event fields, playtime, and the verbs reset()/eraseExpansion()/ randomize(). Verbs (tst_verbs.cpp): resetData() (zeros the buffer + blanks the model) and eraseExpansion() (blanks the model, touches zero save bytes) — both green. randomizeExpansion() is QSKIP'd: it crashes on a progressed save (found+fixed a MapSearch::isType() null-deref; still crashes in SpriteData::load() via AreaSprites::randomize()) — deferred to phase 7 (randomizer is WIP).
Flaky-test note (2026-06-08): tst_market_model intermittently segfaulted (~1/3 runs, exit 0xC0000005) via a queued slot invocation (QMetaMethod::invokeImpl) on a torn-down object. Cause: the test created+destroyed a Bridge PER test method; the Bridge has many cross-object signal connections + the static ItemMarketEntry pointers, and the churn left a queued call targeting a freed object. The real app never churns Bridges (one lives for the app's lifetime), so this was test-induced, not a shippable bug. Fix: market test uses ONE Bridge (initTestCase/cleanupTestCase); its assertions are all relative (money up/down vs each test's own baseline) so a shared fixture is order-independent. The other bridge-based model tests (storage/item-storage/bridge) don't churn into a crash (verified 0/20 each), but prefer a single shared Bridge for new app-model tests.
Bugs found by phase-2 tests so far (all real): Daycare empty-destructor crash (fixed), bank-2 checksum off-by-one (fixed), MapSearch::isType() null-deref (fixed), randomizer sprite-path crash (open, phase 7), PlayerPokedex::reset() memset value/count swap (fixed 2026-06-08) — it passed pokemonDexCount as the fill VALUE, so reset() filled every seen/owned byte with 151 (= true), marking the whole dex seen+owned instead of blanking it; randomize() masked it by assigning every entry. Caught by tst_pokedex markAll_and_reset. DB store-accessor bounds (found 2026-06-08 by the tst_db_stores sweep, all fixed): CreditsDB::getStoreAt() had a fully **inverted guard (store.size() >= ind) — it returned nullptr for every valid index (so the credits list could never resolve a row) and ran store.at(ind) for out-of-range indices (crash). Eight more DBs (Events, Fly, Fonts, GameCorner, Items, Maps, Missables, EventPokemon) guarded positive overflow but not negative indices, so getStoreAt(-1) (which QML passes for "nothing selected") did store.at(-1) → crash. All now use the canonical if(ind < 0 || ind >= store.size()) return nullptr; the other ten DBs already had. AbstractRandomString::getStoreAt() (the base for the random name/example sources) had the same missing negative guard — also fixed (found 2026-06-08 by tst_db_random_strings).**
| module | line | func | region |
|---|---|---|---|
| common | 79.1% | 77.3% | 82.1% |
| db | 62.0% | 58.2% | 55.3% |
| savefile | 72.9% | 73.6% | 65.2% |
| app/appcore | ~50-58% | ~50-65% | – | ← noisy; see below
Coverage-gap-fill pass started 2026-06-08 (session: new chat, build/test/coverage loop re-verified end-to-end on the real Qt 6.11 llvm-mingw kit — clean rebuild + 41/41 ctest green independently reproduced). Authoritative savefile real-source line gap list (worst first, by missed lines): pokemonbox.cpp 72% (314), spritedata.cpp 46% (234), filemanagement.cpp 64% (87), areamap.cpp 62% (83), areapokemon.cpp 61% (75), playerbasics.cpp 67% (57), areatileset.cpp 62% (51), signdata.cpp 40% (50), savefileiterator.cpp 57% (47), item.cpp 62% (47), areawarps.cpp 70% (46)… First file done: tst_sign_data.cpp drives the DB-population + randomize paths (setTo/setToAll/randomize/randomizeAll/load-null) that the area-fragment list-ops test never reached — signdata.cpp 39.8% → 100.0% (83/83), 7 cases green; savefile real-source overall 72.9% → 73.8%. Method note: MapDBEntrySign has a protected ctor, so DB-defined signs are sourced from the first MapsDB map with ≥2 signs; randomizeAll is asserted as a position permutation (RNG-independent), which also covers the non-null branch of randomize().
_Second file(s): tst_map_fragments.cpp — the map-edge fragments' DB-population/resolve paths: WarpData::load(MapDBEntryWarpOut*) + ctor + randomize() + toMap(), and MapConnData::loadFromData(MapDBEntryConnect*) + toMap() + load-null. warpdata.cpp 54.5% → 98.5%, mapconndata.cpp 71% → 100%; savefile real-source overall → 74.6% (4408/5905), 6 cases green (43/43 full suite).
Third file: tst_iterator.cpp — exhaustive coverage of SaveFileIterator, the auto-advancing byte cursor (every fragment uses it). Each accessor checked for BOTH value round-trip (through the toolset) AND cursor side-effect: byte/word/BCD/hex/str/range/bitfield advance by size+padding; getBit/ setBit do not advance; navigation (offsetTo/By, inc/dec, skipPadding) and the push/pop offset stack (incl. empty-pop → 0x0000) asserted directly against the public offset field. savefileiterator.cpp 57.3% → 100.0% (110/110); savefile real-source overall → 75.6% (4467/5905), 9 cases green (44/44 full suite). Notes for future iterator-style tests: offset is public (assert it directly) but state is protected (verify push/pop via offset only); and the hex get/set byte convention is the toolset's contract (asserted in tst_toolset) — at the iterator layer assert idempotency, not a canonical hex string.
Fifth file: tst_item.cpp — the Item inventory-slot fragment: all 4 ctors (null-iterator/ index+amount/name+amount/random), the load() overloads (incl. invalid-name guard), randomize() (50-iter legality sweep: non-glitch, non-once, amount 1-5), toItem() resolution, the full buy/sell pricing surface (one + all × money + coins) validated against the resolved ItemsDB entry, the null-item → 0 / canSell=false guards, and setAmount() clamping (1..99). item.cpp 61.8% → 96.7% (119/123); savefile real-source overall → 76.4% (4510/5905), 8 cases green (45/45 full suite).
Sixth file: tst_player_basics.cpp — the PlayerBasics methods past the field round-trips: badge accessors (count/at/set), toStarter() resolution, the individual randomizers (coins/money/id/starter, 40-iter range checks), and the OT-rewrite machinery — getNonTradeMons() (incl. the null-file guard), fixNonTradeMons(), and fullSetPlayerName/ fullSetPlayerId: verified that the value-unchanged path is a true no-op (no owned-mon OT touched — the fidelity + anti-hang guard) AND that an actual change rewrites an owned mon's OT name/id (used getNonTradeMons() itself to grab an owned mon). playerbasics.cpp 67.4% → 98.9% (173/175); savefile real-source overall → 77.4% (4568/5905), 9 cases green (46/46 full suite).
Seventh file: tst_filemanagement.cpp — FileManagement's non-GUI logic: the recent-files list (add/de-dupe/trim/cap, swap, remove, clear, the persisted ';'-blob expandRecentFiles + startup pruneRecentFiles of unopenable entries, driven via the constructor with a seeded isolated QSettings) and the load-error reporting path (cannot-open vs too-short plain-English messages) reached through openFileRecent without a file dialog. filemanagement.cpp 63.6% → 77.0% (184/239) — the remaining ~55 lines are the native open/save file dialogs + the to-disk save path, which can't run headless (the save→reopen cycle is covered by tst_e2e). savefile real-source overall → 77.9% (4600/5905), 9 cases green (47/47 full suite). Off-by-one found AND fixed 2026-06-08 (approved after confirming intent): processRecentFileChanges() capped with append(file); if(size > MAX_RECENT_FILES) break; — appending then breaking retained MAX+1 (6, not 5). worth verifying the extra slot wasn't an intentional sentinel first; it wasn't — mainwindow bounds its menu loop at MAX_RECENT_FILES (and its shortcut array is sized MAX, so a 6th is never read), and RecentFilesModel uses recentFilesCount() directly (its +1 is just the row-0 "Clear Recent Files" header). So the 6th slot only leaked one extra path into the QML list. Fixed to >= MAX_RECENT_FILES → exactly 5. tst_filemanagement::recents_capBoundsTheList is now the regression guard (asserts ==5). See reference/fix-patterns.md.
Eighth file: tst_storage.cpp — the PC (Storage) beyond load/save round-trips: the flattened 0..11 box space (boxCount/boxAt across both 6-box sets), freeSpace()'s room invariant, depositPokemon() success + the all-full→false path, and the randomize verbs (randomize/randomizeItems/randomizePokemon) running clean. storage.cpp 78.6% → 98.0% (96/98); savefile real-source overall → 78.0%, 7 cases green (48/48 full suite).
Ninth file: tst_pokemonbox.cpp — the biggest single gap: the PokemonBox/PokemonMove logic the two existing pokemon suites don't reach (tst_pokemon = field round-trips, tst_pokemon_logic = DV/EV/level/heal/evolve/shiny/stats/nature). Covers PokemonMove PP-Up controls (max/raise/lower/reset
scopes, resetExp, expLevelRangePercent, setNature (incl. the level-range clamp + same-nature no-op), hasTradeStatus, changeOtData/changeTrade (both directions + idempotent no-op + null-basics guard), cleanupMoves/ correctMoves/update(correctMoves), changeMove(ind,…), the manual* UI hooks, reRollEVs/maxPpUps, dexNum/ speciesName, isPokemonReset/isCorrected, isBoxMon, and a dedicated invalid-mon (species 0) safe-default sweep (dexNum=-1, hpStat=1 floor, levelToExp=0, expLevelRange* fall back to raw exp, resetExp no-op). pokemonbox.cpp 72% → 94.4% (1071/1134, 63 missed); savefile real-source overall → 82.5% (4879/5916), 19 cases green (49/49 full suite). Three latent bugs were raised and, with their approval, FIXED in pokemonbox.cpp this pass (the tests now assert the corrected behaviour and stand as regression guards): (1) update() with resetType=false ran its bare else type2 = toType1->ind on every call, overwriting a dual-type mon's type2 with type1 (silently dropping the second type; emitted no type2Changed()) — reachable via maxLevel()/maxEVs()/resetEVs()/reRollEVs()/ manualLevelChanged(). Fixed: the type2 (re)derivation is now wrapped in if(resetType), so a non-reset update leaves type2 untouched. (2) isCorrected() disagreed with update() for a species whose DB toType2==toType1 (update collapses to type2=0xFF; isCorrected demanded type2==toType2->ind). Fixed: isCorrected now treats a record as dual-type only when toType2 genuinely differs from toType1, and accepts either 0xFF or type1 for a single type (faithful to the DB's mixed 0xFF-vs-duplicate storage). (3) isPokemonReset() could only ever be true for a species with 4 initial moves AND wrongly required isMaxPpUps() when a reset mon has 0 PP-Ups. Root cause traced (it was kept, not squashed aside): the empty-slot bug lived in PokemonBox::isMaxPP()/isMaxPpUps() — they looped all four slots and counted an EMPTY slot (moveID 0) as "not maxed", which propagates into isHealed() so any mon with fewer than four moves could never read as healed (a user-facing wrong result on the heal indicator, not just isPokemonReset). Fixed at the source: isMaxPP()/isMaxPpUps() now skip empty slots (mirroring isMaxedOut()'s existing guard), so isHealed() is correct for any move count; isPokemonReset() then simplifies to iterate the real initial moves (empty after), check ppUp==0, and reuse isHealed(). Regression-guarded by box_healedWithFewerThanFourMoves (a <4-move mon at full HP/PP now reads isMaxPP + isHealed). Also confirmed and fixed: isMinEvs() used || (true if ANY single EV is 0) → changed to && (all-zero), matching its "All stat-exp zero?" contract and isMaxEVs(); it had a real UX impact (StatsTab disables "Reset EVs" on isMinEvs, so the action was greyed out whenever one stat-exp was 0). Still open (tracked, next-steps.md): the type2 single-WRITE truth (0xFF vs duplicate) — load-side tolerance is official, the save-side canonical form is a deliberate temporary exception._
**Cumulative savefile progress this pass: 72.9% → 82.5% across signdata(100%), warpdata(98.5%), mapconndata(100%), savefileiterator(100%), item(96.7%), playerbasics(98.9%), filemanagement(77%, headless ceiling), storage(98%), pokemonbox(94.4%) — plus the recent-files cap off-by-one and five pokemonbox bugs fixed (approved: type2-clobber, isCorrected dual-type, isMaxPP/isMaxPpUps/isHealed empty-slot, isPokemonReset, isMinEvs ||→&&), with only the type2 single-write truth tracked for a later decision._
Tenth file: tst_sprite_data.cpp — SpriteData (one on-map sprite/NPC). Crash investigation first (the rule is to identify, not route around, problems): a probe with MapsDB::inst()->deepLink() called showed all 918 map sprites across 249 maps resolve getToSprite() (0 nulls), and setToAll (918) + randomizeAll (1167) run clean over every map. Verdict: there is no SpriteData defect — the documented "sprite-link crash" is entirely the already-tracked deepLink-not-called landmine (live DB::deepLinkAll() omits MapsDB::deepLink(); scoped to enabling the disabled Maps feature). So no "safe paths only" was needed: the whole file is covered by calling deepLink() in initTestCase (the tst_map_fragments precedent). 13 cases: reset/blank-NPC defaults, the std::optional QML accessors (set/get/reset → −1), step vectors, the four enum random() helpers, toSprite() valid/invalid, a full split-table save→load round-trip (data1/2 + NPC), missable save→check, and the DB-population paths (load(MapDBEntrySprite*) all face/move/type branches incl. the "0"-item guard, setTo/setToAll, randomize/randomizeAll incl. boulder-skip) driven over EVERY map. spritedata.cpp 46% → 100.0% (434/434, 0 missed); savefile real-source overall → 86.5% (5115/5916), 13 cases green (50/50 full suite).
Eleventh file: tst_area_logic.cpp — the area sub-trees' DB-population/randomize logic that tst_area.cpp's byte round-trips don't reach (deepLink in initTestCase): AreaTileset (talk-tile accessors/swap, randomize, loadFromData null+from-map+randomType), AreaMap (conn accessors, toCurMap, coordsToPtr, randomize/setTo from a real map incl. MapConnData::loadFromData, null-safe), and AreaPokemon/AreaPokemonWild (wild randomize over the 0-based dex range + operators + explicit ctor, table randomize, setTo from a map, null-safe) and AreaWarps (list accessors swap/new/remove, setTo + randomize from a real map). 20 cases. Found + fixed (approved) 3 bugs along the way: (1) AreaTileset::loadFromData inverted ternary — (map==nullptr) ? map->getToTileset() : nullptr crashed on a null map and discarded the tileset on a real map; fixed to ? nullptr : map->getToTileset() (regression-guarded by the null + from-map tests). (2) PokemonBox:: newPokemon(Random_Pokedex) used rangeExclusive(1,151) and so could never roll Bulbasaur (dex keys are 0-based, confirmed by probe: dex0=Bulbasaur..dex150=Mew, dex151 null) — fixed to rangeExclusive(0,151). (A bounds guard on AreaPokemon::setTo's array writes was considered but declined — gen-1 wild tables are fixed at 10, so trust the data; left unbounded.) areatileset.cpp 62% → 100.0% (0 missed), areamap.cpp 62% → 93.1% (15 missed), areapokemon.cpp 61% → 90.5% (18 missed), areawarps.cpp 70% → 98.7% (2 missed), pokemonbox.cpp → 94.6%; savefile real-source overall → 90.2% (5336/5916), 20 cases green (51/51 full suite). Stale-note correction (probed, tst_area_probe): the long-standing claim that isType("Cave")/isType("Outdoor") "match 0 maps" (blamed on a wrong type string) was wrong — they match 60 / 38 once deepLink() resolves toTileset; the 0-match + the AreaWarps pickRandom()->getInd() "crash" are the deepLink-not-called landmine, so AreaWarps::setTo/randomize run clean over all 249 maps (now covered). status.md's randomizer row updated accordingly.
Cumulative this pass: 72.9% → 90.2%. Next gap targets (worst remaining by missed lines): filemanagement.cpp (~55 — native file dialogs + to-disk save, headless-unreachable), and assorted leaf getters/branches across savefile fragments; plus residual boundary branches in pokemonbox.cpp (~61 — move-randomize do-while reject, correctMove's single-row replacement, isMaxedOut's invalid-mon branch, setNature's ±25 paths), areamap.cpp (15), areapokemon.cpp (18 — the byte save-NPC paths reached only through the full Area flatten).
common + db raised to ≥90% (2026-06-08, "get the other layers to 90%"):
App-layer push toward 100% started 2026-06-08 (goal: 100%, all tiers incl. GUI). Measured with the SHARED appcore (PSE_SHARED_APPCORE=ON) so the number is real: appcore was 80.2%, raised by: tst_select_models (every select model driven exhaustively — all rows × all roles incl. the placeholder-row default branch, + each model's value↔row helper across edge values), and tst_bridge extended for PokedexModel (all 3 sort comparators via dexSortCycle, pageClosing reset/early-return, dex lookup hit/miss, dataChanged). Bug fixed (approved polish): TypesModel::data() crashed on the placeholder row 0 with an undeclared role (fell through to at(-1)) — now guarded (matches PokemonStartersModel). Gated (QML only queries declared roles). Market/storage models were already well-covered (tst_market_model sweeps all 4 modes + checkout; tst_storage_model/tst_item_storage_model/tst_bridge). Remaining app gaps: fine-grained per-entry item-market branches, fontpreviewprovider, router edge paths — diminishing-value.
The full comprehensive-testing program (all tiers in scope; tracked): (1) finish app C++ leaf coverage; (2) GUI/QML behavioural (Qt Quick Test offscreen — the tst_qml_brg Bridge-as-brg harness already exists; remaining = screen-flow tests: name commit-on-blur, editor tab reactivity, popup dismiss — needs the app screens loaded from app.qrc into the test engine, a larger harness); (3) cross-module integration + whole-system E2E + a compatibility fixture matrix (Red/Blue, fresh/mid/post-E4) + fuzz/property expansion + QBENCHMARK perf pins; (4) CI already builds + runs ctest on Linux on push (confirmed) — remaining is to ADD the ASan/UBSan Linux job (sanitizers don't run on the Windows llvm-mingw kit; Linux is where the QML-GC use-after-free class gets caught automatically) + optional coverage gate. This is a multi-session program; pick up from the task list / this block.
Key discovery (worth knowing for all map-DB tests): DB::deepLinkAll() does not call MapsDB::deepLink() — map warps/sprites/connections are left unresolved at boot (it's only needed by the disabled map randomizer), so getToMap() is null everywhere until a test calls MapsDB::inst()->deepLink() itself. That call is stable headless (no crash; the warp deeplink's null-toMap crash is guarded by #ifdef QT_DEBUG, which is active in the Debug test build). The one warpdata line still missed is a minor branch, not chased._
How the app/appcore number is measured (and why the earlier 56% was wrong): unlike common/db/savefile (each a shared .dll, so per-test coverage merges cleanly), appcore is a static lib embedded into every app-test exe. Merging all .profraw and exporting against a single exe double-counts/garbles and gave an inflated ~56%. The reliable method is a per-file max union: measure each app-test exe against ITS OWN profile, then for each app source file take the best line coverage any test achieved and sum — that yields the honest 43.6%. So the app layer is the real laggard and the current focus. (For a one-shot accurate number in future, build appcore SHARED in the coverage config.) Confirmed attribution artifact (2026-06-08): tst_market_model runs all 7 cases with 0 skips and its buy/sell checkout demonstrably change the player's money — yet llvm-cov reports the mvc/itemmarket/itemmarketentry*.cpp files (283-line storeitem etc.) at 0%. The code is provably executed; llvm-cov on this llvm-mingw kit simply fails to attribute coverage for some statically-linked appcore translation units. So 43.6% is an under-count and the app % can't be trusted on this toolchain — judge app coverage by the passing functional suites, not the number. Making appcore a shared lib (at least for the coverage build) would fix the measurement. RESOLVED 2026-06-08: added option(PSE_SHARED_APPCORE) to app/CMakeLists.txt — pass -DPSE_SHARED_APPCORE=ON for the coverage build and appcore builds as a single shared .dll (WINDOWS_EXPORT_ALL_SYMBOLS auto-exports, no per-class macros). All test exes then share one instrumented appcore, so coverage merges cleanly: app/appcore is actually 58.2% line / 64.7% func (the 0%/43.6% figures were the static-lib artifact; the itemmarket entries now correctly show 50-55%). The default build is unchanged (option OFF → STATIC, as shipped). BUT the app number is still noisy even with shared appcore: repeated full-suite measurements of the SAME code have read 50%, 56%, 58% (and the static build gave 44/47%). llvm-cov on this llvm-mingw kit doesn't merge appcore coverage deterministically across the ~14 app-test exes. The three library layers (common/db/savefile) are stable to the decimal across every run, so trust those; treat the app % as a rough ~50-58% band and judge app coverage by the passing functional suites (every model, both list modes, all four market modes, bulk ops, bridge, settings, engine math). Engine providers + QML behavioural now covered too (2026-06-08): tst_engine_providers is a GUI test (QTEST_MAIN, offscreen) that compiles the app qrc and drives TilesetEngine's resource-backed builders, TilesetProvider, and FontPreviewProvider (incl. malformed-id fallbacks); tst_qml_brg boots a real Bridge as the QML brg context property and drives the C++<->QML property chain from QML (the undefined-chain guard). So no app tier is wholly untested now — the only remaining QML work is full screen-flow tests (commit-on-blur etc.), which stay optional / low-ROI per a design decision.
Trend across the effort (line): common 65 → 79 (Random chance helpers), savefile 63 → 67 → 68 → 72 (area, area-fragments, pokedex, items-logic, misc-regions), db 51 → 53 → 56 (db-entries + db-stores), app 0 → 3.5 → 20 → 39 → 56 (bridge, then storage/market models + tileset engine). Method: instrument all libs + appcore, run every test exe with a unique LLVM_PROFILE_FILE, llvm-profdata merge, then llvm-cov export -instr-profile merged.profdata -summary-only <lib.dll|tst_bridge.exe> <src-dir> per module (app measured against tst_bridge.exe, which statically links all of appcore). db then climbed 56 → 63 (db-entry-getters, db-stores, fonts, mapsearch, map-subentries, random-strings). Biggest remaining climbs: app engine/ providers + Router edge paths (app is now the lowest line layer), savefile leaf getters, and the remaining per-entry getters in db._ _**AddressSanitizer: not viable on this Windows/llvm-mingw kit.** The ASan build compiles (projects/build/asan, -fsanitize=address), but every instrumented exe crashes at startup with interception_win: unhandled instruction (0xC0000005) before any test runs — a known ASan-on-Windows limitation in its API-interception engine, NOT a bug in our code (the same tests pass under the normal
CI (ctest + sanitizer + coverage on push). Workflow written 2026-06-07: .github/workflows/tests.yml — a linux-asan job (installs Qt via aqt, builds with ASan+UBSan which work on Linux, runs ctest headless/offscreen) and a windows job (Qt llvm-mingw, matches the local kit). Not yet exercised on GitHub — the Qt module/arch names may need a first-run tweak (noted in the file). This is the right home for ASan, which can't run on the local Windows kit. Cross-platform gotcha (CI first-run shakeout, 2026-06-08): the local kit is Windows (case-INsensitive filesystem); the linux-asan CI is case-SENSITIVE. A few db headers are mixed-case — spriteSet.h, starterPokemon.h, tmHm.h, hiddenItemsdb.h — so an #include <pse-db/spriteset.h> compiles locally but fails on Linux with "No such file or directory". Match header case exactly in includes. (Also surfaced: the Qt module fix, qtdeclarative/qtsvg are base not add-ons; and tests_all aggregate so new suites are always built.) Qt-download flakiness (recurring, 2026-06-13): the linux-asan job intermittently fails at the Install Qt step — aqtinstall can't reach download.qt.io and a fallback mirror (qt-mirror.dannhauer.de) serves a bad SSL cert (CERTIFICATE_VERIFY_FAILED), aborting the install. Mitigation: cache: true + pinned exact 6.8.3 on both install-qt-action steps, so a successful download is reused and almost every run skips the download (a rare cache-miss flake is fixed by a re-run). Note CI runs Qt 6.8.3 while the kit is 6.11, so a GUI/QML test could in principle pass locally yet differ on CI. (CI status is not part of the default loop — checking it needs GitHub auth that isn't wired up; the local Windows kit + ctest is the working gate.)
File: tests/qml/tst_qml_screens.cpp · CTest name: tst_qml_screens · headless (offscreen).
What it guards. The C++ ctest suite never instantiates QML, so a screen that fails to load ("Cannot override FINAL property" → "Component is not ready") or that loads but degrades (binding TypeErrors, missing types, missing image providers, anchor-on-null) passes every C++ test yet is broken in the app. Exactly that reached main on 2026-06-13 (the Credits Page.contentWidth FINAL override + the id: top anchor-line collision — reference/fix-patterns.md). This test is the automated gate for that whole class; the standing workflow now requires it green before FF main.
How it works. A data-driven QTEST_MAIN GUI test, one ctest row per screen, list taken straight from Router::loadScreens() (the authoritative registry — can't drift). Per screen it: builds a QQmlComponent for the qrc url; fails on any component.errors(); instantiates it into a sized parent via beginCreate/completeCreate (so anchors.fill: parent / parent.width resolve against a real non-null parent, as when the app pushes it onto its StackView); spins the event loop briefly so Component.onCompleted + deferred bindings run; then FAILS if any qWarning/qCritical/ qFatal was emitted during the load (captured via an installed qInstallMessageHandler).
Engine wiring mirrors MainWindow::injectIntoQML + setupProviders (keep in sync if that changes): brg context property (a real Bridge over the BaseSAV.sav fixture), the tileset + font image providers, DB::inst()->qmlProtect(engine) (GC guard), and bootQmlLinkage() for the full type registration. It compiles in app.qrc (so qrc:/ui/app/... resolves and the bundled qtquickcontrols2.conf auto-selects the Material style, same as the app) and the exe's own src/boot/bootQmlLinkage.cpp (one source of truth for QML type registration — not duplicated).
Why this approach (C++ QQmlComponent harness + message handler) over Qt Quick Test (qmltest): the goal is "load every screen, fail on ANY warning." That is a C++-level concern — intercepting the Qt message handler and treating QtWarningMsg/QtCriticalMsg as failures across a data-driven sweep — which a C++ harness expresses directly and uniformly. qmltest is built to write per-case QML assertions on properties; making "any warning anywhere = fail" awkward and per-case, and it has no clean hook to enumerate+instantiate the real screen registry with the app's exact engine wiring. We keep qmltest (tst_qml, tst_qml_brg) for the behavioural flow tests below, where asserting QML state from QML is the natural fit. Tradeoff accepted: the harness duplicates a few lines of engine setup that must track MainWindow (documented above).
First-run note (for whoever runs it first on the kit/CI): if a screen surfaces a pre-existing benign warning (e.g. an unavoidable offscreen-platform message — NOT a QML bug), triage it and, only if truly benign, add a narrow justified substring to the isBenign() allowlist in the test (empty by default). Real QML warnings must be fixed, not allowlisted. A screen that binds to a "current selection" only set during navigation (e.g. PokemonDetails) is a good robustness probe — if it warns when loaded cold, the binding wants a null guard.
It is technically doable. Qt Quick Test drives QML components, simulates clicks/keystrokes, and asserts on properties, running headless via the offscreen platform plugin (so it works in CI).
Why it's deferred — honest cost/benefit:
If greenlit, scope would be narrow: a handful of behavioral smoke tests on the highest-value flows (name commit-on-finish, Pokémon editor reactivity, popup open/dismiss), run offscreen, kept deliberately small. Decision pending; revisit once the UI-polish phase settles.
The screen smoke test is the floor (load + zero-warnings). On top of it a comprehensive GUI suite is now built on a shared headless harness, tests/helpers/guiapp.h, which boots the REAL app — qrc:/ui/app/App.qml (+ AppWindow.qml, both StackViews, the live C++ Router) into a QQuickView on the offscreen platform, wired exactly like MainWindow (brg over a fixture/new save, the tileset/font providers, DB::qmlProtect, bootQmlLinkage). It provides navigation (brg.router.changeScreen), item finders (by objectName / type / value), real input synthesis (clickItem/typeInto/pressKey), QML-warning capture (QmlWarningScope), and save/reopen helpers. Per the hybrid approach: flagship journeys use real synthesized input; breadth uses the real bound objects + real navigation. All run offscreen, so they gate both CI jobs (Linux + Windows).
✅ BUILT + RUN + GREEN on the Qt 6.11 kit 2026-06-13 (full ctest 61/61). First-run triage done — fixes: harness keyType() (QtTest keyClicks(QString) is QWidget-only); appBody() matches the "StackView" substring (the inner stack's class is StackView_QMLTYPE_N, not QQuickStackView — was why every non-modal screen read "no current page"); navigate() waits on the StackView busy transition; benign offscreen font warning allowlisted; tst_gui_input now locates the money field by a non-visual objectName and hard-asserts (no QSKIP); the nav sweep accumulates + reports all bad screens at once. A real crash was found + fixed (Pokemart empty-cart at(0) assert — see status.md Open Issues + reference/fix-patterns.md).
Built (this pass):
Still to build (next increments):
1a. Shell + fragment-leaf smoke (extend tst_qml_screens). Also instantiate App.qml / AppWindow.qml and standalone reusable fragments the zero-warning way. 1b. Detail-screen flows. Navigate pokemon → select a mon → open pokemonDetails; maps → select → mapDetails. Exercises the parent→child push the sweep skips (and the editor tab reactivity / Glance pane bindings).
The long-standing gap "ASan isn't viable on the Windows llvm-mingw kit, real sanitizer coverage needs Linux" (see the AddressSanitizer note in phase 7) is now filled locally, not only in CI: a Docker setup under docker/ builds and runs the whole suite on Linux with Qt 6.11 + clang, including the things the Windows kit can't do.
Run it (PowerShell, from repo root):
-Rebuild rebuilds the image (after a Dockerfile edit); -Clean wipes the persistent build volume.
Architecture (why it's fast): the Dockerfile bakes the toolchain ONCE (clang/lld/llvm, CMake+Ninja, Qt 6.11 via aqtinstall, the headless-GUI libs, Xvfb, ccache, and libclang-rt-18-dev for the sanitizer runtime). The repo is not in the image — it's bind-mounted read-only at /host and rsynced into a persistent named volume pse-build (ext4 inside the Docker VM) by run-tests.sh. Building there instead of over the slow Windows/WSL bind mount is the speed win sought, and the build tree + ccache persist across runs so repeats are incremental. Each variant configures its own dir under /build/out/ and runs cmake --build … --target tests_all then ctest. Full how-to + caveats: docker/README.md.
First-run results (2026-06-13, Qt 6.11 + clang 18, all green):
| variant | result |
|---|---|
| standard | 66/66 ctest passed (offscreen). |
| asan | 66/66 passed, zero ASan/UBSan errors across the whole suite incl. the QML-instantiation GUI tests — the s13f–h use-after-free class is now covered automatically on Linux (and validates the recent ItemMarketEntryPlayerItem UAF fix, HEAD 5bcd3e4). |
| xvfb | 66/66 under a real virtual X server. |
| coverage | 66/66; llvm-cov 89.73% line / 86.52% region / 78.24% branch over project source (Qt/tests/system excluded); HTML in /build/out/coverage/html. |
First-run shakeout (fixed in the committed files): (1) CMake find_package(Qt6Gui) needs the OpenGL dev libs (mesa-common-dev libgl-dev libegl-dev libglx-dev libopengl-dev), not just the runtime libs. (2) clang's sanitizer archives (libclang_rt.asan*) aren't in the base clang package — need libclang-rt-18-dev (added as a layer AFTER the Qt download so it stays cached). (3) llvm-cov report takes ONE binary as the first positional + the rest via -object (passing the source dir as the positional → "Is a directory").
Relationship to CI: the GitHub linux-asan job is Qt 6.8.3 + gcc; this container is 6.11 + clang (the kit toolchain) — the closer-to-runtime local reproduction, runnable on demand without GitHub auth. Both run tests_all + ctest offscreen, so a green container ≈ a green CI Linux job (modulo the Qt/compiler delta).
The suite's static-analysis layer — the last missing test type. Three tools, one entry point each locally (scripts/lint.ps1 on the Windows kit, scripts/lint.sh on Linux/CI) and a lint GitHub Actions workflow (.github/workflows/lint.yml, Linux, runs alongside tests).
clang-tidy (GATED). Config: repo-root .clang-tidy. Reads build/compile_commands.json (CMAKE_EXPORT_COMPILE_COMMANDS is forced ON in projects/CMakeLists.txt). The check set is defect-detection, not style: the clang static analyzer + the high-signal bugprone-* / performance-* / misc-unused* checks. Deliberately OFF (each documented in .clang-tidy): the modernize-*/readability-* style families (churn, not defects); and three checks that only flag intentional project idioms — bugprone-too-small-loop-variable (the var8/var16 GameBoy-width loop counters over fixed ranges), bugprone-unchecked-optional-access (DB base-stat/sprite std::optionals are contract-guaranteed populated past their isValid()/null guards), clang-analyzer-optin.cplusplus.VirtualCall (the qmlRegister()/load()-in-ctor boot pattern), and performance-unnecessary-value-param (mostly deliberate Q_INVOKABLE/slot signatures). Gate status: clean — 143 TUs, 0 findings. (The kit ships clang-tidy 17, so ExcludeHeaderFilterRegex — added in 19 — is not used.) The local serial run is slow (~15 min, header re-analysis per TU); for a fast sweep run clang-tidy file-parallel (see the throwaway tidy_par.ps1 pattern) — CI uses run-clang-tidy-style parallelism implicitly via the script.
cppcheck (INFORMATIONAL for now). .cppcheck-suppressions carries the recurring Qt/MOC structural false positives (unusedFunction, missingIncludeSystem, …). Not on the Windows kit by default; the Linux lint job apt-installs it. Surfaced but not gated yet — it's noisy on Qt and hasn't had a validation pass on this codebase (only seen on Linux CI, since it isn't on the kit). Promote to gating (re-add --error-exitcode=2 + the fail increment in scripts/lint.*) once a clean run is confirmed and the suppressions tuned. Inline // cppcheck-suppress <id> for one-offs.
qmllint (INFORMATIONAL, never gates). qmllint cannot resolve the project's C++-registered QML types — a direct consequence of the deliberate no-qt_add_qml_module design — so its type-dependent categories are false positives AND its unused-import detection is unreliable (a used directory import like import "../controls" can be misflagged; removing it would break the app). Changing QML is also a design decision. So qmllint is surfaced for human review but never fails the build. (If we ever want a real QML gate, the path is generating .qmltypes for the C++ types — a sub-project that somewhat fights the current architecture.)
Bugs the gate found + fixed (2026-06-22) — see notes/reference/fix-patterns.md: expLevelRangePercent() integer-division-as-float (+ div-by-zero guard); an unguarded toType1->ind deref in PokemonBox::update(); a dead-store in PokemonMove's ctor; a signed/unsigned char compare; an int-multiply widening; cheap const-ref loop fixes; a missing switch default.
Reviewed with Twilight 2026-06-22:
Fresh authoritative measurement (Docker coverage variant, Qt 6.11 + clang, all 71 ctest): TOTAL ≈ 89.98% line / 90.30% function / 86.81% region / 78.67% branch over project source. (Up slightly from the 89.73% baseline — the new signal/model/visual/acceptance suites + the analyzer bug-fix paths.) common ≈ 100%; savefile/db ≈ 90%; app is the laggard.
"100% on every level" is bounded — the remaining miss is three distinct kinds, only one of which is worth chasing now:
Progress + a 4th, newly-found constraint (2026-06-22). First fill pass landed: tst_db_coverage_fill.cpp covers the Examples / TmHmsDB / MusicDB / TrainersDB accessor one-liners. Building it surfaced an important reachability limit: the db/savefile shared libs only EXPORT their DB_AUTOPORT/SAVEFILE_AUTOPORT-marked classes. Unmarked classes (e.g. Examples, HiddenCoinsDB) can't be called from an external test exe at all — their accessor lines are only reachable via QML's meta-object (a Q_PROPERTY property("x") read invokes the getter inside the dll — that's how the Examples accessors got covered) or internal dll calls. Plain non-Q_PROPERTY methods on an unexported class are simply not reachable headless. Also data-driven dead ends exist: the HiddenItems/Coins entry getters can't be covered because those stores are empty in the test fixtures. So a chunk of the raw "miss" is export-gated/data-gated, not genuinely fillable — treat the per-line report with that lens (prefer Q_PROPERTY meta-reads for unexported classes; don't add export macros just for coverage).
So the honest target is ~100% of reachable, built, exported code, not a literal 100% of every line. The next coverage passes should work the category-1 list above (each a focused test file + the standard build/ctest/commit loop; app-layer targets need a Bridge à la tst_bridge), and re-measure with the Docker coverage variant. Category 2 unlocks as the Maps/HoF/Options screens are built; category 3 stays documented-excluded.