Cleanup pass on save-sync addressing three independent failure modes
that interact in production data: content_hash drift between client
and server, null-slot archival saves leaking into sync flows, and
content-hash dedupe collapsing legitimately-distinct slots.
Bug fixes
- compute_content_hash dispatched on zipfile.is_zipfile(relative_path),
which silently returned False whenever the process's CWD wasn't
ASSETS_BASE_PATH. Every zip save fell through to the raw-MD5 branch,
persisting hashes that disagreed with clients computing the intended
per-entry zip-hash. Resolve to a full path before the dispatch.
- _build_negotiate_plan, sync_push_pull_task, and sync_watcher all
treated null-slot saves as sync-eligible. Null-slot saves represent
web-UI / archival uploads; including them in negotiate plans matched
them against device pushes by filename and overwrote archival data.
Filter null-slot saves at all three call sites.
- get_save_by_content_hash matched on (rom_id, user_id, content_hash)
only, so identical bytes uploaded to different slots collapsed into
one record. Scope the lookup by slot when provided so clone-save-
to-new-slot creates a distinct row per slot.
- get_save_by_filename matched on (rom_id, user_id, file_name) only.
When two uploads to different slots happened in the same wall-clock
second (the datetime tag is per-second), the second upload UPDATED
the first record's slot instead of creating a distinct row. Scope
the filename lookup by slot too.
One-shot recovery
- New recompute_save_content_hashes manual task walks every Save row,
recomputes via the fixed dispatch, and updates rows whose values
differ. Idempotent; safe to re-run.
- Backend startup runs a COUNT(content_hash IS NULL) query and, if
any rows exist, enqueues the recompute task on the low-priority
RQ queue. The API process moves on; the worker handles the
recompute out-of-band. Subsequent restarts find zero NULL hashes
and skip. Admins can also trigger the task manually.
Test infrastructure
- Added tests/_zipfile_shim.reload_zipfile() mirroring the pattern
from utils/zip_cache.py for the same zipfile-inflate64 + CPython
3.13.5 incompatibility. Test fixtures that build ZIPs call it
immediately before opening the archive.
The paginated ROM list eager-loaded sibling_roms via selectinload, which
hydrated full Rom ORM instances (including heavy JSON metadata columns)
for every sibling even though only an existence/count check was needed
on the frontend. On large collections this dominated request latency.
Split sibling handling by response shape:
- SimpleRomSchema (list): siblings is now list[int]; populated per page
by a single SELECT against the sibling_roms view projecting only
(rom_id, sibling_rom_id) — no Rom row hydration.
- DetailedRomSchema (detail): keeps full SiblingRomSchema objects, with
load_only on (id, name, fs_name_no_tags, fs_name_no_ext) so sibling
rows stop dragging in JSON metadata.
Frontend usage already only consumes siblings.length on list views; the
detail-page VersionSwitcher continues to receive the richer schema.
Drop the migration and the multi_file / top_level_file_count columns on
roms; express both as deferred column_property correlated subqueries
against rom_files instead. The gallery list and detail queries opt in
via undefer, so they get the values computed in the same SELECT via
indexed subqueries (rom_id index already in place); other code paths
that don't read the flags pay nothing.
This keeps the gallery perf win (no rom_files load for cards) without
introducing schema state that has to stay in sync with rom_files at
write time.
The gallery list endpoint was eager-loading every rom_file row for each
paginated ROM via selectinload, then re-joining each row back to its
parent rom for the is_top_level computation. For platforms with extracted
multi-file ROMs (Xbox 360 ~1394 files/ROM, Switch ~199 files/ROM), this
made /api/roms time out at 120s even with a rom_id index.
Cards never displayed individual files — only the has_simple_single_file
/ has_nested_single_file / has_multiple_files booleans that derive from
the file list. Denormalize the underlying state onto roms as multi_file
(folder-based vs single-file) and top_level_file_count, recompute the
booleans from those columns, drop the selectinload from filter_roms, and
move the files field from SimpleRomSchema to DetailedRomSchema so the
gallery payload no longer ships file rows.
Also drop the redundant joinedload(RomFile.rom) and switch the relation
to lazy="select" so subsequent file.rom accesses resolve from the
session identity map instead of re-JOINing the parent rom per file row.
ShowQRCode.vue's folder-based DS/3DS fallback now fetches the detailed
rom on demand, since SimpleRom no longer carries files.
- Use CollectionSchema instead of ReturnType<typeof collectionsStore.getCollection>
in AddRoms.vue and RemoveRoms.vue (simpler, per gantoine review)
- Wrap bulk INSERT in a savepoint so a concurrent duplicate-key violation
is caught via IntegrityError and ignored rather than aborting the transaction
- Only bump Collection.updated_at in remove_roms_from_collection when rows
were actually deleted (rowcount > 0), matching add_roms_to_collection behavior
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace full rom_ids list replacement with atomic POST/DELETE endpoints
that add or remove individual ROMs from a collection. This prevents
concurrent rapid clicks from overwriting each other (last-write-wins).
Also fix missing session.flush() in add_rom_user() and add collection
endpoint tests.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Commit 3991e1b6e removed `@with_details` from `get_roms_by_fs_name` but
left the body using the `query` parameter that decorator was supposed
to inject, so every scan hit `'NoneType' object has no attribute
'filter'` and crashed the platform identification task.
Make the function self-contained: build `select(Rom)` directly and
eager-load only `Rom.platform`, the one relationship the scan loop
actually needs (via `rom.platform_slug` / `rom.platform.fs_slug`).
Keeps the prior commit's intent of avoiding the heavy `with_details`
eager-load on every batch.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Collapse the two parallel id lists and their mirrored chunked-update
loops into a `flips: dict[bool, list[int]]` keyed by desired state, and
drop unused rom assignments in the related tests.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The scan was spending excessive time on large platforms even when all ROMs
were already scanned. Root causes: per-ROM UPDATE queries for skipped ROMs
(10k individual writes), missing composite index on (platform_id, fs_name)
causing full table scans, NOT IN clauses with 10k+ values in
mark_missing_roms(), and redundant filesystem reads.
Changes:
- Add bulk_mark_present() for batch-updating skipped ROMs in one query
- Move skip detection from _identify_rom to the batch loop so skipped ROMs
never enter the async scan pipeline, and report progress for them
- Add composite index idx_roms_platform_id_fs_name via migration 0077
- Rewrite mark_missing_roms() with flip-based approach: mark all missing,
then un-mark present ones in chunks of 1000
- Cache filesystem reads in scan_platforms() to avoid double directory
traversal (precounting + scanning)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace per-item add_session with add_sessions using add_all.
No fallback on IntegrityError -- duplicate concurrent submissions
are the client's responsibility.
Backend API for collecting and querying play sessions, modeled after
the Argosy session data format. Clients submit batches per device,
recording both the session window and screen-on time.
- Restore NoResultFound behavior on update_session, complete_session,
fail_session when row is missing (scalar returns None, old .one()
raised -- silent None is a semantic regression)
- Remove redundant get_session call from _increment_session_counter;
the atomic SQL increment is already a no-op on missing rows
- Log warning when passed session_id is not found in _sync_device
instead of silently creating an orphan session
- Fix broken path construction in FSSyncHandler: build_* methods now
return relative paths; sync_watcher uses paths relative to sync base
instead of CWD (was completely non-functional in production)
- Fix SSH connection leak in push-pull task: conn.close() now in finally
- Add log.warning for disabled SSH host key verification
- Fix race condition in session operation counter: use atomic SQL
increment instead of read-then-write
- Extract _increment_session_counter helper, add exc_info to warnings
- Replace legacy session.query() with select() in sync_sessions_handler
- Fix orphaned session: trigger_push_pull now passes session_id to job
- Fix wasteful SSH download when no matched_save exists
- Fix BaseModel import collision in sync.py (pydantic -> project base)
- Fix ORM mutation in UserSchema.from_orm_with_request: set field on
schema instance instead of mutating live ORM object
- Mask ssh_password and ssh_key_path in DeviceSchema API response
- Fix migration PostgreSQL compatibility: condition ON UPDATE clause
on MySQL, drop enum in downgrade
- Rename copy-paste artifact rom_user_status_enum
- Spread allPlatforms before sorting to avoid mutating Pinia store
- Move _METADATA_SOURCE_COLUMNS to module level
- Add optional chain on sourceInfo v-img src
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Derive metadata source columns from Rom model instead of hardcoded list
- Replace getOrderedCoverage() function calls with a computed map to avoid
redundant sorting on each render
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Enhances the server stats page with two new per-platform statistics:
- Metadata coverage: shows which sources matched ROMs (ordered by user's scan priority config)
- Region breakdown: shows ROM counts per region with flag emojis
Backend adds two new efficient queries (single GROUP BY for metadata, Python-side aggregation for regions).
Frontend redesigns platform cards with a tabular detail layout, size bar visualization, and expandable region chips.
> This PR was developed with AI assistance (Claude Code) per CONTRIBUTING.md disclosure requirements.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add migration 0071 to fix sibling_roms view: add guard against empty string matching for fs_name_no_tags
- Fix group_by_meta_id in filter_roms: use func.nullif to treat empty fs_name_no_tags as NULL in grouping key
- Add group_by_meta_id support to get_roms_scalar
- Add tests for sibling matching behavior with empty/non-empty fs_name_no_tags
Co-authored-by: gantoine <3247106+gantoine@users.noreply.github.com>