Commit Graph

1635 Commits

Author SHA1 Message Date
Georges-Antoine Assi
29f90c027f Merge pull request #3448 from tmgast/fix-save-sync-hash-and-archival
Fix save-sync hash drift, archival save leak, and dedupe scoping
2026-05-29 11:47:52 -04:00
copilot-swe-agent[bot]
54dc059e15 Fix 500 error when char_index contains None key from NULL ROM names
Co-authored-by: gantoine <3247106+gantoine@users.noreply.github.com>
2026-05-29 11:23:55 +00:00
nendo
db0f714b4f SaveSync: use pathlib joins for asset content-hash paths
FSAssetsHandler.compute_content_hash and _compute_zip_hash were
building full paths via f"{self.base_path}/{file_path}". self.base_path
is already a pathlib.Path (resolved by FSHandler.__init__), so the
f-string forced it to str, hard-coded the separator, and re-parsed --
fine on Linux but a footgun if a caller ever sneaks a leading slash or
the path needs Path semantics elsewhere.

Switch both spots to self.base_path / file_path, which is what every
other FSHandler subclass in this module already does (e.g.
FSRomsHandler, FSResourcesHandler, FSSyncHandler all join Path objects
directly).
2026-05-29 17:40:56 +09:00
nendo
41c91fdd5b SaveSync: push null-slot exclusion into the SQL query
Three sync callsites (endpoints/sync.py, sync_watcher.py, and both
branches of tasks/sync_push_pull_task.py) ran get_saves(...) and then
discarded archival null-slot rows in a Python list comprehension. On
libraries with many archival/web-UI uploads that's a strict waste:
those rows are pulled from MariaDB, hydrated into Save model instances,
and then immediately filtered out.

Add a slot_not_null bool kwarg to DBSavesHandler.get_saves and apply
the filter in the SQL query. Update all four callsites to use it and
drop the Python-side comprehension. Default stays False so unrelated
callers keep the current behavior.
2026-05-29 17:40:18 +09:00
nendo
5bb10dacd1 SaveSync: paginate recompute task scan by primary key
get_all_saves() materialized every Save row across all users into a
single .all() list. On instances with very large libraries that's a
real RAM ceiling and pins every row for the lifetime of the recompute
run.

Replace it with get_saves_after_id(after_id, limit) and have the
recompute task drive keyset pagination in PAGE_SIZE-row chunks. SQLAlchemy
streaming via .execution_options(yield_per=...) is incompatible with the
per-call session lifetime that @begin_session enforces (the session
exits before the consumer iterates), so keyset paging from the caller is
the cleanest fit.

Behavior is unchanged: same row coverage, same idempotency, same
counters. Memory usage drops from O(all saves) to O(PAGE_SIZE).
2026-05-29 17:38:49 +09:00
nendo
edb5d15420 Fix save-sync hash drift, archival save leak, and dedupe scoping
Cleanup pass on save-sync addressing three independent failure modes
that interact in production data: content_hash drift between client
and server, null-slot archival saves leaking into sync flows, and
content-hash dedupe collapsing legitimately-distinct slots.

Bug fixes
- compute_content_hash dispatched on zipfile.is_zipfile(relative_path),
  which silently returned False whenever the process's CWD wasn't
  ASSETS_BASE_PATH. Every zip save fell through to the raw-MD5 branch,
  persisting hashes that disagreed with clients computing the intended
  per-entry zip-hash. Resolve to a full path before the dispatch.
- _build_negotiate_plan, sync_push_pull_task, and sync_watcher all
  treated null-slot saves as sync-eligible. Null-slot saves represent
  web-UI / archival uploads; including them in negotiate plans matched
  them against device pushes by filename and overwrote archival data.
  Filter null-slot saves at all three call sites.
- get_save_by_content_hash matched on (rom_id, user_id, content_hash)
  only, so identical bytes uploaded to different slots collapsed into
  one record. Scope the lookup by slot when provided so clone-save-
  to-new-slot creates a distinct row per slot.
- get_save_by_filename matched on (rom_id, user_id, file_name) only.
  When two uploads to different slots happened in the same wall-clock
  second (the datetime tag is per-second), the second upload UPDATED
  the first record's slot instead of creating a distinct row. Scope
  the filename lookup by slot too.

One-shot recovery
- New recompute_save_content_hashes manual task walks every Save row,
  recomputes via the fixed dispatch, and updates rows whose values
  differ. Idempotent; safe to re-run.
- Backend startup runs a COUNT(content_hash IS NULL) query and, if
  any rows exist, enqueues the recompute task on the low-priority
  RQ queue. The API process moves on; the worker handles the
  recompute out-of-band. Subsequent restarts find zero NULL hashes
  and skip. Admins can also trigger the task manually.

Test infrastructure
- Added tests/_zipfile_shim.reload_zipfile() mirroring the pattern
  from utils/zip_cache.py for the same zipfile-inflate64 + CPython
  3.13.5 incompatibility. Test fixtures that build ZIPs call it
  immediately before opening the archive.
2026-05-29 17:00:01 +09:00
Georges-Antoine Assi
8f08769670 run fmt 2026-05-28 20:05:24 -04:00
copilot-swe-agent[bot]
d29ed39a6a Add miximage_v2 media type mapping to SS.fr mixrbv2
Co-authored-by: gantoine <3247106+gantoine@users.noreply.github.com>
2026-05-28 20:15:40 +00:00
Georges-Antoine Assi
30451d5651 fix(security): move SSRF defense into the HTTP client path
The previous validator did a preflight `socket.getaddrinfo` before each
httpx request. Two problems:

  * DNS rebinding / TOCTOU: httpx re-resolves at connect time, so a
    hostname can answer with a public IP for the validator and a
    private IP for the real request. The preflight check did not
    constrain the connection.
  * Event-loop blocking: `socket.getaddrinfo` is synchronous, and the
    media-download callers are async. Slow resolvers stalled
    unrelated requests.

Replace it with two layers, both wired automatically onto every httpx
client built by `utils.context`:

  1. A request event hook running `validate_url_for_http_request`
     (syntactic checks only: scheme, reserved hostnames, literal IPs,
     internal TLDs). No DNS, no call-site responsibility.
  2. `SSRFProtectedAsyncBackend` / `SSRFProtectedSyncBackend`, custom
     httpcore network backends that resolve the hostname inside
     `connect_tcp`, reject any address in a forbidden range, then
     connect to that *same* validated address. The async variant uses
     `loop.getaddrinfo` so it doesn't block the loop. httpcore calls
     `start_tls(server_hostname=<URL host>)` after `connect_tcp`, so
     TLS SNI and cert verification still use the original hostname
     even though the TCP layer connects by IP.

Drop the explicit `validate_url_for_http_request(...)` calls from
`resources_handler.py` — the event hook covers them. Consolidate the
URL validator and its tests under `utils/ssrf.py` /
`tests/utils/test_ssrf.py` so the SSRF surface lives in one module.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 17:58:14 -04:00
Georges-Antoine Assi
84f9dd2e2d Merge pull request #3434 from rommapp/copilot/fix-region-specific-release-date
Use region-prioritized release dates from ScreenScraper
2026-05-26 21:14:21 -04:00
Georges-Antoine Assi
29ce936c7d Potential fix for pull request finding
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
2026-05-26 20:45:41 -04:00
copilot-swe-agent[bot]
511f5e4272 Revert IGDB handler and test changes
Co-authored-by: gantoine <3247106+gantoine@users.noreply.github.com>
2026-05-27 00:06:50 +00:00
Georges-Antoine Assi
f5b1d44313 changes from bot review 2026-05-26 19:52:16 -04:00
Georges-Antoine Assi
04241169d7 fix 2026-05-26 19:36:38 -04:00
Georges-Antoine Assi
09aecc81bf cleanup 2026-05-26 18:22:12 -04:00
Spinnich
3c2f421dbb fix(screenscraper): inject user credentials for cover, manual, and screenshot downloads
Standard media fields (url_cover, url_manual, url_screenshots) were downloaded
using the stored credential-less URLs, causing them to count against the anonymous
IP quota instead of the user's SS account. Apply add_ss_auth_to_url() at each
download call site in the scan and ROM update paths.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

fix(screenscraper): guard add_ss_auth_to_url against non-SS URLs

Only inject ssid/sspassword into screenscraper.fr URLs to prevent
leaking user credentials to third-party sources (IGDB, LaunchBox, etc.)
when url_cover/url_manual/url_screenshots originate from other providers.

Add tests for the non-SS no-op and empty-string edge cases.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

test(screenscraper): verify SS credentials injected for all media download paths

- TestAddSsAuthToUrl: add guards for non-SS URLs (IGDB, LaunchBox) and
  empty string inputs
- test_update_rom: verify ssid/sspassword appear in url_cover and
  url_manual args passed to get_cover/get_manual for screenscraper.fr
  URLs; verify IGDB URLs are NOT decorated with SS credentials
- TestScanCredentialInjection: verify the scan-path ternary pattern
  correctly applies add_ss_auth_to_url to cover and screenshot URLs,
  and that a None cover URL passes through without error

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

test(screenscraper): empirical audit — every SS request carries ssid/sspassword

Intercepts both HTTP clients at the transport/session level to verify
that every outgoing screenscraper.fr request is decorated with the user's
ssid and sspassword credentials:

  aiohttp (API calls via auth_middleware):
  - jeuInfos.php, jeuRecherche.php, ssinfraInfos.php, ssuserInfos.php

  httpx (media downloads via FSResourcesHandler):
  - get_cover          → url_cover
  - get_manual         → url_manual
  - get_rom_screenshots → url_screenshots (each URL)
  - store_media_file   → extra media (fanart, bezel, etc.)

Also verifies the domain guard: IGDB URLs passed through add_ss_auth_to_url
are NOT decorated with SS credentials.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-26 20:50:18 +00:00
copilot-swe-agent[bot]
536f6ac815 fix: use region-aware release dates for SS and IGDB metadata
Co-authored-by: gantoine <3247106+gantoine@users.noreply.github.com>
2026-05-26 13:21:39 +00:00
Georges-Antoine Assi
fb3cc1da87 perf 2026-05-25 12:00:14 -04:00
Georges-Antoine Assi
9e9a282286 fix(roms): dedupe and sort sibling IDs for stable API output
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 11:35:16 -04:00
Georges-Antoine Assi
5e8fac36a7 Merge branch 'master' into claude/awesome-gates-NUpke 2026-05-25 11:10:04 -04:00
Georges-Antoine Assi
b9b82c751b more cleanup 2026-05-25 11:07:03 -04:00
Georges-Antoine Assi
9a87348c22 cleanup 2026-05-25 11:02:35 -04:00
Claude
95b1a99f2a perf(roms): avoid hydrating full Rom rows for siblings on list endpoint
The paginated ROM list eager-loaded sibling_roms via selectinload, which
hydrated full Rom ORM instances (including heavy JSON metadata columns)
for every sibling even though only an existence/count check was needed
on the frontend. On large collections this dominated request latency.

Split sibling handling by response shape:
- SimpleRomSchema (list): siblings is now list[int]; populated per page
  by a single SELECT against the sibling_roms view projecting only
  (rom_id, sibling_rom_id) — no Rom row hydration.
- DetailedRomSchema (detail): keeps full SiblingRomSchema objects, with
  load_only on (id, name, fs_name_no_tags, fs_name_no_ext) so sibling
  rows stop dragging in JSON metadata.

Frontend usage already only consumes siblings.length on list views; the
detail-page VersionSwitcher continues to receive the richer schema.
2026-05-24 23:17:34 +00:00
Georges-Antoine Assi
9b97e32f54 Merge pull request #3425 from rommapp/claude/ecstatic-dirac-dFQQO
Denormalize ROM file stats for efficient gallery rendering
2026-05-24 19:04:12 -04:00
Georges-Antoine Assi
1a560d3660 cleanup 2026-05-24 18:57:38 -04:00
Georges-Antoine Assi
fb13f54f48 cleanup 2026-05-24 17:43:01 -04:00
Claude
8fcc16bad2 refactor(roms): replace denormalized columns with deferred column_property
Drop the migration and the multi_file / top_level_file_count columns on
roms; express both as deferred column_property correlated subqueries
against rom_files instead. The gallery list and detail queries opt in
via undefer, so they get the values computed in the same SELECT via
indexed subqueries (rom_id index already in place); other code paths
that don't read the flags pay nothing.

This keeps the gallery perf win (no rom_files load for cards) without
introducing schema state that has to stay in sync with rom_files at
write time.
2026-05-24 20:41:44 +00:00
Georges-Antoine Assi
63644d0c6f Merge pull request #3426 from rommapp/claude/loving-darwin-pveIr
Defer optional handler initialization with lazy factories
2026-05-24 16:18:24 -04:00
Georges-Antoine Assi
0eec8b0e47 Merge pull request #3424 from rommapp/copilot/fix-csrf-token-issue
Refresh CSRF cookie on OIDC session authentication changes
2026-05-24 15:57:30 -04:00
Georges-Antoine Assi
be476cb7dc Only set CSRF cookie on http.response.start
ASGI spec only allows headers on the http.response.start message;
appending Set-Cookie to body messages is out-of-spec and may break on
some servers. Early-return for non-start messages.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 15:46:50 -04:00
Georges-Antoine Assi
8af556ee46 run fmt 2026-05-24 14:07:45 -04:00
Georges-Antoine Assi
acc1e630b7 Apply suggestions from code review
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
2026-05-24 14:05:58 -04:00
Georges-Antoine Assi
4d8c2ef54b Merge branch 'master' into copilot/fix-folder-tag-import-issue 2026-05-24 13:33:36 -04:00
Georges-Antoine Assi
e7bde8a190 simplify 2026-05-24 13:13:11 -04:00
Georges-Antoine Assi
5fdb69d53c fixes 2026-05-24 13:09:07 -04:00
Claude
5adffeca71 perf(roms): skip rom_files load on gallery list endpoint
The gallery list endpoint was eager-loading every rom_file row for each
paginated ROM via selectinload, then re-joining each row back to its
parent rom for the is_top_level computation. For platforms with extracted
multi-file ROMs (Xbox 360 ~1394 files/ROM, Switch ~199 files/ROM), this
made /api/roms time out at 120s even with a rom_id index.

Cards never displayed individual files — only the has_simple_single_file
/ has_nested_single_file / has_multiple_files booleans that derive from
the file list. Denormalize the underlying state onto roms as multi_file
(folder-based vs single-file) and top_level_file_count, recompute the
booleans from those columns, drop the selectinload from filter_roms, and
move the files field from SimpleRomSchema to DetailedRomSchema so the
gallery payload no longer ships file rows.

Also drop the redundant joinedload(RomFile.rom) and switch the relation
to lazy="select" so subsequent file.rom accesses resolve from the
session identity map instead of re-JOINing the parent rom per file row.

ShowQRCode.vue's folder-based DS/3DS fallback now fetches the detailed
rom on demand, since SimpleRom no longer carries files.
2026-05-24 15:02:02 +00:00
Claude
26bdc11e13 refactor(filesystem): lazy-init launchbox + sync handlers, drop tolerate_missing_base
Apply the same lazy-factory pattern to FSLaunchboxHandler and FSSyncHandler
that ssh_sync_handler now uses. With both opt-in features deferred to
first-use, the tolerate_missing_base escape hatch on FSHandler is no longer
needed — every handler now fails loudly on mkdir failure, which is the
right behavior for the always-on core paths (assets, library, resources).

Touched call sites:
  - resources_handler._resolve_local_file_uri (launchbox)
  - sync_watcher.py, endpoints/device.py, tasks/manual/sync_folder_scan.py
    (fs_sync)

Net effect:
  - Default installs never poke /romm/launchbox or /romm/sync at startup.
  - Misconfigured opt-in users get a clear, actionable PermissionError at
    the call site instead of a silent warning followed by mystery failures.
  - tolerate_missing_base, its tests, and one stale log import are gone.
2026-05-24 14:59:03 +00:00
Claude
1890958ff2 refactor(sync): make SSH sync handler lazy-initialized
The module-level SSHSyncHandler() singleton ran filesystem side effects
(mkdir on SYNC_SSH_KEYS_PATH) at import time, which meant even default
installs with push-pull sync disabled would touch /romm/sync — and the
previous tolerate-and-warn fallback could leave users wondering why
sync silently does nothing.

Replace the eager singleton with a functools.cache'd factory. The
handler is now constructed on first use, so:

  - Default-install users (ENABLE_SYNC_PUSH_PULL=false, no manual sync
    triggered) never touch /romm/sync.
  - Users opting in get a clear, actionable RuntimeError pointing at
    the unwritable path and the env var to override, at the call site
    rather than buried in a startup stack trace.

Also document in env.template that enabling either sync mode requires
a writable volume at $ROMM_BASE_PATH/sync.
2026-05-24 14:49:53 +00:00
copilot-swe-agent[bot]
f94206aa53 Refresh CSRF cookie when auth user changes
Co-authored-by: gantoine <3247106+gantoine@users.noreply.github.com>
2026-05-24 14:46:31 +00:00
Claude
358e91d33c fix(sync): tolerate unwritable SSH keys directory at startup
The SSH sync handler is instantiated at module import time, so any
PermissionError on mkdir would crash the entire app rather than just
disabling the push-pull sync feature. This affected users whose /romm
mount didn't include a writable sync subdirectory (common on Unraid
and similar setups that mount specific subpaths).

Mirror the FSHandler pattern: log a warning and continue. Keys are
expected to be pre-mounted per the module docstring, and
_resolve_key_path already handles a missing directory gracefully.

Fixes #3419
2026-05-24 14:36:14 +00:00
copilot-swe-agent[bot]
03d869e859 Fix gamelist parsing with alternativeEmulator fallback
Co-authored-by: gantoine <3247106+gantoine@users.noreply.github.com>
2026-05-24 14:15:24 +00:00
copilot-swe-agent[bot]
9611ebedb4 Support folder tags in gamelist metadata import
Co-authored-by: gantoine <3247106+gantoine@users.noreply.github.com>
2026-05-24 14:11:01 +00:00
Georges-Antoine Assi
22666631e9 Merge branch 'master' into fix/screenscraper-skip-name-search-after-notgame 2026-05-23 20:54:19 -04:00
Georges-Antoine Assi
599ccb5f14 refactor(screenscraper): return notgame flag as tuple from lookup_rom
Instead of smuggling an internal control flag through the SSRom dict,
lookup_rom now returns (SSRom, is_not_game: bool). scan_handler unpacks
the tuple and short-circuits the name-search fallback when either an
ss_id matched or the hash lookup flagged the entry as notgame.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 20:52:43 -04:00
Georges-Antoine Assi
5c980d82cc fix(screenscraper): wire notgame flag through to scan_handler
The previous commit added a notgame skip-name-search check in scan_handler
but used a different key ("notgame") than what lookup_rom returned
("not_game"), so the fallback was never actually skipped. Align both on
SSRom.not_game, pop the internal flag before returning the SSRom to the
rest of the scan pipeline, and rename the helper to _is_not_game for
consistency with the field name.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 20:22:00 -04:00
Georges-Antoine Assi
1e66436ca1 Merge pull request #3414 from Spinnich/fix/screenscraper-add-user-credentials-media-endpoints
fix(screenscraper): re-add user credentials to media file downloads
2026-05-23 19:57:08 -04:00
Georges-Antoine Assi
ca6ecff4a0 cleanup 2026-05-23 19:07:43 -04:00
Georges-Antoine Assi
dce7101b4f Merge pull request #3407 from rommapp/playmatch-metadata-souce
Add playmatch as explicit metadata source
2026-05-23 13:20:56 -04:00
Spinnich
2ae6e1c07c fix(screenscraper): skip name search after notgame hash lookup
When jeuInfos.php returned a notgame entry (BIOS files, ZZZ hacks, etc.),
lookup_rom() had no notgame check, leading to two bugs:
- notgame entries with a real ID were stored as valid SS matches
- notgame entries with a falsy ID fell through to a jeuRecherche.php name
  search that always returned nothing (pointless quota usage)

Adds _is_notgame() and NOTGAME_NAME_PREFIX, returns SSRom(notgame=True)
from lookup_rom() on a notgame hit, and guards the get_rom() fallback in
scan_handler so the name search is skipped entirely. Also adds the missing
notgame filter to _search_rom() so ZZZ entries can't match by name either.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-23 17:01:41 +00:00
Spinnich
aec1c4ed33 fix(screenscraper): re-add user credentials to media file downloads
SS media URLs are stripped of ssid/sspassword before DB storage (correct),
but downloads were issued against the credential-less stored URLs, causing
them to count against the anonymous IP quota instead of the user's account.

Adds restore_sensitive_query_params() as the principled complement to
strip_sensitive_query_params(), and add_ss_auth_to_url() in ss_handler
which re-attaches credentials at download time without storing them.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-23 15:27:43 +00:00