Commit Graph

298 Commits

Author SHA1 Message Date
Georges-Antoine Assi
aea09911ad cleanup 2026-06-18 07:57:09 -04:00
Georges-Antoine Assi
987e351113 refactor: derive file-name columns via @validates
Centralize the *_no_tags / *_no_ext / *_extension columns (derived from a
file name) behind @validates hooks instead of computing them by hand at
every write site:

- Add pure helpers (compute_file_name_parts and friends) to models.base;
  the filesystem base handler now delegates to them.
- Add @validates on Rom (fs_name), BaseAsset (file_name, inherited by all
  asset subclasses), and Firmware (file_name).
- update_rom keeps the fs_name-derived columns in sync on bulk update(),
  which also fixes the rename path never updating fs_extension.
- Drop the now-redundant computations at the scan/rename call sites.

Also fix the migration backfill loop and a pre-existing list[str | None]
type mismatch surfaced in scan_handler. Add tests for the helpers, the
validators, and the update_rom bulk-sync path.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-17 21:14:16 -04:00
Zurdi
b52c5f1ae7 Merge branch 'master' into chore/frontend-v2 2026-06-17 15:43:08 +02:00
zurdi
2a53669459 fix: prevent crash on startup by bootstrapping library structure A when none is detected 2026-06-17 14:20:58 +02:00
zurdi
9f6138d010 Merge branch 'master' into chore/frontend-v2
Adopt master's ROM schema design (sibling_roms + files, batched
get_files_for_roms / get_siblings_for_roms) while preserving the v2-branch
features master lacks: per-user is_main_sibling on siblings and audio_meta
on rom files.

Conflict resolution:
- responses/rom.py: keep master's sibling_roms/files fields; re-graft
  is_main_sibling via SiblingRomSchema.from_rom(rom, is_main_sibling=...);
  restore the eager-relationship fallback in
  SimpleRomSchema.from_orm_with_request (None sentinel) so the v2
  /{id}/simple endpoint still returns siblings/files.
- roms_handler.py: get_siblings_for_roms now left-joins RomUser and returns
  (Rom, is_main_sibling) tuples; keep both branch and master file helpers.
- drop the redundant branch-only sibling_ids field and
  get_sibling_data_for_roms.
- generated types resolved to match (sibling_roms + files; RomFileSchema
  keeps audio_meta and gains archive_members).
- update v2 components and the RelatedGameCard mock to read sibling_roms.
- fix stale exclude={"siblings"} -> "sibling_roms" in scan emit payloads.
- re-chain the audio_meta migration as 0083 (after master's 0082) to keep a
  single Alembic head.
- package.json: union of branch tooling + master dependency bumps; lock
  regenerated.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-16 01:19:55 +02:00
Georges-Antoine Assi
49ef5097e4 fix: split structure detection into has_structure_path_a, fail loudly on bad layouts
Extract has_structure_path_a as its own cached property and have
has_structure_path_b delegate to it, removing duplicated isdir checks.
detect_library_structure and get_platforms_directory now read the named
properties instead of re-implementing the roms-path check inline.

Keep the inconclusive/bad-structure fallback defaulting to Structure A so
a malformed library raises FolderStructureNotMatchException rather than
listing the bare library root as a flat list of platforms.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-12 14:59:01 -04:00
copilot-swe-agent[bot]
788d454d98 fix: prioritize Structure A over Structure B in library structure detection
When a top-level `roms/` folder exists (Structure A), never detect the
library as Structure B, even if individual `<platform>/roms/` directories
also exist. This prevents existing Structure A libraries from being broken
after upgrading to 4.9.0.

- `has_structure_path_b` in `config_manager.py` now returns `False` early
  when `{LIBRARY_BASE_PATH}/{ROMS_FOLDER_NAME}` is an existing directory
- `detect_library_structure()` in `platforms_handler.py` now explicitly
  checks Structure A (`os.path.exists(roms_path)`) before consulting
  `cnfg.has_structure_path_b`
- Updated test to assert Structure A wins when both layouts coexist

Co-authored-by: gantoine <3247106+gantoine@users.noreply.github.com>
2026-06-12 16:35:01 +00:00
Spinnich
1d9963ac63 fix(hashing): compute RA hash for archive ROMs on cartridge platforms
The archive branch of get_rom_files (introduced in #3412) was missing
the RAHasherService.calculate_hash call that exists in the non-archive
branch, causing all archive-format ROMs to produce an empty ra_hash
during scanning regardless of platform.

The RA hash call is now made for archive ROMs, mirroring the existing
non-archive behaviour. The RA_BUFFER_HASH_UNSUPPORTED skip logic in
RAHasherService already handles disc-based platforms (PSX, PS2, PSP,
Saturn, Dreamcast, etc.) so those continue to be excluded automatically.

Also improves handling of folder-based multi-file ROMs whose directories
contain compressed files. RAHasher cannot process archives via the /*
glob and fails with "Could not open file". The fix mirrors the existing
CHD folder logic: for cartridge platforms the largest archive in the
folder is passed directly to RAHasher for buffer hashing; for disc
platforms the call is skipped as buffer hashing is unsupported.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-30 14:55:42 +00:00
Georges-Antoine Assi
10d731d823 cleanup 2026-05-29 11:58:53 -04:00
Georges-Antoine Assi
ae60d14f81 Merge branch 'master' into feat/composite-hashing-archives 2026-05-29 11:50:17 -04:00
nendo
db0f714b4f SaveSync: use pathlib joins for asset content-hash paths
FSAssetsHandler.compute_content_hash and _compute_zip_hash were
building full paths via f"{self.base_path}/{file_path}". self.base_path
is already a pathlib.Path (resolved by FSHandler.__init__), so the
f-string forced it to str, hard-coded the separator, and re-parsed --
fine on Linux but a footgun if a caller ever sneaks a leading slash or
the path needs Path semantics elsewhere.

Switch both spots to self.base_path / file_path, which is what every
other FSHandler subclass in this module already does (e.g.
FSRomsHandler, FSResourcesHandler, FSSyncHandler all join Path objects
directly).
2026-05-29 17:40:56 +09:00
nendo
edb5d15420 Fix save-sync hash drift, archival save leak, and dedupe scoping
Cleanup pass on save-sync addressing three independent failure modes
that interact in production data: content_hash drift between client
and server, null-slot archival saves leaking into sync flows, and
content-hash dedupe collapsing legitimately-distinct slots.

Bug fixes
- compute_content_hash dispatched on zipfile.is_zipfile(relative_path),
  which silently returned False whenever the process's CWD wasn't
  ASSETS_BASE_PATH. Every zip save fell through to the raw-MD5 branch,
  persisting hashes that disagreed with clients computing the intended
  per-entry zip-hash. Resolve to a full path before the dispatch.
- _build_negotiate_plan, sync_push_pull_task, and sync_watcher all
  treated null-slot saves as sync-eligible. Null-slot saves represent
  web-UI / archival uploads; including them in negotiate plans matched
  them against device pushes by filename and overwrote archival data.
  Filter null-slot saves at all three call sites.
- get_save_by_content_hash matched on (rom_id, user_id, content_hash)
  only, so identical bytes uploaded to different slots collapsed into
  one record. Scope the lookup by slot when provided so clone-save-
  to-new-slot creates a distinct row per slot.
- get_save_by_filename matched on (rom_id, user_id, file_name) only.
  When two uploads to different slots happened in the same wall-clock
  second (the datetime tag is per-second), the second upload UPDATED
  the first record's slot instead of creating a distinct row. Scope
  the filename lookup by slot too.

One-shot recovery
- New recompute_save_content_hashes manual task walks every Save row,
  recomputes via the fixed dispatch, and updates rows whose values
  differ. Idempotent; safe to re-run.
- Backend startup runs a COUNT(content_hash IS NULL) query and, if
  any rows exist, enqueues the recompute task on the low-priority
  RQ queue. The API process moves on; the worker handles the
  recompute out-of-band. Subsequent restarts find zero NULL hashes
  and skip. Admins can also trigger the task manually.

Test infrastructure
- Added tests/_zipfile_shim.reload_zipfile() mirroring the pattern
  from utils/zip_cache.py for the same zipfile-inflate64 + CPython
  3.13.5 incompatibility. Test fixtures that build ZIPs call it
  immediately before opening the archive.
2026-05-29 17:00:01 +09:00
Georges-Antoine Assi
207d0dc4c6 feat(hashing): persist per-member hashes on archive RomFile
Internal members of multi-file archives (zip/tar/7z/rar) are now hashed
individually (crc/md5/sha1) and stored in a new `archive_members` JSON
column on the archive's RomFile, alongside the existing composite hash
used for hash-database matching. Only the archive itself is surfaced as
a RomFile so full_path keeps pointing at a file that exists on disk,
which is the constraint that previously forced us to choose between
composite-only or broken downloads.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-28 09:41:04 -04:00
Georges-Antoine Assi
9111f70d0a refactor(filesystem): merge archive_7zip.py into archives.py
Consolidate all archive readers (zip/tar/7z/rar) and 7z-internal helpers
into a single utils/archives.py module to keep the archive surface area
in one place.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-28 09:10:01 -04:00
Georges-Antoine Assi
a1194dc5e0 changes from bot review 2026-05-28 09:02:26 -04:00
zurdi
6274f83716 Merge remote-tracking branch 'origin/feat/soundtrack-support' into chore/frontend-v2 2026-05-28 09:24:54 +00:00
Zurdi
f8bbd85d23 Merge branch 'master' into feat/soundtrack-support 2026-05-28 11:20:58 +02:00
Georges-Antoine Assi
a170649fe6 fix(hashing): emit single RomFile for multi-file archives
Per-internal-member RomFiles produced full_paths that didn't exist on
disk, breaking downloads and zip-building. Stream entries into the
composite hash only and emit one RomFile pointing at the archive itself.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 21:35:01 -04:00
Georges-Antoine Assi
0bfe369425 run fmt 2026-05-27 21:03:08 -04:00
Georges-Antoine Assi
30451d5651 fix(security): move SSRF defense into the HTTP client path
The previous validator did a preflight `socket.getaddrinfo` before each
httpx request. Two problems:

  * DNS rebinding / TOCTOU: httpx re-resolves at connect time, so a
    hostname can answer with a public IP for the validator and a
    private IP for the real request. The preflight check did not
    constrain the connection.
  * Event-loop blocking: `socket.getaddrinfo` is synchronous, and the
    media-download callers are async. Slow resolvers stalled
    unrelated requests.

Replace it with two layers, both wired automatically onto every httpx
client built by `utils.context`:

  1. A request event hook running `validate_url_for_http_request`
     (syntactic checks only: scheme, reserved hostnames, literal IPs,
     internal TLDs). No DNS, no call-site responsibility.
  2. `SSRFProtectedAsyncBackend` / `SSRFProtectedSyncBackend`, custom
     httpcore network backends that resolve the hostname inside
     `connect_tcp`, reject any address in a forbidden range, then
     connect to that *same* validated address. The async variant uses
     `loop.getaddrinfo` so it doesn't block the loop. httpcore calls
     `start_tls(server_hostname=<URL host>)` after `connect_tcp`, so
     TLS SNI and cert verification still use the original hostname
     even though the TCP layer connects by IP.

Drop the explicit `validate_url_for_http_request(...)` calls from
`resources_handler.py` — the event hook covers them. Consolidate the
URL validator and its tests under `utils/ssrf.py` /
`tests/utils/test_ssrf.py` so the SSRF surface lives in one module.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 17:58:14 -04:00
Zurdi
7839a01702 Merge branch 'master' into feat/soundtrack-support 2026-05-27 21:33:04 +02:00
Georges-Antoine Assi
acff688f11 refactor(hashing): use _make_file_hash helper at remaining sites
Apply the helper to the three other per-file FileHash constructions
(folder-walk hash, empty-archive fallback, single-file hash). The
all-empty FileHash literals are left alone since the helper would be
strictly more obscure for that case.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 09:12:11 -04:00
Georges-Antoine Assi
f255b5a7d9 feat(hashing): add RAR support to multi-file archive composite hashing
Add read_rar_archive_files via the existing 7zz binary (which natively
handles RAR3/RAR5 read), and collapse the per-extension reader dispatch
into an ARCHIVE_READERS dict so future formats are one entry away. Also
extract a small _make_file_hash helper to remove the repeated nested
ternaries in the inner loop.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 09:09:37 -04:00
Georges-Antoine Assi
438c03facc refactor(filesystem): extract archive/CHD helpers to utils/archives.py
Pull file/archive readers (zip/tar/gz/bz2/7z), CHD parsing, and the
shared libmagic MIME detector out of roms_handler.py into a new
utils/archives.py. Rename the previously underscore-prefixed
read_zip_archive_files / read_tar_archive_files to match the existing
read_7z_archive_files convention, and consolidate the duplicated
"with lock: detector.from_file()" pattern into a detect_mime_type helper.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 08:41:45 -04:00
zurdi
17ea8da23d Merge remote-tracking branch 'origin/master' into chore/frontend-v2 2026-05-25 07:56:38 +00:00
Claude
26bdc11e13 refactor(filesystem): lazy-init launchbox + sync handlers, drop tolerate_missing_base
Apply the same lazy-factory pattern to FSLaunchboxHandler and FSSyncHandler
that ssh_sync_handler now uses. With both opt-in features deferred to
first-use, the tolerate_missing_base escape hatch on FSHandler is no longer
needed — every handler now fails loudly on mkdir failure, which is the
right behavior for the always-on core paths (assets, library, resources).

Touched call sites:
  - resources_handler._resolve_local_file_uri (launchbox)
  - sync_watcher.py, endpoints/device.py, tasks/manual/sync_folder_scan.py
    (fs_sync)

Net effect:
  - Default installs never poke /romm/launchbox or /romm/sync at startup.
  - Misconfigured opt-in users get a clear, actionable PermissionError at
    the call site instead of a silent warning followed by mystery failures.
  - tolerate_missing_base, its tests, and one stale log import are gone.
2026-05-24 14:59:03 +00:00
Spinnich
242dc9e357 fix(hashing): use only default exclusions for archive internal files
User-configured EXCLUDED_MULTI_PARTS_EXT/FILES are intentionally not
applied to archive internal files. Archives are curated ROM sets where
every file is relevant — user custom exclusions (e.g. "bin") could
silently produce incorrect composite hashes. Only the hardcoded
DEFAULT_EXCLUDED_FILES/EXTENSIONS (junk like .DS_Store, gamelist.xml)
are applied.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-23 12:28:49 +00:00
Spinnich
a9f9ea2edc fix(hashing): address trunk lint issues in composite archive hashing
- Use AnyioPath.stat() instead of os.path.getmtime in async context (ASYNC240)
- Add assert to narrow rom_md5_h/rom_sha1_h from HASH|None to HASH (mypy/union-attr)
- Auto-formatted long log.error calls in archive_7zip.py (ruff)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-23 12:14:39 +00:00
Spinnich
c20d48bbf8 feat(hashing): compute both composite hash & individual files hash for multi-file archives
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-23 12:04:04 +00:00
Georges-Antoine Assi
1be2ca2b3c soimplify 2026-05-21 17:17:30 -04:00
copilot-swe-agent[bot]
98bc9a9eea Optimize multi-ROM exclusion matching pass
Co-authored-by: gantoine <3247106+gantoine@users.noreply.github.com>
2026-05-21 18:52:55 +00:00
copilot-swe-agent[bot]
5a1e238a5f perf: pre-normalize exclusions once and use set for O(1) lookup in exclude_multi_roms
Co-authored-by: gantoine <3247106+gantoine@users.noreply.github.com>
2026-05-21 18:50:45 +00:00
copilot-swe-agent[bot]
9e3f85b085 Fix ES-DE multi-folder exclusion matching
Agent-Logs-Url: https://github.com/rommapp/romm/sessions/2213cb94-9971-48a6-8d17-9efc5c209db4

Co-authored-by: gantoine <3247106+gantoine@users.noreply.github.com>
2026-05-21 11:22:21 +00:00
Georges-Antoine Assi
94d011ee5e tolerate launchbox basepath 2026-05-21 06:56:36 -04:00
Georges-Antoine Assi
405f678514 Merge pull request #3388 from rommapp/hardlink-resources-gamelist
feat(fs): hardlink import/export assets, harden sync init
2026-05-19 09:04:18 -04:00
Georges-Antoine Assi
adb050f164 commit and push 2026-05-19 07:31:25 -04:00
Georges-Antoine Assi
f84796da08 Merge pull request #3385 from Spinnich/pr/chd-raw-hashing
feat(hashing): compute raw CHD hashes and route disc-data SHA1 to Hasheous
2026-05-18 14:52:54 -04:00
Georges-Antoine Assi
591b07ec49 changes from bot review 2026-05-18 14:44:52 -04:00
Georges-Antoine Assi
e6d4ede939 cleanup 2026-05-18 07:40:59 -04:00
Georges-Antoine Assi
757fafae5f feat(fs): hardlink import/export assets when possible, harden sync init
Importer (gamelist/launchbox file:// flows) and exporters (gamelist.xml,
metadata.pegasus.txt local exports) now hardlink media assets when source
and destination share a filesystem, falling back transparently to a copy
on EXDEV / EPERM / EOPNOTSUPP / EMLINK / EACCES (cross-device, FAT32,
exFAT, network mounts, etc.). Saves disk space and is effectively
instantaneous on large files (videos, manuals, miximages).

Covers keep a real copy (allow_link=False) because _store_cover resizes
the small cover in place via PIL.Image.save, which would truncate the
shared inode and corrupt the user's source image.

Also makes FSSyncHandler tolerate a missing/unwritable /romm/sync at
startup: an OSError from mkdir now logs a warning instead of crashing
the whole app at module-import time. Sync calls still fail at use time
if the mount remains broken — the right place to surface the error.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 07:38:11 -04:00
Georges-Antoine Assi
90945685e4 Stuff 2026-05-17 12:43:33 -04:00
Spinnich
01f0b1d2b5 feat(hashing): compute raw CHD hashes and route disc-data SHA1 to Hasheous
CHD files now follow the same hash logic as all other file types — CRC32,
MD5, and SHA1 are computed from raw container bytes. This allows
ScreenScraper to log KO entries for unrecognised CHD files, which it
could not do when only the disc-data SHA1 was being computed.

The CHD header SHA1 (disc-data SHA1) is separately extracted and stored
in a new chd_sha1_hash field on RomFile, with a migration adding the
column to rom_files. Hasheous receives only this disc-data SHA1 (no
CRC/MD5) since it indexes disc-based games by disc-data SHA1, not raw
file hashes.

The RAHasher multi-file path now passes the largest CHD directly instead
of a /* wildcard, which RAHasher cannot expand. Hash computations are
wrapped in asyncio.to_thread to avoid blocking the event loop during
large reads.

Hash-lookup metadata handlers (ScreenScraper, Hasheous, Playmatch) now
fall back to rom.files (stored DB hashes) when fs_rom files are not
rehashed, fixing hash-based matching for UNMATCHED and UPDATE scan types.

The Disc SHA-1 is displayed in the ROM detail view for both single-file
(FileInfo.vue) and multi-file (FileSelectItem.vue) CHD games.
2026-05-17 08:01:05 -04:00
Georges-Antoine Assi
c6a2f56fad Merge pull request #3367 from rommapp/regional-provider-tags
Prefer ROM's own region tag for ScreenScraper and IGDB artwork
2026-05-13 11:19:53 -04:00
Georges-Antoine Assi
dad1250e15 case-insensitive region lookup for provider shortcode mapping
Rom.regions can contain raw filename text like "europe" or "EUROPE"
(filename parsing in roms_handler doesn't normalize casing), so the
direct dict lookup missed those tags and the locale silently fell back
to scan.priority.region. Replace the dict access with a helper that
lowercases both sides.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 09:34:10 -04:00
Georges-Antoine Assi
944514acc0 prefer rom's own region tag for ScreenScraper and IGDB artwork
When a ROM filename carries a region tag (e.g. (Europe)), use that
region first when picking artwork and localized titles, falling back to
the configured scan.priority.region. Previously the configured priority
was the only signal, so a US-first config would force US covers onto
European ROMs even when an EU asset was available.

Adds a shared name->provider-shortcode map and threads the rom through
the IGDB and SS lookup APIs so the rom-aware locale/region selection
can run for both providers.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 09:06:11 -04:00
Georges-Antoine Assi
d8ef6f0c05 Merge branch 'master' into local-lb-fix 2026-05-09 13:20:31 -04:00
Georges-Antoine Assi
e3aaa106a2 perf(backend): reuse libmagic instance for image upload validation
magic.Magic(mime=True) loads the magic database from disk on construction;
instantiating it per request was adding pointless overhead to every avatar
and artwork upload. Share a module-level instance guarded by a lock (the
underlying magic_t handle is not thread-safe), and surface MagicException
as a 400 so a sniffing failure fails closed instead of bubbling a 500.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 10:14:38 -04:00
Georges-Antoine Assi
53f14f5710 fix(backend): validate uploaded images with libmagic before storing
Avatar, ROM artwork, and collection artwork uploads now sniff the file
header with libmagic and reject anything that isn't PNG/JPEG/WebP/GIF,
saving the file with an extension derived from the detected MIME rather
than the user-supplied filename. Pairs with the raw asset endpoint,
which decides inline vs attachment from the on-disk extension.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 09:18:02 -04:00
Georges-Antoine Assi
5e3a2707b0 cleanup 2026-05-03 19:39:19 -04:00
copilot-swe-agent[bot]
da005cf81a Optimize fnmatch check and use consistent n64 filename in test
Agent-Logs-Url: https://github.com/rommapp/romm/sessions/8cbbc2ca-a3e3-4c61-9e47-f8544d59231a

Co-authored-by: gantoine <3247106+gantoine@users.noreply.github.com>
2026-05-03 23:36:23 +00:00