Apply the helper to the three other per-file FileHash constructions
(folder-walk hash, empty-archive fallback, single-file hash). The
all-empty FileHash literals are left alone since the helper would be
strictly more obscure for that case.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add read_rar_archive_files via the existing 7zz binary (which natively
handles RAR3/RAR5 read), and collapse the per-extension reader dispatch
into an ARCHIVE_READERS dict so future formats are one entry away. Also
extract a small _make_file_hash helper to remove the repeated nested
ternaries in the inner loop.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Pull file/archive readers (zip/tar/gz/bz2/7z), CHD parsing, and the
shared libmagic MIME detector out of roms_handler.py into a new
utils/archives.py. Rename the previously underscore-prefixed
read_zip_archive_files / read_tar_archive_files to match the existing
read_7z_archive_files convention, and consolidate the duplicated
"with lock: detector.from_file()" pattern into a detect_mime_type helper.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
User-configured EXCLUDED_MULTI_PARTS_EXT/FILES are intentionally not
applied to archive internal files. Archives are curated ROM sets where
every file is relevant — user custom exclusions (e.g. "bin") could
silently produce incorrect composite hashes. Only the hardcoded
DEFAULT_EXCLUDED_FILES/EXTENSIONS (junk like .DS_Store, gamelist.xml)
are applied.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Use AnyioPath.stat() instead of os.path.getmtime in async context (ASYNC240)
- Add assert to narrow rom_md5_h/rom_sha1_h from HASH|None to HASH (mypy/union-attr)
- Auto-formatted long log.error calls in archive_7zip.py (ruff)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
CHD files now follow the same hash logic as all other file types — CRC32,
MD5, and SHA1 are computed from raw container bytes. This allows
ScreenScraper to log KO entries for unrecognised CHD files, which it
could not do when only the disc-data SHA1 was being computed.
The CHD header SHA1 (disc-data SHA1) is separately extracted and stored
in a new chd_sha1_hash field on RomFile, with a migration adding the
column to rom_files. Hasheous receives only this disc-data SHA1 (no
CRC/MD5) since it indexes disc-based games by disc-data SHA1, not raw
file hashes.
The RAHasher multi-file path now passes the largest CHD directly instead
of a /* wildcard, which RAHasher cannot expand. Hash computations are
wrapped in asyncio.to_thread to avoid blocking the event loop during
large reads.
Hash-lookup metadata handlers (ScreenScraper, Hasheous, Playmatch) now
fall back to rom.files (stored DB hashes) when fs_rom files are not
rehashed, fixing hash-based matching for UNMATCHED and UPDATE scan types.
The Disc SHA-1 is displayed in the ROM detail view for both single-file
(FileInfo.vue) and multi-file (FileSelectItem.vue) CHD games.
Replace the single HIGH_PRIO_STRUCTURE_PATH config attribute with two
glob patterns (STRUCTURE_PATH_A = roms/*, STRUCTURE_PATH_B = */roms) and
update all call sites to detect Structure B via glob.glob, defaulting to
Structure A when no match is found.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Callers now pass the full platform dict and rom.fs_extension; the service
normalizes the extension (optional leading dot, case-insensitive) before
checking the compressed-archive skip set, so ROMs stored with bare
extensions like "zip" correctly hit the skip path.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
RAHasher was being spawned for every hashable ROM regardless of file
type. When the source file is a zip/7z/tar and the RA platform needs
an on-disk disc image (PSX, PS2, PSP, Saturn, Dreamcast, Sega CD,
3DO, PC-FX, Neo Geo CD, TurboGrafx CD, Atari Jaguar CD, Wii), the
subprocess fails with "Unsupported console for buffer hash: {id}"
after paying full process-spawn overhead per ROM — a serious slowdown
when indexing large zipped collections (e.g. myrient PS2/PSP sets).
calculate_hash now short-circuits those combinations with a debug log
and no subprocess. Raw disc images (.iso, .chd, .cue/.bin) and
archives on cartridge platforms still go through RAHasher as before.
Also centralize COMPRESSED_FILE_EXTENSIONS in utils/filesystem.py so
roms_handler (is_compressed_file / hashing), rahasher (skip logic),
and feeds (PKGi passthrough) share one source of truth. The shared
set adds .rar, which is_compressed_file now recognizes too.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>