mirror of
https://github.com/rommapp/romm.git
synced 2026-06-28 06:46:00 +00:00
SaveSync: paginate recompute task scan by primary key
get_all_saves() materialized every Save row across all users into a single .all() list. On instances with very large libraries that's a real RAM ceiling and pins every row for the lifetime of the recompute run. Replace it with get_saves_after_id(after_id, limit) and have the recompute task drive keyset pagination in PAGE_SIZE-row chunks. SQLAlchemy streaming via .execution_options(yield_per=...) is incompatible with the per-call session lifetime that @begin_session enforces (the session exits before the consumer iterates), so keyset paging from the caller is the cleanest fit. Behavior is unchanged: same row coverage, same idempotency, same counters. Memory usage drops from O(all saves) to O(PAGE_SIZE).
This commit is contained in:
@@ -196,10 +196,18 @@ class DBSavesHandler(DBBaseHandler):
|
||||
)
|
||||
|
||||
@begin_session
|
||||
def get_all_saves(
|
||||
def get_saves_after_id(
|
||||
self,
|
||||
after_id: int,
|
||||
limit: int,
|
||||
session: Session = None, # type: ignore
|
||||
) -> Sequence[Save]:
|
||||
"""Every Save row across all users, ordered by id. Used by the
|
||||
recompute_save_content_hashes maintenance task."""
|
||||
return session.scalars(select(Save).order_by(asc(Save.id))).all()
|
||||
"""Page Save rows by primary key. Returns up to ``limit`` rows with
|
||||
``id > after_id``, ordered by id. Used by the
|
||||
recompute_save_content_hashes maintenance task to walk every row in
|
||||
bounded-memory batches: streaming via ``yield_per`` is incompatible
|
||||
with the per-call session lifetime that ``@begin_session`` enforces,
|
||||
so the caller drives pagination with this method instead."""
|
||||
return session.scalars(
|
||||
select(Save).where(Save.id > after_id).order_by(asc(Save.id)).limit(limit)
|
||||
).all()
|
||||
|
||||
Reference in New Issue
Block a user