Scriberr

mirror of https://github.com/rishikanthc/Scriberr.git synced 2026-07-01 08:15:46 +00:00

Author	SHA1	Message	Date
rishikanthc	4dc6810015	Add partial index predicate regression tests	2026-04-23 12:32:36 -07:00
rishikanthc	5993f418d0	Fix partial index WHERE extraction	2026-04-23 12:30:40 -07:00
rishikanthc	42ff560afc	Harden partial index predicate verification	2026-04-23 12:29:02 -07:00
rishikanthc	4960a2d528	Finalize database migration correctness fixes	2026-04-23 12:22:01 -07:00
rishikanthc	a5f88fb638	Fix database migration backfill regressions	2026-04-23 12:10:36 -07:00
rishikanthc	f926509ac9	Harden database schema detection and migration errors	2026-04-23 12:01:25 -07:00
rishikanthc	5d6a60d793	Refactor database migration and persistence layer	2026-04-23 11:38:25 -07:00
rishikanthc	c5f758cddd	repo: make execution creation deterministic and add user-scoped APIs	2026-04-23 11:24:13 -07:00
rishikanthc	fc3e933104	db: enforce single default per user and migration normalization	2026-04-23 11:24:08 -07:00
rishikanthc	5067a40790	test database migration with real sqlite fixtures	2026-04-23 10:35:59 -07:00
rishikanthc	e3a7c48bd7	fix schema compatibility gaps after migration	2026-04-23 10:35:15 -07:00
rishikanthc	0be71a63a0	refactor database schema and legacy migration flow	2026-04-23 10:17:03 -07:00
Booth	bdb8838b8b	fix: pin torchcodec to 0.7.x for compatibility with PyTorch 2.8.x torchcodec>=0.6.0 (upstream default) resolves to 0.10.0+ which requires PyTorch 2.9. Scriberr ships PyTorch 2.8.x, causing a C++ ABI symbol mismatch at load time. Pin to ~=0.7.0, the last release compatible with PyTorch 2.8. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-21 10:41:40 -07:00
Claude	9fd0943b92	fix: move override-dependencies to correct TOML scope [tool.uv] The override-dependencies key was placed after [tool.uv.sources], causing it to be parsed as tool.uv.sources.override-dependencies instead of tool.uv.override-dependencies. uv would silently ignore it, meaning torchcodec was never actually excluded on Linux aarch64. https://claude.ai/code/session_01YMyUwpk577EradV93tMMqS	2026-04-21 10:38:20 -07:00
Claude	6221843866	fix: restore diarization on Linux ARM64 and add WhisperX model selector - Default sortformer output format to json; RTTM path fails silently on NeMo annotation objects, producing zero diarization segments - Exclude torchcodec on Linux aarch64 via uv platform marker; no wheels exist for any torchcodec version on manylinux aarch64, causing pyannote environment setup to fail entirely on ARM64 Docker - Add diarization model selector to WhisperX config UI; Parakeet and Canary sections already had this but WhisperX was missing it, making it impossible to select nvidia_sortformer as the diarization backend https://claude.ai/code/session_01YMyUwpk577EradV93tMMqS	2026-04-21 10:38:20 -07:00
scnerd	77c3365a4a	Add TODO comment to remember that this fix is really about technical debt and we should just remove the legacy workers argument entirely.	2026-04-21 10:38:09 -07:00
scnerd	d0a1dcbd6c	docs(queue): add comments to test functions for consistency All six test functions now have // TestFoo verifies... comments matching the project's existing convention. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-21 10:38:09 -07:00
scnerd	8b3f25f801	refactor(queue): use testify assertions and t.Setenv in queue tests Switch from raw t.Errorf to testify/assert for consistency with the rest of the codebase. Use t.Setenv() instead of manual os.Setenv/defer os.Unsetenv for automatic cleanup. Simplify table structs where min and max are always equal. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-21 10:38:09 -07:00
scnerd	8d05a7cdd9	fix(queue): allow QUEUE_WORKERS env var to override hardcoded worker count The QUEUE_WORKERS environment variable was defined and read in getOptimalWorkerCount(), but NewTaskQueue() unconditionally overwrote the result with the hardcoded legacyWorkers parameter (always 2). This made QUEUE_WORKERS effectively dead code. Now legacyWorkers is only used as a fallback when QUEUE_WORKERS is not set, preserving the default of 2 workers while allowing users to control concurrency via the environment variable. Set QUEUE_WORKERS=1 to serialize all transcription jobs and prevent system overload during bulk uploads. Fixes: rishikanthc/Scriberr#379 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-21 10:38:09 -07:00
scnerd	1ab08ddc11	test(queue): add unit tests for QUEUE_WORKERS env var behavior Add tests verifying that getOptimalWorkerCount() respects the QUEUE_WORKERS environment variable and that NewTaskQueue() should allow QUEUE_WORKERS to override the hardcoded legacy worker count. Includes a failing test (TestNewTaskQueue_EnvOverridesLegacy) that reproduces the bug where QUEUE_WORKERS is always overridden by the hardcoded legacyWorkers parameter. Ref: rishikanthc/Scriberr#379 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-21 10:38:09 -07:00
Peter Somlo	175832e8e7	fix: voxtral duration comparison and "auto" language handling - use shared LANGUAGES constant in frontend config dialog	2026-02-28 10:59:07 -08:00
Fran Fitzpatrick	4e75295019	feat: add speaker identification toggle to summary templates Add option to include speaker labels in summary prompts when diarization is available. When enabled, transcripts are formatted as: [SPEAKER_NAME] Text here... The prompt also includes a hint to the LLM that speaker labels are present, helping it produce summaries that attribute statements to specific speakers. Changes: - Add IncludeSpeakerInfo field to SummaryTemplate model - Add toggle UI in summary template dialog - Format transcript with speaker labels when generating summary - Update prompt prefix to indicate speaker labels are present Closes #353 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-28 10:58:49 -08:00
Fran Fitzpatrick	850af1fb6e	test: update PyAnnote test to reflect optional HF token The HF token parameter is now optional at validation time since it can be provided via the HF_TOKEN environment variable at runtime. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-07 12:23:27 -08:00
Fran Fitzpatrick	ff12270419	feat: add HF_TOKEN environment variable fallback for diarization Previously, users had to enter their Hugging Face token in the UI for every transcription job that used diarization. Now the token can be set via the HF_TOKEN environment variable, which is especially useful for Docker deployments. Changes: - Add HFToken to backend config (reads from HF_TOKEN env var) - Update PyAnnote adapter to fall back to env var when no UI token - Update WhisperX adapter to fall back to env var when no UI token - Update documentation to clarify both configuration options 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-07 12:23:27 -08:00
Fran Fitzpatrick	f6df31b500	feat: add VAD segmentation thresholds for Pyannote diarization Add configurable voice activity detection thresholds to improve speaker diarization accuracy for noisy or distant audio recordings. - Add --segmentation-onset and --segmentation-offset CLI args to pyannote_diarize.py - Pass segmentation thresholds from Go adapter to Python script - Map existing vad_onset/vad_offset params to Pyannote segmentation - Add VAD Onset/Offset inputs to UI when Pyannote diarization is selected (Whisper, Parakeet, Canary model families) Lower onset values (0.3-0.4) help detect quieter/distant speakers. Lower offset values improve detection of speech endings. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-07 12:20:31 -08:00
Peter Somlo	df5de714c4	fix: make transcription temp and output directories configurable - Add TempDir field to Config struct to read TEMP_DIR env var - Update NewUnifiedTranscriptionService to accept tempDir and outputDir parameters - Remove hardcoded "data/temp" and "data/transcripts" paths from unified service - Update NewUnifiedJobProcessor to pass directory paths from config - Update main.go to use cfg.TempDir and cfg.TranscriptsDir - Update all test files to use new function signatures - Fix database.go to use directory from DATABASE_PATH instead of hardcoded "data/"	2026-01-07 12:18:26 -08:00
rishikanthc	73a82b9f6b	fix auto device detection in voxtral	2025-12-31 15:47:19 -08:00
rishikanthc	ad3053cc9b	fix: add Voxtral model selection and fix dependencies - Add FamilyMistralVoxtral and ModelVoxtral constants - Add case for Voxtral in selectModels switch statement - Add convertToVoxtralParams function for parameter conversion - Add MaxNewTokens field to WhisperXParams model - Map language and max_new_tokens parameters correctly - Fix parameter name in buffered script (output_path -> output_file) - Add mistral-common dependency to pyproject.toml - Check for both VoxtralForConditionalGeneration AND mistral_common On next server restart, the environment will be re-synced automatically to install the missing mistral-common dependency.	2025-12-31 15:47:19 -08:00
rishikanthc	1485b01488	feat: add buffered transcription for Voxtral to handle long audio - Create voxtral_transcribe_buffered.py for audio > 30 minutes - Split audio into 25-minute chunks for processing - Automatically detect long audio and use buffered mode - Concatenate text results from all chunks - No timestamp adjustment needed (text-only model) - Handles unlimited audio length via chunking	2025-12-31 15:47:19 -08:00
rishikanthc	5a947e8739	fix: update Voxtral token limits based on 32k context window - Default: 4096 → 8192 tokens - Maximum: 8192 → 16384 tokens - Minimum: 512 → 1024 tokens - Voxtral has 32k context window, handles 30-40 min audio - Updated UI description to reflect capabilities	2025-12-31 15:47:19 -08:00
rishikanthc	95ecbf6d21	fix: increase Voxtral max_new_tokens to 4096 (max 8192) - Default increased from 500 to 4096 tokens - Maximum increased from 2000 to 8192 tokens - Minimum increased from 100 to 512 tokens - Add max_new_tokens to TypeScript interface - Fix UI to use correct parameter (was using max_line_width)	2025-12-31 15:47:19 -08:00
rishikanthc	1ae7b2bf71	feat: add Voxtral-mini transcription support - Add VoxtralAdapter using transformers library with direct model loading - Add Python transcription script with apply_transcription_request() method - Register Voxtral adapter in main.go with dedicated environment - Add UI configuration in TranscriptionConfigDialog with warning banner - Support multilingual transcription without word-level timestamps - Auto GPU/CPU detection, no device parameter needed - Graceful degradation for missing timestamp features Voxtral provides high-quality text-only transcription but does not support word-level timestamps. UI warns users that synchronized playback and seek features won't be available.	2025-12-31 15:47:19 -08:00
rishikanthc	923b39e415	fix: ensure directories exist before writing adapter scripts - Create env directory in copy script functions before writing - Fixes initialization errors for Parakeet, Canary, and Sortformer adapters - Update Makefile to use web/project-site for website commands - Add build target to Makefile for building Scriberr binary	2025-12-31 15:47:19 -08:00
rishikanthc	2afd6a1ecf	fixes #317	2025-12-29 21:11:47 -08:00
Paul Irish	9975e6fb02	fix duplicated openapi annotations pt 2	2025-12-29 21:10:13 -08:00
Paul Irish	a7aaf06bbb	fix duplicated openapi annotations	2025-12-29 21:10:13 -08:00
Paul Irish	ab912a6b6e	always copy scripts	2025-12-29 21:09:53 -08:00
Paul Irish	7471a2a1b6	Add test suite for python adapter scripts	2025-12-29 21:09:53 -08:00
Paul Irish	50dd4130ff	Extract python adapter scripts to proper files	2025-12-29 21:09:53 -08:00
Fran Fitzpatrick	8f537548d4	feat: add RTX 5090 Blackwell GPU support (sm_120) Add support for NVIDIA RTX 50-series GPUs (Blackwell architecture) which require CUDA 12.8+ and PyTorch cu128 wheels due to the new sm_120 compute capability. Changes: - Add configurable PYTORCH_CUDA_VERSION environment variable to control PyTorch wheel version at runtime (cu126 for legacy, cu128 for Blackwell) - Update all model adapters to use dynamic CUDA version instead of hardcoded cu126 URLs - Update Dockerfile.cuda.12.9 for Blackwell with CUDA 12.9.1 base image, PYTORCH_CUDA_VERSION=cu128, and missing WHISPERX_ENV/yt-dlp - Update Dockerfile.cuda with explicit PYTORCH_CUDA_VERSION=cu126 - Add docker-compose.blackwell.yml for pre-built Blackwell image - Add docker-compose.build.blackwell.yml for local Blackwell builds - Add GPU compatibility documentation to README Fixes: rishikanthc/Scriberr#104	2025-12-24 14:46:44 -08:00
rishikanthc	913063eb49	refactor: Switch yt-dlp to standalone binary & cleanup UV config - Dockerfiles: Install yt-dlp binary from GitHub releases to /usr/local/bin - Go: Execute yt-dlp binary directly, removing uv python wrapper - Config: Remove unused UVPath configuration and findUVPath function - Entrypoint: Remove yt-dlp init logic (still initializes whisperx env if needed)	2025-12-16 19:07:29 -08:00
rishikanthc	f99087b2bd	fix: Resolve mobile audio playback permission issues - Change cookie SameSite policy from Strict to Lax (Strict blocks media subresources on mobile) - Decouple Secure cookie flag from APP_ENV: - Add SECURE_COOKIES config (defaults to true in prod, but can be overridden) - Allows testing production builds over HTTP (home network) - Increase gocyclo threshold to 25 to accommodate complex handlers	2025-12-16 18:21:36 -08:00
rishikanthc	11434b9f1b	feat: Add production security configuration for CORS and cookie handling - Fix refresh token cookie Secure flag bug (was hardcoded to false) - Wire up AllowedOrigins config in CORS middleware (router, handlers, chat, SSE) - Add APP_ENV=production to Dockerfile and Dockerfile.cuda - Update all docker-compose files with APP_ENV and ALLOWED_ORIGINS examples - CORS now validates origins in production, allows all in development - Increase gocyclo threshold from 20 to 25 for complex handlers	2025-12-16 18:21:36 -08:00
rishikanthc	c44de7858b	refactor: Complete repository pattern migration for all remaining files (Phases 5-7) Phase 5: Refactor queue.go (10 DB calls removed) - Added JobRepository to TaskQueue struct and constructor - Added UpdateStatus, UpdateError, FindByStatus, CountByStatus methods to JobRepository - Replaced all database.DB calls with repository methods Phase 6: Refactor chat_handlers.go and summarize_handlers.go (6 DB calls removed) - Added GetMessageCountsBySessionIDs and GetLastMessagesBySessionIDs to ChatRepository - Added UpdateSummary to JobRepository - Replaced batch queries and update calls with repository methods - Removed database import from both files Phase 7: Refactor quick_transcription.go (3 DB calls removed) - Added JobRepository injection to QuickTranscriptionService - Updated constructor and all callers Summary: 46+ database.DB calls replaced with repository methods across 7 phases. All tests pass, build succeeds.	2025-12-16 18:21:36 -08:00
rishikanthc	86add0037d	refactor: Replace direct database.DB calls with repository pattern in handlers, dropzone, and multitrack_processor Phase 1: Define interfaces - Created internal/interfaces/ package with AuthServiceInterface, TaskQueueInterface, JobProcessorInterface Phase 2: Refactor handlers.go (21 DB calls removed) - Replaced all database.DB calls with repository methods - Added RefreshTokenRepository for token management - Added new repository methods: Count, FindActiveTrackJobs, FindLatestCompletedExecution, FindByName Phase 3: Refactor dropzone.go (3 DB calls removed) - Added CountWithAutoTranscription to UserRepository - Injected JobRepository and UserRepository into Service Phase 4: Refactor multitrack_processor.go - Changed constructor to accept *gorm.DB and JobRepository - Updated Handler to inject MultiTrackProcessor Updated all test files with new dependencies and mock implementations.	2025-12-16 18:21:36 -08:00
rishikanthc	7fc7619ee6	fix: tests for upstream changes fix: new tests for chat and user management flows fix: resolve lint errors fix: configured lefthook to check entire project	2025-12-16 18:21:36 -08:00
rishikanthc	658a1a5c49	fix: lint errors in go code	2025-12-16 18:21:36 -08:00
rishikanthc	3bbcbcfd63	fix: responsive design	2025-12-15 13:36:12 -08:00
rishikanthc	7deff5b903	fix: change logo.. fix chat	2025-12-14 19:09:52 -08:00
rishikanthc	201b3b787c	fix: remove timer-based job scanner that caused duplicate transcriptions The jobScanner was running every 10 seconds and re-enqueueing jobs that were already in the queue but hadn't started processing yet. This caused completed files to be re-transcribed when auto-transcribe was enabled. Changes: - Removed jobScanner goroutine (10-second polling loop) - Removed scanPendingJobs function - Added recoverPendingJobs that runs ONCE at startup to recover any pending jobs left from previous server runs - Jobs are now only enqueued when explicitly requested: - Upload with auto-transcribe enabled - Manual transcription start - Server restart recovery (one-time)	2025-12-14 19:09:52 -08:00

1 2 3 4 5 ...

290 Commits