Scriberr

mirror of https://github.com/rishikanthc/Scriberr.git synced 2026-06-29 15:26:02 +00:00

Author	SHA1	Message	Date
rishikanthc	4dc6810015	Add partial index predicate regression tests	2026-04-23 12:32:36 -07:00
rishikanthc	5993f418d0	Fix partial index WHERE extraction	2026-04-23 12:30:40 -07:00
rishikanthc	42ff560afc	Harden partial index predicate verification	2026-04-23 12:29:02 -07:00
rishikanthc	4960a2d528	Finalize database migration correctness fixes	2026-04-23 12:22:01 -07:00
rishikanthc	a5f88fb638	Fix database migration backfill regressions	2026-04-23 12:10:36 -07:00
rishikanthc	f926509ac9	Harden database schema detection and migration errors	2026-04-23 12:01:25 -07:00
rishikanthc	5d6a60d793	Refactor database migration and persistence layer	2026-04-23 11:38:25 -07:00
rishikanthc	c5f758cddd	repo: make execution creation deterministic and add user-scoped APIs	2026-04-23 11:24:13 -07:00
rishikanthc	fc3e933104	db: enforce single default per user and migration normalization	2026-04-23 11:24:08 -07:00
rishikanthc	5067a40790	test database migration with real sqlite fixtures	2026-04-23 10:35:59 -07:00
rishikanthc	e3a7c48bd7	fix schema compatibility gaps after migration	2026-04-23 10:35:15 -07:00
rishikanthc	0be71a63a0	refactor database schema and legacy migration flow	2026-04-23 10:17:03 -07:00
Booth	bdb8838b8b	fix: pin torchcodec to 0.7.x for compatibility with PyTorch 2.8.x torchcodec>=0.6.0 (upstream default) resolves to 0.10.0+ which requires PyTorch 2.9. Scriberr ships PyTorch 2.8.x, causing a C++ ABI symbol mismatch at load time. Pin to ~=0.7.0, the last release compatible with PyTorch 2.8. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-21 10:41:40 -07:00
Claude	9fd0943b92	fix: move override-dependencies to correct TOML scope [tool.uv] The override-dependencies key was placed after [tool.uv.sources], causing it to be parsed as tool.uv.sources.override-dependencies instead of tool.uv.override-dependencies. uv would silently ignore it, meaning torchcodec was never actually excluded on Linux aarch64. https://claude.ai/code/session_01YMyUwpk577EradV93tMMqS	2026-04-21 10:38:20 -07:00
Claude	6221843866	fix: restore diarization on Linux ARM64 and add WhisperX model selector - Default sortformer output format to json; RTTM path fails silently on NeMo annotation objects, producing zero diarization segments - Exclude torchcodec on Linux aarch64 via uv platform marker; no wheels exist for any torchcodec version on manylinux aarch64, causing pyannote environment setup to fail entirely on ARM64 Docker - Add diarization model selector to WhisperX config UI; Parakeet and Canary sections already had this but WhisperX was missing it, making it impossible to select nvidia_sortformer as the diarization backend https://claude.ai/code/session_01YMyUwpk577EradV93tMMqS	2026-04-21 10:38:20 -07:00
Claude	00923e8898	refactor: use shared FormHelpers in SummaryTemplateDialog Remove 4 duplicated CSS class constants and replace manual Select/Switch blocks with SelectField and SwitchField from FormHelpers. 225 → 185 lines (-18%) https://claude.ai/code/session_01YMyUwpk577EradV93tMMqS	2026-04-21 10:38:20 -07:00
Claude	5cdc91cf48	refactor: deduplicate TranscriptionConfigDialog with shared form helpers - Extract SelectField, SwitchField, SliderField, AdvancedAccordion to FormHelpers - Move shared CSS class constants (inputClassName, etc.) to FormHelpers - Extract DiarizationSection to eliminate 3x copy-pasted diarization blocks - Replace 11 inline Select blocks, 5 Switch+label blocks, 2 Slider blocks, 2 Accordion wrappers with single-line helper calls TranscriptionConfigDialog: 1178 → 702 lines (-40%) FormHelpers: 119 → 264 lines (reusable form infrastructure) https://claude.ai/code/session_01YMyUwpk577EradV93tMMqS	2026-04-21 10:38:20 -07:00
scnerd	77c3365a4a	Add TODO comment to remember that this fix is really about technical debt and we should just remove the legacy workers argument entirely.	2026-04-21 10:38:09 -07:00
scnerd	d0a1dcbd6c	docs(queue): add comments to test functions for consistency All six test functions now have // TestFoo verifies... comments matching the project's existing convention. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-21 10:38:09 -07:00
scnerd	8b3f25f801	refactor(queue): use testify assertions and t.Setenv in queue tests Switch from raw t.Errorf to testify/assert for consistency with the rest of the codebase. Use t.Setenv() instead of manual os.Setenv/defer os.Unsetenv for automatic cleanup. Simplify table structs where min and max are always equal. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-21 10:38:09 -07:00
scnerd	8d05a7cdd9	fix(queue): allow QUEUE_WORKERS env var to override hardcoded worker count The QUEUE_WORKERS environment variable was defined and read in getOptimalWorkerCount(), but NewTaskQueue() unconditionally overwrote the result with the hardcoded legacyWorkers parameter (always 2). This made QUEUE_WORKERS effectively dead code. Now legacyWorkers is only used as a fallback when QUEUE_WORKERS is not set, preserving the default of 2 workers while allowing users to control concurrency via the environment variable. Set QUEUE_WORKERS=1 to serialize all transcription jobs and prevent system overload during bulk uploads. Fixes: rishikanthc/Scriberr#379 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-21 10:38:09 -07:00
scnerd	1ab08ddc11	test(queue): add unit tests for QUEUE_WORKERS env var behavior Add tests verifying that getOptimalWorkerCount() respects the QUEUE_WORKERS environment variable and that NewTaskQueue() should allow QUEUE_WORKERS to override the hardcoded legacy worker count. Includes a failing test (TestNewTaskQueue_EnvOverridesLegacy) that reproduces the bug where QUEUE_WORKERS is always overridden by the hardcoded legacyWorkers parameter. Ref: rishikanthc/Scriberr#379 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-21 10:38:09 -07:00
Paul Irish	a378335ebd	refactor(frontend): extract auth logic to helpers and interceptor needed because I was adding a new SpeakerSettings component but the useAuth hook triggered an infinite recusion bug because of the window.fetch wrappings.	2026-03-22 12:10:24 -07:00
Rishikanth Chandrasekaran	bccf81d3e6	Update README with referral request and personal link Added a request for AI/ML engineering referrals and linked personal website.	2026-03-19 10:38:46 -07:00
Rishikanth Chandrasekaran	abd6d62a60	Add project status update and collaboration invitation Added an update on project status, including personal circumstances and a call for community contributions.	2026-03-19 10:07:09 -07:00
Peter Somlo	175832e8e7	fix: voxtral duration comparison and "auto" language handling - use shared LANGUAGES constant in frontend config dialog	2026-02-28 10:59:07 -08:00
Fran Fitzpatrick	5aacc58bad	docs: regenerate swagger docs for speaker identification toggle Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-28 10:58:49 -08:00
Fran Fitzpatrick	4e75295019	feat: add speaker identification toggle to summary templates Add option to include speaker labels in summary prompts when diarization is available. When enabled, transcripts are formatted as: [SPEAKER_NAME] Text here... The prompt also includes a hint to the LLM that speaker labels are present, helping it produce summaries that attribute statements to specific speakers. Changes: - Add IncludeSpeakerInfo field to SummaryTemplate model - Add toggle UI in summary template dialog - Format transcript with speaker labels when generating summary - Update prompt prefix to indicate speaker labels are present Closes #353 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-28 10:58:49 -08:00
Paul Irish	71e7588c75	fix the 'any's	2026-02-28 10:58:17 -08:00
Paul Irish	6e707f363c	fix(auth): prevent infinite fetch recursion and multiple wrapper layers This fixes an issue where the frontend would spam the auth endpoints repeatedly when logged out or when a session expired. 1. Infinite Recursion on 401: The window.fetch wrapper would catch a 401, call tryRefresh(), which then called fetch() again, triggering the wrapper recursively if the refresh also failed. We now use the original fetch for refresh attempts and exclude auth endpoints from auto-refresh logic. 2. Multiple Wrapper Layers: Since useAuth is a hook used by many components, multiple instances were independently wrapping window.fetch. We now store the original fetch globally and ensure wrapping only happens once.	2026-02-28 10:58:17 -08:00
Fran Fitzpatrick	850af1fb6e	test: update PyAnnote test to reflect optional HF token The HF token parameter is now optional at validation time since it can be provided via the HF_TOKEN environment variable at runtime. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-07 12:23:27 -08:00
Fran Fitzpatrick	ff12270419	feat: add HF_TOKEN environment variable fallback for diarization Previously, users had to enter their Hugging Face token in the UI for every transcription job that used diarization. Now the token can be set via the HF_TOKEN environment variable, which is especially useful for Docker deployments. Changes: - Add HFToken to backend config (reads from HF_TOKEN env var) - Update PyAnnote adapter to fall back to env var when no UI token - Update WhisperX adapter to fall back to env var when no UI token - Update documentation to clarify both configuration options 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-07 12:23:27 -08:00
Fran Fitzpatrick	f6df31b500	feat: add VAD segmentation thresholds for Pyannote diarization Add configurable voice activity detection thresholds to improve speaker diarization accuracy for noisy or distant audio recordings. - Add --segmentation-onset and --segmentation-offset CLI args to pyannote_diarize.py - Pass segmentation thresholds from Go adapter to Python script - Map existing vad_onset/vad_offset params to Pyannote segmentation - Add VAD Onset/Offset inputs to UI when Pyannote diarization is selected (Whisper, Parakeet, Canary model families) Lower onset values (0.3-0.4) help detect quieter/distant speakers. Lower offset values improve detection of speech endings. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-07 12:20:31 -08:00
Fran Fitzpatrick	f8c0c6759d	fix: Listen button in selection menu now works in Timeline View The selection menu's "Listen" button wasn't working in Timeline View because the character-to-timestamp mapping was incorrectly counting text from timestamp and speaker name elements. Changes: - Add data-transcript-text attribute to transcript text containers - Update TreeWalker in useSelectionMenu to only count text inside these marked elements This fixes the character index calculation so word timestamps are correctly looked up. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-07 12:20:07 -08:00
Fran Fitzpatrick	8c3f345cee	style: add glass-card styling to sticky title section Match the title/controls section styling to the audio player below with glass-card, rounded corners, border, shadow, and padding. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-07 12:19:23 -08:00
Fran Fitzpatrick	6944e6719c	fix: keep header controls visible during auto-scroll Make title, chat button, and settings dropdown sticky so users can toggle auto-scroll without pausing playback. Wraps both the title section and audio player in a single sticky container. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-07 12:19:23 -08:00
Fran Fitzpatrick	1dedde96a8	feat: fix auto-scroll and add active segment highlighting in Timeline View The "Auto Scroll On" feature was broken because it relied on a word-level ref that was never assigned. This fix implements segment-level auto-scroll for Timeline View. Changes: - Enable autoScrollEnabled prop usage in TranscriptView - Add activeSegmentIndex computation to track current playback position - Add auto-scroll effect that scrolls to active segment on segment change - Add subtle background highlight to indicate the currently playing segment The auto-scroll only triggers when: - Mode is 'expanded' (Timeline View) - Auto-scroll is enabled - Audio is playing - The segment actually changes (debounced to prevent jitter) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-07 12:19:23 -08:00
Fran Fitzpatrick	0db419e5cd	fix: speaker rename now updates in real-time without page reload After renaming speakers in Timeline View, the changes now appear immediately in both the transcript display and downloads (JSON, TXT, SRT). Root cause: The onSpeakerMappingsUpdate callback was a no-op, so the React Query cache wasn't being invalidated after saving speaker mappings. Fix: Invalidate the speakerMappings cache when the dialog saves, triggering an automatic refetch that updates all components using the hook. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-07 12:18:57 -08:00
Peter Somlo	df5de714c4	fix: make transcription temp and output directories configurable - Add TempDir field to Config struct to read TEMP_DIR env var - Update NewUnifiedTranscriptionService to accept tempDir and outputDir parameters - Remove hardcoded "data/temp" and "data/transcripts" paths from unified service - Update NewUnifiedJobProcessor to pass directory paths from config - Update main.go to use cfg.TempDir and cfg.TranscriptsDir - Update all test files to use new function signatures - Fix database.go to use directory from DATABASE_PATH instead of hardcoded "data/"	2026-01-07 12:18:26 -08:00
Peter Somlo	93abf6eb21	feat: expand language support in the UI to 58 languages for Whisper and OpenAI models Expands language selection from 24 to 58 languages for Whisper and OpenAI transcription profiles. Changes: - Expand LANGUAGES array to 58 languages (all with WER >50%) - Add 34 new languages including Afrikaans, Armenian, Czech, Danish, Hungarian, Norwegian, Romanian, Serbian, Slovak, Thai, and many more - Create VOXTRAL_LANGUAGES array with original 24-language subset for Voxtral - Update VoxtralConfig to use VOXTRAL_LANGUAGES instead of LANGUAGES - All languages alphabetically sorted Language array usage: - LANGUAGES (58) → Whisper and OpenAI models - VOXTRAL_LANGUAGES (24) → Voxtral model - CANARY_LANGUAGES (4) → NVIDIA Canary model	2026-01-01 12:47:41 -08:00
rishikanthc	b8fd360ca2	fix: streamline API docs generation to sync both locations Updated make docs to generate swagger.json to both api-docs/ and web/project-site/public/api/ to match CI workflow behavior. This fixes CI failures where the project site swagger.json was out of sync with code changes (max_new_tokens field for Voxtral).	2025-12-31 16:03:33 -08:00
rishikanthc	f9a58baa1e	clean lint	2025-12-31 15:53:35 -08:00
rishikanthc	0248b01cbd	update docs	2025-12-31 15:47:19 -08:00
rishikanthc	73a82b9f6b	fix auto device detection in voxtral	2025-12-31 15:47:19 -08:00
rishikanthc	97eb45ea67	feat: add 'make dev' command to replace dev.sh script - Auto-installs Air if not found (with GOPATH/bin PATH handling) - Creates placeholder files for Go embed directive in dev mode - Starts backend with Air live reload (or falls back to go run) - Starts frontend with Vite HMR - Handles cleanup on Ctrl+C/SIGTERM - Removed dev.sh in favor of unified Makefile command	2025-12-31 15:47:19 -08:00
rishikanthc	efff1a3a7c	fix: use Literata font for all transcripts Changed from font-inter to font-literata to ensure consistent typography across all transcript views regardless of model used.	2025-12-31 15:47:19 -08:00
rishikanthc	f08504eaa3	fix: disable timeline view for transcripts without word-level timestamps - Check for presence of word_segments in transcript data - Show disabled menu item with explanation when timestamps unavailable - Applies to Voxtral and other models without word-level timestamps	2025-12-31 15:47:19 -08:00
rishikanthc	ad3053cc9b	fix: add Voxtral model selection and fix dependencies - Add FamilyMistralVoxtral and ModelVoxtral constants - Add case for Voxtral in selectModels switch statement - Add convertToVoxtralParams function for parameter conversion - Add MaxNewTokens field to WhisperXParams model - Map language and max_new_tokens parameters correctly - Fix parameter name in buffered script (output_path -> output_file) - Add mistral-common dependency to pyproject.toml - Check for both VoxtralForConditionalGeneration AND mistral_common On next server restart, the environment will be re-synced automatically to install the missing mistral-common dependency.	2025-12-31 15:47:19 -08:00
rishikanthc	56c540da36	forgot to commit removal of old project site	2025-12-31 15:47:19 -08:00
rishikanthc	1485b01488	feat: add buffered transcription for Voxtral to handle long audio - Create voxtral_transcribe_buffered.py for audio > 30 minutes - Split audio into 25-minute chunks for processing - Automatically detect long audio and use buffered mode - Concatenate text results from all chunks - No timestamp adjustment needed (text-only model) - Handles unlimited audio length via chunking	2025-12-31 15:47:19 -08:00

1 2 3 4 5 ...

839 Commits