Scriberr

mirror of https://github.com/rishikanthc/Scriberr.git synced 2026-06-28 14:55:46 +00:00

Author	SHA1	Message	Date
Peter Somlo	93abf6eb21	feat: expand language support in the UI to 58 languages for Whisper and OpenAI models Expands language selection from 24 to 58 languages for Whisper and OpenAI transcription profiles. Changes: - Expand LANGUAGES array to 58 languages (all with WER >50%) - Add 34 new languages including Afrikaans, Armenian, Czech, Danish, Hungarian, Norwegian, Romanian, Serbian, Slovak, Thai, and many more - Create VOXTRAL_LANGUAGES array with original 24-language subset for Voxtral - Update VoxtralConfig to use VOXTRAL_LANGUAGES instead of LANGUAGES - All languages alphabetically sorted Language array usage: - LANGUAGES (58) → Whisper and OpenAI models - VOXTRAL_LANGUAGES (24) → Voxtral model - CANARY_LANGUAGES (4) → NVIDIA Canary model	2026-01-01 12:47:41 -08:00
rishikanthc	b8fd360ca2	fix: streamline API docs generation to sync both locations Updated make docs to generate swagger.json to both api-docs/ and web/project-site/public/api/ to match CI workflow behavior. This fixes CI failures where the project site swagger.json was out of sync with code changes (max_new_tokens field for Voxtral).	2025-12-31 16:03:33 -08:00
rishikanthc	f9a58baa1e	clean lint	2025-12-31 15:53:35 -08:00
rishikanthc	0248b01cbd	update docs	2025-12-31 15:47:19 -08:00
rishikanthc	73a82b9f6b	fix auto device detection in voxtral	2025-12-31 15:47:19 -08:00
rishikanthc	97eb45ea67	feat: add 'make dev' command to replace dev.sh script - Auto-installs Air if not found (with GOPATH/bin PATH handling) - Creates placeholder files for Go embed directive in dev mode - Starts backend with Air live reload (or falls back to go run) - Starts frontend with Vite HMR - Handles cleanup on Ctrl+C/SIGTERM - Removed dev.sh in favor of unified Makefile command	2025-12-31 15:47:19 -08:00
rishikanthc	efff1a3a7c	fix: use Literata font for all transcripts Changed from font-inter to font-literata to ensure consistent typography across all transcript views regardless of model used.	2025-12-31 15:47:19 -08:00
rishikanthc	f08504eaa3	fix: disable timeline view for transcripts without word-level timestamps - Check for presence of word_segments in transcript data - Show disabled menu item with explanation when timestamps unavailable - Applies to Voxtral and other models without word-level timestamps	2025-12-31 15:47:19 -08:00
rishikanthc	ad3053cc9b	fix: add Voxtral model selection and fix dependencies - Add FamilyMistralVoxtral and ModelVoxtral constants - Add case for Voxtral in selectModels switch statement - Add convertToVoxtralParams function for parameter conversion - Add MaxNewTokens field to WhisperXParams model - Map language and max_new_tokens parameters correctly - Fix parameter name in buffered script (output_path -> output_file) - Add mistral-common dependency to pyproject.toml - Check for both VoxtralForConditionalGeneration AND mistral_common On next server restart, the environment will be re-synced automatically to install the missing mistral-common dependency.	2025-12-31 15:47:19 -08:00
rishikanthc	56c540da36	forgot to commit removal of old project site	2025-12-31 15:47:19 -08:00
rishikanthc	1485b01488	feat: add buffered transcription for Voxtral to handle long audio - Create voxtral_transcribe_buffered.py for audio > 30 minutes - Split audio into 25-minute chunks for processing - Automatically detect long audio and use buffered mode - Concatenate text results from all chunks - No timestamp adjustment needed (text-only model) - Handles unlimited audio length via chunking	2025-12-31 15:47:19 -08:00
rishikanthc	5a947e8739	fix: update Voxtral token limits based on 32k context window - Default: 4096 → 8192 tokens - Maximum: 8192 → 16384 tokens - Minimum: 512 → 1024 tokens - Voxtral has 32k context window, handles 30-40 min audio - Updated UI description to reflect capabilities	2025-12-31 15:47:19 -08:00
rishikanthc	95ecbf6d21	fix: increase Voxtral max_new_tokens to 4096 (max 8192) - Default increased from 500 to 4096 tokens - Maximum increased from 2000 to 8192 tokens - Minimum increased from 100 to 512 tokens - Add max_new_tokens to TypeScript interface - Fix UI to use correct parameter (was using max_line_width)	2025-12-31 15:47:19 -08:00
rishikanthc	1ae7b2bf71	feat: add Voxtral-mini transcription support - Add VoxtralAdapter using transformers library with direct model loading - Add Python transcription script with apply_transcription_request() method - Register Voxtral adapter in main.go with dedicated environment - Add UI configuration in TranscriptionConfigDialog with warning banner - Support multilingual transcription without word-level timestamps - Auto GPU/CPU detection, no device parameter needed - Graceful degradation for missing timestamp features Voxtral provides high-quality text-only transcription but does not support word-level timestamps. UI warns users that synchronized playback and seek features won't be available.	2025-12-31 15:47:19 -08:00
rishikanthc	923b39e415	fix: ensure directories exist before writing adapter scripts - Create env directory in copy script functions before writing - Fixes initialization errors for Parakeet, Canary, and Sortformer adapters - Update Makefile to use web/project-site for website commands - Add build target to Makefile for building Scriberr binary	2025-12-31 15:47:19 -08:00
rishikanthc	5e5dc17a13	fix colors and styles	2025-12-29 21:11:47 -08:00
rishikanthc	c433db07b7	feat: add toggle for Automatic Gain Control Allow users to enable/disable AGC before starting recording. AGC automatically adjusts microphone volume for consistent levels.	2025-12-29 21:11:47 -08:00
rishikanthc	ca2ed2fd72	fix: use remote-only echo cancellation for microphone Use Chrome/Edge's 'remote-only' echo cancellation mode to allow microphone input during local system audio playback while still preventing acoustic echo from remote sources in video calls	2025-12-29 21:11:47 -08:00
rishikanthc	76d92e2055	style: apply brand gradient to Upload Recording button	2025-12-29 21:11:47 -08:00
rishikanthc	8ca2b5ba2b	refactor: use consistent design system in SystemAudioRecorder - Replace hardcoded colors with CSS variables - Match button design with transcription settings - Apply brand gradient to Start Recording button	2025-12-29 21:11:47 -08:00
rishikanthc	b13d1b360d	refactor: use default button components in SystemAudioRecorder	2025-12-29 21:11:47 -08:00
rishikanthc	eb6192960f	refactor: restrict system audio to Chromium browsers only Removed Firefox/Safari support as only Chromium browsers (Chrome, Edge, Brave) reliably support tab audio capture via getDisplayMedia API. Changes: - Added Chromium browser detection (Chrome, Edge, Brave, Chromium) - Show compatibility error dialog for non-Chromium browsers - Removed all Firefox-specific code and constraints - Simplified UI instructions (tab selection only) - Cleaner error messages focused on tab audio Tested working on: Chrome, Edge, Brave Not supported: Firefox, Safari, other browsers	2025-12-29 21:11:47 -08:00
rishikanthc	f5379464f6	feat: add system audio recording with microphone mixing Implements Screen Capture API based system audio recording for meeting recordings. Works on Chrome/Edge with tab audio capture. Features: - Client-side audio mixing (system audio + microphone) using Web Audio API - Real-time volume controls via GainNode - Simple timer-based recording (no visualization complexity) - Echo cancellation enabled for microphone to prevent feedback loops - Browser compatibility checks - Graceful error handling for permissions and stream interruptions Technical details: - Uses getDisplayMedia() for system audio capture (requires video=true, immediately stopped) - getUserMedia() for microphone with echo cancellation - MediaRecorder for direct recording without WaveSurfer dependency - Cyan/blue themed UI to differentiate from regular microphone recording Tested and working on Chrome. Firefox support needs investigation (v146.0.1).	2025-12-29 21:11:47 -08:00
rishikanthc	2afd6a1ecf	fixes #317	2025-12-29 21:11:47 -08:00
Paul Irish	0029078b8a	project site	2025-12-29 21:10:13 -08:00
Paul Irish	9975e6fb02	fix duplicated openapi annotations pt 2	2025-12-29 21:10:13 -08:00
Paul Irish	a7aaf06bbb	fix duplicated openapi annotations	2025-12-29 21:10:13 -08:00
Paul Irish	ab912a6b6e	always copy scripts	2025-12-29 21:09:53 -08:00
Paul Irish	7471a2a1b6	Add test suite for python adapter scripts	2025-12-29 21:09:53 -08:00
Paul Irish	50dd4130ff	Extract python adapter scripts to proper files	2025-12-29 21:09:53 -08:00
Paul Irish	edb65339b8	dont blank on vite startup	2025-12-29 21:09:53 -08:00
Paul Irish	d013fe288a	build: adopt gotestsum for go test output formatting	2025-12-26 20:40:52 -08:00
Fran Fitzpatrick	8f537548d4	feat: add RTX 5090 Blackwell GPU support (sm_120) Add support for NVIDIA RTX 50-series GPUs (Blackwell architecture) which require CUDA 12.8+ and PyTorch cu128 wheels due to the new sm_120 compute capability. Changes: - Add configurable PYTORCH_CUDA_VERSION environment variable to control PyTorch wheel version at runtime (cu126 for legacy, cu128 for Blackwell) - Update all model adapters to use dynamic CUDA version instead of hardcoded cu126 URLs - Update Dockerfile.cuda.12.9 for Blackwell with CUDA 12.9.1 base image, PYTORCH_CUDA_VERSION=cu128, and missing WHISPERX_ENV/yt-dlp - Update Dockerfile.cuda with explicit PYTORCH_CUDA_VERSION=cu126 - Add docker-compose.blackwell.yml for pre-built Blackwell image - Add docker-compose.build.blackwell.yml for local Blackwell builds - Add GPU compatibility documentation to README Fixes: rishikanthc/Scriberr#104	2025-12-24 14:46:44 -08:00
Paul Irish	718cb74b70	simpler name of job	2025-12-21 08:40:49 -08:00
Paul Irish	64953f9dde	only on main and PRs	2025-12-21 08:40:49 -08:00
Paul Irish	4f75db3856	cleaner	2025-12-21 08:40:49 -08:00
Paul Irish	57127b6ec6	Revert "fix lint in TOC" This reverts commit `15c919e327`.	2025-12-21 08:40:49 -08:00
Paul Irish	2db72409da	fix lint in TOC	2025-12-21 08:40:49 -08:00
Paul Irish	5d11f318d5	any	2025-12-21 08:40:49 -08:00
Paul Irish	03c8f76a1d	flesh it out	2025-12-21 08:40:49 -08:00
Paul Irish	007b344f60	ci: add basic build and test workflow	2025-12-21 08:40:49 -08:00
Paul Irish	ff41bd7dc6	drop the any	2025-12-21 08:38:42 -08:00
Paul Irish	410e6ea91b	add speaker dialog to download menu	2025-12-21 08:38:42 -08:00
rishikanthc	9328215a2e	m	2025-12-19 09:48:53 -08:00
rishikanthc	3ff2136d19	docs: add LLM disclosure section to README.md	2025-12-19 09:31:14 -08:00
rishikanthc	bab12bfe39	docs: update installation for PUID/PGID and add troubleshooting section - Update Docker Compose files to default PUID/PGID to 1000 - Add note about SECURE_COOKIES for non-SSL access in README and project site - Create dedicated Troubleshooting page in documentation site - Synchronize permissions documentation across all platforms	2025-12-19 09:24:22 -08:00
rishikanthc	eac630e494	fix: add Features page to docs sidebar navigation	2025-12-17 13:24:17 -08:00
rishikanthc	becfd0ad0f	fix compose v1.2.0	2025-12-17 11:41:07 -08:00
rishikanthc	38c8b69f3b	feat: add elegant sponsor segment to homepage	2025-12-17 11:37:20 -08:00
rishikanthc	069cc7e0ce	docs: update site routing and navigation to use Diarization page	2025-12-17 11:31:50 -08:00

1 2 3 4 5 ...

800 Commits