Commit Graph

105 Commits

Author SHA1 Message Date
rishikanthc
e3a7c48bd7 fix schema compatibility gaps after migration 2026-04-23 10:35:15 -07:00
rishikanthc
0be71a63a0 refactor database schema and legacy migration flow 2026-04-23 10:17:03 -07:00
Fran Fitzpatrick
4e75295019 feat: add speaker identification toggle to summary templates
Add option to include speaker labels in summary prompts when diarization
is available. When enabled, transcripts are formatted as:
[SPEAKER_NAME] Text here...

The prompt also includes a hint to the LLM that speaker labels are present,
helping it produce summaries that attribute statements to specific speakers.

Changes:
- Add IncludeSpeakerInfo field to SummaryTemplate model
- Add toggle UI in summary template dialog
- Format transcript with speaker labels when generating summary
- Update prompt prefix to indicate speaker labels are present

Closes #353

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-28 10:58:49 -08:00
rishikanthc
2afd6a1ecf fixes #317 2025-12-29 21:11:47 -08:00
Paul Irish
9975e6fb02 fix duplicated openapi annotations pt 2 2025-12-29 21:10:13 -08:00
Paul Irish
a7aaf06bbb fix duplicated openapi annotations 2025-12-29 21:10:13 -08:00
rishikanthc
913063eb49 refactor: Switch yt-dlp to standalone binary & cleanup UV config
- Dockerfiles: Install yt-dlp binary from GitHub releases to /usr/local/bin
- Go: Execute yt-dlp binary directly, removing uv python wrapper
- Config: Remove unused UVPath configuration and findUVPath function
- Entrypoint: Remove yt-dlp init logic (still initializes whisperx env if needed)
2025-12-16 19:07:29 -08:00
rishikanthc
f99087b2bd fix: Resolve mobile audio playback permission issues
- Change cookie SameSite policy from Strict to Lax (Strict blocks media subresources on mobile)
- Decouple Secure cookie flag from APP_ENV:
  - Add SECURE_COOKIES config (defaults to true in prod, but can be overridden)
  - Allows testing production builds over HTTP (home network)
- Increase gocyclo threshold to 25 to accommodate complex handlers
2025-12-16 18:21:36 -08:00
rishikanthc
11434b9f1b feat: Add production security configuration for CORS and cookie handling
- Fix refresh token cookie Secure flag bug (was hardcoded to false)
- Wire up AllowedOrigins config in CORS middleware (router, handlers, chat, SSE)
- Add APP_ENV=production to Dockerfile and Dockerfile.cuda
- Update all docker-compose files with APP_ENV and ALLOWED_ORIGINS examples
- CORS now validates origins in production, allows all in development
- Increase gocyclo threshold from 20 to 25 for complex handlers
2025-12-16 18:21:36 -08:00
rishikanthc
c44de7858b refactor: Complete repository pattern migration for all remaining files (Phases 5-7)
Phase 5: Refactor queue.go (10 DB calls removed)
- Added JobRepository to TaskQueue struct and constructor
- Added UpdateStatus, UpdateError, FindByStatus, CountByStatus methods to JobRepository
- Replaced all database.DB calls with repository methods

Phase 6: Refactor chat_handlers.go and summarize_handlers.go (6 DB calls removed)
- Added GetMessageCountsBySessionIDs and GetLastMessagesBySessionIDs to ChatRepository
- Added UpdateSummary to JobRepository
- Replaced batch queries and update calls with repository methods
- Removed database import from both files

Phase 7: Refactor quick_transcription.go (3 DB calls removed)
- Added JobRepository injection to QuickTranscriptionService
- Updated constructor and all callers

Summary: 46+ database.DB calls replaced with repository methods across 7 phases.
All tests pass, build succeeds.
2025-12-16 18:21:36 -08:00
rishikanthc
86add0037d refactor: Replace direct database.DB calls with repository pattern in handlers, dropzone, and multitrack_processor
Phase 1: Define interfaces
- Created internal/interfaces/ package with AuthServiceInterface, TaskQueueInterface, JobProcessorInterface

Phase 2: Refactor handlers.go (21 DB calls removed)
- Replaced all database.DB calls with repository methods
- Added RefreshTokenRepository for token management
- Added new repository methods: Count, FindActiveTrackJobs, FindLatestCompletedExecution, FindByName

Phase 3: Refactor dropzone.go (3 DB calls removed)
- Added CountWithAutoTranscription to UserRepository
- Injected JobRepository and UserRepository into Service

Phase 4: Refactor multitrack_processor.go
- Changed constructor to accept *gorm.DB and JobRepository
- Updated Handler to inject MultiTrackProcessor

Updated all test files with new dependencies and mock implementations.
2025-12-16 18:21:36 -08:00
rishikanthc
7fc7619ee6 fix: tests for upstream changes
fix: new tests for chat and user management flows

fix: resolve lint errors

fix: configured lefthook to check entire project
2025-12-16 18:21:36 -08:00
rishikanthc
658a1a5c49 fix: lint errors in go code 2025-12-16 18:21:36 -08:00
rishikanthc
3bbcbcfd63 fix: responsive design 2025-12-15 13:36:12 -08:00
rishikanthc
7deff5b903 fix: change logo.. fix chat 2025-12-14 19:09:52 -08:00
rishikanthc
57232d9c06 fix(api): return graceful empty responses instead of 400/404
- GetTranscript returns 200 with available=false when transcript not ready
- GetJobExecutionData returns 200 with available=false when no execution
- GetJobLogs returns JSON with available=false when no logs exist
- Updated frontend hooks to handle new response format with available field
- Added .gitignore entries for prompt.txt and .agent folder
2025-12-14 19:09:52 -08:00
rishikanthc
8cb6c394c8 fix(streaming): add proper headers for real-time chunk delivery
- Add Transfer-Encoding chunked and X-Accel-Buffering headers to chat and summarize handlers
- Start response immediately with c.Status(http.StatusOK)
- Fix SummaryDialog: wider desktop, reading font, no inner border, darker text
- Add generating animation while waiting for first LLM chunk
2025-12-14 19:09:52 -08:00
rishikanthc
9e7fec288e feat: show context length, thinking messages separately, automatic context management 2025-12-14 19:09:52 -08:00
rishikanthc
4f78395091 fix: testing fix for long transcript chat sessions 2025-12-14 19:09:52 -08:00
rishikanthc
e5cbd43d6c fix: testing fix for long transcript chat sessions 2025-12-14 19:09:52 -08:00
rishikanthc
ebdd4eced3 fix: errors on empty 2025-12-14 19:09:52 -08:00
rishikanthc
3a0f4fb9bc feat: implement per-job SSE for real-time status updates
- Implement SSE Broadcaster with job-based subscription support
- Add /api/v1/events endpoint for SSE streaming
- Update transcription service and handlers to broadcast job events
- Implement frontend per-job SSE connection logic
- Remove legacy polling from audio list hooks
- Fix server shutdown deadlock issue
2025-12-14 19:09:52 -08:00
rishikanthc
94bb19e774 feat: audio streaming, visualizer, and secure cookie auth
- Implemented HTTP Range Requests for audio streaming

- Added EmberPlayer and AudioVisualizer components

- Implemented secure HttpOnly cookie authentication

- Added production scaffolding (APP_ENV, Secure flag)

- Fixed layout jitter and normalized UI

- Fixed 400 error on /speakers endpoint
2025-12-14 19:09:52 -08:00
rishikanthc
503a6d714f feat: stream audio in chunks - new ember player and visualizer 2025-12-14 19:09:52 -08:00
rishikanthc
bcb22af50d docs: update swagger documentation for delta sync API 2025-12-07 15:22:16 -08:00
rishikanthc
d752012a76 feat: implement server-side delta sync with soft deletes and updated_after param 2025-12-07 15:22:16 -08:00
ET
91af22bfd8 Configurable OpenAI API Base URL
Fix for enhancement issue #194

Added option to use custom OpenAI API base URL.

If not configured the default OpenAI API base URL (https://api.openai.com/v1) will be used.

Does not change current behavior of apiKey, i.e if apiKey is already configured it will not have to be re-entered when modifying base URL.
2025-12-06 12:32:03 -08:00
ET
bde45ddb6a Use usermappings in chat
If the db contains usermappings for the session these are sent to the LLM instead of generic names.
2025-12-05 09:46:34 -08:00
rishikanthc
90ba898e63 adding debug statements to understand transcript injection in chats 2025-12-03 10:34:10 -08:00
Edris
3048046b51 Instead of full raw transcript json we send formated text to LLM
Instead of sending transcript (raw transcript json) we send cleanTranscript (formated and uses MUCH less tokens) to LLM.

Before we sent transcript:
"{"text":" Testar lite, ett, två, tre.","language":"sv","segments":[{"start":1.347,"end":4.253,"text":" Testar lite, ett, två, tre.","speaker":"SPEAKER_00"}],"word_segments":[{"start":1.347,"end":2.013,"word":"Testar","score":0.887,"speaker":"SPEAKER_00"},{"start":2.094,"end":2.74,"word":"lite,","score":0.936,"speaker":"SPEAKER_00"},{"start":2.76,"end":3.002,"word":"ett,","score":0.984,"speaker":"SPEAKER_00"},{"start":3.103,"end":3.668,"word":"två,","score":0.859,"speaker":"SPEAKER_00"},{"start":3.688,"end":4.253,"word":"tre.","score":0.88,"speaker":"SPEAKER_00"}],"confidence":0,"processing_tim... ..."

Now we send cleanTranscript:
"[SPEAKER_00] [00:00:01 - 00:00:04] Testar lite, ett, två, tre."
2025-12-03 09:33:34 -08:00
rishikanthc
d72de18a55 feat: add gpt-4o support, fix response formats, and add ui warnings for timestamp limitations 2025-12-01 14:00:33 -08:00
rishikanthc
dd050eae17 fix: add auth header to openai validation and support server-side key fallback 2025-12-01 14:00:33 -08:00
rishikanthc
82b9a68d7d feat: enhance OpenAI integration with logging and API key validation 2025-12-01 14:00:33 -08:00
rishikanthc
312716816e feat: server-side sorting and searching for audio list 2025-11-30 19:29:21 -08:00
rishikanthc
3e86b72987 docs: fix api docs generation, navigation, and content inaccuracies 2025-11-30 13:00:48 -08:00
rishikanthc
009836dd03 feat(cli): add auto-install script, settings tab, and fix macos crash 2025-11-29 10:21:47 -08:00
rishikanthc
cf002fd560 feat(cli): add auto-install script and binaries serving 2025-11-29 10:21:47 -08:00
rishikanthc
5837a99734 feat(backend): ensure file upload endpoint supports CLI ingestion 2025-11-29 10:21:47 -08:00
rishikanthc
5e571f08ac feat(backend): add CLI auth handshake endpoints 2025-11-29 10:21:47 -08:00
rishikanthc
96e86b9b4f fixing backward compatibility 2025-11-27 11:06:07 -08:00
rishikanthc
a65f59167b fix chat context injection 2025-11-27 10:24:04 -08:00
rishikanthc
99031c5054 major refactor 2025-11-26 19:45:31 -08:00
rishikanthc
2de6a3ae0c closes #253 2025-11-24 20:30:58 -08:00
rishikanthc
a9db82be2f closes #253 2025-11-24 20:30:17 -08:00
rishikanthc
9ad7d32bd6 closes #218 and closes #237 - summary generation timeout. 2025-11-24 19:08:34 -08:00
rishikanthc
1ddab6d6cf closes #270 and closes #231 2025-11-24 15:15:29 -08:00
Geoff Tognetti
c2d29fc9c6 Fix YouTube downloads - Add Deno runtime for video cipher decryption
YouTube downloads were failing with "exit status 1" error. Root cause:
YouTube now requires yt-dlp to use a JavaScript runtime for video cipher
decryption.

Changes:
- Install Deno runtime in both Dockerfiles (standard and CUDA)
- Upgrade from yt-dlp to yt-dlp[default] to include all optional dependencies
- Add stderr capture to YouTube download handler for better error diagnostics
- Add performance logging for YouTube downloads (timing and file size)

Fixes #224

See: https://github.com/yt-dlp/yt-dlp/issues/14404
2025-11-15 14:06:57 -08:00
rishikanthc
2b24e08055 improves logging 2025-09-11 10:32:36 -07:00
rishikanthc
0534304c23 better progres status for multi-track audio transcription 2025-09-11 09:19:25 -07:00
rishikanthc
81eb280da4 fixes pyannote diarization with the new unified arch 2025-09-10 21:25:37 -07:00