updates readme

This commit is contained in:
rishikanthc
2025-08-29 11:03:41 -07:00
parent 15aaf4cf07
commit 203dac7c61
2 changed files with 185 additions and 25 deletions

193
README.md
View File

@@ -1,44 +1,187 @@
# Scriberr
<div align="center">
Ever wished you could just record something and get an accurate transcript with speaker identification? That's exactly what Scriberr does.
<img alt="Scriberr" src="assets/scriberr-logo.svg" width="480" />
## What is this?
Selfhosted, offline transcription — transcribe audio, summarize, annotate, and build on a clean REST API.
Scriberr is a web app that takes your audio files and turns them into detailed transcripts. It uses WhisperX under the hood, so you get:
[Website](https://scriberr.app) • [Docs](https://scriberr.app/docs/intro.html) • [API Reference](https://scriberr.app/api.html) • [Changelog](https://scriberr.app/changelog.html)
- **Accurate transcription** - WhisperX is really good at this
- **Speaker diarization** - It figures out who said what
- **Multiple formats** - Get your transcript as text, SRT, VTT, or JSON
- **Chat with your audio** - Ask questions about what was said using AI
- **Take notes** - Annotate important parts as you listen
</div>
## Quick start
---
The easiest way to get started:
## Introduction
Scriberr is a selfhosted offline transcription app for converting audio into text. Record or upload audio, get it transcribed, and quickly summarize or chat using your preferred LLM provider. Scriberr runs on modern CPUs (no GPU required, though GPUs can accelerate processing) and offers a range of tradeoffs between speed and transcription quality.
- Built with React (frontend) and Go (backend), packaged as a single binary
- Uses WhisperX with opensource Whisper models for accurate transcription
- Clean, distractionfree UI optimized for reading and working with transcripts
<p align="center">
<img alt="Scriberr homepage" src="screenshots/scriberr-homepage.png" width="720" />
</p>
## Features
- Accurate transcription with wordlevel timing
- Speaker diarization (identify and label speakers)
- Transcript reader with playback followalong and seekfromtext
- Highlights and lightweight notetaking (jump note → audio/transcript)
- Summarize and chat over transcripts (OpenAI or local models via Ollama)
- Transcription profiles for reusable configurations
- YouTube video transcription (paste a link and transcribe)
- Quick transcribe (ephemeral) and batch upload
- REST API coverage for all major features + API key management
- Download transcripts as JSON/SRT/TXT (and more)
## Screenshots
<details>
<summary>Show screenshots</summary>
<p align="center">
<img alt="Transcript view" src="screenshots/scriberr-transcript page.png" width="720" />
</p>
<p align="center"><em>Minimal transcript reader with playback followalong and seekfromtext.</em></p>
<p align="center">
<img alt="Summarize transcripts" src="screenshots/scriberr-summarize transcripts.png" width="720" />
</p>
<p align="center"><em>Summarize long recordings and use custom prompts.</em></p>
<p align="center">
<img alt="API key management" src="screenshots/scriberr-api-key-management.png" width="720" />
</p>
<p align="center"><em>Generate and manage API keys for the REST API.</em></p>
<p align="center">
<img alt="YouTube video transcription" src="screenshots/scriberr-youtube-video.png" width="720" />
</p>
<p align="center"><em>Transcribe audio directly from a YouTube link.</em></p>
</details>
## Installation
Visit the website for the full guide: https://scriberr.app/docs/installation.html
### Homebrew (macOS & Linux)
```bash
brew tap rishikanthc/scriberr
brew install scriberr
# Start the server
scriberr
```
Then just run `scriberr` and open http://localhost:8080 in your browser.
Open http://localhost:8080 in your browser.
## What you can do
Optional configuration via .env (sensible defaults provided):
- Upload audio files or record directly in the app
- Get transcripts with timestamps and speaker labels
- Download transcripts in whatever format you need
- Chat with AI about your recordings
- Create summaries of long audio
- Manage everything through a clean web interface
```env
# Server
HOST=localhost
PORT=8080
## Requirements
# Storage
DATABASE_PATH=./data/scriberr.db
UPLOAD_DIR=./data/uploads
WHISPERX_ENV=./data/whisperx-env
- Python 3.11+ (for the transcription engine)
- A few GB of disk space for the AI models
# Custom paths (if needed)
UV_PATH=/custom/path/to/uv
```
That's it. Everything else is handled for you.
### Docker
## Built with
Multiline example:
Go backend, React frontend, WhisperX for transcription, and a lot of coffee.
```bash
docker run -d \
--name scriberr \
-p 8080:8080 \
-v scriberr_data:/app/data \
--restart unless-stopped \
ghcr.io/rishikanthc/scriberr:latest
```
Docker Compose:
```yaml
version: '3.9'
services:
scriberr:
image: ghcr.io/rishikanthc/scriberr:latest
container_name: scriberr
ports:
- "8080:8080"
volumes:
- scriberr_data:/app/data
restart: unless-stopped
volumes:
scriberr_data:
```
Then open http://localhost:8080.
## Diarization (speaker identification)
Scriberr uses the opensource pyannote models for local speaker diarization. Models are hosted on Hugging Face and require an access token (only used to download models — diarization runs locally).
1) Create an account on https://huggingface.co
2) Visit and accept the user conditions for these repositories:
- https://huggingface.co/pyannote/speaker-diarization-3.0
- https://huggingface.co/pyannote/speaker-diarization
- https://huggingface.co/pyannote/speaker-diarization-3.1
- https://huggingface.co/pyannote/segmentation-3.0
Verify they appear here: https://huggingface.co/settings/gated-repos
3) Create an access token under Settings → Access Tokens and enable all permissions under “Repositories”. Keep it safe.
4) In Scriberr, when creating a profile or using Transcribe+, open the Diarization tab and paste the token into the “Hugging Face Token” field.
See the full guide: https://scriberr.app/docs/diarization.html
<p align="center">
<img alt="Diarization setup" src="screenshots/scriberr-diarization-setup.png" width="420" />
</p>
## API
Scriberr exposes a clean REST API for most features (transcription, chat, notes, summaries, admin, and more). Authentication supports JWT or API keys depending on endpoint.
- API Reference: https://scriberr.app/api.html
- Quick start examples (cURL and JS) on the API page
- Generate or manage API keys in the app
## Contributing
Issues and PRs are welcome. Please open an issue to discuss large changes first and keep PRs focused.
Local dev overview:
```bash
# Backend (dev)
cp -n .env.example .env || true
go run cmd/server/main.go
# Frontend (dev)
cd web/frontend
npm ci
npm run dev
# Full build (embeds UI in Go binary)
./build.sh
./scriberr
```
Coding style: `go fmt ./...`, `go vet ./...`, and `cd web/frontend && npm run lint`.
## License
Licensed under the [MIT License](LICENSE).

17
assets/scriberr-logo.svg Normal file
View File

@@ -0,0 +1,17 @@
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 800 200" role="img" aria-label="Scriberr">
<defs>
<linearGradient id="g" x1="0%" y1="0%" x2="100%" y2="0%">
<stop offset="0%" stop-color="#2563eb"/>
<stop offset="100%" stop-color="#a855f7"/>
</linearGradient>
</defs>
<rect width="100%" height="100%" fill="transparent"/>
<text x="50%" y="50%" dominant-baseline="middle" text-anchor="middle"
font-family="'Poiret One', 'Inter', ui-sans-serif, system-ui, -apple-system, 'Segoe UI', Roboto, 'Helvetica Neue', Arial, sans-serif"
font-size="88" font-weight="600" fill="url(#g)">Scriberr</text>
<style>
@media (prefers-color-scheme: dark) {
text { filter: drop-shadow(0 1px 0 rgba(255,255,255,0.08)); }
}
</style>
</svg>

After

Width:  |  Height:  |  Size: 781 B