docs: add app startup troubleshooting section

This commit is contained in:
rishikanthc
2025-12-17 11:20:50 -08:00
parent 9925e12d26
commit a066a73c7b

View File

@@ -23,7 +23,7 @@ Open [http://localhost:8080](http://localhost:8080) in your browser.
Scriberr works out of the box. However, for Homebrew or manual installations, you can customize the application behavior using environment variables or a `.env` file placed in the same directory as the binary (or where you run the command from).
> **Docker Users:** You can ignore this section if you are using `docker-compose.yml`, as these values are set in the `environment` section there.
> **Docker Users:** You can ignore this section if you are using `docker-compose.yml`, as these values are already configured with sane defaults.
### Environment Variables
@@ -73,11 +73,18 @@ services:
ports:
- "8080:8080"
volumes:
- scriberr_data:/app/data
- scriberr_data:/app/data # volume for data
- env_data:/app/whisperx-env # volume for models and python envs
environment:
- APP_ENV=production # DO NOT CHANGE THIS
# CORS: comma-separated list of allowed origins for production
# - ALLOWED_ORIGINS=https://your-domain.com
# - SECURE_COOKIES=false # Uncomment this ONLY if you are not using SSL
restart: unless-stopped
volumes:
scriberr_data:
scriberr_data: {}
env_data: {}
```
2. Run the container:
@@ -94,14 +101,14 @@ If you have a compatible NVIDIA GPU, this configuration enables hardware acceler
2. Create a file named `docker-compose.cuda.yml`:
```yaml
version: "3.9"
services:
scriberr:
image: ghcr.io/rishikanthc/scriberr:v1.0.4-cuda
ports:
- "8080:8080"
volumes:
- scriberr_data:/app/data
- scriberr_data:/app/data # volume for data
- env_data:/app/whisperx-env # volume for models and python envs
restart: unless-stopped
deploy:
resources:
@@ -114,9 +121,14 @@ services:
environment:
- NVIDIA_VISIBLE_DEVICES=all
- NVIDIA_DRIVER_CAPABILITIES=compute,utility
- APP_ENV=production # DO NOT CHANGE THIS
# CORS: comma-separated list of allowed origins for production
# - ALLOWED_ORIGINS=https://your-domain.com
# - SECURE_COOKIES=false # Uncomment this ONLY if you are not using SSL
volumes:
scriberr_data: {}
env_data: {}
```
3. Run the container with the CUDA configuration:
@@ -124,3 +136,75 @@ volumes:
```bash
docker compose -f docker-compose.cuda.yml up -d
```
## App Startup
When you run Scriberr for the first time, it may take several minutes to start. This is normal!
The application needs to initialize the Python environments and download the necessary machine learning models (NVIDIA Sortformer, NVIDIA Canary, NVIDIA Parakeet).
**Subsequent runs will be much faster** because all models and environments are persisted to the `env_data` volume (or your local mapped folders).
You will know the application is ready when you see the line: `msg="Scriberr is ready" url=http://0.0.0.0:8080`.
**Example Startup Log:**
```text
scriberr | === Scriberr Container Setup ===
scriberr | Requested UID: 10001, GID: 10001
scriberr | Setting up custom user with UID=10001, GID=10001...
scriberr | Group with GID 10001 already exists, using it
scriberr | usermod: no changes
scriberr | Setting up data directories...
scriberr | === Setup Complete ===
scriberr | Switching to user appuser (UID=10001, GID=10001) and starting application...
scriberr | time=02:50:36 level="INFO " msg="Starting Scriberr" version=dev
scriberr | [+] Loading configuration
scriberr | time=02:50:36 level="INFO " msg="Registering adapters with environment path" whisperx_env=/app/whisperx-env
scriberr | time=02:50:36 level="INFO " msg="Adapter registration complete"
scriberr | [+] Connecting to database
scriberr | [+] Setting up authentication
scriberr | [+] Initializing SSE broadcaster
scriberr | [+] Initializing repositories
scriberr | [+] Initializing services
scriberr | [+] Initializing transcription service
scriberr | [+] Initializing transcription service
scriberr | [+] Preparing Python environment
scriberr | time=02:50:36 level="INFO " msg="Initializing unified transcription service"
scriberr | time=02:50:36 level="INFO " msg="Initializing registered models in parallel..."
scriberr | time=02:50:36 level="INFO " msg="Preparing NVIDIA Sortformer environment" env_path=/app/whisperx-env/parakeet
scriberr | time=02:50:36 level="INFO " msg="transcription model initialized" model_id=openai_whisper
scriberr | time=02:50:36 level="INFO " msg="Preparing NVIDIA Canary environment" env_path=/app/whisperx-env/parakeet
scriberr | time=02:50:36 level="INFO " msg="Preparing PyAnnote environment" env_path=/app/whisperx-env/pyannote
scriberr | time=02:50:36 level="INFO " msg="Preparing NVIDIA Parakeet environment" env_path=/app/whisperx-env/parakeet
scriberr | time=02:50:36 level="INFO " msg="Preparing WhisperX environment" env_path=/app/whisperx-env
scriberr | time=02:50:36 level="INFO " msg="Installing PyAnnote dependencies"
scriberr | time=02:50:36 level="INFO " msg="Parakeet environment not ready, setting up"
scriberr | time=02:50:36 level="INFO " msg="Installing Canary dependencies"
scriberr | time=02:50:36 level="INFO " msg="Installing Parakeet dependencies"
scriberr | time=02:50:36 level="INFO " msg="Downloading Sortformer model" path=/app/whisperx-env/parakeet/diar_streaming_sortformer_4spk-v2.nemo
Downloading diar_streaming_sortformer_4spk-v2.nemo: 100% (449.5 MB / 449.5 MB)
scriberr | time=02:50:53 level="INFO " msg="Successfully downloaded Sortformer model" size=471367680
scriberr | time=02:50:53 level="INFO " msg="Sortformer environment prepared successfully"
scriberr | time=02:50:53 level="INFO " msg="diarization model initialized" model_id=sortformer
scriberr | time=02:53:11 level="INFO " msg="WhisperX environment prepared successfully"
scriberr | time=02:53:11 level="INFO " msg="transcription model initialized" model_id=whisperx
scriberr | time=02:53:14 level="INFO " msg="PyAnnote environment prepared successfully"
scriberr | time=02:53:14 level="INFO " msg="diarization model initialized" model_id=pyannote
scriberr | time=02:53:28 level="INFO " msg="Downloading Canary model" path=/app/whisperx-env/parakeet/canary-1b-v2.nemo
scriberr | time=02:53:28 level="INFO " msg="Downloading Parakeet model" path=/app/whisperx-env/parakeet/parakeet-tdt-0.6b-v3.nemo
Downloading parakeet-tdt-0.6b-v3.nemo: 100% (2.3 GB / 2.3 GB)
scriberr | time=02:54:37 level="INFO " msg="Successfully downloaded Parakeet model" size=2509332480
scriberr | time=02:54:37 level="INFO " msg="Created buffered transcription script" path=/app/whisperx-env/parakeet/transcribe_buffered.py
scriberr | time=02:54:37 level="INFO " msg="Parakeet environment prepared successfully"
scriberr | time=02:54:37 level="INFO " msg="transcription model initialized" model_id=parakeet
Downloading canary-1b-v2.nemo: 100% (5.9 GB / 5.9 GB)
scriberr | time=02:55:54 level="INFO " msg="Successfully downloaded Canary model" size=6358958080
scriberr | time=02:55:54 level="INFO " msg="Canary environment prepared successfully"
scriberr | time=02:55:54 level="INFO " msg="transcription model initialized" model_id=canary
scriberr | time=02:55:54 level="INFO " msg="Model initialization completed"
scriberr | time=02:55:54 level="INFO " msg="Unified transcription service initialized successfully"
scriberr | [+] Initializing quick transcription service
scriberr | [+] Starting background processing
scriberr | time=02:55:54 level="INFO " msg="Scriberr is ready" url=http://0.0.0.0:8080
```