# Setup Guide ## Prerequisites | Component | Version | Notes | |---|---|---| | Python | 3.11+ | Windows install from python.org | | NVIDIA GPU driver | Latest | RTX series recommended | | CUDA toolkit | 12.x | Required by faster-whisper | | Mosquitto | 2.x | MQTT broker | | WhisperLiveKit | Latest | `pip install whisperlivekit` | | PlatformIO | Latest | Via VS Code extension | --- ## 1 — Install Mosquitto (MQTT broker) Download from mosquitto.org and install with default settings. Start the service: ``` net start mosquitto ``` Verify it's running: ``` mosquitto_sub -h localhost -t "#" -v ``` --- ## 2 — Install WhisperLiveKit ``` pip install whisperlivekit ``` Start the server with diarization enabled: ``` wlk --model large-v3 --language en --diarization ``` The first run downloads the model (~3 GB). The WebSocket will be available at `ws://localhost:8000/asr`. Verify by opening `http://localhost:8000` in a browser. > **Latency note:** If `large-v3` is too slow on your GPU, try > `--model distil-large-v3` for similar accuracy at lower latency. --- ## 3 — Install the Python bridge ``` cd bridge pip install -r requirements.txt ``` Run it: ``` python bridge.py ``` A small window opens for assigning friendly names to auto-detected speakers (SPEAKER_00, SPEAKER_01, …). The defaults (Pastor, Reader, Guest, Choir) are applied immediately — edit them if your service has different roles. --- ## 4 — Flash the ESP32 1. Open the `esp32/` folder in VS Code with the PlatformIO extension installed. 2. Edit `src/main.cpp` — fill in your WiFi credentials and the PC's IP address: ```cpp #define WIFI_SSID "YourNetwork" #define WIFI_PASSWORD "YourPassword" #define MQTT_HOST "192.168.1.100" // run `ipconfig` on the PC to find this ``` 3. Select the correct environment in PlatformIO: - `esp32dev` for ESP32-WROOM-32 - `esp32-s3` for ESP32-S3 (recommended for larger RAM) 4. Click **Upload**. Open Serial Monitor at 115200 baud to see boot messages. --- ## 5 — End-to-end test Run these checks in order: 1. **Whisper standalone** — speak into the mic, verify text appears at `http://localhost:8000`. 2. **MQTT manually** — with the ESP32 connected, publish a test message: ``` mosquitto_pub -h localhost -t display/text -m "{\"lines\":[\"Line one\",\"Line two\",\"Line three\"]}" ``` The display should refresh within ~2 seconds. 3. **Full pipeline** — start the bridge, speak naturally. Text should appear on the display within 3–5 seconds of speech. 4. **Speaker labels** — if two people speak alternately, `[PASTOR]` / `[READER]` labels should appear as speaker changes are detected. --- ## 6 — Deployment checklist - [ ] PC set to never sleep during services - [ ] Mosquitto service set to start automatically (`sc config mosquitto start=auto`) - [ ] WhisperLiveKit added to Windows startup (Task Scheduler or a `.bat` file) - [ ] ESP32 powered from a USB wall adapter (not PC USB, to avoid dependency) - [ ] Static IP assigned to ESP32 in router DHCP settings - [ ] Audio input confirmed — direct mixer feed preferred over microphone