# Setup Guide

## Prerequisites

| Component | Version | Notes |
|---|---|---|
| Python | 3.11+ | Windows install from python.org |
| NVIDIA GPU driver | Latest | RTX series recommended |
| CUDA toolkit | 12.x | Required by faster-whisper |
| Mosquitto | 2.x | MQTT broker |
| WhisperLiveKit | Latest | `pip install whisperlivekit` |
| PlatformIO | Latest | Via VS Code extension |

---

## 1 — Install Mosquitto (MQTT broker)

Download from mosquitto.org and install with default settings.
Start the service:

```
net start mosquitto
```

Verify it's running:

```
mosquitto_sub -h localhost -t "#" -v
```

---

## 2 — Install WhisperLiveKit

```
pip install whisperlivekit
```

Start the server with diarization enabled:

```
wlk --model large-v3 --language en --diarization
```

The first run downloads the model (~3 GB). The WebSocket will be available at
`ws://localhost:8000/asr`. Verify by opening `http://localhost:8000` in a browser.

> **Latency note:** If `large-v3` is too slow on your GPU, try
> `--model distil-large-v3` for similar accuracy at lower latency.

---

## 3 — Install the Python bridge

```
cd bridge
pip install -r requirements.txt
```

Run it:

```
python bridge.py
```

A small window opens for assigning friendly names to auto-detected speakers
(SPEAKER_00, SPEAKER_01, …). The defaults (Pastor, Reader, Guest, Choir) are
applied immediately — edit them if your service has different roles.

---

## 4 — Flash the ESP32

1. Open the `esp32/` folder in VS Code with the PlatformIO extension installed.
2. Edit `src/main.cpp` — fill in your WiFi credentials and the PC's IP address:

   ```cpp
   #define WIFI_SSID     "YourNetwork"
   #define WIFI_PASSWORD "YourPassword"
   #define MQTT_HOST     "192.168.1.100"   // run `ipconfig` on the PC to find this
   ```

3. Select the correct environment in PlatformIO:
   - `esp32dev` for ESP32-WROOM-32
   - `esp32-s3` for ESP32-S3 (recommended for larger RAM)

4. Click **Upload**. Open Serial Monitor at 115200 baud to see boot messages.

---

## 5 — End-to-end test

Run these checks in order:

1. **Whisper standalone** — speak into the mic, verify text appears at
   `http://localhost:8000`.

2. **MQTT manually** — with the ESP32 connected, publish a test message:

   ```
   mosquitto_pub -h localhost -t display/text -m "{\"lines\":[\"Line one\",\"Line two\",\"Line three\"]}"
   ```

   The display should refresh within ~2 seconds.

3. **Full pipeline** — start the bridge, speak naturally. Text should appear on
   the display within 3–5 seconds of speech.

4. **Speaker labels** — if two people speak alternately, `[PASTOR]` / `[READER]`
   labels should appear as speaker changes are detected.

---

## 6 — Deployment checklist

- [ ] PC set to never sleep during services
- [ ] Mosquitto service set to start automatically (`sc config mosquitto start=auto`)
- [ ] WhisperLiveKit added to Windows startup (Task Scheduler or a `.bat` file)
- [ ] ESP32 powered from a USB wall adapter (not PC USB, to avoid dependency)
- [ ] Static IP assigned to ESP32 in router DHCP settings
- [ ] Audio input confirmed — direct mixer feed preferred over microphone