CLAUDE.md — AI Development Context
This file provides context for AI-assisted development sessions on the Church Live Transcription Display project.
Project Summary
A live captioning system for deaf/hard-of-hearing church congregants. A Windows PC captures audio, transcribes it locally using Whisper (GPU-accelerated), and sends rolling text over MQTT to an ESP32 driving a large e-ink display. No cloud services. No internet required during operation.
Architecture
[Audio source]
↓ (USB mic or mixer line-in)
[Windows PC]
├── WhisperLiveKit (local Whisper server, WebSocket on port 8000)
├── Mosquitto MQTT broker (port 1883)
└── bridge.py (Python: WS subscriber → sentence buffer → MQTT publisher)
↓ (WiFi / MQTT topic: display/text)
[ESP32-WROOM or S3]
└── Waveshare 7.5" V2 e-ink display (SPI, GxEPD2 library)
PC Environment
- OS: Windows 10/11
- GPU: NVIDIA RTX series (tested with RTX 4070 Super)
- Python: 3.11+
- MQTT broker: Mosquitto (localhost:1883)
- Whisper server: WhisperLiveKit (
wlk --model large-v3 --language en)
- Whisper WebSocket:
ws://localhost:8000/asr
ESP32 Environment
- Board: ESP32-WROOM-32 or ESP32-S3
- Framework: Arduino (via PlatformIO)
- Display: Waveshare 7.5" V2 (800×480 pixels, black/white)
- Display library: GxEPD2
- MQTT library: PubSubClient
- Build tool: PlatformIO (VSCode)
SPI Wiring (Waveshare 7.5" V2 to ESP32)
| Display Pin |
ESP32 Pin |
| BUSY |
GPIO 4 |
| RST |
GPIO 16 |
| DC |
GPIO 17 |
| CS |
GPIO 5 |
| CLK |
GPIO 18 |
| DIN |
GPIO 23 |
| GND |
GND |
| VCC |
3.3V |
MQTT Topics
| Topic |
Direction |
Payload |
display/text |
PC → ESP32 |
JSON: {"lines": ["line1", "line2", "line3"]} |
display/clear |
PC → ESP32 |
Empty / any |
display/status |
ESP32 → PC |
JSON: {"ready": true} |
Key Files
bridge/bridge.py — Main Python bridge. Connects to Whisper WS, buffers text, publishes to MQTT.
esp32/src/main.cpp — ESP32 firmware. WiFi + MQTT client, renders text to e-ink.
esp32/platformio.ini — Board and library config.
Design Constraints & Decisions
Refresh Strategy
- Full e-ink refresh: ~1.5–2 seconds with flash. Acceptable for sentence-level updates.
- Partial refresh: ~300ms, some ghosting. Use for rapid updates if needed.
- Current approach: buffer until sentence boundary or 4-second silence, then push full screen update.
- Display shows 3 lines of text. New text pushes old text up; oldest line drops off.
Text Formatting
- Target font size: large enough to read at 3–5 metres (approx 36–48px equivalent at 800px wide)
- At ~800px wide with a large font: approximately 35–45 characters per line
- Lines wrap at word boundaries
- All caps optional for readability (configurable)
Audio Input
- Preferred: direct feed from church mixing desk (line-in or USB audio interface)
- Fallback: USB condenser microphone near pulpit/lectern
- Whisper performs best with clean, low-noise input
- VAD (Voice Activity Detection) in WhisperLiveKit handles silence automatically
Network
- All on local WiFi (church LAN or dedicated hotspot)
- MQTT broker on Windows PC
- ESP32 connects to same WiFi network
- Static IP recommended for ESP32 to avoid reconnection delays
Bridge Script Logic (bridge.py)
1. Connect to Mosquitto MQTT broker
2. Connect to WhisperLiveKit WebSocket (ws://localhost:8000/asr)
3. Receive partial transcription updates
4. Accumulate words into a sentence buffer
5. On sentence-end signal (or timeout):
a. Word-wrap text into lines (max ~40 chars each)
b. Maintain a rolling 3-line buffer
c. Publish JSON payload to MQTT topic display/text
6. On reconnect events: re-establish WS and MQTT connections
Known Issues / Open Questions
Development Notes
- WhisperLiveKit WebSocket returns incremental JSON with
text and is_final fields
- GxEPD2 supports both full and partial refresh; partial requires
setPartialWindow()
- PubSubClient default packet size is 128 bytes — must increase to handle JSON payloads (~200 bytes)
- Use
client.setBufferSize(512) in PubSubClient setup
Testing Approach
- Test Whisper server standalone: speak into mic, verify text in browser at
http://localhost:8000
- Test MQTT: use MQTT Explorer or
mosquitto_sub to verify bridge publishes correctly
- Test ESP32 display: send static MQTT messages manually before connecting bridge
- End-to-end: full pipeline test with recorded sermon audio
- In-situ trial: 1–2 Sunday services with a volunteer congregant providing feedback