CLAUDE.md 12 KB

CLAUDE.md — Audio Recorder Project

This file provides full project context for Claude Code or any AI assistant working on this codebase. Read this before making any changes.


Project Overview

A WiFi-connected audio recorder system consisting of three components:

  1. audio_recorder_server.py — Python/Flask server running on a Windows PC. Handles all audio capture, encoding, file management, and serves a web UI.

  2. src/launcher.py — Windows GUI launcher. Starts the Flask server as a background thread, shows a system tray icon, and opens the browser on startup. Built into a standalone AudioRecorder.exe via PyInstaller.

  3. AudioRecorderController.ino — Arduino sketch for an ESP32-S3 with an LCD Keypad Shield (LCD1602). Sends HTTP commands to the PC server over WiFi, displays status and a live audio level bar on the LCD.


Architecture

[ESP32-S3 + LCD Keypad Shield]
        |  WiFi HTTP GET
        v
[AudioRecorder.exe  (Windows PC)]
  └── launcher.py  (system tray, browser opener)
  └── audio_recorder_server.py  (Flask on port 5000)
        ├── sounddevice  (audio capture)
        ├── soundfile    (WAV/FLAC/OGG encoding)
        ├── lameenc      (MP3 encoding)
        └── /recordings/ (output directory)

Communication is plain HTTP on the local network. The ESP32 calls the PC server directly. The PC server also hosts its own web UI accessible from any browser on the LAN.


File Structure

AudioRecorder_Build/
├── audio_recorder_server.py     # Main server — edit this for recording logic
├── src/
│   ├── launcher.py              # Windows tray launcher — edit for UI/startup
│   └── audio_recorder_server.py # Copy kept in sync with root version
├── AudioRecorder.spec           # PyInstaller build config
├── BUILD.bat                    # One-click Windows build script
├── debug_test.py                # Standalone diagnostic tool
├── build_tools/
│   └── version_info.txt         # Windows EXE metadata
├── CLAUDE.md                    # This file
└── VERSION.md                   # Changelog

Arduino sketch (separate):

AudioRecorderController.ino      # ESP32-S3 firmware

Server: audio_recorder_server.py

Key globals (set by init_from_args() at startup)

Variable Default Description
OUTPUT_DIR ./recordings Where recordings are saved
SAMPLE_RATE 44100 Hz — overridden by device resolver
CHANNELS 2 1=mono, 2=stereo — clamped to device max
DEVICE None sounddevice input index, None=system default
FILE_FORMAT WAV WAV / FLAC / OGG / MP3
PORT 5000 Flask HTTP port
MP3_AVAILABLE auto True if lameenc is installed

Critical design rules

  • init_from_args() must be called before app.run() — argparse is deferred so the module can be safely imported by launcher.py without consuming its argv.
  • signal.signal() is inside if __name__ == "__main__": only — signals cannot be registered from a non-main thread; the launcher imports the module from a background thread so signal registration must be guarded.
  • app.run() must use use_reloader=False — Flask's reloader spawns a child watcher process which causes infinite restart loops in frozen PyInstaller EXEs.
  • HTML is returned as a raw string, never via render_template_string() — Jinja2 misinterprets CSS/JS curly braces as template variables and throws 500 errors. Always use: return html, 200, {"Content-Type": "text/html; charset=utf-8"}
  • build_ui_html() uses concatenated string literals — triple-quoted strings containing CSS/JS caused Jinja2 issues in earlier versions. The current implementation builds HTML via Python string concatenation to avoid any template engine involvement.

API endpoints

Method Route Description
GET / Web UI (full recorder interface)
GET /api/start Start recording
GET /api/stop Stop recording (data held in memory)
GET /api/pause Pause if recording, resume if paused
GET /api/resume Resume explicitly
GET /api/save Stop (if needed) + encode + write to disk
GET /api/status Full state dict incl. file list + device name
GET /api/level Lightweight RMS level poll {level, peak, state}
GET /api/devices List input devices with capabilities
POST /api/setdevice Set device + channels + samplerate + format
POST /api/rename Rename a recording file {old, new}
GET /recordings/<file> Download a recording

AudioRecorder class

  • Audio is captured as float32 PCM blocks in _audio_callback
  • RMS level is computed per-block (log scale, −60 dB floor → 0.0–1.0)
  • Peak hold snaps up instantly, decays at ~0.012 per callback
  • _actual_rate and _actual_ch are stored on the instance after stream opens (may differ from globals if resolve_device_settings() clamped them)
  • save() runs in a daemon thread so it doesn't block the Flask server
  • MP3 encoding uses lameenc at 192 kbps, quality=2. Input must be int16 PCM.

Device resolution (resolve_device_settings)

Called before every recording. Queries the device's actual max_input_channels and default_samplerate, clamps channels, then tries sample rates in order: [wanted, device_default, 48000, 44100, 22050, 16000] using sd.check_input_settings() to validate each before committing. This prevents PortAudio error -9996 (invalid device config).

File naming

Default format: YYYYMMDD_DAY_HHMM.ext Example: 20260304_TUE_0947.wav Set in _make_filename(). Extension comes from FILE_FORMAT.


Launcher: src/launcher.py

Critical design rules

  • multiprocessing.freeze_support() must be the very first line — before any other imports. On Windows, PyInstaller re-executes the EXE entry point when spawning subprocesses. Without freeze_support, this creates infinite launcher copies that crash the machine.
  • Server runs as a threading.Thread, never subprocess.Popen — subprocess spawning in a frozen EXE triggers the infinite restart loop described above.
  • Path resolution uses sys._MEIPASS for bundled files — PyInstaller extracts bundled data files to a temp directory at sys._MEIPASS. User files (recordings, config) are saved next to the EXE at Path(sys.executable).parent.
  • Tray uses pystray + PIL — tkinter was removed because it causes _tkinter import errors in frozen PyInstaller builds on Python 3.14.

Config file

recorder_config.json sits next to the EXE. Keys: port, device, outdir, samplerate, channels, format, auto_open_browser

Logging

All startup events written to recorder_launcher.log next to the EXE. Also available via tray menu → "Show Log". Useful for diagnosing startup failures in the frozen EXE where there is no console.


Arduino: AudioRecorderController.ino

Hardware

  • Board: ESP32-S3
  • Display: LCD Keypad Shield LCD1602 (16×2 characters)
  • Button input: Resistor ladder on ADC pin GPIO 1 (analog read)

Pin mapping

Shield ESP32-S3 GPIO Function
RS 19 LCD Register Select
EN 18 LCD Enable
D4–D7 17, 16, 15, 7 LCD data
A0 1 (ADC) Button resistor ladder
Backlight 2 (PWM) LCD backlight

Button mapping (ADC thresholds)

Button Action ADC <
SELECT Start recording 3000
LEFT Pause / Resume 2100
DOWN Stop recording 1100
RIGHT Save last recording 600 (UP) / see code
UP Show IP address 600

LCD level bar

  • Row 0: * REC 00:00:42 during recording
  • Row 1: pixel-accurate signal bar using 7 custom characters
  • 16 chars × 5 pixel columns = 80 pixel positions total
  • Chars 0–5: empty to full block (partial fill)
  • Char 6: peak hold marker (top + bottom line)
  • Level polled from /api/level every 120ms during recording
  • Smoothing: fast attack (×0.6), slow decay (×0.9)

Communication

  • ESP32 calls the PC server directly via HTTPClient GET requests
  • The ESP32 also hosts its own lightweight web UI (built as a C++ string) which makes fetch() calls direct to the PC server (URL embedded at render time)
  • WiFi credentials and PC IP stored in NVS via Preferences
  • Config portal: on first boot (or if WiFi fails), starts AP AudioRecorder — connect and browse to 192.168.4.1 to configure

Dependencies

Python (pip)

flask
sounddevice
soundfile
numpy
pillow
pystray
lameenc
pyinstaller    # build only

Arduino Libraries

LiquidCrystal  (built-in)
WiFi           (ESP32 core)
WebServer      (ESP32 core)
HTTPClient     (ESP32 core)
Preferences    (ESP32 core)
ArduinoJson    (Library Manager: Benoit Blanchon)

Building the EXE

BUILD.bat

This installs all Python dependencies, copies audio_recorder_server.py into src/, and runs PyInstaller with AudioRecorder.spec.

Output: dist/AudioRecorder.exe (~80 MB, fully self-contained)

Key PyInstaller settings in the spec:

  • console=False — no black console window
  • upx=False — avoids false-positive AV flags
  • hiddenimports includes all sounddevice, pystray, lameenc, flask internals

After building

Allow port 5000 through Windows Firewall (run once as admin):

netsh advfirewall firewall add rule name="AudioRecorder" dir=in action=allow protocol=TCP localport=5000

Known Issues & History

Bugs fixed during development

Error Cause Fix
Infinite EXE spawn loop subprocess.Popen in frozen app + missing freeze_support() Server runs as thread; freeze_support() is first line
_tkinter missing tkinter doesn't bundle reliably with PyInstaller on Python 3.14 Replaced with pystray
audio_recorder_server.py not found sys._MEIPASS path set too late Path resolution at module top before any imports
Internal Server Error (500) on web UI Jinja2 parses CSS/JS {} as template variables Raw string response, no render_template_string()
ValueError: signal only works in main thread signal.signal() at module level, called from launcher thread Moved inside if __name__ == "__main__":
PortAudio error -9996 Device doesn't support requested sample rate / channel count resolve_device_settings() probes and clamps before opening stream
Remote visualiser not working getUserMedia blocked on non-localhost HTTP origins Server-side RMS via /api/level; getUserMedia only on localhost

Web UI Notes

The web UI (build_ui_html()) is a single-page app with three tabs:

Recorder tab

  • Status pill, elapsed timer (locally interpolated between polls)
  • 24-bar audio visualiser (localhost: Web Audio API; remote: /api/level poll)
  • Start / Pause / Stop / Save buttons

Files tab

  • List of saved recordings with size and date
  • ✏️ rename button per file (modal dialog)
  • ⬇️ download link per file

Settings tab

  • Input device dropdown (populated from /api/devices)
  • Channels, sample rate, format selectors
  • Format hint showing approximate file size per minute
  • MP3 option disabled with install hint if lameenc not found
  • Apply Settings sends POST /api/setdevice — takes effect immediately

Diagnostic Tool

debug_test.py — run directly with Python (not as EXE) to diagnose issues:

python debug_test.py

Checks: package imports, port availability, module import, audio devices, Flask startup, local IP. Prints pass/fail for each and pauses on errors.