Configuration

Environment variables, cache layout, and cloud-vs-on-device brain selection for the bithuman CLI.

Environment variables

Variable	What
`BITHUMAN_API_SECRET`	Avatar-runtime auth (metering). Canonical name on the CLI; `BITHUMAN_API_KEY` is accepted as an alias for cross-SDK parity. Get a free key at bithuman.ai → Developer.
`OPENAI_API_KEY`	Cloud conversation brain (OpenAI Realtime). Required for `bithuman run` unless `BITHUMAN_LOCAL=1` is set.
`BITHUMAN_LOCAL`	`=1` flips the brain to the on-device stack (whisper.cpp + llama.cpp + Supertonic + Silero). Needs the `[local]` extra: `pip install 'bithuman-cli[local]'`. See Local mode.
`BITHUMAN_LOCAL_*`	Per-component tuning (whisper model, LLM, voice, language). See Local mode tuning.
`BITHUMAN_INSTRUCTIONS`	System-prompt override for the conversation brain.
`BITHUMAN_UNMETERED`	`=1` skips the avatar-runtime auth heartbeat — dev / parity testing only.
`RUST_LOG`	Tracing filter. Default `info,bithuman_serve=info`.

Cloud vs on-device brain

The avatar runtime is always the same; only the conversation brain changes. Selection is driven entirely by environment.

Cloud (default): set OPENAI_API_KEY. The brain is OpenAI Realtime — fast warm-up, lowest first-token latency, hosted reliability.

export BITHUMAN_API_SECRET=your_api_secret
export OPENAI_API_KEY=sk-...
bithuman run avatar.imx

On-device: install the [local] extra and set BITHUMAN_LOCAL=1. No API key, no outbound network. BITHUMAN_API_SECRET is still required for avatar metering (or BITHUMAN_UNMETERED=1 for dev parity testing).

pip install 'bithuman-cli[local]'
export BITHUMAN_API_SECRET=your_api_secret
BITHUMAN_LOCAL=1 bithuman run avatar.imx

Both modes go through the same bithuman run command and produce the same browser URL. See Local mode for the full on-device stack and tuning variables.

Cache layout

Path	Contents
`~/.cache/bithuman/models`	`.imx` avatar models (pool-mode default `--models-root`)
`~/.cache/bithuman/avatars`	Imported avatars staged via `POST /launch`
`~/.cache/bithuman/showcase`	Downloads from `bithuman pull`
`~/.cache/bithuman/brain-venv`	Auto-bootstrapped venv for the bundled conversation brain (only used when not pip-installed)
`~/.cache/huggingface/hub`	Local-mode STT + LLM weights (whisper.cpp `.bin`, llama.cpp `.gguf`)
`~/.cache/supertonic`	Local-mode TTS ONNX weights

bithuman doctor shows the current size of each directory. Clear the whole tree with rm -rf ~/.cache/bithuman — it regenerates on the next run.

Configuration

Environment variables

Cloud vs on-device brain

Cache layout

See also