Quick Start: Real-Time Avatar API in 5 Minutes

This page covers the Python SDK path (Essence on CPU, self-hosted). If you’re building somewhere else, jump straight to:

Mac / iPad / iPhone app → Swift SDK quickstart (10 min, on-device)
No code, just try it on a Mac → bithuman-cli (brew install bithuman-cli)
Web app with LiveKit → Cloud Plugin guide (cloud-hosted avatar)
Backend, any language → REST API overview (curl against api.bithuman.ai)

1. Get Credentials

Create an account at www.bithuman.ai

Copy your API Secret

Go to Developer → API Keys and copy your API Secret.

Download an avatar model

Download an avatar model (.imx file) from the Explore page — click the ⋮ menu on any agent and select Download.

2. Install

pip install bithuman --upgrade

The SDK includes opencv-python-headless automatically. Do not install opencv-python (full) separately — it conflicts with PyAV and causes FFmpeg warnings on macOS.

Source, changelog, and issue tracker for the bithuman PyPI package live at bithuman-python-sdk-public. The runtime source is private (signing material is baked in), but the public repo mirrors the README + changelog and is where to file bugs or feature requests.

3. Run Your First Avatar

You need a .wav audio file to drive the avatar. A sample speech.wav is included in each example directory, or generate your own with any TTS service.

Option A — CLI (fastest, no coding)

export BITHUMAN_API_SECRET=your_api_secret
bithuman generate avatar.imx --audio speech.wav --output demo.mp4

Open demo.mp4 to see your avatar talking.

Don’t have a WAV yet? Grab the bundled sample in one line: curl -O https://raw.githubusercontent.com/bithuman-product/bithuman-examples/main/essence-selfhosted/speech.wav

Option B — Python

import asyncio, os
from bithuman import AsyncBithuman
from bithuman.audio import load_audio, float32_to_int16

async def main():
    runtime = await AsyncBithuman.create(
        model_path="avatar.imx",
        api_secret=os.environ["BITHUMAN_API_SECRET"],
    )
    await runtime.start()

    pcm, sr = load_audio("speech.wav")
    pcm = float32_to_int16(pcm)
    chunk = sr // 25                          # one chunk per video frame
    for i in range(0, len(pcm), chunk):
        await runtime.push_audio(pcm[i : i + chunk].tobytes(), sr)
    await runtime.flush()

    try:
        async for frame in runtime.run():
            if frame.has_image:
                image = frame.bgr_image        # numpy (H, W, 3) uint8
            if frame.end_of_speech:
                break
    finally:
        await runtime.stop()

asyncio.run(main())

Want to display the frames live? See Audio Clip for the OpenCV + speaker playback pattern. Full working example on GitHub

Key Concepts

Concept	Description
Runtime	`AsyncBithuman` instance that processes audio into video
push_audio	Feed audio bytes — avatar lip-syncs in real-time
flush	Signals end of audio input
run()	Async generator that yields frames at 25 FPS
Frame	Contains `.bgr_image` (numpy), `.audio_chunk`, `.end_of_speech`

Troubleshooting

ModuleNotFoundError: No module named 'bithuman'

The SDK is not installed. Run:

pip install bithuman --upgrade

Make sure you’re using the correct Python environment (virtualenv, conda, etc.).

Authentication failed / 401 error

Your API secret is invalid or missing. Check:

You copied the full secret from Developer → API Keys
The api_secret parameter or BITHUMAN_API_SECRET env var is set correctly
Your account is active with available credits

Quick test:

curl -X POST https://api.bithuman.ai/v1/validate \
  -H "api-secret: YOUR_SECRET"

Avatar shows but no lip movement

The avatar needs audio input to animate:

Ensure you’re calling push_audio() with valid audio data
Call flush() after pushing all audio
Check that the audio is 16-bit PCM format (use float32_to_int16() helper)
Verify audio sample rate matches the file (typically 16000 or 44100)

Slow startup (~20 seconds)

This is normal for the first session — the .imx model takes time to load and initialize. Subsequent sessions in the same process start instantly.To reduce perceived latency, keep the runtime alive between sessions instead of recreating it.

FileNotFoundError: avatar.imx not found

The model file path is wrong. Check:

The .imx file exists at the path you specified
Use an absolute path if running from a different directory
Download a model from the Explore page if you don’t have one

Next Steps

Audio Clip

Play audio file through avatar (5 min)

Live Microphone

Real-time mic input (10 min)

AI Conversation

OpenAI voice chat (15 min)

Or jump straight to the Docker App for a complete end-to-end setup.

Guides

Prompt Guide — Master the CO-STAR framework for avatar personality
Media Guide — Upload voice, image, and video assets
Animal Mode — Create animal avatars

System Requirements

Python 3.9 – 3.14
Essence (CPU): Linux (x86_64 / ARM64), macOS 13+ (Intel or Apple Silicon), or Windows 10+. 1–2 CPU cores, 4 GB RAM typical.
Expression on-device: macOS 14+ on Apple Silicon M3 or later, 16 GB RAM. Elsewhere, use the self-hosted GPU deployment on Linux + NVIDIA.

Getting Started

Swift SDK

Deployment

Integrations

Changelog

Quick Start: Real-Time Avatar API in 5 Minutes

1. Get Credentials

2. Install

3. Run Your First Avatar

Option A — CLI (fastest, no coding)

Option B — Python

Key Concepts

Troubleshooting

Next Steps

Audio Clip

Live Microphone

AI Conversation

Guides

System Requirements

Getting Started

Swift SDK

Deployment

Integrations

Changelog

Documentation Index

​1. Get Credentials

​2. Install

​3. Run Your First Avatar

​Option A — CLI (fastest, no coding)

​Option B — Python

​Key Concepts

​Troubleshooting

​Next Steps

Audio Clip

Live Microphone

AI Conversation

​Guides

​System Requirements

1. Get Credentials

2. Install

3. Run Your First Avatar

Option A — CLI (fastest, no coding)

Option B — Python

Key Concepts

Troubleshooting

Next Steps

Guides

System Requirements