An AvatarSession is how you bring a bitHuman avatar into a LiveKit room. This guide covers every way to do it, with complete working examples.Documentation Index
Fetch the complete documentation index at: https://docs.bithuman.ai/llms.txt
Use this file to discover all available pages before exploring further.
New to bitHuman? Start with How It Works to understand the core concepts first.
Choose Your Approach
| Approach | Best For | Model Files | GPU Required | Internet Required |
|---|---|---|---|---|
| Cloud Plugin | Getting started, web apps | No | No | Yes |
| Self-Hosted CPU | Privacy, edge devices | Yes (.imx) | No | Only for auth |
| Self-Hosted GPU | Dynamic faces, custom images | No (uses images) | Yes | Only for auth |
| On-Device macOS | Apple Silicon, privacy-first | No (uses images) | No (Apple M3+) | Only for auth |
Prerequisites
All approaches need these basics: You also need a LiveKit server. If you don’t have one:Cloud Plugin
The cloud plugin runs the avatar on bitHuman’s servers. You just provide an Agent ID and API secret — no model files, no GPU.Complete Working Example
Environment Variables
Run It
How It Works Behind the Scenes
Whenavatar.start() and session.start() run:
- The plugin sends a request to bitHuman’s cloud API
- A cloud avatar worker receives the request
- The worker downloads the avatar model (cached after first time)
- The worker joins your LiveKit room as a participant named
bithuman-avatar-agent - As your agent produces TTS audio, the worker generates animated video frames
- Video is published to the room — users see the avatar speaking
Using Expression Model (GPU) with Custom Image
Self-Hosted CPU
Run the avatar entirely on your own machine using a downloaded.imx model file. Great for privacy and offline use.
Complete Working Example
Environment Variables
How It Differs from Cloud
| Aspect | Cloud | Self-Hosted CPU |
|---|---|---|
| Model location | bitHuman’s servers | Your machine |
| Avatar parameter | avatar_id="A78WKV4515" | model_path="/path/to/avatar.imx" |
| Internet needed | Yes (always) | Only for authentication |
| First frame latency | 2-4 seconds | ~20 seconds (model load) |
| Privacy | Audio sent to cloud | Audio stays local |
System Requirements
- CPU: 1–2 cores sustain 25 FPS on modern chips; 4+ is comfortable for headroom
- RAM: 4 GB minimum, 8 GB recommended
- Disk: ~500 MB per
.imxmodel - OS: Linux (x86_64 / ARM64), macOS 13+ (Intel or Apple Silicon), or Windows 10+
Self-Hosted GPU
Use a GPU container that generates avatars from any face image — no pre-built models needed.Complete Working Example
Start the GPU Container First
Environment Variables
For detailed GPU container setup, see Self-Hosted GPU Container.
On-Device Apple Silicon
Run the Expression model entirely on the end-user’s Apple Silicon device — no LiveKit, no AvatarSession, no server. The Swift SDK bundles speech recognition, an on-device LLM, TTS, and the lip-sync engine. The only network traffic is a 1-request-per-minute billing heartbeat toapi.bithuman.ai.
Vanilla integration (audio + avatar)
Hardware floor
| Platform | Minimum |
|---|---|
| macOS | M3+ Apple Silicon, macOS 26 (Tahoe) |
| iPad | iPad Pro M4+, 16 GB unified memory, iPadOS 26 |
| iPhone | iPhone 16 Pro+ (A18 Pro), iOS 26 |
HardwareCheck.evaluate() gates this at runtime — under-spec devices see a polite refusal screen.
Try without writing code
What’s Next
Once your avatar session is running, explore these features:Gestures & Dynamics
Add wave, nod, and laugh animations (Essence only)
Control via REST API
Make avatars speak or inject context from any backend
Python SDK (No LiveKit)
Generate video frames directly without real-time rooms
Docker Examples
Pre-built Docker stacks for every deployment mode
Troubleshooting
Avatar doesn't appear in the room
Avatar doesn't appear in the room
Cloud mode: Check that your Self-hosted mode: Check that the
avatar_id exists — look it up in your Library. Verify your API secret is valid with:.imx file path is correct and the file is not corrupted:Avatar appears but no lip movement
Avatar appears but no lip movement
The avatar needs audio input to animate. Ensure:
- Your TTS is producing audio (test with
openai.TTS()separately) - Ensure
avatar.start(session, room=ctx.room)is called beforesession.start() - Check agent logs for audio pipeline errors
'Authentication failed' error
'Authentication failed' error
- Verify your API secret is correct (copy-paste from Developer → API Keys)
- Check you have credits remaining in your account
- Ensure the
BITHUMAN_API_SECRETenvironment variable is set
High latency / slow first frame
High latency / slow first frame
Cloud: First request downloads the model (~2-4 seconds). Subsequent requests use cache (~1-2 seconds).Self-hosted CPU: First load takes ~20 seconds (model initialization). Keep the process running for fast subsequent sessions.Self-hosted GPU: Cold start takes ~30-40 seconds. Use long-running containers with preset avatars for ~4 second startup.
'No available workers' or 503 errors
'No available workers' or 503 errors
All avatar workers are busy. The system retries automatically (up to 5 times with backoff). If it persists:
- Check your usage limits
- Try again in a few seconds
- For self-hosted: increase the number of worker replicas
Session Lifecycle
Understanding how sessions behave helps you build reliable integrations.| Behavior | Essence (CPU) | Expression (GPU server) | Expression (on-device) |
|---|---|---|---|
| Idle timeout | None — sessions run indefinitely | 10 minutes of inactivity | None — runs while app is open |
| Gestures | Supported | Not supported | Not supported |
| Use case | Kiosks, always-on displays | Interactive conversations | Native consumer apps |
Essence sessions are designed for 24/7 deployments like museum kiosks and lobby displays. They run until the client disconnects — there is no idle timeout. Expression sessions automatically close after 10 minutes of inactivity to free GPU resources.
Billing & Credits
Avatar sessions consume credits based on the deployment mode and session duration.| Deployment | Model | Credit Cost | Notes |
|---|---|---|---|
| Cloud | Essence | 2 cr/min | CPU rendering on bitHuman servers |
| Cloud | Expression | 4 cr/min | GPU rendering on bitHuman servers |
| Self-Hosted | Essence | 1 cr/min | CPU rendering on your hardware |
| Self-Hosted | Expression | 2 cr/min | GPU rendering on your hardware |
| On-Device | Expression (Swift SDK) | 2 cr/min | Active avatar minutes only — audio-only mode is unmetered |
Check your remaining credits at www.bithuman.ai — your credit balance is shown in the top navigation bar. Credits are consumed only for active sessions — idle containers cost nothing.
Next Steps
Dynamics API
Add gestures and movements
Webhooks
Get notified about session events
Embed Avatars
Put avatars on any website
