Pricing & credits
How bitHuman bills — credits per active minute by model and surface, the free tier, plans, and how to check your balance.
How billing works
bitHuman bills in credits consumed per active minute of avatar runtime. Audio-only mode (the Swift SDK without an attached avatar) is unmetered. Plans top up credits monthly; overage is pay-as-you-go.
Grab a free dev key at bithuman.ai → Developer — it lands in your inbox in seconds with the free tier attached.
Credits per minute by surface and model
| Surface | Model | Cost | Notes |
|---|---|---|---|
| bitHuman Cloud | Essence | 2 cr/min | CPU rendering on bitHuman’s servers |
| bitHuman Cloud | Expression | 4 cr/min | GPU rendering on bitHuman’s servers |
| Self-hosted (Python SDK / CLI) | Essence | 1 cr/min | CPU on your hardware |
| Self-hosted (GPU container) | Expression | 2 cr/min | NVIDIA GPU on your hardware |
| On-device (Swift SDK) | Expression | 2 cr/min | Active avatar minutes only |
| Audio-only (Swift SDK, no avatar) | — | Free | No avatar attached → fully offline, no metering |
BITHUMAN_UNMETERED=1 dev mode | either | Free | Skips auth + heartbeat — dev / parity testing only |
| Agent generation | — | 250 cr (one-time) | Per .imx built from your photo / video |
| Dynamics generation | — | 250 cr (one-time) | Per gesture / movement set generated for an agent |
A “credit minute” is wall-clock time the engine is actively producing frames (on-device, the wall-clock between chat.start() and chat.stop() with an avatar attached). Idle, paused, or disconnected time isn’t billed.
Free tier
- 99 credits / month at signup, no credit card required.
- Generates ~50 minutes of cloud Essence or ~25 minutes of cloud Expression.
- Resets monthly. Unused credits don’t roll over.
Plans
| Plan | Price | Credits / month | Concurrent sessions |
|---|---|---|---|
| Free | $0 | 99 | 2 |
| Creator | $20 | 1,800 | 5 |
| Pro | $99 | 10,000 | 10 |
| Enterprise | Contact | Custom | Custom |
Current pricing and your live balance are in the bitHuman dashboard — the credit balance is on the top navigation bar.
Metered vs unmetered
| Mode | What it means | Auth |
|---|---|---|
| Metered (default) | Your BITHUMAN_API_SECRET exchanges for a runtime token; a heartbeat fires once per minute while frames are flowing. Both cloud and self-hosted run this way. | BITHUMAN_API_SECRET (server) / BITHUMAN_API_KEY (Swift) |
| Unmetered dev mode | BITHUMAN_UNMETERED=1 skips auth + heartbeat entirely. For local dev, CI, and parity work — not licensed for production. | none |
| Audio-only | Swift SDK with no avatar config attached. Fully offline, never reaches the auth endpoint. | none |
The CLI, Python SDK, and Docker container all honour BITHUMAN_UNMETERED=1. The Swift SDK has the same escape hatch via its unmetered initializer.
How metering works
Server-side surfaces (cloud, self-hosted Python, self-hosted GPU)
The Python SDK and Docker container exchange a BITHUMAN_API_SECRET for a short-lived runtime token, then heartbeat back to api.bithuman.ai once per minute while the engine is generating frames. Each heartbeat increments your usage counter.
On-device surface (Swift SDK)
The Swift SDK requests a runtime token once on chat.start() (sync — bad keys fail fast with VoiceChatError.authenticationFailed), then heartbeats once per minute while the avatar is attached. Audio-only mode doesn’t authenticate or heartbeat at all. If the device loses connectivity mid-session, the SDK has a 5-minute offline grace period before it surfaces a billing error and pauses the avatar.
Check your balance
curl https://api.bithuman.ai/v2/credit-summaries \
-H "api-secret: $BITHUMAN_API_SECRET"
Response:
{
"success": true,
"data": {
"user_id": "1f09e2f0-bffd-4201-a3e7-e98a9e432e67",
"balance": 5240,
"plan_credits": 240,
"topup_credits": 5000,
"is_enterprise": false,
"minutes_estimate": {
"voice_chat": 524,
"camera_chat": 174,
"essence_cloud": 2620,
"essence_self_hosted": 5240,
"expression_cloud": 1310,
"expression_self_hosted": 2620
}
}
}
The minutes_estimate keys map directly to the two avatar models: essence_* = Essence, expression_* = Expression (cloud = bitHuman-hosted, self_hosted = your hardware). voice_chat / camera_chat are managed cloud conversational-agent estimates, not avatar models.
What’s NOT billed
- Source code, SDK installs, documentation — free.
- Audio-only Swift SDK use — voice chat with no avatar attached is unmetered and fully offline.
- Idle time — the engine has to be actively producing frames. Stopped or paused sessions don’t accrue.
- Failed auth — bad keys fail fast and don’t burn credits.
- Model weights —
.imxand Expression weight downloads are free; only active runtime minutes count.
FAQ
Does the on-device Swift SDK work without an internet connection?
Audio-only mode is fully offline. Avatar mode authenticates once on chat.start() and heartbeats once per minute — with a 5-minute offline grace window after the last successful heartbeat. After that, the avatar pauses until connectivity returns.
What if I run out of credits mid-session?
The current heartbeat finishes, then subsequent heartbeats fail. The Python / Docker engines stop emitting frames; the Swift SDK surfaces a VoiceChatError.authenticationFailed and the avatar pauses. Top up credits to resume.
Can I have multiple concurrent sessions?
Yes, up to your plan’s concurrent-session limit. Each session bills independently while it’s actively generating frames.
Where to go next
- Models — Essence vs Expression.
- Building avatars — what generation costs and how to do it.
- Deploy via LiveKit — the cloud-metered path.
- API reference — the full credit-summaries schema.