SDKs
Embed a real-time, lip-synced bitHuman avatar natively in your app — on-device or over the cloud — from Python, Apple, Android, or the browser.
One engine, one API, every language
Every bitHuman SDK is a thin, idiomatic binding over libessence — the
portable C++ avatar-rendering engine that also powers the CLI and the
cloud REST API. They all read the same .imx avatar file and
produce identical frames at 25 FPS, so anything you prove out in one language
ports straight to the others.
The contract is the same everywhere:
- Push 16 kHz mono PCM audio as it arrives (mic, TTS, WebRTC).
- Drain lip-synced video frames at 25 FPS.
That push/drain loop is documented once, canonically, under audio streaming — every language page links back to it rather than repeating it.
Which SDK should I use?
Match your target platform to a binding. All four ship visual, lip-synced avatars; the difference is where they run and how mature each surface is.
| Platform / target | SDK | On-device / cloud | Status |
|---|---|---|---|
| Backend, AI agents, batch render, edge boxes | Python | On-device | GA |
| Native Mac, iPad, iPhone apps | Swift / Apple | On-device | GA |
| Native Android apps | Android (Kotlin) | On-device | Beta |
| Browser & Node web clients | JavaScript / TS | Cloud client | Preview |
| Native app ↔ server avatar over WebRTC, or Python voice agent with a face | LiveKit (Apple + Python) | Cloud / hybrid | Built on the GA bindings |
If you are not sure, start with the Python SDK or the no-code CLI.
On-device vs cloud
bitHuman runs in two topologies. The same .imx and the same API work in both.
Runs on the user's machine, phone, or edge box. Private, low-latency, offline-capable. Python, Swift, Kotlin.
We host the GPU. Zero ops, ideal for web clients and sharing one avatar. JavaScript/TS, LiveKit.
| On-device | Cloud | |
|---|---|---|
| Where inference runs | Your machine / phone | bitHuman’s GPU pool |
| SDKs | Python, Swift, Kotlin | JavaScript/TS, LiveKit |
| Network | Optional — billing heartbeat only (or fully offline) | Required |
| Hardware | Any modern CPU (Essence); Apple Silicon / NVIDIA GPU (Expression) | None — we host it |
| Cost | 1 credit/min (Essence) · 2 credits/min (Expression) | 2 credits/min · 4 credits/min |
| Best for | Privacy, kiosks, edge, low latency | Zero-ops, web clients, sharing one avatar |
See models for the Essence vs Expression comparison and pricing for credit details.
Status matrix
We keep this honest so you can plan around it.
| SDK | Package | Topology | Status |
|---|---|---|---|
| Python | pip install bithuman | On-device | GA |
| Swift / Apple | SwiftPM bitHumanKit | On-device | GA — bitHumanKit is the published full-stack package; the newer libessence Bithuman streaming binding is rolling out |
| Android (Kotlin) | Gradle ai.bithuman:sdk | On-device | Beta |
| JavaScript / TS | npm install @bithuman/sdk | Cloud client | Preview |
| Flutter | reference app only | On-device | Reference app in bithuman-apps, not a published code SDK — see below |
Note Swift has two surfaces today. The currently-published SwiftPM package is
bitHumanKit— a full-stack on-device library with a built-in voice agent (STT + LLM + TTS) and a low-level streaming runtime. A newer binding namedBithumanmaps directly onto thelibessencestreaming engine (Fixture/Runtime/pushAudio/pullFrame) and is rolling out. The Swift page covers both.
A note on Flutter
A Flutter integration exists, but it is a reference app in the private
bithuman-apps repo — it is not published to pub.dev and the plugin source
is not in a public repo. Do not add bithuman: ^X.Y.Z to a pubspec.yaml;
flutter pub get will fail with “package not found”. Until it ships, build
Flutter apps on the underlying Swift SDK (Apple) and
Android SDK via platform channels, or ping us on
Discord for early access.
See also
- Audio streaming — the canonical push/drain loop
- Models — Essence vs Expression and the
.imxformat - CLI — the no-code path, same engine
- API reference — the cloud REST API