Agent lifecycle
From a face and voice to a live talking avatar — generate, poll, resolve, and drive a session over the REST API and SDKs.
The agent lifecycle
From “I have a face and voice” to “live talking avatar”:
Generate → Store → Resolve → Live session → Speak
1. Generate
Call POST /v1/agent/generate with a prompt plus optional image, video, and audio assets. It returns { agent_id, status: "processing" } immediately — generation runs async (typically 2–5 minutes).
curl -X POST https://api.bithuman.ai/v1/agent/generate \
-H "api-secret: $BITHUMAN_API_SECRET" \
-H "content-type: application/json" \
-d '{
"prompt": "You are a friendly receptionist.",
"image": "https://example.com/headshot.jpg",
"audio": "https://example.com/voice.wav"
}'
See Building avatars for media specs and the full set of generation inputs.
2. Poll until ready
Poll GET /v1/agent/status/{agent_id} every 5 seconds. Status transitions:
processing → generating → completed → ready (success)
\→ failed (error)
Note
generatingandcompletedare intermediate states, not terminal — keep polling past them until you seereadyorfailed. Typical wall-clock is 2–5 minutes.
3. Resolve and stream
Once ready, the agent has an agent_code (e.g. A78WKV4515). Every SDK can resolve it by code — no need to download the .imx:
from bithuman import AsyncBithuman
import os
rt = await AsyncBithuman.create(
agent_code="A78WKV4515", # not model_path — the code resolves to the .imx
api_secret=os.environ["BITHUMAN_API_SECRET"],
)
From here, drive frames with the push/drain loop.
4. Drive the live session
For a hosted LiveKit session, push text into the live room:
curl -X POST https://api.bithuman.ai/v1/agent/A78WKV4515/speak \
-H "api-secret: $BITHUMAN_API_SECRET" \
-H "content-type: application/json" \
-d '{"message": "Hello!"}'
For silent knowledge injection — the avatar doesn’t say it aloud but uses it in future replies:
curl -X POST https://api.bithuman.ai/v1/agent/A78WKV4515/add-context \
-H "api-secret: $BITHUMAN_API_SECRET" \
-H "content-type: application/json" \
-d '{"context": "The customer is on the Pro plan."}'
Updating an agent
POST /v1/agent/{code} updates the system prompt only. For a new face or voice, generate a new agent.
Where to go next
- Audio streaming — the push/drain pattern that powers a live session.
- Building avatars — design likeness, voice, and personality.
- Deploy via LiveKit — the fastest path to a hosted session.
- API reference — full request/response schemas.