Agent lifecycle

From a face and voice to a live talking avatar — generate, poll, resolve, and drive a session over the REST API and SDKs.

The agent lifecycle

From “I have a face and voice” to “live talking avatar”:

Generate → Store → Resolve → Live session → Speak

1. Generate

Call POST /v1/agent/generate with a prompt plus optional image, video, and audio assets. It returns { agent_id, status: "processing" } immediately — generation runs async (typically 2–5 minutes).

curl -X POST https://api.bithuman.ai/v1/agent/generate \
  -H "api-secret: $BITHUMAN_API_SECRET" \
  -H "content-type: application/json" \
  -d '{
    "prompt": "You are a friendly receptionist.",
    "image": "https://example.com/headshot.jpg",
    "audio": "https://example.com/voice.wav"
  }'

See Building avatars for media specs and the full set of generation inputs.

2. Poll until ready

Poll GET /v1/agent/status/{agent_id} every 5 seconds. Status transitions:

processing → generating → completed → ready   (success)
                                  \→ failed    (error)

Note generating and completed are intermediate states, not terminal — keep polling past them until you see ready or failed. Typical wall-clock is 2–5 minutes.

3. Resolve and stream

Once ready, the agent has an agent_code (e.g. A78WKV4515). Every SDK can resolve it by code — no need to download the .imx:

from bithuman import AsyncBithuman
import os

rt = await AsyncBithuman.create(
    agent_code="A78WKV4515",   # not model_path — the code resolves to the .imx
    api_secret=os.environ["BITHUMAN_API_SECRET"],
)

From here, drive frames with the push/drain loop.

4. Drive the live session

For a hosted LiveKit session, push text into the live room:

curl -X POST https://api.bithuman.ai/v1/agent/A78WKV4515/speak \
  -H "api-secret: $BITHUMAN_API_SECRET" \
  -H "content-type: application/json" \
  -d '{"message": "Hello!"}'

For silent knowledge injection — the avatar doesn’t say it aloud but uses it in future replies:

curl -X POST https://api.bithuman.ai/v1/agent/A78WKV4515/add-context \
  -H "api-secret: $BITHUMAN_API_SECRET" \
  -H "content-type: application/json" \
  -d '{"context": "The customer is on the Pro plan."}'

Updating an agent

POST /v1/agent/{code} updates the system prompt only. For a new face or voice, generate a new agent.

Where to go next