Swift / iOS — Hello, avatar

Boot a real-time, lip-synced on-device bitHuman voice agent on iPhone or iPad with SwiftUI and bitHumanKit.

Prerequisites

  • A bitHuman API key, exposed to the app as BITHUMAN_API_KEY (Apple convention) — get one at Developer → API Keys; see Authentication.
  • Xcode 26+ on a Mac, plus an Apple Developer account. Add the SwiftPM package:
.package(url: "https://github.com/bithuman-product/bithuman-sdk-public.git", from: "0.8.1")
  • Device floor (real hardware — the Simulator can’t run on-device inference): iPhone 16 Pro or later (A18 Pro+), or iPad Pro M4 or later, on iOS / iPadOS 26+. Earlier devices are refused at launch by HardwareCheck.evaluate().
  • Apple-approved memory entitlements — without them iOS terminates the app mid-conversation. Request both before you start (Apple takes 1–3 business days): com.apple.developer.kernel.increased-memory-limit and com.apple.developer.kernel.extended-virtual-addressing.

Note Swift / Apple is GA. The published SwiftPM package is bitHumanKit — a self-contained XCFramework with zero transitive dependencies. The iOS example below drives an Expression voice agent; Essence is the lighter option for lower-end devices (see Models).

Run it

  1. Open the example folder in Xcode (File → Open → select the folder containing Package.swift):
git clone https://github.com/bithuman-product/bithuman-sdk-public.git
open bithuman-sdk-public/Examples/swift/ios-avatar/Package.swift
  1. Set the API key in the scheme: Product → Scheme → Edit Scheme → Run → Arguments → Environment Variables, add BITHUMAN_API_KEY. Never hardcode it.

  2. Select a physical iPhone 16 Pro or iPad Pro M4+, then Build and Run.

What you’ll see

On first launch the app downloads the Expression weights (~1.6 GB, cached), warms the model, then shows a live circular avatar that says “live — talk to me”. Speak and the avatar answers and lip-syncs the reply at 25 fps, fully on-device with sub-200 ms latency. Under-spec devices instead show an “unsupported device” screen.

Full code

The minimal shape: a HardwareCheck gate, then a VoiceChat that boots the avatar. The full app (Sources/IOSAvatarApp.swift) adds the render-host wiring and lifecycle phases.

// IOSAvatarApp.swift — iOS voice agent with a lip-synced avatar
import SwiftUI
import UIKit
import bitHumanKit

@main
struct IOSAvatarApp: App {
    var body: some Scene {
        WindowGroup {
            switch HardwareCheck.evaluate() {
            case .supported:                 AvatarRootView()
            case .unsupported(let reason):   UnsupportedDeviceView(reason: reason)
            }
        }
    }
}

@MainActor
final class AvatarLifecycle: ObservableObject {
    @Published var phase: Phase = .idle
    @Published private(set) var renderer: AvatarRendererView?
    private var chat: VoiceChat?

    enum Phase: Equatable { case idle, warming, live, error(String) }

    func start() async {
        do {
            // 1. Download / verify the Expression weights (~1.6 GB, cached).
            let weights = try await ExpressionWeights.ensureAvailable { _ in }
            phase = .warming

            // 2. Configure a voice chat with an avatar.
            let agent = AgentCatalog.defaultAgent
            var config = VoiceChatConfig()
            config.systemPrompt = agent.systemPrompt
            config.avatar = AvatarConfig(modelPath: weights,
                                         portraitPath: AgentCatalog.thumbnailURL(for: agent)!)
            config.apiKey = ProcessInfo.processInfo.environment["BITHUMAN_API_KEY"]

            // 3. Start it and render frames into a view.
            let chat = VoiceChat(config: config)
            try await chat.start()
            self.chat = chat
            self.renderer = AvatarRendererView(frame: .zero,
                                               idleFrame: chat.initialIdleFrame,
                                               clipMode: .circle)
            self.phase = .live
        } catch {
            phase = .error(error.localizedDescription)
        }
    }
}

Full source: GitHub

Next steps

  • Swift SDK — full walkthrough: lifecycle, entitlements, device matrix.
  • macos-voice example — offline macOS voice agent: no avatar, no API key.
  • Models — Essence vs Expression, which to ship.