Swift SDK on macOS

Run real-time voice + lip-synced avatar inside a Mac app. All inference runs on the device’s GPU and Neural Engine; the only network use is the first-launch weights download and (for the avatar engine) a 1-request-per-minute billing heartbeat.

Prerequisites

macOS 26 (Tahoe) or later
Apple Silicon Mac (M3 or newer)
Xcode 26+
~3 GB free disk for first-launch model downloads
A BITHUMAN_API_KEY if you’ll use the avatar engine (audio-only mode is unmetered and doesn’t need a key — see Get an API key)

Add the package

In Xcode: File → Add Package Dependencies →

https://github.com/bithuman-product/bithuman-kit-public.git

Or in Package.swift:

dependencies: [
    .package(url: "https://github.com/bithuman-product/bithuman-kit-public.git",
             from: "0.8.1")
],
targets: [
    .target(
        name: "MyApp",
        dependencies: [
            .product(name: "bitHumanKit", package: "bithuman-kit-public")
        ]
    )
]

The library product is bitHumanKit. import bitHumanKit and you’re in.

Boot a voice agent

import SwiftUI
import bitHumanKit

@main
struct VoiceAgentApp: App {
    @StateObject private var lifecycle = Lifecycle()
    var body: some Scene {
        WindowGroup {
            ContentView(lifecycle: lifecycle)
                .task { await lifecycle.start() }
        }
    }
}

@MainActor
final class Lifecycle: ObservableObject {
    @Published var status = "booting…"
    private var chat: VoiceChat?

    func start() async {
        var config = VoiceChatConfig()
        config.localeIdentifier = "en-US"
        config.systemPrompt = "You are a calm assistant. One sentence per turn."
        config.voice = .preset("Aiden")
        do {
            let chat = VoiceChat(config: config)
            try await chat.start()
            self.chat = chat
            status = "live — talk to me"
        } catch {
            status = "error: \(error.localizedDescription)"
        }
    }
}

That’s the whole audio-only integration. The first launch downloads the LLM and TTS weights to ~/.cache/huggingface/hub/; subsequent launches are instant.

Get an API key

The avatar pipeline is metered (2 credits/min) and requires a bitHuman developer key. Audio-only mode is free and unmetered — this section only matters if you plan to use the avatar.

Sign in at https://www.bithuman.ai → Developer → API Keys.
Either set config.apiKey directly OR export BITHUMAN_API_KEY in the launching environment. The SDK resolves them in that order.
The first heartbeat happens at chat.start() — bad keys throw VoiceChatError.authenticationFailed immediately, before any user-visible work. 5-minute offline grace once authenticated.

Add the lip-synced avatar

let weights = try await ExpressionWeights.ensureAvailable()
let portrait = AgentCatalog.thumbnailURL(for: AgentCatalog.defaultAgent)!

config.avatar = AvatarConfig(modelPath: weights, portraitPath: portrait)
config.apiKey = ProcessInfo.processInfo.environment["BITHUMAN_API_KEY"]
let chat = VoiceChat(config: config)
try await chat.start()  // throws .missingAPIKey / .authenticationFailed

let coordinator = AvatarCoordinator(chat: chat)
coordinator.bindToOrchestrator()
coordinator.prewarmPortraitURL = portrait

guard let bh = chat.bithuman else { return }
let renderer = AvatarRendererView(
    frame: .zero, idleFrame: chat.initialIdleFrame, clipMode: .circle)
let pump = FramePump(
    bithuman: bh, chat: chat, window: renderer, coordinator: coordinator)
coordinator.framePump = pump
chat.onBargeIn = { [weak pump] in pump?.buffer.flushSpeech() }

Host the renderer in your SwiftUI tree:

struct AvatarHost: NSViewRepresentable {
    let view: AvatarRendererView
    func makeNSView(context: Context) -> AvatarRendererView { view }
    func updateNSView(_ nsView: AvatarRendererView, context: Context) {}
}

The first call to ExpressionWeights.ensureAvailable() downloads the universal weights bundle (~1.6 GB) to ~/.cache/bithuman/expression/. SHA-256 verified; cached.

Permissions

Mac apps need:

Microphone — System Settings prompts on first chat.start().
Speech Recognition — Same.

Add to your app’s Info.plist:

<key>NSMicrophoneUsageDescription</key>
<string>Talk to your on-device assistant.</string>
<key>NSSpeechRecognitionUsageDescription</key>
<string>Recognise what you say so the assistant can respond.</string>

Sandboxed apps additionally need the audio-input entitlement:

<key>com.apple.security.device.audio-input</key>
<true/>

Distribution

Mac apps consuming bitHumanKit ship via the standard channels:

Direct DMG with Sparkle auto-update (the bitHuman Mac reference app uses this; see bithuman-apps/Mac).
Mac App Store — same app, archived through Xcode.
Homebrew Cask — for CLI / dev tools (see bithuman-cli for the canonical example).

Reference apps

bithuman-mac — full Mac app with floating avatar window, agent picker, voice gallery, drag-drop face swap, Sparkle DMG packaging. Annotated source designed to be cloned and adapted.
bithuman-cli — same SDK, CLI shell. Three modes (text / voice / video). Source at available via Homebrew (brew install bithuman-cli).

Troubleshooting

See the dedicated Troubleshooting page. The most common Mac-specific issues:

App freezes on first launch — first run downloads ~3 GB of models. Add a progress UI hook via ExpressionWeights.ensureAvailable(progress:).
Microphone permission denied silently — your app isn’t sandboxed AND doesn’t include the entitlement. Either sandbox
- entitle, or add NSMicrophoneUsageDescription to Info.plist.
unsupportedHardware thrown on M2 — the engine refuses pre-M3 silicon. There’s no override; the silicon doesn’t have the bandwidth.

iOS / iPadOS deployment
Cookbook — copy-paste recipes
API reference — DocC, every public type
bithuman-cli — what you can do without writing code

Getting Started

Swift SDK

Deployment

Integrations

Changelog

Swift SDK on macOS

Prerequisites

Add the package

Boot a voice agent

Get an API key

Add the lip-synced avatar

Permissions

Distribution

Reference apps

Troubleshooting

Next

Getting Started

Swift SDK

Deployment

Integrations

Changelog

Documentation Index

​Prerequisites

​Add the package

​Boot a voice agent

​Get an API key

​Add the lip-synced avatar

​Permissions

​Distribution

​Reference apps

​Troubleshooting

​Next

Prerequisites

Add the package

Boot a voice agent

Get an API key

Add the lip-synced avatar

Permissions

Distribution

Reference apps

Troubleshooting

Next