Skip to content

xybrid-ai/xybrid

English · 简体中文

Xybrid Logo

Xybrid

Run LLMs, ASR, and TTS natively in apps and games.
Rust core · iOS · Android · Flutter · Unity
Private, offline, no cloud required.

Documentation · SDKs · Models · Join Discord · Follow on X · Issues

Website Discord Twitter
License Build Stars Release Release Date
pub.dev Maven Central

Desktop demo     Android demo

Start Here

Goal Path
Fastest demo (2 min) Download CLI →
Build a mobile or desktop app Flutter SDK →
Add AI NPCs to your game Unity SDK → and try the 3D tavern demo
Android native Kotlin SDK →
Rust / embedded Core crate →

Game demo

SDKs

Xybrid is a Rust-powered runtime with native bindings for every major platform.

SDK Platforms Install Status Sample
Flutter iOS, Android, macOS, Linux, Windows pub.dev Available README
Unity macOS, Windows, Linux, iOS, Android See below Available Unity 3D AI tavern
Swift iOS, macOS Swift Package Manager Coming Soon README
Kotlin Android Maven Central Available README
CLI macOS, Linux, Windows Download binary Available
Rust All xybrid-core / xybrid-sdk Available

Every SDK wraps the same Rust core — identical model support and behavior across all platforms.

Install

Unity — Package Manager → Add from git URL:

https://github.com/xybrid-ai/xybrid.git#upm

The upm branch contains pre-built native libraries for all platforms. To pin a specific version: https://github.com/xybrid-ai/xybrid.git#upm/v0.1.0-beta8

Flutter — add to your pubspec.yaml:

dependencies:
  xybrid_flutter: ^0.1.0

Kotlin (Android) — add to your build.gradle.kts:

dependencies {
    implementation("ai.xybrid:xybrid-kotlin:0.1.0-beta8")
}

Quick Start

See each SDK's README for platform-specific setup: Flutter · Unity · Swift · Kotlin · Rust

Single Model

Run a model in one line from the CLI, or three lines from any SDK:

CLI:

xybrid run kokoro-82m --input "Hello world" -o output.wav

Flutter:

final model = await Xybrid.model('kokoro-82m').load();
final result = await model.run(XybridEnvelope.text('Hello world'));
// result → 24kHz WAV audio

Kotlin:

val model = XybridModelLoader.fromRegistry("kokoro-82m").load()
val result = model.run(Envelope.text("Hello world"))
// result → 24kHz WAV audio

Swift:

let model = try ModelLoader.fromRegistry(modelId: "kokoro-82m").load()
let result = try model.run(envelope: Envelope.text("Hello world"))
// result → 24kHz WAV audio

Unity (C#):

var model = XybridClient.LoadModel("kokoro-82m");
var result = model.Run(Envelope.Text("Hello world"));
// result → 24kHz WAV audio

Rust:

let model = Xybrid::model("kokoro-82m").load()?;
let result = model.run(&Envelope::text("Hello world"))?;
// result → 24kHz WAV audio

Pipelines

Chain models together — build a voice assistant in 3 lines of YAML:

# voice-assistant.yaml
name: voice-assistant
stages:
  - model: whisper-tiny    # Speech → text
  - model: qwen2.5-0.5b    # Process with LLM
  - model: kokoro-82m      # Text → speech

CLI:

xybrid run voice-assistant.yaml --input question.wav -o response.wav

Flutter:

final pipeline = Xybrid.pipeline(yaml: yamlString);
final result = await pipeline.run(XybridEnvelope.audio(bytes: audioBytes, sampleRate: 16000));

Kotlin:

// Pipeline support coming soon — use single model loading for now

Swift:

// Pipeline support coming soon — use single model loading for now

Unity (C#):

// Pipeline support coming soon — use single model loading for now

Rust:

let pipeline = Xybrid::pipeline(&yaml_string).load()?;
pipeline.load_models()?;
let result = pipeline.run(&Envelope::audio(audio_bytes))?;

Supported Models

All models run entirely on-device. No cloud, no API keys required. Browse the full registry with xybrid models list.

Start with these

Model Type Params Why start here
SmolLM2 360M LLM 360M Best quality-to-size ratio for any device
Kokoro 82M TTS 82M High-quality speech, 24 voices, fast
Whisper Tiny ASR 39M Accurate multilingual transcription

Speech-to-Text

Model Params Format Description
Whisper Tiny 39M SafeTensors Multilingual transcription (Candle runtime)
Wav2Vec2 Base 95M ONNX English ASR with CTC decoding

Text-to-Speech

Model Params Format Description
Kokoro 82M 82M ONNX High-quality, 24 natural voices
KittenTTS Nano 15M ONNX Ultra-lightweight, 8 voices

Language Models

Model Params Format Description
Gemma 3 1B 1B GGUF Q4_K_M Google's mobile-optimized LLM
Llama 3.2 1B 1B GGUF Q4_K_M Meta's general purpose, 128K context
Qwen 2.5 0.5B 500M GGUF Q4_K_M Compact on-device chat
Qwen 3.5 0.8B 800M GGUF Q4_K_M Latest Qwen with reasoning (thinking mode)
Qwen 3.5 2B 2B GGUF Q4_K_M Larger Qwen 3.5 with extended reasoning
SmolLM2 360M 360M GGUF Q4_K_M Best tiny LLM, excellent quality/size ratio

Coming Soon

Model Type Params Priority Status
Phi-4 Mini LLM 3.8B P2 Spec Ready (first multi-quant: Q4, Q8, FP16)
Qwen3 0.6B LLM 600M P2 Planned
Trinity Nano LLM (MoE) 6B (1B active) P2 Planned
LFM2 700M LLM 700M P2 Planned
Nomic Embed Text v1.5 Embeddings 137M P1 Blocked (needs Tokenize/MeanPool steps)
LFM2-VL 450M Vision 450M P2 Planned
Whisper Tiny CoreML ASR 39M P2 Planned
Qwen3-TTS 0.6B TTS 600M P2 Blocked (needs custom SafeTensors runtime)
Chatterbox Turbo TTS 350M P3 Blocked (needs ModelGraph template)

Features

Capability iOS Android macOS Linux Windows
Speech-to-Text
Text-to-Speech
Language Models
Vision Models 🔜 🔜 🔜 🔜 🔜
Embeddings 🔜 🔜 🔜 🔜 🔜
Pipeline Orchestration
Model Download & Caching
Hardware Acceleration Metal, ANE CPU Metal, ANE CUDA CUDA

SDK pipeline support: Flutter ✅ · Rust ✅ · Kotlin 🔜 · Swift 🔜 · Unity 🔜


Why Xybrid?

  • Privacy first — All inference runs on-device. Your data never leaves the device.
  • Offline capable — No internet required after initial model download.
  • Cross-platform — One API across iOS, Android, macOS, Linux, and Windows.
  • Pipeline orchestration — Chain models together (ASR → LLM → TTS) in a single call.
  • Automatic optimization — Hardware acceleration on Apple Neural Engine, Metal, and CUDA.

How it compares

Xybrid Ollama llama.cpp ONNX Runtime
Mobile (iOS/Android)
Game engine (Unity)
Multi-stage pipelines
ASR + TTS + LLM in one SDK
Runs in-process (no server)
No cloud required

Community

Contributing

We welcome contributions! See CONTRIBUTING.md for guidelines on setting up your development environment, submitting pull requests, and adding new models.

Star History

Star History Chart

License

Apache License 2.0 — see LICENSE for details.