Xybrid

Run LLMs, ASR, and TTS natively in apps and games.
Rust core · iOS · Android · Flutter · Unity
Private, offline, no cloud required.

Documentation · SDKs · Models · Join Discord · Follow on X · Issues

Start Here

Goal	Path
Fastest demo (2 min)	Download CLI →
Build a mobile or desktop app	Flutter SDK →
Add AI NPCs to your game	Unity SDK → and try the 3D tavern demo
Android native	Kotlin SDK →
Rust / embedded	Core crate →

SDKs

Xybrid is a Rust-powered runtime with native bindings for every major platform.

SDK	Platforms	Install	Status	Sample
Flutter	iOS, Android, macOS, Linux, Windows	pub.dev	Available	README
Unity	macOS, Windows, Linux, iOS, Android	See below	Available	Unity 3D AI tavern
Swift	iOS, macOS	Swift Package Manager	Coming Soon	README
Kotlin	Android	Maven Central	Available	README
CLI	macOS, Linux, Windows	Download binary	Available	—
Rust	All	`xybrid-core` / `xybrid-sdk`	Available	—

Every SDK wraps the same Rust core — identical model support and behavior across all platforms.

Install

Unity — Package Manager → Add from git URL:

https://github.com/xybrid-ai/xybrid.git#upm

The upm branch contains pre-built native libraries for all platforms. To pin a specific version: https://github.com/xybrid-ai/xybrid.git#upm/v0.1.0-beta8

Flutter — add to your pubspec.yaml:

dependencies:
  xybrid_flutter: ^0.1.0

Kotlin (Android) — add to your build.gradle.kts:

dependencies {
    implementation("ai.xybrid:xybrid-kotlin:0.1.0-beta8")
}

Quick Start

See each SDK's README for platform-specific setup: Flutter · Unity · Swift · Kotlin · Rust

Single Model

Run a model in one line from the CLI, or three lines from any SDK:

CLI:

xybrid run kokoro-82m --input "Hello world" -o output.wav

Flutter:

final model = await Xybrid.model('kokoro-82m').load();
final result = await model.run(XybridEnvelope.text('Hello world'));
// result → 24kHz WAV audio

Kotlin:

val model = XybridModelLoader.fromRegistry("kokoro-82m").load()
val result = model.run(Envelope.text("Hello world"))
// result → 24kHz WAV audio

Swift:

let model = try ModelLoader.fromRegistry(modelId: "kokoro-82m").load()
let result = try model.run(envelope: Envelope.text("Hello world"))
// result → 24kHz WAV audio

Unity (C#):

var model = XybridClient.LoadModel("kokoro-82m");
var result = model.Run(Envelope.Text("Hello world"));
// result → 24kHz WAV audio

Rust:

let model = Xybrid::model("kokoro-82m").load()?;
let result = model.run(&Envelope::text("Hello world"))?;
// result → 24kHz WAV audio

Pipelines

Chain models together — build a voice assistant in 3 lines of YAML:

# voice-assistant.yaml
name: voice-assistant
stages:
  - model: whisper-tiny    # Speech → text
  - model: qwen2.5-0.5b    # Process with LLM
  - model: kokoro-82m      # Text → speech

CLI:

xybrid run voice-assistant.yaml --input question.wav -o response.wav

Flutter:

final pipeline = Xybrid.pipeline(yaml: yamlString);
final result = await pipeline.run(XybridEnvelope.audio(bytes: audioBytes, sampleRate: 16000));

Kotlin:

// Pipeline support coming soon — use single model loading for now

Swift:

// Pipeline support coming soon — use single model loading for now

Unity (C#):

// Pipeline support coming soon — use single model loading for now

Rust:

let pipeline = Xybrid::pipeline(&yaml_string).load()?;
pipeline.load_models()?;
let result = pipeline.run(&Envelope::audio(audio_bytes))?;

Supported Models

All models run entirely on-device. No cloud, no API keys required. Browse the full registry with xybrid models list.

Start with these

Model	Type	Params	Why start here
SmolLM2 360M	LLM	360M	Best quality-to-size ratio for any device
Kokoro 82M	TTS	82M	High-quality speech, 24 voices, fast
Whisper Tiny	ASR	39M	Accurate multilingual transcription

Speech-to-Text

Model	Params	Format	Description
Whisper Tiny	39M	SafeTensors	Multilingual transcription (Candle runtime)
Wav2Vec2 Base	95M	ONNX	English ASR with CTC decoding

Text-to-Speech

Model	Params	Format	Description
Kokoro 82M	82M	ONNX	High-quality, 24 natural voices
KittenTTS Nano	15M	ONNX	Ultra-lightweight, 8 voices

Language Models

Model	Params	Format	Description
Gemma 3 1B	1B	GGUF Q4_K_M	Google's mobile-optimized LLM
Llama 3.2 1B	1B	GGUF Q4_K_M	Meta's general purpose, 128K context
Qwen 2.5 0.5B	500M	GGUF Q4_K_M	Compact on-device chat
Qwen 3.5 0.8B	800M	GGUF Q4_K_M	Latest Qwen with reasoning (thinking mode)
Qwen 3.5 2B	2B	GGUF Q4_K_M	Larger Qwen 3.5 with extended reasoning
SmolLM2 360M	360M	GGUF Q4_K_M	Best tiny LLM, excellent quality/size ratio

Coming Soon

Model	Type	Params	Priority	Status
Phi-4 Mini	LLM	3.8B	P2	Spec Ready (first multi-quant: Q4, Q8, FP16)
Qwen3 0.6B	LLM	600M	P2	Planned
Trinity Nano	LLM (MoE)	6B (1B active)	P2	Planned
LFM2 700M	LLM	700M	P2	Planned
Nomic Embed Text v1.5	Embeddings	137M	P1	Blocked (needs Tokenize/MeanPool steps)
LFM2-VL 450M	Vision	450M	P2	Planned
Whisper Tiny CoreML	ASR	39M	P2	Planned
Qwen3-TTS 0.6B	TTS	600M	P2	Blocked (needs custom SafeTensors runtime)
Chatterbox Turbo	TTS	350M	P3	Blocked (needs ModelGraph template)

Features

Capability	iOS	Android	macOS	Linux	Windows
Speech-to-Text	✅	✅	✅	✅	✅
Text-to-Speech	✅	✅	✅	✅	✅
Language Models	✅	✅	✅	✅	✅
Vision Models	🔜	🔜	🔜	🔜	🔜
Embeddings	🔜	🔜	🔜	🔜	🔜
Pipeline Orchestration	✅	✅	✅	✅	✅
Model Download & Caching	✅	✅	✅	✅	✅
Hardware Acceleration	Metal, ANE	CPU	Metal, ANE	CUDA	CUDA

SDK pipeline support: Flutter ✅ · Rust ✅ · Kotlin 🔜 · Swift 🔜 · Unity 🔜

Why Xybrid?

Privacy first — All inference runs on-device. Your data never leaves the device.
Offline capable — No internet required after initial model download.
Cross-platform — One API across iOS, Android, macOS, Linux, and Windows.
Pipeline orchestration — Chain models together (ASR → LLM → TTS) in a single call.
Automatic optimization — Hardware acceleration on Apple Neural Engine, Metal, and CUDA.

How it compares

	Xybrid	Ollama	llama.cpp	ONNX Runtime
Mobile (iOS/Android)	✅	❌	❌	✅
Game engine (Unity)	✅	❌	❌	❌
Multi-stage pipelines	✅	❌	❌	❌
ASR + TTS + LLM in one SDK	✅	❌	❌	❌
Runs in-process (no server)	✅	❌	✅	✅
No cloud required	✅	✅	✅	✅

Community

Contributing

We welcome contributions! See CONTRIBUTING.md for guidelines on setting up your development environment, submitting pull requests, and adding new models.

Star History

License

Apache License 2.0 — see LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 334 Commits
.cargo		.cargo
.github		.github
bindings		bindings
crates		crates
docs		docs
examples		examples
integration-tests		integration-tests
macros		macros
tools		tools
vendor		vendor
xtask		xtask
.gitignore		.gitignore
.gitmodules		.gitmodules
CHANGELOG.md		CHANGELOG.md
CLA.md		CLA.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
README.zh-CN.md		README.zh-CN.md
SECURITY.md		SECURITY.md
justfile		justfile
rustfmt.toml		rustfmt.toml
vercel.json		vercel.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Xybrid

Start Here

SDKs

Install

Quick Start

Single Model

Pipelines

Supported Models

Start with these

Speech-to-Text

Text-to-Speech

Language Models

Coming Soon

Features

Why Xybrid?

How it compares

Community

Contributing

Star History

License

About

Uh oh!

Releases 27

Packages

Uh oh!

Contributors 3

Languages

Folders and files

Latest commit

History

Repository files navigation

Xybrid

Start Here

SDKs

Install

Quick Start

Single Model

Pipelines

Supported Models

Start with these

Speech-to-Text

Text-to-Speech

Language Models

Coming Soon

Features

Why Xybrid?

How it compares

Community

Contributing

Star History

License

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 27

Packages 0

Uh oh!

Contributors 3

Languages

Packages