Tiny but Mighty.
Impossibly Fast AI
for Apple Silicon.

A Zig-native GGUF inference engine built specifically for Apple Silicon. We deliver a single-binary, highly optimized CLI experience focusing exclusively on Apple Metal and GGUF files.

terminal Quick Start groups Join the community

Core Architecture

memory

Apple Silicon Native

Built exclusively for Apple Silicon, making it incredibly fast for local AI tasks.

bolt

Metal First

Optimized specifically for Apple Metal to deliver the highest possible performance.

dataset

GGUF Only

A deliberate focus strictly on GGUF files for understandable and narrow local inference.

terminal

Single Binary

A highly optimized CLI experience. Just drop the binary and run with zero external dependencies.

Quick Start

Getting up and running takes just a few seconds. Ensure you have Zig 0.15.2 or newer installed.

terminal

# 1. Clone
git clone https://github.com/Alex188dot/ziggy-llm.git
cd ziggy-llm

# 2. Build
zig build -Doptimize=ReleaseFast

# 3. Chat
./zig-out/bin/ziggy-llm chat \
  --model path/to/model.gguf \
  --backend metal \
  --temperature 0 \
  --seed 42

Blazing Fast Performance

🏎️

Tokens per second (t/s) measured on MacBook Pro M3 18GB. Higher is better.

Model	GGUF	llama.cpp	ZINC	ziggy-llm
TinyLlama 1.1B	Q4_K_M	151.4	—	~123
Llama 3.2 3B	Q4_K_M	53.5	—	~48
Llama 3.1 8B	Q4_K_M	23.1	~10	~22.4
Mistral 7B	Q4_K_M	28.0	—	~20
Ministral 3B	Q4_K_M	43.7	—	~45.5
Gemma 2 2B	Q4_K_M	—	—	~48
Qwen3 1.7B	Q4_K_M	92.0	—	~65
Qwen3.8B	Q4_K_M	25.0	~8	~17.5
Qwen3.5 2B	Q4_K_M	62.4	—	~48.9

Join the Community

🤝

ziggy-llm is open source and in active development, check out our issue tracker for things that need immediate attention:

radio_button_unchecked Implement OpenAI compatible server

radio_button_unchecked Add support for Qwen 3.5 (MoE and DeltaNet variants) and Gemma 4

radio_button_unchecked Make chat more robust

radio_button_unchecked Test all quants (currently tested only Q4_K_M and Q6_K)

radio_button_unchecked Test bigger models (of Qwen 3 and Llama families)

star

Support the Project

If you find this project interesting, please consider starring the repo. It genuinely helps us grow and reach more developers in the local AI ecosystem!

Star on GitHub

Tiny but Mighty. Impossibly Fast AI for Apple Silicon.