Architecture families

Six architecture families, parameterised by config. Once a family is in, adding a new model that shares it is a weights + config operation — no Zig code change required.

The six families

Source: src/arch/.

Family	Module	Examples
Llama-style decoder	`tomoul.arch.llama`	Llama 3, Phi-4, Qwen, gpt-oss, InkubaLM
DeltaNet / hybrid	`tomoul.arch.deltanet`	Trending recurrent + attention hybrids
Transformer encoder	`tomoul.arch.transformer_encoder`	bge-m3, sentence-transformer, XLM-RoBERTa
Transformer decoder	`tomoul.arch.transformer_decoder`	Smaller generative, decoder-only
Encoder-decoder	`tomoul.arch.encoder_decoder`	Whisper, NLLB-style translation
LSTM	`tomoul.arch.lstm`	Silero VAD, lightweight legacy

Loading a config

const cfg     = try tomoul.format.config_json.load(a, "phi-4/config.json");
const weights = try tomoul.format.safetensors.open(a, "phi-4/model.safetensors");
var model     = try tomoul.arch.llama.LlamaModel.fromConfig(a, cfg, weights, .{});
defer model.deinit();

Adding a model

If the architecture is already supported, dropping in a new model is a config + weights operation — no Zig code. Register it in src/models/ to make it discoverable via tomoul.models.<slug>.

Adding a new family

New families live under src/arch/<family>/. Each family is one Zig file (the forward pass) plus a config struct plus a validation harness against a reference implementation. See CONTRIBUTING.md in the repo for the full process.

Roadmap

GGUF support in the format readers, alongside safetensors and .tl.
New families land when a model the catalog needs doesn't fit an existing one — the bar to add a family is high, the bar to add a model under an existing family is just a config + weights.

← Previous

Install

GPU support

Last updated 13 May 2026Edit this page on GitHub