Architecture families

Six architecture families, parameterised by config. Once a family is in, adding a new model that shares it is a weights + config operation — no Zig code change required.

The six families

Source: src/arch/.

FamilyModuleExamples
Llama-style decodertomoul.arch.llamaLlama 3, Phi-4, Qwen, gpt-oss, InkubaLM
DeltaNet / hybridtomoul.arch.deltanetTrending recurrent + attention hybrids
Transformer encodertomoul.arch.transformer_encoderbge-m3, sentence-transformer, XLM-RoBERTa
Transformer decodertomoul.arch.transformer_decoderSmaller generative, decoder-only
Encoder-decodertomoul.arch.encoder_decoderWhisper, NLLB-style translation
LSTMtomoul.arch.lstmSilero VAD, lightweight legacy

Loading a config

const cfg     = try tomoul.format.config_json.load(a, "phi-4/config.json");
const weights = try tomoul.format.safetensors.open(a, "phi-4/model.safetensors");
var model     = try tomoul.arch.llama.LlamaModel.fromConfig(a, cfg, weights, .{});
defer model.deinit();

Adding a model

If the architecture is already supported, dropping in a new model is a config + weights operation — no Zig code. Register it in src/models/ to make it discoverable via tomoul.models.<slug>.

Adding a new family

New families live under src/arch/<family>/. Each family is one Zig file (the forward pass) plus a config struct plus a validation harness against a reference implementation. See CONTRIBUTING.md in the repo for the full process.

Roadmap

  • GGUF support in the format readers, alongside safetensors and .tl.
  • New families land when a model the catalog needs doesn't fit an existing one — the bar to add a family is high, the bar to add a model under an existing family is just a config + weights.
Last updated 13 May 2026Edit this page on GitHub