tomoul pull

Pre-fetch model weights so `tomoul serve` starts instantly.

Usage

tomoul pull baai/bge-m3
tomoul pull microsoft/phi-4 --quant int4

Pull is idempotent — re-running on a cached model is a no-op (no re-download) unless you pass --force.

Where weights live

~/.cache/tomoul/<provider>/<model>/<quant>/ by default. Override with --cache or the TOMOUL_CACHE_DIR environment variable, or set cache_dir in ~/.config/tomoul/config.toml.

See tomoul models to list what's cached and how much space it's using.

Flags

FlagNotes
--quantfp16, int8, int4, q8_k, q4_0
--cacheOverride cache dir.
--forceRe-download even if present.
Last updated 13 May 2026Edit this page on GitHub