tomoul pull
Pre-fetch model weights so `tomoul serve` starts instantly.
Usage
tomoul pull baai/bge-m3
tomoul pull microsoft/phi-4 --quant int4
Pull is idempotent — re-running on a cached model is a no-op (no re-download)
unless you pass --force.
Where weights live
~/.cache/tomoul/<provider>/<model>/<quant>/ by default. Override with
--cache or the TOMOUL_CACHE_DIR environment variable, or set
cache_dir in ~/.config/tomoul/config.toml.
See tomoul models to list what's cached and how much
space it's using.
Flags
| Flag | Notes |
|---|---|
--quant | fp16, int8, int4, q8_k, q4_0 |
--cache | Override cache dir. |
--force | Re-download even if present. |
Last updated 13 May 2026Edit this page on GitHub