Skip to main content

Documentation Index

Fetch the complete documentation index at: https://honcho.dev/docs/llms.txt

Use this file to discover all available pages before exploring further.

Short answer: you can’t, in place.

The embedding dimension is machine-enforced as immutable for the life of a deployment. The embedding model is operator-owned as immutable by contract. The supported way to change either is:
  1. Stand up a new deployment at the desired configuration.
  2. Replay or re-embed your data into it out-of-band.
  3. Cut traffic over to the new deployment.
The rest of this page explains why, and what the safety boundaries actually are.

Why dimension is enforced and model is not

On boot, both the API (src/main.py lifespan) and the deriver (src/deriver/__main__.py) run the validator in src/startup/embedding_validator.py. It does a schema-qualified pg_attribute lookup against documents.embedding and message_embeddings.embedding, decodes the declared atttypmod, and compares it to EMBEDDING_VECTOR_DIMENSIONS. A mismatch crashes the process with an actionable error before any HTTP route is served or any queue task is processed. There is no equivalent check for the model. The pgvector column does not record what model produced the vectors inside it, and this design intentionally avoids adding new persistent metadata fields. The runtime has no way to detect that you swapped text-embedding-3-small for a different model that emits the same dimension. That last point is a real footgun:
Changing EMBEDDING_MODEL_CONFIG__MODEL to a different model at the same dimension (for example text-embedding-3-small@1536text-embedding-3-large truncated to 1536) will silently succeed. New writes will use the new model; existing rows still hold vectors from the old model; recall quality will degrade with no startup or runtime warning.Treat model identity as a contract you own. If you need to change it, follow the destroy + rebuild path below.

Recipe: changing dim or model

Concretely, for either a dim change or a model change:
  1. Provision the new deployment with the target environment.
    # On the new deployment:
    export EMBEDDING_VECTOR_DIMENSIONS=768
    export EMBEDDING_MODEL_CONFIG__TRANSPORT=openai
    export EMBEDDING_MODEL_CONFIG__MODEL=nomic-embed-text
    export EMBEDDING_MODEL_CONFIG__OVERRIDES__BASE_URL=http://your-ollama:11434/v1
    alembic upgrade head
    uv run python scripts/configure_embeddings.py --dry-run
    uv run python scripts/configure_embeddings.py --yes
    
  2. Replay your source data (messages, documents, ingested content) into the new deployment via your normal application path. Honcho’s existing message-creation API will re-derive embeddings using the new configuration. There is no in-place re-embedding tool — that would be a separate spec covering atomicity, cost-per-token, and dialectic-during-migration semantics.
  3. Cut over at your application layer (DNS, load balancer, feature flag — whatever you use). The old deployment can stay running until you are confident in the new one; this design does not require an atomic switch.
The startup validator on the new deployment will refuse to start if step 1’s configure_embeddings.py did not run, so a misconfiguration cannot quietly write wrong-dim vectors into the new schema.

Edge case: truncation at the default dimension

If you are using text-embedding-3-large but truncating to 1536 (the default), be aware that EMBEDDING_MODEL_CONFIG__DIMENSIONS_MODE=auto will not forward dimensions= to the API — auto interprets the default as “operator did not opt into a non-default dim.” The provider will return native 3072, the response-dim validator will reject it, and the request will fail. For this case, either set EMBEDDING_VECTOR_DIMENSIONS=1536 explicitly (so auto knows the operator opted in), or set EMBEDDING_MODEL_CONFIG__DIMENSIONS_MODE=always.

Backend swap (turbopuffer ↔ lancedb ↔ pgvector) is a different operation

Switching the storage backend at constant dim/model — for example moving from pgvector to Turbopuffer — is supported via src/reconciler/sync_vectors.py and VECTOR_STORE_MIGRATED. That flow is unchanged by the embedding-pipeline work and is documented separately. It is not the destroy + rebuild path described above.