Skip to content

feat(byok): add direct BYOK support for Ollama Cloud#2819

Open
kilo-code-bot[bot] wants to merge 15 commits intomainfrom
byok-ollama-cloud
Open

feat(byok): add direct BYOK support for Ollama Cloud#2819
kilo-code-bot[bot] wants to merge 15 commits intomainfrom
byok-ollama-cloud

Conversation

@kilo-code-bot
Copy link
Copy Markdown
Contributor

@kilo-code-bot kilo-code-bot Bot commented Apr 25, 2026

Summary

  • Adds Ollama Cloud as a new direct BYOK provider, reusing the existing OpenAI-compatible direct BYOK plumbing.
  • Models are synced dynamically from models.dev via the shared syncDirectByokModels flow; the catalog is fetched once per sync run and shared with the existing zai-coding fetcher.
  • Wires the new provider id (ollama-cloud) into DirectUserByokInferenceProviderIdSchema, the BYOK test-model map, and the direct-byok definitions registry. The BYOK UI picks it up automatically via DIRECT_BYOK_PROVIDERS.
  • Exposes none/low/medium/high reasoning variants via the new REASONING_VARIANTS_NONE_LOW_MEDIUM_HIGH preset; transformRequest mirrors chutes-byok and promotes reasoning.effort onto the OpenAI-compatible reasoning_effort field.
  • Renames the display-name helper to modelIdToDisplayName and strips any :tag suffix (e.g. kimi-k2.6:cloudkimi-k2.6) in addition to vendor prefixes.

The transport uses https://ollama.com/v1 (OpenAI-compat docs).

Verification

  • Set the ollama-cloud BYOK key in the BYOK Keys Manager and verified it appears in the provider list, populated from the models.dev catalog.
  • Test BYOK key flow uses UserByokTestModels['ollama-cloud'] = 'kimi-k2.6:cloud'.

Visual Changes

N/A — provider is surfaced in the existing BYOK UI via the auto-populated `DIRECT_BYOK_PROVIDERS_LIST`.

Reviewer Notes

  • Model ids include colons (e.g. gpt-oss:120b, kimi-k2.6:cloud); the direct-byok model lookup is a plain string compare against formatDirectByokModelId, so colons pass through without needing URL encoding.
  • models.dev catalog is memoized inside a single syncDirectByokModels run via a lazy getModelsDevCatalog() on a shared SyncContext, so adding more models.dev-backed providers does not multiply HTTP calls.

Adds Ollama Cloud as a new direct BYOK provider using its OpenAI-compatible
endpoint at https://ollama.com/v1. Model list and metadata are curated from
https://models.dev/api.json with descriptions sourced from the Kilo gateway
model catalog.
@kilo-code-bot
Copy link
Copy Markdown
Contributor Author

kilo-code-bot Bot commented Apr 25, 2026

Code Review Summary

Status: No Issues Found | Recommendation: Merge

Files Reviewed (6 files)
  • apps/web/src/lib/ai-gateway/providers/direct-byok/direct-byok-definitions.ts
  • apps/web/src/lib/ai-gateway/providers/direct-byok/direct-byok-meta.ts
  • apps/web/src/lib/ai-gateway/providers/direct-byok/ollama-cloud.ts
  • apps/web/src/lib/ai-gateway/providers/direct-byok/sync-direct-byok.ts
  • apps/web/src/lib/ai-gateway/providers/model-settings.ts
  • apps/web/src/lib/ai-gateway/providers/openrouter/inference-provider-id.ts

Reviewed by gpt-5.5-20260423 · 838,815 tokens

kilo-code-bot Bot added 14 commits April 29, 2026 14:52
Reworks the Ollama Cloud BYOK provider to pull its model catalog
dynamically via sync-direct-byok (mirroring zai-coding) instead of
hardcoding 37 models. Keeps a single recommended model (gpt-oss:120b)
inline and adds the display name to direct-byok-meta.
The dynamic sync leaves Redis empty until the first cron run, so only
the hardcoded recommended model is guaranteed to be present in the
model list. Use it as the test model to avoid a broken test-key flow
pre-sync.
…ma-cloud types

- Cache the models.dev catalog within a single syncDirectByokModels run so
  providers sourced from it share one HTTP fetch.
- Rename stripVendorPrefix to modelIdToDisplayName and strip trailing :cloud
  tags (e.g. kimi-k2.6:cloud -> kimi-k2.6) in addition to vendor prefixes.
- Drop variants: null from the ollama-cloud recommended model; the schema on
  main is now .optional(), so the explicit null was rejected by tsgo.
Broaden modelIdToDisplayName so tags other than ':cloud' (e.g. ':latest') are
stripped from the user-visible name as well.
Mirror the chutes-byok transformRequest so reasoning variants (none/low/
medium/high) flow through to Ollama Cloud's OpenAI-compatible endpoint as
the reasoning_effort field.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant