ModuleX-managed models

modulexai is the ModuleX-managed model provider — the default option for every surface that calls a language model. You do not supply or manage any provider key: ModuleX provisions the upstream keys for you and meters usage in credits against your plan allowance and wallet. Managed models are available out of the box on the Chat model selector, the Assistant, the LLM node, the Agent node, and the AI Composer. The alternative is BYOK (bring your own key): you connect a first-party provider account and that vendor bills you directly, with no ModuleX credit charge. This page covers managed models end to end — the auth model, the curated catalog, exactly what each call costs in credits, and how to switch a selection to BYOK. For how managed-versus-BYOK selection works across all providers, see Connect LLM providers.

🎬 MEDIA PLACEHOLDER · MX-MEDIA-4100 · [SCREENSHOT] [SCREENSHOT_DESCRIPTION]: The ModuleX-managed (modulexai) entry in the model selector. [SCREENSHOT_DETAILS]: Capture the model picker in chat (or Assistant settings) showing the ModuleX-managed models grouped together — for example Claude Sonnet 4.6, GPT 5.5, Claude Opus 4.6 — with no API-key/connect prompt, signalling they are ready to use. Light theme, 16:9, crop to the dropdown.

How managed models work

A managed model is one whose provider wire name is modulexai. When you pick such a model, the request runs through ModuleX-provisioned upstream keys and the call is costed in credits:

You select a model id

Choose a modulexai model id (for example claude-sonnet-4.6 or gpt-5.5) anywhere a model can be picked. No credential is required — modulexai is exempt from credential validation because it uses ModuleX’s own model-level keys.

ModuleX resolves and routes the model

ModuleX maps the model id to its upstream wire model and request settings. Several managed models are routed through an aggregator with a Bedrock-then-Anthropic fallback order; you select by ModuleX model_id and the routing is handled for you. Deprecated ids are resolved to their replacement automatically — see Deprecated models route automatically.

The billing gate admits and meters the call

Every managed turn passes through the usage gate: it reserves credit before any work, charges on success, and denies with a DenialEnvelope on 402 / 403 / 429 when your allowance and wallet are exhausted. See Credit cost.

🎬 MEDIA PLACEHOLDER · MX-MEDIA-4101 · [IMAGE] [IMAGE_DESCRIPTION]: Request path for a managed (modulexai) model call versus a BYOK call. [IMAGE_DETAILS]: Show two lanes from a single “model call” box. Managed lane: ModuleX-provisioned key, routed through the aggregator, passing the credit usage gate (reserve to charge to settle) and metered in credits. BYOK lane: your own credential, routed straight to the upstream vendor, billed by the vendor, not credited. Label which lane consumes credits. 16:9, light and dark variants, ModuleX brand palette, no UI chrome.

No key required: the `modulex_key` auth schema

Managed access uses a single auth schema, modulex_key. Unlike BYOK providers, it has no fields for you to fill in — its setup_environment_variables array is empty. ModuleX supplies the upstream key automatically and tracks usage against your subscription plan’s monthly credit allowance.

auth_type

string

modulex_key — the managed-key scheme. ModuleX provides the API keys automatically; no key configuration is needed.

display_name

string

ModuleX Managed Key.

setup_environment_variables

array

Empty ([]). There is nothing to enter to use managed models.

Because modulexai needs no credential, you do not call POST /credentials for it. Managed models are usable as soon as your organization is on any plan (Free included). You still need to be an owner or admin to manage the integration catalog or change defaults; the member role is retired.

Available models

ModuleX serves the managed model list from the catalog at GET /integrations/llm-providers/modulexai. The catalog is read-only metadata used by the model selectors and the builder — it is not the execution path, and it never returns or stores a secret key. The catalog read is cached for up to 10 minutes. The active models below are the ones ModuleX currently advertises for modulexai. There are six deprecated bedrock-* ids in addition; see Deprecated models.

Chat and reasoning models

Model id	Display name	Upstream	Max input	Max output	Vision	Input `$/1M`	Output `$/1M`
`claude-sonnet-4.6`	Claude Sonnet 4.6	Anthropic	200,000	64,000	Yes	`$3.00`	`$15.00`
`claude-opus-4.6`	Claude Opus 4.6	Anthropic	200,000	128,000	Yes	`$5.00`	`$25.00`
`claude-haiku-4.5`	Claude Haiku 4.5	Anthropic	200,000	64,000	Yes	`$0.80`	`$4.00`
`gpt-5.5`	GPT 5.5	OpenAI	1,050,000	128,000	Yes	`$5.00`	`$30.00`
`gpt-5.4`	GPT 5.4	OpenAI	1,050,000	128,000	Yes	`$2.50`	`$15.00`
`gpt-5.4-mini`	GPT 5.4 mini	OpenAI	400,000	128,000	Yes	`$0.75`	`$4.50`
`gpt-5.4-nano`	GPT 5 nano	OpenAI	400,000	128,000	Yes	`$0.20`	`$1.25`
`gpt-5-chat-latest`	GPT 5 chat latest	OpenAI	128,000	16,384	Yes	`$1.25`	`$10.00`
`gpt-5.3-codex`	GPT 5.3 codex	OpenAI	400,000	128,000	Yes	`$1.75`	`$14.00`
`o3`	o3	OpenAI	200,000	100,000	Yes	`$2.00`	`$8.00`
`gemma-3-27b`	Gemma 3 27B	Google	128,000	8,192	Yes	`$0.10`	`$0.10`
`gemma-3-12b`	Gemma 3 12B	Google	128,000	8,192	Yes	`$0.05`	`$0.05`
`gemma-3-4b`	Gemma 3 4B	Google	128,000	8,192	No	`$0.02`	`$0.02`

Embedding models

Model id	Display name	Upstream	Max input	Dimensions	Input `$/1M`	Output `$/1M`
`text-embedding-3-large`	Text Embedding 3 Large	OpenAI	8,191	3,072	`$0.13`	`$0.00`
`text-embedding-3-small`	Text Embedding 3 Small	OpenAI	8,191	1,536	`$0.02`	`$0.00`

The per-million-token prices above are the upstream provider’s USD rates recorded in the ModuleX catalog. For managed (modulexai) usage they are inputs to the credit-cost formula, not amounts you pay directly — your actual charge is in credits and includes a 5% system margin. See Credit cost. Embedding models report a zero output price because embedding forces completion_tokens = 0.

The model list is served from the catalog and changes as ModuleX adds, routes, or deprecates models. Read GET /integrations/llm-providers/modulexai for the current set rather than hard-coding ids; catalog reads are cached for up to 10 minutes.

The model record

Each entry in the provider’s models[] array carries the fields below.

string

The ModuleX model_id you select (for example claude-sonnet-4.6).

display_name

string

Human-readable name shown in selectors.

provider

string

Display vendor — Anthropic, OpenAI, or Google on active managed models.

provider_id

string

Routing provider slug. Managed chat/reasoning models route through openrouter; the OpenAI embedding models route through openai.

max_input_tokens

integer

Maximum context window in tokens.

max_output_tokens

integer

Maximum tokens the model can generate. 0 for embedding models.

data_freshness

string

Knowledge-cutoff date (for example 2026-01-01). This field, not a knowledge_cutoff key, carries the cutoff.

intelligence

integer

Quality score, 1–5.

speed

integer

Latency score, 1–5 (higher is faster).

supports_vision

boolean

Whether the model accepts image input.

input_usd_per_1m_tokens

number

Upstream input price per 1M tokens, as a flat number (not a nested pricing object).

output_usd_per_1m_tokens

number

Upstream output price per 1M tokens. 0.0 on embedding models.

is_embedding

boolean

true on the two embedding models. Embedding models also report embedding_dimension and a zero output price.

embedding_dimension

integer

Vector dimension for embedding models (3072 for text-embedding-3-large, 1536 for text-embedding-3-small).

status

string

Lifecycle state. Present and set to deprecated on the retired bedrock-* ids; active models omit it (treated as healthy).

replacement_id

string

On a deprecated model, the successor model_id ModuleX routes the call to instead.

Show Example model record (claude-sonnet-4.6)

{
  "id": "claude-sonnet-4.6",
  "display_name": "Claude Sonnet 4.6",
  "provider": "Anthropic",
  "provider_id": "openrouter",
  "max_input_tokens": 200000,
  "max_output_tokens": 64000,
  "data_freshness": "2026-01-01",
  "intelligence": 4,
  "speed": 3,
  "supports_vision": true,
  "input_usd_per_1m_tokens": 3.0,
  "output_usd_per_1m_tokens": 15.0
}

Read the model catalog

curl https://api.modulex.dev/integrations/llm-providers/modulexai \
  -H "Authorization: Bearer mx_live_xxxxxxxxxxxxxxxxxxxxxxxx" \
  -H "X-Organization-ID: 6f1c2d3e-4a5b-6c7d-8e9f-0a1b2c3d4e5f"

The response is an integration-detail object whose models array contains the entries above plus an auth_schemas array describing the single modulex_key scheme. A truncated example:

Catalog detail (truncated)

{
  "name": "modulexai",
  "display_name": "ModulexAI",
  "description": "ModuleX's managed LLM provider with curated models and usage tracking",
  "integration_type": "llm_provider",
  "version": "1.0.0",
  "categories": ["AI & LLM Providers", "ai", "chat", "managed"],
  "auth_schemas": [
    {
      "auth_type": "modulex_key",
      "display_name": "ModuleX Managed Key",
      "description": "Use ModuleX's managed API keys with usage tracked against your monthly credit limit",
      "setup_environment_variables": []
    }
  ],
  "models": [
    {
      "id": "claude-sonnet-4.6",
      "display_name": "Claude Sonnet 4.6",
      "max_input_tokens": 200000,
      "max_output_tokens": 64000,
      "supports_vision": true,
      "input_usd_per_1m_tokens": 3.0,
      "output_usd_per_1m_tokens": 15.0
    }
  ]
}

The catalog detail endpoints are owner/admin-gated and org-scoped: every request needs Authorization: Bearer mx_live_… plus X-Organization-ID, and the caller must be an owner or admin. See Authentication.

Credit cost

Managed (modulexai) usage is the only LLM-provider usage that consumes credits. A credit is the managed-usage billing unit: **1000 credits =

1.00** (1 credit = `

0.001`). BYOK calls are never credited. A managed call is metered in two parts.

Per-turn run charge

Each logical run or chat turn is charged a flat RUN_CREDIT of 1 credit, recorded once and deduplicated by an idempotency key so a resumed turn is not double-charged.

Token metering

The model’s tokens are converted to credits using the upstream per-token rates plus a system margin:

Token-to-credit formula

cost = (prompt_tokens · input_rate / 1e6 + completion_tokens · output_rate / 1e6) · MARGIN · SCALE

where:

Symbol	Value	Meaning
`input_rate` / `output_rate`	the model’s `input_usd_per_1m_tokens` / `output_usd_per_1m_tokens`	upstream USD rates from the model record
`MARGIN`	`1.05`	a 5% system margin applied to managed usage
`SCALE`	`1000`	`1000 credits = $1.00`, so 1 credit = `$0.001`

Embedding calls force completion_tokens = 0, so only input tokens are metered. If a model id is unknown to the pricing table, it meters to 0 rather than a silent fallback rate.

Other managed operations carry their own flat base charges, also billed in credits: a managed knowledge retrieval reserves RETRIEVAL_BASE = 1 credit, a managed document ingest reserves FILE_INGEST_BASE = 1 credit, and an integration tool call has a TOOL_BASE of 10 credits ($0.01). These apply to the corresponding managed surfaces, not to a plain LLM call. See Credits & metering.

Worked example

A single claude-sonnet-4.6 turn that reads 10,000 prompt tokens and writes 1,000 completion tokens:

Token cost = (10000 · 3.0 / 1e6 + 1000 · 15.0 / 1e6) · 1.05 · 1000 = (0.03 + 0.015) · 1.05 · 1000 ≈ 47.25 credits (≈ $0.047).
Plus the flat per-turn RUN_CREDIT of 1 credit.

So the turn costs roughly 48 credits. The exact charge is quantized to eight decimal places when recorded.

When a managed call is denied

When managed usage exhausts your plan allowance and wallet, the billing admission gate denies the call with a flat DenialEnvelope — {code, layer, key, current, limit, reason} — returned on:

Status	`layer`	When
`402`	`credit` / `wallet`	Plan credits exhausted, overage disabled, or wallet balance insufficient.
`403`	`quota`	A plan quota is exceeded.
`429`	`rate`	A per-plan rate class (for example `sync_exec`) is exceeded; includes `Retry-After` and `X-RateLimit-*` headers.

These envelopes appear only on managed-usage surfaces — the run, Composer, Assistant, and managed-knowledge calls that actually meter credits. Plain credential and catalog CRUD return the FastAPI {detail} HTTPException shape instead. See Errors & status codes and Usage gating & limits.

Where you select a managed model

A managed model id is selectable anywhere a model is chosen — no connection step:

Chat

Pick a managed model in the chat composer.

Assistant

Set a managed model for the Assistant to reason and act with.

LLM node

Set model_id on the node to a managed id.

Agent node

Run an autonomous step on a managed model.

Deprecated models route automatically

When you select a model whose status is deprecated (or maintenance), ModuleX resolves it to its replacement_id before the call — for example a bedrock-claude-sonnet-4.6 id routes to claude-sonnet-4.6. Routing is cycle- and depth-guarded, so a chain of replacements terminates at a live model. For a managed integration, if a requested id is unknown or its chain cannot resolve, ModuleX softens to the integration’s first serving (healthy, non-deprecated) model rather than failing. You do not need to update saved selections immediately, but prefer a live id for clarity.

Deprecated models

These six ids exist only for backward compatibility and route to their replacements. Do not select them for new work.

Deprecated id	Routes to
`bedrock-claude-sonnet-4.6`	`claude-sonnet-4.6`
`bedrock-claude-opus-4.6`	`claude-opus-4.6`
`bedrock-claude-haiku-4.5`	`claude-haiku-4.5`
`bedrock-gemma-3-27b`	`gemma-3-27b`
`bedrock-gemma-3-12b`	`gemma-3-12b`
`bedrock-gemma-3-4b`	`gemma-3-4b`

Switching to BYOK

Managed and BYOK are a per-selection choice — you switch by changing which model a surface uses, not by a global toggle. To move a chat, Assistant, or node off managed credits and onto a vendor-direct bill, select a model from a connected BYOK provider instead of a modulexai model.

Connect a BYOK provider

Create a credential for a first-party provider (openai, anthropic, gemini, or xai) with POST /credentials, supplying your own provider key. You must be an owner or admin. See Connect LLM providers and the per-provider pages, for example Anthropic and OpenAI.

Select a BYOK model where you want vendor-direct billing

In the chat model selector, Assistant settings, or a node, choose a model from the connected provider in place of the modulexai model.

Confirm the billing path

BYOK token usage is billed by the upstream vendor with no ModuleX markup and is recorded for analytics only — it does not consume credits. Non-model ModuleX usage (such as the flat per-run charge, or managed knowledge/tool calls) can still apply.

Switching a model moves cost between billing systems. A modulexai selection consumes ModuleX credits and is subject to the billing gate; a BYOK selection is billed by your provider and is not credited. Switching back to a modulexai model returns that cost to credits. Confirm which model is selected before relying on either path.

Connect LLM providers

All providers, and how managed vs BYOK selection works.

Credits & metering

What a credit is and exactly what consumes credits.

Errors

Catalog reads for modulexai return ModuleX’s standard error shapes. Managed model calls can additionally return the billing DenialEnvelope described in When a managed call is denied.

Status	When it happens	Shape
`400`	Asking a typed endpoint for the wrong integration type (for example `modulexai` is not a tool).	`{detail: string}`
`401`	Missing or invalid `Authorization`, or missing `X-Organization-ID`.	`{detail: string}`
`403`	Caller is not an owner or admin in the organization.	`{detail: string}`
`404`	Unknown provider, for example `GET /integrations/llm-providers/<unknown>` returns `LLM provider not found`.	`{detail: string}`
`422`	Query/parameter validation error.	`{detail: [ ... ]}`
`402` / `403` / `429`	A managed model call hits the billing gate (credit, quota, or rate).	`DenialEnvelope` — `{code, layer, key, current, limit, reason}`
`500`	Unhandled server error.	`{detail: string}`

See Errors & status codes for all three envelope shapes and which surface emits each.

Connect LLM providers

Managed vs BYOK, and the full provider list.

Credits & metering

The credit unit and what consumes credits.

Usage gating & limits

The billing admission gate and its 402/403/429 responses.

Anthropic (BYOK)

The bring-your-own-key alternative for Claude models.

Open questions (TBD)

Upstream routing detail. Several managed models route through an aggregator (openrouter) with a Bedrock-then-Anthropic fallback order, and the OpenAI embedding models route through openai. The exact upstream selected for a given call is handled internally and is not exposed in any catalog API response; treat the routing as an implementation detail subject to change.
Public catalog base URL. The catalog and SDK examples use https://api.modulex.dev per the repo-wide convention; the exact public host for these endpoints is not pinned from source and should be confirmed against the deployed environment.
Per-model max-output overrides. ModuleX applies scenario-specific maximum-output-token limits to managed calls (for example a smaller cap for simple chat than for an agent step). These limits are backend-only and are never returned in a catalog response, so the catalog’s max_output_tokens is the model ceiling, not necessarily the per-scenario limit applied at run time.

​How managed models work

​No key required: the modulex_key auth schema

​Available models

​Chat and reasoning models

​Embedding models

​The model record

​Read the model catalog

​Credit cost

​Per-turn run charge

​Token metering

​Worked example

​When a managed call is denied

​Where you select a managed model

Chat

Assistant

LLM node

Agent node

​Deprecated models route automatically

​Deprecated models

​Switching to BYOK

Connect LLM providers

Credits & metering

​Errors

​Related

Connect LLM providers

Credits & metering

Usage gating & limits

Anthropic (BYOK)

​Open questions (TBD)

How managed models work

No key required: the `modulex_key` auth schema

Available models

Chat and reasoning models

Embedding models

The model record

Read the model catalog

Credit cost

Per-turn run charge

Token metering

Worked example

When a managed call is denied

Where you select a managed model

Deprecated models route automatically

Deprecated models

Switching to BYOK

Errors

Related

Open questions (TBD)