Skip to main content
modulexai is the ModuleX-managed model provider — the default option for every surface that calls a language model. You do not supply or manage any provider key: ModuleX provisions the upstream keys for you and meters usage in credits against your plan allowance and wallet. Managed models are available out of the box on the Chat model selector, the Assistant, the LLM node, the Agent node, and the AI Composer. The alternative is BYOK (bring your own key): you connect a first-party provider account and that vendor bills you directly, with no ModuleX credit charge. This page covers managed models end to end — the auth model, the curated catalog, exactly what each call costs in credits, and how to switch a selection to BYOK. For how managed-versus-BYOK selection works across all providers, see Connect LLM providers.
🎬 MEDIA PLACEHOLDER · MX-MEDIA-4100 · [SCREENSHOT] [SCREENSHOT_DESCRIPTION]: The ModuleX-managed (modulexai) entry in the model selector. [SCREENSHOT_DETAILS]: Capture the model picker in chat (or Assistant settings) showing the ModuleX-managed models grouped together — for example Claude Sonnet 4.6, GPT 5.5, Claude Opus 4.6 — with no API-key/connect prompt, signalling they are ready to use. Light theme, 16:9, crop to the dropdown.

How managed models work

A managed model is one whose provider wire name is modulexai. When you pick such a model, the request runs through ModuleX-provisioned upstream keys and the call is costed in credits:
1

You select a model id

Choose a modulexai model id (for example claude-sonnet-4.6 or gpt-5.5) anywhere a model can be picked. No credential is required — modulexai is exempt from credential validation because it uses ModuleX’s own model-level keys.
2

ModuleX resolves and routes the model

ModuleX maps the model id to its upstream wire model and request settings. Several managed models are routed through an aggregator with a Bedrock-then-Anthropic fallback order; you select by ModuleX model_id and the routing is handled for you. Deprecated ids are resolved to their replacement automatically — see Deprecated models route automatically.
3

The billing gate admits and meters the call

Every managed turn passes through the usage gate: it reserves credit before any work, charges on success, and denies with a DenialEnvelope on 402 / 403 / 429 when your allowance and wallet are exhausted. See Credit cost.
🎬 MEDIA PLACEHOLDER · MX-MEDIA-4101 · [IMAGE] [IMAGE_DESCRIPTION]: Request path for a managed (modulexai) model call versus a BYOK call. [IMAGE_DETAILS]: Show two lanes from a single “model call” box. Managed lane: ModuleX-provisioned key, routed through the aggregator, passing the credit usage gate (reserve to charge to settle) and metered in credits. BYOK lane: your own credential, routed straight to the upstream vendor, billed by the vendor, not credited. Label which lane consumes credits. 16:9, light and dark variants, ModuleX brand palette, no UI chrome.

No key required: the modulex_key auth schema

Managed access uses a single auth schema, modulex_key. Unlike BYOK providers, it has no fields for you to fill in — its setup_environment_variables array is empty. ModuleX supplies the upstream key automatically and tracks usage against your subscription plan’s monthly credit allowance.
auth_type
string
modulex_key — the managed-key scheme. ModuleX provides the API keys automatically; no key configuration is needed.
display_name
string
ModuleX Managed Key.
setup_environment_variables
array
Empty ([]). There is nothing to enter to use managed models.
Because modulexai needs no credential, you do not call POST /credentials for it. Managed models are usable as soon as your organization is on any plan (Free included). You still need to be an owner or admin to manage the integration catalog or change defaults; the member role is retired.

Available models

ModuleX serves the managed model list from the catalog at GET /integrations/llm-providers/modulexai. The catalog is read-only metadata used by the model selectors and the builder — it is not the execution path, and it never returns or stores a secret key. The catalog read is cached for up to 10 minutes. The active models below are the ones ModuleX currently advertises for modulexai. There are six deprecated bedrock-* ids in addition; see Deprecated models.

Chat and reasoning models

Model idDisplay nameUpstreamMax inputMax outputVisionInput $/1MOutput $/1M
claude-sonnet-4.6Claude Sonnet 4.6Anthropic200,00064,000Yes$3.00$15.00
claude-opus-4.6Claude Opus 4.6Anthropic200,000128,000Yes$5.00$25.00
claude-haiku-4.5Claude Haiku 4.5Anthropic200,00064,000Yes$0.80$4.00
gpt-5.5GPT 5.5OpenAI1,050,000128,000Yes$5.00$30.00
gpt-5.4GPT 5.4OpenAI1,050,000128,000Yes$2.50$15.00
gpt-5.4-miniGPT 5.4 miniOpenAI400,000128,000Yes$0.75$4.50
gpt-5.4-nanoGPT 5 nanoOpenAI400,000128,000Yes$0.20$1.25
gpt-5-chat-latestGPT 5 chat latestOpenAI128,00016,384Yes$1.25$10.00
gpt-5.3-codexGPT 5.3 codexOpenAI400,000128,000Yes$1.75$14.00
o3o3OpenAI200,000100,000Yes$2.00$8.00
gemma-3-27bGemma 3 27BGoogle128,0008,192Yes$0.10$0.10
gemma-3-12bGemma 3 12BGoogle128,0008,192Yes$0.05$0.05
gemma-3-4bGemma 3 4BGoogle128,0008,192No$0.02$0.02

Embedding models

Model idDisplay nameUpstreamMax inputDimensionsInput $/1MOutput $/1M
text-embedding-3-largeText Embedding 3 LargeOpenAI8,1913,072$0.13$0.00
text-embedding-3-smallText Embedding 3 SmallOpenAI8,1911,536$0.02$0.00
The per-million-token prices above are the upstream provider’s USD rates recorded in the ModuleX catalog. For managed (modulexai) usage they are inputs to the credit-cost formula, not amounts you pay directly — your actual charge is in credits and includes a 5% system margin. See Credit cost. Embedding models report a zero output price because embedding forces completion_tokens = 0.
The model list is served from the catalog and changes as ModuleX adds, routes, or deprecates models. Read GET /integrations/llm-providers/modulexai for the current set rather than hard-coding ids; catalog reads are cached for up to 10 minutes.

The model record

Each entry in the provider’s models[] array carries the fields below.
id
string
The ModuleX model_id you select (for example claude-sonnet-4.6).
display_name
string
Human-readable name shown in selectors.
provider
string
Display vendor — Anthropic, OpenAI, or Google on active managed models.
provider_id
string
Routing provider slug. Managed chat/reasoning models route through openrouter; the OpenAI embedding models route through openai.
max_input_tokens
integer
Maximum context window in tokens.
max_output_tokens
integer
Maximum tokens the model can generate. 0 for embedding models.
data_freshness
string
Knowledge-cutoff date (for example 2026-01-01). This field, not a knowledge_cutoff key, carries the cutoff.
intelligence
integer
Quality score, 1–5.
speed
integer
Latency score, 1–5 (higher is faster).
supports_vision
boolean
Whether the model accepts image input.
input_usd_per_1m_tokens
number
Upstream input price per 1M tokens, as a flat number (not a nested pricing object).
output_usd_per_1m_tokens
number
Upstream output price per 1M tokens. 0.0 on embedding models.
is_embedding
boolean
true on the two embedding models. Embedding models also report embedding_dimension and a zero output price.
embedding_dimension
integer
Vector dimension for embedding models (3072 for text-embedding-3-large, 1536 for text-embedding-3-small).
status
string
Lifecycle state. Present and set to deprecated on the retired bedrock-* ids; active models omit it (treated as healthy).
replacement_id
string
On a deprecated model, the successor model_id ModuleX routes the call to instead.

Read the model catalog

curl https://api.modulex.dev/integrations/llm-providers/modulexai \
  -H "Authorization: Bearer mx_live_xxxxxxxxxxxxxxxxxxxxxxxx" \
  -H "X-Organization-ID: 6f1c2d3e-4a5b-6c7d-8e9f-0a1b2c3d4e5f"
The response is an integration-detail object whose models array contains the entries above plus an auth_schemas array describing the single modulex_key scheme. A truncated example:
Catalog detail (truncated)
{
  "name": "modulexai",
  "display_name": "ModulexAI",
  "description": "ModuleX's managed LLM provider with curated models and usage tracking",
  "integration_type": "llm_provider",
  "version": "1.0.0",
  "categories": ["AI & LLM Providers", "ai", "chat", "managed"],
  "auth_schemas": [
    {
      "auth_type": "modulex_key",
      "display_name": "ModuleX Managed Key",
      "description": "Use ModuleX's managed API keys with usage tracked against your monthly credit limit",
      "setup_environment_variables": []
    }
  ],
  "models": [
    {
      "id": "claude-sonnet-4.6",
      "display_name": "Claude Sonnet 4.6",
      "max_input_tokens": 200000,
      "max_output_tokens": 64000,
      "supports_vision": true,
      "input_usd_per_1m_tokens": 3.0,
      "output_usd_per_1m_tokens": 15.0
    }
  ]
}
The catalog detail endpoints are owner/admin-gated and org-scoped: every request needs Authorization: Bearer mx_live_… plus X-Organization-ID, and the caller must be an owner or admin. See Authentication.

Credit cost

Managed (modulexai) usage is the only LLM-provider usage that consumes credits. A credit is the managed-usage billing unit: **1000 credits = 1.00(1credit=1.00** (1 credit = `0.001`). BYOK calls are never credited. A managed call is metered in two parts.

Per-turn run charge

Each logical run or chat turn is charged a flat RUN_CREDIT of 1 credit, recorded once and deduplicated by an idempotency key so a resumed turn is not double-charged.

Token metering

The model’s tokens are converted to credits using the upstream per-token rates plus a system margin:
Token-to-credit formula
cost = (prompt_tokens · input_rate / 1e6 + completion_tokens · output_rate / 1e6) · MARGIN · SCALE
where:
SymbolValueMeaning
input_rate / output_ratethe model’s input_usd_per_1m_tokens / output_usd_per_1m_tokensupstream USD rates from the model record
MARGIN1.05a 5% system margin applied to managed usage
SCALE10001000 credits = $1.00, so 1 credit = $0.001
Embedding calls force completion_tokens = 0, so only input tokens are metered. If a model id is unknown to the pricing table, it meters to 0 rather than a silent fallback rate.
Other managed operations carry their own flat base charges, also billed in credits: a managed knowledge retrieval reserves RETRIEVAL_BASE = 1 credit, a managed document ingest reserves FILE_INGEST_BASE = 1 credit, and an integration tool call has a TOOL_BASE of 10 credits ($0.01). These apply to the corresponding managed surfaces, not to a plain LLM call. See Credits & metering.

Worked example

A single claude-sonnet-4.6 turn that reads 10,000 prompt tokens and writes 1,000 completion tokens:
  • Token cost = (10000 · 3.0 / 1e6 + 1000 · 15.0 / 1e6) · 1.05 · 1000 = (0.03 + 0.015) · 1.05 · 100047.25 credits (≈ $0.047).
  • Plus the flat per-turn RUN_CREDIT of 1 credit.
So the turn costs roughly 48 credits. The exact charge is quantized to eight decimal places when recorded.

When a managed call is denied

When managed usage exhausts your plan allowance and wallet, the billing admission gate denies the call with a flat DenialEnvelope{code, layer, key, current, limit, reason} — returned on:
StatuslayerWhen
402credit / walletPlan credits exhausted, overage disabled, or wallet balance insufficient.
403quotaA plan quota is exceeded.
429rateA per-plan rate class (for example sync_exec) is exceeded; includes Retry-After and X-RateLimit-* headers.
These envelopes appear only on managed-usage surfaces — the run, Composer, Assistant, and managed-knowledge calls that actually meter credits. Plain credential and catalog CRUD return the FastAPI {detail} HTTPException shape instead. See Errors & status codes and Usage gating & limits.

Where you select a managed model

A managed model id is selectable anywhere a model is chosen — no connection step:

Chat

Pick a managed model in the chat composer.

Assistant

Set a managed model for the Assistant to reason and act with.

LLM node

Set model_id on the node to a managed id.

Agent node

Run an autonomous step on a managed model.

Deprecated models route automatically

When you select a model whose status is deprecated (or maintenance), ModuleX resolves it to its replacement_id before the call — for example a bedrock-claude-sonnet-4.6 id routes to claude-sonnet-4.6. Routing is cycle- and depth-guarded, so a chain of replacements terminates at a live model. For a managed integration, if a requested id is unknown or its chain cannot resolve, ModuleX softens to the integration’s first serving (healthy, non-deprecated) model rather than failing. You do not need to update saved selections immediately, but prefer a live id for clarity.

Deprecated models

These six ids exist only for backward compatibility and route to their replacements. Do not select them for new work.
Deprecated idRoutes to
bedrock-claude-sonnet-4.6claude-sonnet-4.6
bedrock-claude-opus-4.6claude-opus-4.6
bedrock-claude-haiku-4.5claude-haiku-4.5
bedrock-gemma-3-27bgemma-3-27b
bedrock-gemma-3-12bgemma-3-12b
bedrock-gemma-3-4bgemma-3-4b

Switching to BYOK

Managed and BYOK are a per-selection choice — you switch by changing which model a surface uses, not by a global toggle. To move a chat, Assistant, or node off managed credits and onto a vendor-direct bill, select a model from a connected BYOK provider instead of a modulexai model.
1

Connect a BYOK provider

Create a credential for a first-party provider (openai, anthropic, gemini, or xai) with POST /credentials, supplying your own provider key. You must be an owner or admin. See Connect LLM providers and the per-provider pages, for example Anthropic and OpenAI.
2

Select a BYOK model where you want vendor-direct billing

In the chat model selector, Assistant settings, or a node, choose a model from the connected provider in place of the modulexai model.
3

Confirm the billing path

BYOK token usage is billed by the upstream vendor with no ModuleX markup and is recorded for analytics only — it does not consume credits. Non-model ModuleX usage (such as the flat per-run charge, or managed knowledge/tool calls) can still apply.
Switching a model moves cost between billing systems. A modulexai selection consumes ModuleX credits and is subject to the billing gate; a BYOK selection is billed by your provider and is not credited. Switching back to a modulexai model returns that cost to credits. Confirm which model is selected before relying on either path.

Connect LLM providers

All providers, and how managed vs BYOK selection works.

Credits & metering

What a credit is and exactly what consumes credits.

Errors

Catalog reads for modulexai return ModuleX’s standard error shapes. Managed model calls can additionally return the billing DenialEnvelope described in When a managed call is denied.
StatusWhen it happensShape
400Asking a typed endpoint for the wrong integration type (for example modulexai is not a tool).{detail: string}
401Missing or invalid Authorization, or missing X-Organization-ID.{detail: string}
403Caller is not an owner or admin in the organization.{detail: string}
404Unknown provider, for example GET /integrations/llm-providers/<unknown> returns LLM provider not found.{detail: string}
422Query/parameter validation error.{detail: [ ... ]}
402 / 403 / 429A managed model call hits the billing gate (credit, quota, or rate).DenialEnvelope{code, layer, key, current, limit, reason}
500Unhandled server error.{detail: string}
See Errors & status codes for all three envelope shapes and which surface emits each.

Connect LLM providers

Managed vs BYOK, and the full provider list.

Credits & metering

The credit unit and what consumes credits.

Usage gating & limits

The billing admission gate and its 402/403/429 responses.

Anthropic (BYOK)

The bring-your-own-key alternative for Claude models.

Open questions (TBD)

  • Upstream routing detail. Several managed models route through an aggregator (openrouter) with a Bedrock-then-Anthropic fallback order, and the OpenAI embedding models route through openai. The exact upstream selected for a given call is handled internally and is not exposed in any catalog API response; treat the routing as an implementation detail subject to change.
  • Public catalog base URL. The catalog and SDK examples use https://api.modulex.dev per the repo-wide convention; the exact public host for these endpoints is not pinned from source and should be confirmed against the deployed environment.
  • Per-model max-output overrides. ModuleX applies scenario-specific maximum-output-token limits to managed calls (for example a smaller cap for simple chat than for an agent step). These limits are backend-only and are never returned in a catalog response, so the catalog’s max_output_tokens is the model ceiling, not necessarily the per-scenario limit applied at run time.