modulexdb), where ModuleX hosts your embeddings and chunks and bills retrieval
in credits, and external (bring-your-own) stores — Qdrant, Pinecone, MongoDB
Atlas, and Weaviate — that you connect with your own credentials and query
without ModuleX markup. This page is the exhaustive reference for both: how each
is identified, the exact fields you supply to connect, the knowledge node
contract that consumes them, and the retrieval billing model.
For the conceptual model of retrieval-augmented generation in ModuleX, see
Knowledge & RAG. To create and manage a managed
knowledge base in the app, see Knowledge overview
and Managed knowledge (modulexdb). To wire a
provider into a graph, see the Knowledge node.
Knowledge bases and their provider credentials are organization-scoped. Every
knowledge endpoint and every credential operation requires the owner or
admin role (
organization_admin_required); the member role is retired. The
organization is selected by the X-Organization-ID header on every request — see
Org context & X-Organization-ID.The two provider families
A provider is identified on the wire by aprovider_type value. The backend
enumerates exactly five (KnowledgeProviderType): one managed and four external.
provider_type | Family | Display name | How you connect | Retrieval billing |
|---|---|---|---|---|
modulexdb | Managed | modulexdb (managed) | Auto-created when you create a knowledge base | Billed in credits |
qdrant | External (BYOK) | Qdrant | custom credential (URL + API key) | Uncosted |
pinecone | External (BYOK) | Pinecone | custom credential (API key + environment) | Uncosted |
mongodb_atlas | External (BYOK) | MongoDB Atlas Vector Search | custom credential (connection string) | Uncosted |
weaviate | External (BYOK) | Weaviate | custom credential (URL + API key) | Uncosted |
Per-provider connection guides
This page is the cross-provider reference. For the field-by-field connect steps,
open the page for your store: modulexdb (managed),
Qdrant,
Pinecone,
MongoDB Atlas, and
Weaviate.
Managed: modulexdb
The managed store is the default. When you create a knowledge base without choosing an external provider, ModuleX provisions amodulexdb knowledge base: it
parses, chunks, and embeds your documents, stores the chunk vectors in its own
pgvector-backed store, and serves vector, hybrid, and RAG-context search over
them. You do not connect anything — the store is created for you, and an internal
credential is linked to the knowledge base automatically (the “native knowledge
base = credential” pattern).
A knowledge base counts as managed when its embedding_config.integration_name
is modulexai. Managed retrieval and managed ingest are metered in credits (see
Retrieval billing below). Choose modulexdb when you want
ModuleX to own ingestion and storage end to end and you do not already run a vector
database.
External: bring your own vector store
An external provider connects ModuleX to a vector database you already operate. ModuleX does not ingest or store documents for an external provider — you populate and maintain your own collection/index, and ModuleX only queries it at run time. Because ModuleX provides no storage and no embeddings for these, external retrieval is uncosted (your vector-database vendor and your embedding provider bill you directly — this is bring-your-own-key usage; see BYOK in the glossary). All four external providers connect with acustom auth credential. Each declares
its own connection fields and exposes catalog actions (query, plus a list/describe
pair) so the catalog UI can show what the provider can do. Choose an external
provider when you already have vectors in Qdrant, Pinecone, MongoDB Atlas, or
Weaviate and want to retrieve over them without re-ingesting into ModuleX.
Important capability boundary. External providers query an existing
collection; they have no ModuleX-side ingestion pipeline. The document
upload, chunking, embedding, and processing flows described in
Managing documents apply to managed knowledge
bases only. You are responsible for keeping an external collection populated and
its embedding model consistent with the one you configure on the ModuleX side.
How a provider is connected
The connection mechanism differs by family.Managed (modulexdb): create a knowledge base
Create a knowledge base — in the app under Knowledge, or with
POST /knowledge-bases.
ModuleX provisions the managed store, auto-discovers an embedding-capable
credential if you do not supply one, and links an internal modulexdb credential
to the new knowledge base. There is nothing else to connect. See
Build a RAG knowledge base.External: create a credential for the provider
Create a
custom credential for the provider with its connection fields (URL,
API key, connection string — see the per-provider tables below), in the app under
Settings → Credentials, or with POST /credentials. ModuleX encrypts the secret
at rest and tests the connection before saving where a test endpoint is declared.
See Managing credentials.Reference the provider from a knowledge node
In a workflow, add a Knowledge node, set its
provider_type, and point credential_id at the credential (or, for managed, the
auto-linked knowledge-base credential). For external providers, also set
collection_name and an embedding_config. The node configuration is documented
in The knowledge node contract below.Creating an external-provider credential
External providers use thecustom auth schema. Create the credential with the
provider’s name and an auth_data object holding its fields. Auth in every
request follows the standard model: Authorization: Bearer mx_live_… plus
X-Organization-ID.
The exact SDK method surface for credential creation may vary between the
JavaScript and Python SDKs; the credential resource is not guaranteed to be at
parity across both. Confirm against the SDK ⇄ API parity matrix
and fall back to the REST call above when a method is absent. The cURL request is
the authoritative shape.
Connection fields by provider
Each external provider declares its connection fields in its manifest under acustom auth schema. The fields below are the exact auth_data keys you supply.
- Qdrant
- Pinecone
- MongoDB Atlas
- Weaviate
URL of your Qdrant instance, including the port. Not treated as a secret.
Example:
https://xyz-abc.aws.cloud.qdrant.io:6333.API key for Qdrant Cloud. Optional for local/unauthenticated instances. Stored
encrypted. Example format:
qdrant-api-key-....GET {url}/collections, expecting a result field in the response). Catalog
actions: query, list_collections, get_collection_info.Browsing providers in the catalog
The catalog exposes knowledge providers as read-only metadata for discovery; it is not the execution surface. List providers withGET /integrations/knowledge-providers
or fetch one with GET /integrations/knowledge-providers/{provider_name}. Both
require the owner or admin role and X-Organization-ID. The detail endpoint
returns the provider’s actions and its auth_schemas (with OAuth flags enriched
where applicable — not relevant to these custom-auth providers).
Unique provider id, e.g.
qdrant, pinecone, mongodb_atlas, weaviate.Human-readable name, e.g.
MongoDB Atlas Vector Search.Short summary of the provider.
Always
knowledge_provider for these entries.Tags such as
Vector Database, semantic-search.The provider’s catalog actions (returned by the detail endpoint), each with
name, description, parameters, and a derived output_schema.The auth variants the provider supports. For all four external providers this is a
single
custom schema whose connection fields live under fields (see the
warning above).The catalog SDK method names and exact filter arguments may differ between SDKs and
are not guaranteed to be at parity. Treat the cURL request as authoritative; verify
SDK signatures against the SDK ⇄ API parity matrix.
The knowledge node contract
A Knowledge node is how a provider is consumed inside a workflow. The node configuration (KnowledgeNodeConfig) is shared across
families; a handful of fields apply only to external providers.
Credential for the knowledge base. For managed, this is the internal credential
auto-linked to the knowledge base; for external, the
custom credential you
created for the provider.One of
modulexdb, qdrant, pinecone, weaviate, mongodb_atlas.The search query. Supports
{{nodeId.path}} references to upstream node outputs
for dynamic queries.When
true, use the workflow input as the query instead of the query field.Collection/index/class name on the external store. Required for external
providers; ignored by modulexdb.
Namespace for Pinecone (or similar). External only.
Number of results to retrieve. Range 1–50.
Minimum similarity-score threshold, 0.0–1.0. Score scales are normalized per
provider, so the effective floor differs by vector store.
Maximum tokens in the formatted context string (for the
context output format).
Range 100–10000.Provider-specific filter conditions applied to the search.
Restrict retrieval to specific document ids. Managed knowledge bases only.
One of
chunks (individual chunks with metadata), context (a single formatted
RAG context string), or both.Include chunk metadata in results.
Include source-document info in the formatted context.
Embedding settings for providers that do not embed internally. Required for
external vector databases (Qdrant, Pinecone, MongoDB Atlas, and Weaviate when not
using a text vectorizer), because ModuleX must turn the query text into a vector
before searching your store.
Known limitation. External-provider retrieval is implemented in the workflow
engine’s knowledge node, which embeds the query (per
embedding_config) and calls
the provider adapter. A separate, secondary executor helper path does not yet
support external providers and returns “External provider not yet supported” if it
is reached. If an external-provider knowledge node returns that error, treat it as a
known gap; see Known limitations. Managed
(modulexdb) retrieval is supported on every path.Retrieval billing
Whether a provider is metered depends entirely on the family.| Operation | Managed (modulexdb) | External (BYOK) |
|---|---|---|
| Document ingest (parse → chunk → embed) | FILE_INGEST_BASE = 1 credit, plus per-chunk embedding token cost | Not applicable — you ingest into your own store |
| Retrieval (search / hybrid / retrieve-context) | RETRIEVAL_BASE = 1 credit, plus query-embedding token cost | Uncosted by ModuleX |
embedding_config.integration_name == "modulexai". One credit equals $0.001.
External (BYOK) retrieval is not credited; your vector-database vendor and your
embedding provider bill you directly. For the full credit model see
Credits & metering.
On the interactive knowledge API, a managed-store credit or rate denial returns the
flat billing
DenialEnvelope as 402 / 403 / 429 (reject-before-write — no
search runs). Inside a workflow run, the managed-retrieval gate is best-effort so a
billing hiccup never crashes an in-flight run. The error envelope shapes are
documented on Errors & status codes; the gate itself on
Usage gating & limits.🎬 MEDIA PLACEHOLDER · MX-MEDIA-4110 · [IMAGE]
[IMAGE_DESCRIPTION]: Decision diagram contrasting the managed modulexdb path with the external BYOK path.
[IMAGE_DETAILS]: Two side-by-side lanes. Left lane “Managed (modulexdb)”: documents -> ModuleX ingest (parse/chunk/embed) -> pgvector store -> retrieval, with a credit-coin icon on ingest and retrieval. Right lane “External (BYOK)”: your own collection in Qdrant/Pinecone/Atlas/Weaviate -> ModuleX embeds query -> adapter query -> results, marked “uncosted by ModuleX”. Light and dark variants; 16:9; label the wire identifiers modulexdb / qdrant / pinecone / mongodb_atlas / weaviate.
🎬 MEDIA PLACEHOLDER · MX-MEDIA-4111 · [SCREENSHOT]
[SCREENSHOT_DESCRIPTION]: The credential-add dialog connecting an external knowledge provider.
[SCREENSHOT_DETAILS]: Capture Settings -> Credentials -> Add credential with a knowledge provider (e.g. Qdrant) selected, showing the custom connection fields (URL, API key) and the test-before-save state. Dark theme; crop to the dialog; redact any real secret values.
Next steps
Managed knowledge (modulexdb)
Set up ModuleX-hosted storage and retrieval, billed in credits.
Connect an external store
Bring your own Qdrant, Pinecone, MongoDB Atlas, or Weaviate.
Knowledge node
Wire a provider into a workflow and shape its output.
Build a RAG knowledge base
End to end: create a knowledge base, ingest documents, and query it.