Skip to main content
A knowledge provider is the vector store ModuleX searches when a workflow or chat needs retrieval. ModuleX supports two families: the managed store (modulexdb), where ModuleX hosts your embeddings and chunks and bills retrieval in credits, and external (bring-your-own) stores — Qdrant, Pinecone, MongoDB Atlas, and Weaviate — that you connect with your own credentials and query without ModuleX markup. This page is the exhaustive reference for both: how each is identified, the exact fields you supply to connect, the knowledge node contract that consumes them, and the retrieval billing model. For the conceptual model of retrieval-augmented generation in ModuleX, see Knowledge & RAG. To create and manage a managed knowledge base in the app, see Knowledge overview and Managed knowledge (modulexdb). To wire a provider into a graph, see the Knowledge node.
Knowledge bases and their provider credentials are organization-scoped. Every knowledge endpoint and every credential operation requires the owner or admin role (organization_admin_required); the member role is retired. The organization is selected by the X-Organization-ID header on every request — see Org context & X-Organization-ID.

The two provider families

A provider is identified on the wire by a provider_type value. The backend enumerates exactly five (KnowledgeProviderType): one managed and four external.
provider_typeFamilyDisplay nameHow you connectRetrieval billing
modulexdbManagedmodulexdb (managed)Auto-created when you create a knowledge baseBilled in credits
qdrantExternal (BYOK)Qdrantcustom credential (URL + API key)Uncosted
pineconeExternal (BYOK)Pineconecustom credential (API key + environment)Uncosted
mongodb_atlasExternal (BYOK)MongoDB Atlas Vector Searchcustom credential (connection string)Uncosted
weaviateExternal (BYOK)Weaviatecustom credential (URL + API key)Uncosted

Per-provider connection guides

This page is the cross-provider reference. For the field-by-field connect steps, open the page for your store: modulexdb (managed), Qdrant, Pinecone, MongoDB Atlas, and Weaviate.

Managed: modulexdb

The managed store is the default. When you create a knowledge base without choosing an external provider, ModuleX provisions a modulexdb knowledge base: it parses, chunks, and embeds your documents, stores the chunk vectors in its own pgvector-backed store, and serves vector, hybrid, and RAG-context search over them. You do not connect anything — the store is created for you, and an internal credential is linked to the knowledge base automatically (the “native knowledge base = credential” pattern). A knowledge base counts as managed when its embedding_config.integration_name is modulexai. Managed retrieval and managed ingest are metered in credits (see Retrieval billing below). Choose modulexdb when you want ModuleX to own ingestion and storage end to end and you do not already run a vector database.

External: bring your own vector store

An external provider connects ModuleX to a vector database you already operate. ModuleX does not ingest or store documents for an external provider — you populate and maintain your own collection/index, and ModuleX only queries it at run time. Because ModuleX provides no storage and no embeddings for these, external retrieval is uncosted (your vector-database vendor and your embedding provider bill you directly — this is bring-your-own-key usage; see BYOK in the glossary). All four external providers connect with a custom auth credential. Each declares its own connection fields and exposes catalog actions (query, plus a list/describe pair) so the catalog UI can show what the provider can do. Choose an external provider when you already have vectors in Qdrant, Pinecone, MongoDB Atlas, or Weaviate and want to retrieve over them without re-ingesting into ModuleX.
Important capability boundary. External providers query an existing collection; they have no ModuleX-side ingestion pipeline. The document upload, chunking, embedding, and processing flows described in Managing documents apply to managed knowledge bases only. You are responsible for keeping an external collection populated and its embedding model consistent with the one you configure on the ModuleX side.

How a provider is connected

The connection mechanism differs by family.
1

Managed (modulexdb): create a knowledge base

Create a knowledge base — in the app under Knowledge, or with POST /knowledge-bases. ModuleX provisions the managed store, auto-discovers an embedding-capable credential if you do not supply one, and links an internal modulexdb credential to the new knowledge base. There is nothing else to connect. See Build a RAG knowledge base.
2

External: create a credential for the provider

Create a custom credential for the provider with its connection fields (URL, API key, connection string — see the per-provider tables below), in the app under Settings → Credentials, or with POST /credentials. ModuleX encrypts the secret at rest and tests the connection before saving where a test endpoint is declared. See Managing credentials.
3

Reference the provider from a knowledge node

In a workflow, add a Knowledge node, set its provider_type, and point credential_id at the credential (or, for managed, the auto-linked knowledge-base credential). For external providers, also set collection_name and an embedding_config. The node configuration is documented in The knowledge node contract below.

Creating an external-provider credential

External providers use the custom auth schema. Create the credential with the provider’s name and an auth_data object holding its fields. Auth in every request follows the standard model: Authorization: Bearer mx_live_… plus X-Organization-ID.
curl -X POST https://api.modulex.dev/credentials \
  -H "Authorization: Bearer mx_live_xxx" \
  -H "X-Organization-ID: org_123" \
  -H "Content-Type: application/json" \
  -d '{
    "integration_name": "qdrant",
    "auth_type": "custom",
    "display_name": "Prod Qdrant",
    "make_default": true,
    "auth_data": {
      "url": "https://xyz-abc.aws.cloud.qdrant.io:6333",
      "api_key": "qdrant-api-key-..."
    }
  }'
The exact SDK method surface for credential creation may vary between the JavaScript and Python SDKs; the credential resource is not guaranteed to be at parity across both. Confirm against the SDK ⇄ API parity matrix and fall back to the REST call above when a method is absent. The cURL request is the authoritative shape.

Connection fields by provider

Each external provider declares its connection fields in its manifest under a custom auth schema. The fields below are the exact auth_data keys you supply.
url
string
required
URL of your Qdrant instance, including the port. Not treated as a secret. Example: https://xyz-abc.aws.cloud.qdrant.io:6333.
api_key
string
API key for Qdrant Cloud. Optional for local/unauthenticated instances. Stored encrypted. Example format: qdrant-api-key-....
ModuleX validates the connection by listing collections (GET {url}/collections, expecting a result field in the response). Catalog actions: query, list_collections, get_collection_info.
The connection fields above arrive in each manifest under a fields array, not the setup_environment_variables array used by tool and LLM-provider auth schemas. If you read the raw manifest from the catalog API, expect fields for the four knowledge providers. The manifest’s provider_type value (external) is dropped during catalog sync and never appears in API responses — do not rely on it.

Browsing providers in the catalog

The catalog exposes knowledge providers as read-only metadata for discovery; it is not the execution surface. List providers with GET /integrations/knowledge-providers or fetch one with GET /integrations/knowledge-providers/{provider_name}. Both require the owner or admin role and X-Organization-ID. The detail endpoint returns the provider’s actions and its auth_schemas (with OAuth flags enriched where applicable — not relevant to these custom-auth providers).
curl 'https://api.modulex.dev/integrations/knowledge-providers' \
  -H 'Authorization: Bearer mx_live_xxx' \
  -H 'X-Organization-ID: org_123'
name
string
Unique provider id, e.g. qdrant, pinecone, mongodb_atlas, weaviate.
display_name
string
Human-readable name, e.g. MongoDB Atlas Vector Search.
description
string
Short summary of the provider.
integration_type
string
Always knowledge_provider for these entries.
categories
string[]
Tags such as Vector Database, semantic-search.
actions
object[]
The provider’s catalog actions (returned by the detail endpoint), each with name, description, parameters, and a derived output_schema.
auth_schemas
object[]
The auth variants the provider supports. For all four external providers this is a single custom schema whose connection fields live under fields (see the warning above).
The catalog SDK method names and exact filter arguments may differ between SDKs and are not guaranteed to be at parity. Treat the cURL request as authoritative; verify SDK signatures against the SDK ⇄ API parity matrix.

The knowledge node contract

A Knowledge node is how a provider is consumed inside a workflow. The node configuration (KnowledgeNodeConfig) is shared across families; a handful of fields apply only to external providers.
credential_id
string
required
Credential for the knowledge base. For managed, this is the internal credential auto-linked to the knowledge base; for external, the custom credential you created for the provider.
provider_type
string
default:"modulexdb"
One of modulexdb, qdrant, pinecone, weaviate, mongodb_atlas.
query
string
required
The search query. Supports {{nodeId.path}} references to upstream node outputs for dynamic queries.
query_from_input
boolean
default:"false"
When true, use the workflow input as the query instead of the query field.
collection_name
string
Collection/index/class name on the external store. Required for external providers; ignored by modulexdb.
namespace
string
Namespace for Pinecone (or similar). External only.
top_k
integer
default:"5"
Number of results to retrieve. Range 1–50.
min_score
number
default:"0.3"
Minimum similarity-score threshold, 0.0–1.0. Score scales are normalized per provider, so the effective floor differs by vector store.
max_tokens
integer
default:"2000"
Maximum tokens in the formatted context string (for the context output format). Range 100–10000.
filters
object
Provider-specific filter conditions applied to the search.
document_ids
string[]
Restrict retrieval to specific document ids. Managed knowledge bases only.
output_format
string
default:"context"
One of chunks (individual chunks with metadata), context (a single formatted RAG context string), or both.
include_metadata
boolean
default:"true"
Include chunk metadata in results.
include_source
boolean
default:"true"
Include source-document info in the formatted context.
embedding_config
object
Embedding settings for providers that do not embed internally. Required for external vector databases (Qdrant, Pinecone, MongoDB Atlas, and Weaviate when not using a text vectorizer), because ModuleX must turn the query text into a vector before searching your store.
Match the embedding model in embedding_config to the model your external collection was indexed with. ModuleX embeds the query with the model you configure; if it differs from the one used to build the collection (or the dimension does not match), similarity scores are meaningless and retrieval returns poor or empty results. For managed (modulexdb) knowledge bases this is handled for you — ingest and query use the same configured embedding model.
Known limitation. External-provider retrieval is implemented in the workflow engine’s knowledge node, which embeds the query (per embedding_config) and calls the provider adapter. A separate, secondary executor helper path does not yet support external providers and returns “External provider not yet supported” if it is reached. If an external-provider knowledge node returns that error, treat it as a known gap; see Known limitations. Managed (modulexdb) retrieval is supported on every path.

Retrieval billing

Whether a provider is metered depends entirely on the family.
OperationManaged (modulexdb)External (BYOK)
Document ingest (parse → chunk → embed)FILE_INGEST_BASE = 1 credit, plus per-chunk embedding token costNot applicable — you ingest into your own store
Retrieval (search / hybrid / retrieve-context)RETRIEVAL_BASE = 1 credit, plus query-embedding token costUncosted by ModuleX
A knowledge base is billed only when it is managed — that is, when embedding_config.integration_name == "modulexai". One credit equals $0.001. External (BYOK) retrieval is not credited; your vector-database vendor and your embedding provider bill you directly. For the full credit model see Credits & metering.
On the interactive knowledge API, a managed-store credit or rate denial returns the flat billing DenialEnvelope as 402 / 403 / 429 (reject-before-write — no search runs). Inside a workflow run, the managed-retrieval gate is best-effort so a billing hiccup never crashes an in-flight run. The error envelope shapes are documented on Errors & status codes; the gate itself on Usage gating & limits.
🎬 MEDIA PLACEHOLDER · MX-MEDIA-4110 · [IMAGE] [IMAGE_DESCRIPTION]: Decision diagram contrasting the managed modulexdb path with the external BYOK path. [IMAGE_DETAILS]: Two side-by-side lanes. Left lane “Managed (modulexdb)”: documents -> ModuleX ingest (parse/chunk/embed) -> pgvector store -> retrieval, with a credit-coin icon on ingest and retrieval. Right lane “External (BYOK)”: your own collection in Qdrant/Pinecone/Atlas/Weaviate -> ModuleX embeds query -> adapter query -> results, marked “uncosted by ModuleX”. Light and dark variants; 16:9; label the wire identifiers modulexdb / qdrant / pinecone / mongodb_atlas / weaviate.
🎬 MEDIA PLACEHOLDER · MX-MEDIA-4111 · [SCREENSHOT] [SCREENSHOT_DESCRIPTION]: The credential-add dialog connecting an external knowledge provider. [SCREENSHOT_DETAILS]: Capture Settings -> Credentials -> Add credential with a knowledge provider (e.g. Qdrant) selected, showing the custom connection fields (URL, API key) and the test-before-save state. Dark theme; crop to the dialog; redact any real secret values.

Next steps

Managed knowledge (modulexdb)

Set up ModuleX-hosted storage and retrieval, billed in credits.

Connect an external store

Bring your own Qdrant, Pinecone, MongoDB Atlas, or Weaviate.

Knowledge node

Wire a provider into a workflow and shape its output.

Build a RAG knowledge base

End to end: create a knowledge base, ingest documents, and query it.