Skip to main content
MongoDB Atlas Vector Search is a bring-your-own-key (BYOK) knowledge provider for ModuleX. You connect your own Atlas cluster with a connection string, and ModuleX runs semantic retrieval against an existing collection that already holds vector embeddings — ModuleX does not host or ingest the vectors. The provider name on the wire is mongodb_atlas. It is one of the knowledge providers ModuleX can retrieve from. Unlike the managed store modulexdb — where ModuleX hosts the vectors, runs the embeddings, and meters the work in credits — MongoDB Atlas is BYOK and uncosted: retrieval runs against your cluster and is never credited by ModuleX. Your upstream Atlas and embedding-provider usage is billed by those providers directly. The other BYOK stores are Qdrant, Pinecone, and Weaviate.
This page is the provider-catalog entry for mongodb_atlas: how to connect your cluster, how to configure a workflow knowledge node to retrieve from it, and the exact request and result shapes. For the broader managed-vs-BYOK model and the RAG concept, see Knowledge & RAG and External knowledge providers.

What Atlas does in ModuleX

MongoDB Atlas is a retrieval-only provider in ModuleX. ModuleX queries your cluster’s existing vector index; it does not upload documents, create indexes, or chunk and embed content for you. You are responsible for populating the collection with documents that contain an embedding field and for creating an Atlas Vector Search index over that field. Because Atlas requires a precomputed query vector for similarity search, ModuleX embeds the search text first (using an embedding credential you configure) and then runs Atlas’s $vectorSearch aggregation. Two surfaces use the provider:

Workflow knowledge node

A knowledge node with provider_type set to mongodb_atlas retrieves matching documents inside a running workflow and feeds them to a downstream LLM or agent node. This is the primary BYOK retrieval path.

Integration catalog entry

Atlas appears in the integration catalog as a knowledge_provider with query, list_databases, and list_collections actions and a connection-string credential. The catalog is the browse and credential-setup surface.
🎬 MEDIA PLACEHOLDER · MX-MEDIA-4150 · [IMAGE] [IMAGE_DESCRIPTION]: BYOK retrieval flow from a ModuleX knowledge node to MongoDB Atlas Vector Search. [IMAGE_DETAILS]: Left-to-right flow. Box 1 “Knowledge node (provider_type: mongodb_atlas)”. Arrow to Box 2 “ModuleX embeds the query text via embedding_config (e.g. OpenAI)”. Arrow to Box 3 “Atlas $vectorSearch on database.collection”. Arrow back to Box 4 “matches: id, score, content, metadata”. A small lock icon labeled “connection string credential” sits over the arrow into Atlas, and a side note reads “BYOK — not credited by ModuleX”. Light theme, 16:9, brand palette.

Connect MongoDB Atlas

You connect Atlas by storing a credential that holds your cluster connection string. The credential is created against the mongodb_atlas integration with auth_type set to custom. You can create it in the app or over the API.

Credential fields

The Atlas auth schema has a single field.
connection_string
string
required
Your MongoDB Atlas connection string, in mongodb+srv://... form (for example mongodb+srv://user:password@cluster.mongodb.net/). This value is sensitive — it is encrypted at rest and never returned in full by the API. ModuleX uses it to open an async client (motor) against your cluster for every retrieval. The connecting database user needs read access to the database and collection you intend to query.
The connection string embeds your cluster credentials. Scope the Atlas database user to the data ModuleX needs to read, and rotate the credential if the string is exposed. ModuleX encrypts stored credential data; see Data security & encryption.

Connect in the app

Connecting a knowledge provider is an org-level write, so it requires the owner or admin role (see Roles & permissions; the member role is retired). In the org settings, open the integration catalog, select MongoDB Atlas Vector Search, choose Add credential, paste the connection string, and save. ModuleX validates the credential by connecting and listing databases before it is stored.
🎬 MEDIA PLACEHOLDER · MX-MEDIA-4151 · [SCREENSHOT] [SCREENSHOT_DESCRIPTION]: The Add-credential dialog for MongoDB Atlas Vector Search. [SCREENSHOT_DETAILS]: ModuleX org settings, integration catalog, MongoDB Atlas detail open with the “Add credential” dialog showing the single “Connection String” field (masked sample mongodb+srv://user:password@cluster.mongodb.net/), a display-name field, a “Set as default” toggle, and a “Test” button. Light theme, desktop width, no real secrets visible.

Connect over the API

Every request authenticates with Authorization: Bearer mx_live_… plus the X-Organization-ID header (the backend also accepts X-API-KEY); see Authentication. Creating a credential requires the owner/admin role. Send auth_type: "custom" and put the connection string in auth_data keyed by the field name connection_string. A successful create returns 201 with a CredentialResponse; keep the returned credential_id — the knowledge node references it.
curl -X POST https://api.modulex.dev/credentials \
  -H "Authorization: Bearer mx_live_xxx" \
  -H "X-Organization-ID: org_123" \
  -H "Content-Type: application/json" \
  -d '{
    "integration_name": "mongodb_atlas",
    "auth_type": "custom",
    "auth_data": {
      "connection_string": "mongodb+srv://user:password@cluster.mongodb.net/"
    },
    "display_name": "Atlas production",
    "make_default": true
  }'
The POST /credentials request body and the custom auth type are verified against the backend. The exact SDK method names and parameter casing for credential creation are not fully pinned in the research base; confirm them against the JavaScript SDK and Python SDK references before relying on the snippet shapes.

Credential response

POST /credentials returns a CredentialResponse. The connection string is never echoed back.
credential_id
string (UUID)
The credential identifier. Pass this as the knowledge node’s credential_id.
integration_name
string
mongodb_atlas.
integration_type
string
knowledge_provider.
display_name
string
The display name you supplied, or a generated default.
auth_type
string
custom.
is_default
boolean
Whether this is the default credential for mongodb_atlas in the organization.
created_at
string (ISO-8601) | null
Creation timestamp.
updated_at
string (ISO-8601) | null
Last-update timestamp.
last_used_at
string (ISO-8601) | null
When the credential was last used for a retrieval. null until first use.
expires_at
string (ISO-8601) | null
Expiry, if any. Connection-string credentials do not expire on their own.
credentials_metadata
object | null
Any metadata you attached at create time.

Test the connection before saving

To validate a connection string without persisting it, call POST /credentials/test-temporary with the same integration_name, auth_type, and auth_data. For Atlas, the test connects to the cluster, pings the server, and lists databases; it has no HTTP test endpoint.
curl -X POST https://api.modulex.dev/credentials/test-temporary \
  -H "Authorization: Bearer mx_live_xxx" \
  -H "X-Organization-ID: org_123" \
  -H "Content-Type: application/json" \
  -d '{
    "integration_name": "mongodb_atlas",
    "auth_type": "custom",
    "auth_data": {
      "connection_string": "mongodb+srv://user:password@cluster.mongodb.net/"
    }
  }'
The response is a TestTemporaryCredentialResponse with is_valid, a human-readable message, tested_at, test_method, integration_name, and auth_type. A failed connection reports is_valid: false with the reason in message.

Catalog actions

In the integration catalog, mongodb_atlas is a knowledge_provider that advertises three actions. These describe the provider’s capability in the catalog; retrieval inside a workflow is driven by the knowledge node config in the next section, which maps onto the query action.

Configure BYOK retrieval in a knowledge node

A knowledge node retrieves from a provider inside a running workflow. To retrieve from Atlas, set the node’s provider_type to mongodb_atlas, point credential_id at your Atlas credential, set collection_name to a database.collection value, and supply an embedding_config so ModuleX can embed the query.

How retrieval runs

When the node executes, ModuleX:
1

Resolves and decrypts the credential

It loads the org credential by credential_id, decrypts the stored connection string, and opens an async Atlas client.
2

Embeds the query

Atlas requires a precomputed query vector, so ModuleX embeds the resolved query text using embedding_config. If embedding_config is missing, the node fails with a validation error.
3

Runs $vectorSearch

It splits collection_name on the first . into database and collection, then runs an aggregation: a $vectorSearch stage followed by $addFields that exposes the vectorSearchScore as score. The vector field is excluded from results unless vectors are requested.
4

Filters and formats

Matches scoring below min_score are dropped. Each remaining document becomes a chunk with content, score, metadata, and id, then the node formats them per output_format.
BYOK retrieval is uncosted. Unlike modulexdb, an Atlas knowledge node consumes no ModuleX credits — there is no billing gate on the retrieval. Your Atlas query cost and your embedding-provider token cost are billed by those providers directly. See Credits & metering.

Knowledge-node fields

These are the KnowledgeNodeConfig fields that apply when provider_type is mongodb_atlas.
provider_type
string
default:"modulexdb"
Set to mongodb_atlas to retrieve from Atlas. One of modulexdb, qdrant, pinecone, weaviate, mongodb_atlas.
credential_id
string
required
The Atlas credential ID from the credential you created. ModuleX decrypts its connection string to reach your cluster.
collection_name
string
required
The target collection in database.collection format (for example support.kb_chunks). It must contain the first . separator; a value with no . fails with a database.collection format error. Required for Atlas — a node with no collection_name fails before it queries.
query
string
required
The search text. Supports {{nodeId.path}} references so the query can come from an upstream node’s output. ModuleX embeds this text to build the Atlas query vector.
query_from_input
boolean
default:"false"
When true, the workflow input is used as the query instead of the query field.
embedding_config
object
required
Required for Atlas. The embedding model ModuleX uses to embed the query text before the vector search. The embedding model and dimension you configure here must match the model and dimension used to build the vectors stored in your Atlas collection, or scores will be meaningless. See the sub-fields below.
top_k
integer
default:"5"
Number of results to retrieve. Range 150.
min_score
number
default:"0.3"
Minimum similarity score threshold. Range 0.01.0. Matches from Atlas scoring below this value are dropped.
max_tokens
integer
default:"2000"
Maximum tokens in the formatted context string, when output_format returns a context. Range 10010000.
filters
object
Provider-specific pre-filter passed through to Atlas as the $vectorSearch filter (MQL form). null by default.
namespace
string
Not used for Atlas (it applies to Pinecone-style providers). Leave unset.
document_ids
array
Filter to specific document IDs. Native (modulexdb) knowledge bases only — has no effect on Atlas. Use filters for Atlas pre-filtering.
output_format
string
default:"context"
How the node returns results. One of chunks (individual matches with metadata), context (a single formatted RAG string), or both.
include_metadata
boolean
default:"true"
Include each document’s metadata (every field except _id, score, and the vector field) in the results.
include_source
boolean
default:"true"
Include source-document info in the formatted context string.

Atlas index defaults

ModuleX builds the $vectorSearch stage with these defaults. Make sure your Atlas Vector Search index and stored documents match them, or override the index in your collection setup accordingly.
SettingDefault ModuleX usesNotes
Vector Search index namevector_indexThe index name passed to $vectorSearch. Create your Atlas index with this name.
Vector field pathembeddingThe document field that holds the stored vectors.
numCandidatestop_k × 10Candidate pool size; scales with top_k.
limittop_kResults returned by the aggregation.
ModuleX queries Atlas with a fixed index name of vector_index and a fixed vector field of embedding for the knowledge-node retrieval path. Your collection must have an Atlas Vector Search index named vector_index over a field named embedding.

Example node config

{
  "provider_type": "mongodb_atlas",
  "credential_id": "939f74dc-7b2f-473c-b87d-b95c30c32fd3",
  "collection_name": "support.kb_chunks",
  "query": "{{trigger.question}}",
  "top_k": 5,
  "min_score": 0.3,
  "output_format": "context",
  "embedding_config": {
    "integration_name": "openai",
    "provider_id": "openai",
    "model_id": "text-embedding-3-small"
  }
}

Retrieval results

Each Atlas match is normalized into a chunk before the node formats it.
id
string
The matched document’s _id, stringified.
score
number
The Atlas vectorSearchScore for the match. Matches below min_score are excluded.
content
string | null
The text content extracted from the document. ModuleX looks for a text value in common fields — content, text, chunk_text, page_content, data, body, summary, document — and returns the first match, or null if none is present.
metadata
object | null
Every document field except _id, score, and the vector field. Present only when include_metadata is true.
The node then returns chunks, a formatted context string, or both, per output_format. The downstream LLM or agent node consumes this as RAG context.

Errors

Atlas retrieval runs inside the workflow executor, so failures surface as the node failing the run rather than as an HTTP envelope. Because BYOK retrieval is uncosted, there is no DenialEnvelope ({code, layer, …}) on this path — that envelope only appears on managed (modulexdb) knowledge operations. The credential-management endpoints return the standard {detail} HTTPException shape. See Errors & status codes.
ConditionWhat you see
collection_name missingThe node fails with collection_name is required for external provider.
collection_name has no .Collection name must be in 'database.collection' format.
embedding_config missingembedding_config required for providers that don't handle embeddings.
No query vector reached AtlasMongoDB Atlas requires query_vector for search.
Bad or unauthorized connection stringMongoDB authentication failed: … (an authentication error).
Cluster unreachable / motor not installedFailed to connect to MongoDB: … (a connection error).
Aggregation failure (for example a missing or misnamed Atlas index)MongoDB query failed: … (a query error).
Credential not found in the orgCredential not found: <credential_id>.
The motor driver must be available in the runtime for Atlas connections. If it is missing, connections fail with an instruction to install it. This is an environment dependency of the ModuleX backend, not something you configure per credential.

Knowledge providers

Compare MongoDB Atlas with modulexdb and the other BYOK vector stores.

modulexdb (managed)

The managed alternative: ModuleX hosts the vectors and meters usage in credits.

Knowledge node

Retrieve from Atlas inside a workflow.

External knowledge providers

The product overview of bringing your own vector store.

Authentication & credentials

How ModuleX stores and resolves integration credentials.

Knowledge & RAG

The retrieval-augmented-generation model behind knowledge nodes.