Skip to main content
Weaviate is an open-source vector database with a GraphQL query API. In ModuleX, Weaviate is a bring-your-own-key (BYOK) knowledge provider: you store the connection details for your own cluster as a credential, and a knowledge node in a workflow runs vector similarity search against one of your Weaviate classes. ModuleX never hosts your vectors and does not charge credits for retrieval against a Weaviate cluster you own — see Billing. If you want ModuleX to host the vectors, ingest, embed, and search for you, use modulexdb (managed) instead. For the full list of supported stores, see Knowledge providers.
Weaviate retrieval in ModuleX runs inside a workflow knowledge node, not through the managed /knowledge-bases search API. The managed search, ingest, and document endpoints under /knowledge-bases operate only on modulexdb-backed knowledge bases. With Weaviate you bring vectors that already exist in your own cluster.

How Weaviate fits into ModuleX

ModuleX treats Weaviate as an external provider behind a single adapter. At run time, a knowledge node:
1

Resolves the credential

The node looks up your stored Weaviate credential by credential_id, scoped to the organization in X-Organization-ID, and decrypts the connection details (url and optional api_key).
2

Embeds the query

Weaviate retrieval in ModuleX requires a query vector. ModuleX generates the embedding from the node’s embedding_config (for example an OpenAI text-embedding-3-small model) before calling Weaviate. See embedding configuration.
3

Runs a nearVector GraphQL search

The adapter issues a Get GraphQL query with a nearVector argument against your class, applies a certainty floor derived from min_score, and limits results to top_k.
4

Returns chunks or context

Matches are normalized into ModuleX’s standard chunk shape and returned as chunks, a single context string, or both, depending on output_format.
🎬 MEDIA PLACEHOLDER · MX-MEDIA-4160 · [IMAGE] [IMAGE_DESCRIPTION]: Data-flow diagram of BYOK Weaviate retrieval inside a knowledge node. [IMAGE_DETAILS]: Show: workflow knowledge node -> resolve Weaviate credential (url + optional api_key) -> embed query via embedding_config -> nearVector GraphQL POST to the customer’s Weaviate cluster -> normalized chunks back into run state. Mark the cluster boundary as customer-owned (BYOK, uncosted). Match the docs accent palette; light and dark variants; 16:9.

Connect Weaviate

A Weaviate connection is stored as a credential with the custom auth type and the integration name weaviate. Creating a credential requires the owner or admin role on the organization (organization_admin_required); the member role is retired and cannot create credentials. See Roles & permissions. Authenticate every request to the ModuleX API with Authorization: Bearer mx_live_… and the X-Organization-ID header, as described in Authentication.

Connection fields

These are the two fields the Weaviate auth schema exposes. They are stored, encrypted at rest, inside the credential’s auth_data.
url
string
required
The base URL of your Weaviate instance, with no trailing path. Example: https://your-cluster.weaviate.cloud. If omitted at query time the adapter falls back to http://localhost:8080, but you should always set it explicitly. Stored unencrypted-flagged (sensitive: false).
api_key
string
The Weaviate API key. Required for Weaviate Cloud; optional for an unauthenticated self-hosted cluster. When present, ModuleX sends it to Weaviate as Authorization: Bearer <api_key>. Stored as a secret (sensitive: true).

Create the credential

Send the connection fields under auth_data with auth_type set to custom. The response returns the new credential_id, which you then reference from the knowledge node.
curl -X POST https://api.modulex.dev/credentials \
  -H "Authorization: Bearer mx_live_xxx" \
  -H "X-Organization-ID: org_123" \
  -H "Content-Type: application/json" \
  -d '{
    "integration_name": "weaviate",
    "auth_type": "custom",
    "display_name": "Production Weaviate",
    "auth_data": {
      "url": "https://your-cluster.weaviate.cloud",
      "api_key": "weaviate-api-key-..."
    }
  }'
Send auth_type: "custom" explicitly. Without it, the backend infers the credential type from auth_data: a payload containing api_key is treated as a standard API-key credential, which is the wrong shape for Weaviate (it would drop the url). The explicit custom type keeps both fields together in one credential.
For the broader credential lifecycle — listing, testing, setting a default, and rotation — see Managing credentials.

Test the connection

Testing a Weaviate credential validates connectivity rather than running a search. ModuleX calls your cluster’s readiness probe at GET {url}/v1/.well-known/ready and, on success, reads GET {url}/v1/schema to count your classes.
OutcomeMeaning
status: "connected" plus classes_countThe cluster is reachable and the API key (if any) is valid.
401 from WeaviateThe api_key is missing or invalid for a cluster that requires authentication.
Connection refused / not readyThe url is wrong, the cluster is down, or it is not reachable from ModuleX.
🎬 MEDIA PLACEHOLDER · MX-MEDIA-4161 · [SCREENSHOT] [SCREENSHOT_DESCRIPTION]: The Weaviate connection form in the ModuleX credentials UI. [SCREENSHOT_DETAILS]: Capture the “Add credential” panel for Weaviate showing the URL field (required) and API Key field (optional, masked), with the Test connection result showing “connected” and a class count. Settings > Credentials in the live app; dark theme; crop to the form.

Configure the knowledge node

Once the credential exists, add a knowledge node to a workflow and point it at Weaviate. The node config (KnowledgeNodeConfig) is shared across all providers; the fields below are the ones that matter for Weaviate. Every node value supports {{nodeId.path}} references so a query can come from a previous step — see Variables & references.

Connection and provider

provider_type
string
default:"modulexdb"
Set to weaviate to route the node through the Weaviate adapter. The other accepted values are modulexdb, qdrant, pinecone, and mongodb_atlas.
credential_id
string
required
The credential_id of the Weaviate credential you created above.
collection_name
string
required
The Weaviate class to search (Weaviate’s term for a collection). Required for every external provider; the node raises a validation error if it is missing for a non-modulexdb provider.
namespace
string
Accepted on the node config for Pinecone-style providers. The Weaviate adapter ignores it — Weaviate has no namespace concept, so leave it unset.

Query

query
string
required
The search text. Supports {{nodeId.path}} references for dynamic queries. ModuleX embeds this text into a vector before calling Weaviate (see embedding configuration).
query_from_input
boolean
default:"false"
When true, the node uses the workflow input as the query instead of the query field, reading the first of query, question, input, user_input, or message found in run state.

Retrieval settings

top_k
integer
default:"5"
Number of results to return. Range 150. Maps to the GraphQL limit argument.
min_score
number
default:"0.3"
Minimum similarity threshold, 0.01.0. For Weaviate the adapter converts this to a certainty floor as certainty = 1.0 - min_score (only when min_score > 0; otherwise certainty is 0.0 and no floor is applied). The returned score for each match is Weaviate’s certainty value.
max_tokens
integer
default:"2000"
Token budget for the assembled context string. Range 10010000. Applies when output_format is context or both.
filters
object
Provider-specific filter conditions passed through to the adapter. For Weaviate these correspond to GraphQL where conditions.
document_ids
array
Restrict results to specific document IDs. Native (modulexdb) knowledge bases only — this filter is not applied to Weaviate.

Output

output_format
string
default:"context"
How retrieved knowledge is returned: chunks (individual matches with metadata), context (a single formatted RAG string), or both.
include_metadata
boolean
default:"true"
Include each match’s properties (returned by Weaviate as object properties) in the chunk metadata.
include_source
boolean
default:"true"
Include source document headers in the formatted context string.

Embedding configuration

Weaviate retrieval in ModuleX always requires a query vector — the adapter reports requires_query_embedding() = true and raises Weaviate requires query_vector for search if none is supplied. ModuleX builds that vector from the node’s embedding_config before calling your cluster, so this object is required for Weaviate.
Use the same embedding model that produced the vectors stored in your Weaviate class. Mismatched models yield dimension errors or meaningless similarity scores. ModuleX does not read your class’s vectorizer config to infer the model; you declare it in embedding_config.
embedding_config
object
required
Embedding settings for generating the query vector.
ModuleX’s text2vec-style server-side text search is not wired through this adapter. Although the catalog metadata for Weaviate lists a query_text parameter and mentions nearText, the live retrieval path uses nearVector only and rejects a search with no query_vector. Plan to provide vectors via embedding_config.

BYOK retrieval

A complete Weaviate knowledge node configuration, ready to drop into a workflow’s node definition config:
Knowledge node config (Weaviate)
{
  "provider_type": "weaviate",
  "credential_id": "a1b2c3d4-0000-0000-0000-000000000000",
  "collection_name": "ProductDocs",
  "query": "{{trigger.question}}",
  "top_k": 5,
  "min_score": 0.3,
  "output_format": "context",
  "include_metadata": true,
  "include_source": true,
  "embedding_config": {
    "integration_name": "openai",
    "provider_id": "openai",
    "model_id": "text-embedding-3-small",
    "credential_id": null
  }
}
When this node runs, ModuleX embeds {{trigger.question}}, issues the nearVector GraphQL query against the ProductDocs class, and writes the result to run state under the node’s own id. A downstream LLM node or Agent node can then reference the retrieved context with {{<knowledge_node_id>.context}}. To run the workflow itself over the API or an SDK, see Run via API and the Run a workflow guide. The run streams over SSE — see SSE run streaming.

What the adapter returns

Each match is normalized into a chunk. The score is Weaviate’s certainty; metadata carries the object’s returned properties; content is the adapter’s best-effort text extraction from those properties.
id
string
The Weaviate object id (_additional.id).
score
number
The match’s certainty from Weaviate (01, higher is better).
content
string
Text extracted from the object’s properties. The adapter checks common field names in order: content, text, chunk_text, page_content, data, body, summary, document. If none is present, content may be empty — name a text property accordingly in your class.
metadata
object
The object’s returned properties (everything except _additional). Present only when include_metadata is true.

Returning specific properties

The Weaviate adapter requests result properties for you. If your class stores its text under a non-default property name, set properties in the node filters/config path appropriate to your build, or ensure one of the recognized field names above exists so content is populated. The catalog query action documents these tunables:
class_name
string
required
The Weaviate class to search.
query_vector
array
The query embedding (nearVector). Supplied by ModuleX from embedding_config.
limit
integer
default:"5"
Maximum number of results.
certainty
number
default:"0.7"
Minimum certainty threshold (01). At run time this is derived from the node’s min_score.
properties
array
Properties to return in results. Defaults to an empty list, in which case the adapter returns the object id, certainty, and distance.
where
object
GraphQL where filter conditions.

Billing: BYOK retrieval is not metered

Searching a Weaviate cluster you own does not consume ModuleX credits. ModuleX meters retrieval only for managed (modulexdb) knowledge — a managed search reserves one retrieval credit before the embed and records it on success. BYOK providers, including Weaviate, are uncosted for retrieval. See Credits & metering and Knowledge & RAG.
The Weaviate search is uncosted, but the workflow run that contains the knowledge node is still subject to the run-admission billing gate. A run on the run / composer / assistant / knowledge-managed surfaces can be denied with a flat DenialEnvelope ({code, layer, key, current, limit, reason}) returned as 402, 403, or 429 when your plan allowance or wallet is exhausted. See Usage gating & limits and Errors & status codes. Embedding the query (via a managed embedding model) can itself consume credits even when the vector store is BYOK.

Errors and edge cases

The adapter and node surface the following conditions. Adapter exceptions become the node’s error field (the node returns empty chunks/context with an error rather than failing the whole run), while credential and config problems raise before the search.

Knowledge providers

Compare every vector store ModuleX can use for retrieval.

Knowledge node

The node that runs Weaviate retrieval inside a workflow.

External knowledge providers

How BYOK vector stores plug into ModuleX.

Knowledge & RAG

The retrieval model behind chats and workflows.