Pinecone vector store (BYOK)

Pinecone is a managed vector database you can bring to ModuleX as an external knowledge provider. When you connect Pinecone, ModuleX queries your Pinecone index at retrieval time using a stored credential — your vectors stay in your Pinecone project, and Pinecone bills you directly. This is the bring-your-own-key (BYOK) model: unlike managed knowledge (modulexdb), BYOK retrieval through Pinecone is not metered in ModuleX credits. Use this page to store a Pinecone credential, configure a knowledge node to point at your index, and understand exactly how ModuleX issues the query.

Pinecone is a retrieval-only provider in ModuleX. ModuleX queries an index you already populated; it does not upload, chunk, embed, or ingest documents into Pinecone for you. Document ingest (parse, chunk, embed) only applies to managed knowledge bases. For Pinecone you own the indexing pipeline.

How BYOK retrieval works

At a high level, a Pinecone-backed retrieval inside a workflow runs in five steps:

Resolve the credential

The knowledge node reads its credential_id, loads the matching org credential, and decrypts the stored auth_data (your api_key and environment).

Embed the query

Pinecone stores vectors but does not embed text. ModuleX generates a query embedding first, using the node’s embedding_config (for example OpenAI text-embedding-3-small). If embedding_config is missing, the node fails before any call to Pinecone.

Call your index

ModuleX POSTs the query vector to your Pinecone index endpoint with your Api-Key header, requesting topK matches in the configured namespace, with any metadata filter applied.

Normalize the matches

Each Pinecone match is mapped into a standard result with id, score, content (pulled from common metadata text fields), and metadata. Matches below min_score are dropped client-side.

Format the output

Results are returned to the workflow as chunks, a formatted context string, or both, depending on the node’s output_format.

🎬 MEDIA PLACEHOLDER · MX-MEDIA-4140 · [IMAGE] [IMAGE_DESCRIPTION]: BYOK retrieval flow diagram for Pinecone. [IMAGE_DETAILS]: Horizontal flow: workflow knowledge node to “embed query (embedding_config)” to ModuleX backend to your Pinecone index, then matches flowing back and being normalized to chunks/context. Label the embedding step “billed only if the embedding provider is managed” and the Pinecone call “BYOK, not metered in ModuleX credits”. Light theme, 16:9, annotated.

Before you start

You will need the following from your own Pinecone account:

Pinecone prerequisites

A Pinecone API key. Pinecone keys look like pcsk_....
The environment for your project (for example us-east-1-aws or gcp-starter).
An existing, populated index whose vector dimension matches the embedding model you plan to use in ModuleX. A 1536-dimension index pairs with OpenAI text-embedding-3-small; mismatched dimensions cause Pinecone to reject the query.
The index namespace you want to search, if you use namespaces.

You also need the owner or admin role in the ModuleX organization. Credential and knowledge routes require an admin/owner role; the retired member role cannot manage them. See roles & permissions.

Connect Pinecone

Connecting Pinecone means storing an encrypted credential in your organization. ModuleX persists it as a custom credential (auth_type is custom) under the pinecone integration, which the catalog classifies as a knowledge_provider.

Credential fields

Pinecone declares a single custom auth schema with two fields:

api_key

string

required

Your Pinecone API key. Stored encrypted and never returned in plaintext. Sample format pcsk_.... This is the Api-Key header value ModuleX sends on every request to Pinecone.

environment

string

required

Your Pinecone environment, for example us-east-1-aws or gcp-starter. ModuleX uses it to build the Pinecone host URL (see Index URL resolution).

index_host

string

Optional, undocumented in the connect UI. If present in auth_data, this exact URL overrides the environment-derived index host. Set this when your index host does not match the legacy environment-based pattern — for example for serverless indexes. See Index URL resolution for why this matters.

Connect in the app

Open knowledge providers

In the ModuleX app, go to the integrations area and select Pinecone under knowledge providers.

Enter your credentials

Paste your Pinecone API key and environment, give the credential a display name, and optionally mark it as the default for Pinecone.

Test the connection

ModuleX validates the credential by listing your indexes (GET https://api.pinecone.io/indexes). A 200 with an indexes field means the key works. This test is free.

🎬 MEDIA PLACEHOLDER · MX-MEDIA-4141 · [SCREENSHOT] [SCREENSHOT_DESCRIPTION]: The Pinecone connect form in the ModuleX app. [SCREENSHOT_DETAILS]: Show the Pinecone knowledge-provider credential form with the API key field (masked, placeholder pcsk_...), the environment field (placeholder us-east-1-aws), a display-name field, and a “test connection” action. Capture the success state after testing. Light theme.

Connect via the API

Create the credential with POST /credentials. The request body is read raw and the credential type is inferred from integration_name, auth_type, and auth_data. For Pinecone, send auth_type: custom with both fields nested under auth_data. ModuleX resolves integration_type to knowledge_provider from the catalog automatically. All requests authenticate with Authorization: Bearer mx_live_… plus the X-Organization-ID header. See authentication.

curl -X POST https://api.modulex.dev/credentials \
  -H "Authorization: Bearer mx_live_xxx" \
  -H "X-Organization-ID: org_123" \
  -H "Content-Type: application/json" \
  -d '{
    "integration_name": "pinecone",
    "auth_type": "custom",
    "display_name": "Production Pinecone",
    "make_default": true,
    "auth_data": {
      "api_key": "pcsk_xxx",
      "environment": "us-east-1-aws"
    }
  }'

Creating, listing, and querying a Pinecone credential is a plain CRUD operation, so it returns the standard {detail} error envelope on failure (for example a 400 "Invalid auth_data or auth_type…" if the body is malformed). The flat billing DenialEnvelope does not apply here. See errors & status codes for the full envelope reference. Pinecone credentials are stored with auth_type custom, so they appear in your credential list (only internal credentials are hidden).

Create-credential request fields

integration_name

string

required

Must be pinecone. ModuleX looks this up in the integration catalog to resolve integration_type to knowledge_provider. An unknown name falls back to tool.

auth_type

string

required

Must be custom for Pinecone. This routes the request to the custom-credential path, which stores arbitrary auth_data fields.

auth_data

object

required

The Pinecone credential fields. Must not be empty. Carries api_key and environment (and optionally index_host).

display_name

string

A human-readable label for the credential. Defaults to Pinecone Custom Credential if omitted.

make_default

boolean

default:"false"

When true, sets this credential as the default for the pinecone integration, unsetting any prior default.

Test the saved credential

Validate a stored credential against Pinecone with POST /credentials/{credential_id}/test. ModuleX calls the integration’s test endpoint — GET https://api.pinecone.io/indexes — and reports whether the key is valid.

curl -X POST https://api.modulex.dev/credentials/cred_abc/test \
  -H "Authorization: Bearer mx_live_xxx" \
  -H "X-Organization-ID: org_123"

A successful test response has the shape {credential_id, is_valid, message, tested_at}.

Configure a knowledge node

Once the credential exists, point a knowledge node at your Pinecone index by setting provider_type to pinecone and supplying the index name, an embedding configuration, and the credential. Because Pinecone stores only vectors, the knowledge node requires an embedding_config for Pinecone. ModuleX uses it to embed the query text before calling Pinecone. If embedding_config is omitted, the node raises a validation error and never reaches Pinecone.

Knowledge node fields for Pinecone

credential_id

string

required

The credential_id of your stored Pinecone credential.

provider_type

string

default:"modulexdb"

required

Set to pinecone. Valid knowledge provider types are modulexdb, qdrant, pinecone, weaviate, and mongodb_atlas.

collection_name

string

required

The Pinecone index name to search. ModuleX uses Pinecone’s index_name terminology here as collection_name. Required for all external providers; the node fails with "collection_name is required for external provider: pinecone" if it is missing.

embedding_config

object

required

Embedding settings used to turn the query text into a vector. Required for Pinecone because Pinecone does not embed text. See Embedding configuration for fields. Use a model whose output dimension matches your Pinecone index.

query

string

required

The search query. Supports {{nodeId.path}} references so the query can come from an earlier node’s output.

query_from_input

boolean

default:"false"

When true, the workflow input is used as the query instead of the query field.

namespace

string

The Pinecone namespace to search within the index. Omit to search the default namespace.

top_k

integer

default:"5"

Number of matches to return. Range 1–50. Sent to Pinecone as topK.

min_score

number

default:"0.3"

Minimum similarity score, 0.0–1.0. Pinecone has no native score threshold, so ModuleX drops matches below this value after Pinecone returns them. The score is Pinecone’s raw match score for your index’s metric; ModuleX does not re-normalize it.

filters

object

A Pinecone metadata filter, passed through to Pinecone’s query filter field verbatim. Use Pinecone filter syntax, for example {"category": {"$eq": "docs"}}.

max_tokens

integer

default:"2000"

Token budget for the formatted context string. Range 100–10000. Only applies when output_format is context or both.

output_format

string

default:"context"

How results are returned: chunks (individual matches with metadata), context (a single RAG-ready string), or both.

include_metadata

boolean

default:"true"

Include each match’s metadata in the result. ModuleX requests includeMetadata from Pinecone accordingly.

include_source

boolean

default:"true"

Include source document headers in the formatted context output.

document_ids

array

Filter to specific document IDs. Native (modulexdb) KB only — ignored for Pinecone. Use filters for Pinecone-side metadata filtering instead.

Embedding configuration

The embedding_config object tells ModuleX which model to use when embedding the query. The model’s output dimension must match your Pinecone index dimension.

integration_name

string

required

The embedding provider integration, for example openai or cohere.

provider_id

string

required

The provider identifier, for example openai.

model_id

string

required

The embedding model id, for example text-embedding-3-small (1536 dimensions) or text-embedding-3-large (3072 dimensions).

credential_id

string

Credential for the embedding provider. If null, ModuleX uses your organization’s default credential for that integration.

Dimension mismatch is the most common Pinecone failure. If your index was created at 1536 dimensions, the embedding_config.model_id must produce 1536-dimension vectors. A mismatch produces a Pinecone query error surfaced as a 500 with the message "Pinecone query failed: …".

Example knowledge node configuration

A workflow knowledge node configured for Pinecone looks like this:

Pinecone knowledge node config

{
  "credential_id": "cred_abc",
  "provider_type": "pinecone",
  "collection_name": "product-docs",
  "namespace": "v1",
  "query": "{{trigger.question}}",
  "top_k": 5,
  "min_score": 0.3,
  "filters": { "category": { "$eq": "docs" } },
  "output_format": "context",
  "embedding_config": {
    "integration_name": "openai",
    "provider_id": "openai",
    "model_id": "text-embedding-3-small",
    "credential_id": null
  }
}

🎬 MEDIA PLACEHOLDER · MX-MEDIA-4142 · [SCREENSHOT] [SCREENSHOT_DESCRIPTION]: Knowledge node inspector configured for Pinecone. [SCREENSHOT_DETAILS]: Capture the knowledge node inspector with provider type set to Pinecone, the index (collection) name, namespace, top_k, min_score, and the embedding model selector populated. Highlight that an embedding model is required. Light theme.

How ModuleX queries Pinecone

When the knowledge node runs, ModuleX builds and sends the Pinecone query directly. Understanding the exact request helps when debugging connectivity or 404s.

Request to Pinecone

ModuleX POSTs to {index_url}/query with these headers:

Headers sent to Pinecone

Api-Key: <your api_key>
Content-Type: application/json

and this body:

Pinecone query body

{
  "vector": [0.0123, -0.0456, "..."],
  "topK": 5,
  "includeMetadata": true,
  "includeValues": false,
  "namespace": "v1",
  "filter": { "category": { "$eq": "docs" } }
}

namespace and filter are only included when set. The request times out after 30 seconds.

Index URL resolution

ModuleX builds the index host in this order:

Show Index host resolution rules

If your credential auth_data contains index_host, ModuleX uses that exact URL.
Otherwise, ModuleX derives the host from the environment as https://{index_name}-{environment}.svc.pinecone.io.

The connection test and list_indexes instead call the controller URL https://controller.{environment}.pinecone.io/indexes, while the saved-credential test uses https://api.pinecone.io/indexes.

The environment-derived host (https://{index_name}-{environment}.svc.pinecone.io) and the controller host (https://controller.{environment}.pinecone.io) follow Pinecone’s legacy pod-based URL conventions. Pinecone serverless indexes use a different host. If your queries return connection errors or 404 despite a valid key, set index_host in auth_data to the exact host shown in your Pinecone console.

Normalized result

ModuleX maps each Pinecone match into a standard result object before formatting:

string

The Pinecone match id (the vector id).

score

number

Pinecone’s raw similarity score for the match. Matches below min_score are dropped before this point.

content

string

Text extracted from the match metadata. ModuleX checks common metadata fields in order: content, text, chunk_text, page_content, data, body, summary, document. If none are present, content is empty — store your chunk text under one of these keys in Pinecone metadata so retrieved chunks carry usable text.

metadata

object

The match metadata, included when include_metadata is true.

vector

array

The stored vector values, included only when vector values are requested.

These results are then shaped into chunks, context, or both per the node’s output_format.

Billing

BYOK retrieval through Pinecone is not metered in ModuleX credits. Only knowledge bases whose embedding provider is managed (modulexdb / modulexai) reserve retrieval credits. Two cost notes still apply:

Pinecone bills you directly for queries and storage on your own account.
The query embedding may be metered. If your embedding_config uses a managed embedding model, generating the query vector is billed in ModuleX credits like any other managed model call. Using a BYOK embedding credential avoids that. See credits & metering.

Errors and edge cases

Invalid Pinecone API key

A bad or revoked key returns 401 from Pinecone. ModuleX surfaces this as an authentication error ("Invalid Pinecone API key"). Re-test the credential with POST /credentials/{credential_id}/test and rotate the key in Pinecone if needed.

Index not found (404)

If Pinecone returns 404 for the query, ModuleX raises "Index not found: <index_name>". Verify the collection_name matches an index in your project, and check the Index URL resolution rules — a serverless index on the legacy host pattern is the usual cause. Set index_host to fix it.

Missing embedding_config

Pinecone requires a pre-computed query vector. Without embedding_config, the node raises "embedding_config required for providers that don't handle embeddings…" and never calls Pinecone. Add an embedding_config whose dimension matches your index.

Missing collection_name

The node raises "collection_name is required for external provider: pinecone". Set collection_name to your index name.

Dimension mismatch or other query failure

Any other Pinecone error (for example a vector-dimension mismatch) surfaces as a 500 with "Pinecone query failed: <pinecone message>". Confirm your embedding model dimension equals the index dimension.

Empty content in results

Pinecone returns vector ids, scores, and metadata — not separate text. If retrieved chunks have empty content, store the chunk text in metadata under one of the recognized keys (content, text, chunk_text, page_content, data, body, summary, document).

Provider actions reference

Beyond workflow retrieval, the Pinecone integration declares three catalog actions, surfaced through the knowledge-provider catalog and the tool node. Authentication is the same custom API-key schema described above.

query — vector similarity search

Performs a vector similarity search on a Pinecone index.Parameters

index_name

string

required

Name of the Pinecone index to search.

namespace

string

Namespace within the index.

query_vector

array

required

Query embedding vector.

top_k

integer

default:"5"

Number of results to return.

filter

object

Metadata filter conditions.

include_metadata

boolean

default:"true"

Include metadata in results.

include_values

boolean

default:"false"

Include vector values in results.

Output

matches

array

Array of matches, each with id, score, metadata, and values.

namespace

string

The namespace searched.

list_indexes — list project indexes

Lists all indexes in your Pinecone project. Takes no parameters.Output: an array of indexes, each with name, dimension, metric, and status.

describe_index — index statistics

Returns statistics about one index.Parameters

index_name

string

required

Name of the index.

Output

dimension

integer

The index vector dimension.

total_vector_count

integer

Total vectors in the index.

namespaces

object

Per-namespace vector counts.

Knowledge providers overview

Compare managed and BYOK vector stores and see all supported providers.

Knowledge node

Configure retrieval inside a workflow.

Managing credentials

Create, rotate, and scope credentials in the app and via API.

Knowledge & RAG

How ModuleX retrieves company knowledge end to end.

​How BYOK retrieval works

​Before you start

Pinecone prerequisites

​Connect Pinecone

​Credential fields

​Connect in the app

​Connect via the API

​Create-credential request fields

​Test the saved credential

​Configure a knowledge node

​Knowledge node fields for Pinecone

​Embedding configuration

​Example knowledge node configuration

​How ModuleX queries Pinecone

​Request to Pinecone

​Index URL resolution

​Normalized result

​Billing

​Errors and edge cases

​Provider actions reference

​Related pages

Knowledge providers overview

Knowledge node

Managing credentials

Knowledge & RAG

How BYOK retrieval works

Before you start

Connect Pinecone

Credential fields

Connect in the app

Connect via the API

Create-credential request fields

Test the saved credential

Configure a knowledge node

Knowledge node fields for Pinecone

Embedding configuration

Example knowledge node configuration

How ModuleX queries Pinecone

Request to Pinecone

Index URL resolution

Normalized result

Billing

Errors and edge cases

Provider actions reference

Related pages