Skip to main content

KnowledgeBase

A KnowledgeBase is a vector database (e.g., Qdrant, Milvus, OpenSearch) that stores and indexes document embeddings for efficient real-time retrieval. It powers context-aware AI workflows by enabling similarity search to fetch the most relevant information for a given query. As a key component of Retrieval-Augmented Generation (RAG) pipelines, the KnowledgeBase enhances query augmentation, ensuring accurate, contextual, and intelligent responses by integrating streaming data with AI-driven retrieval mechanisms.

Sample YAML Configuration

apiVersion: streams.nstream.ai/v1
kind: KnowledgeBase
metadata:
name: "KNOWLEDGEBASE_NAME"
namespace: "NAMESPACE"
spec:
connectorRef: "SOURCE_CONNECTOR_DB"
embeddingModelRef: "HF_EMBEDDING_MODEL"
provider: Qdrant
collectionTemplate:
hnswConfig:
edgesPerNode: 16
efContruct: 200
fullScanThreshold: 1000
maxIndexingThreads: 4
onDisk: true
payloadM: 32
KeyDescriptionExample
apiVersionDefines the API version for KnowledgeBase configurationstreams.nstream.ai/v1
kindSpecifies the type of resource being configuredKnowledgeBase
nameThe unique name of the KnowledgeBase instancemy-knowledgebase
namespaceKubernetes namespace where the KnowledgeBase is deployeddefault
connectorRefReference to the source connector databasepostgres-connector
embeddingModelRefReference to the Hugging Face embedding model used for document embeddingssentence-transformers/all-MiniLM-L6-v2
providerThe vector database provider for the KnowledgeBaseQdrant
collectionTemplateConfiguration settings for the document collection in the vector databaseSee detailed structure below