KnowledgeBase

A KnowledgeBase is a vector database (e.g., Qdrant, Milvus, OpenSearch) that stores and indexes document embeddings for efficient real-time retrieval. It powers context-aware AI workflows by enabling similarity search to fetch the most relevant information for a given query. As a key component of Retrieval-Augmented Generation (RAG) pipelines, the KnowledgeBase enhances query augmentation, ensuring accurate, contextual, and intelligent responses by integrating streaming data with AI-driven retrieval mechanisms.

Sample YAML Configuration

apiVersion: streams.nstream.ai/v1
kind: KnowledgeBase
metadata:
  name: "KNOWLEDGEBASE_NAME"
  namespace: "NAMESPACE"
spec:
  connectorRef: "SOURCE_CONNECTOR_DB"
  embeddingModelRef: "HF_EMBEDDING_MODEL"
  provider: Qdrant
  collectionTemplate:
    hnswConfig:
      edgesPerNode: 16
      efContruct: 200
      fullScanThreshold: 1000
      maxIndexingThreads: 4
      onDisk: true
      payloadM: 32

Key	Description	Example
`apiVersion`	Defines the API version for KnowledgeBase configuration	`streams.nstream.ai/v1`
`kind`	Specifies the type of resource being configured	`KnowledgeBase`
`name`	The unique name of the KnowledgeBase instance	`my-knowledgebase`
`namespace`	Kubernetes namespace where the KnowledgeBase is deployed	`default`
`connectorRef`	Reference to the source connector database	`postgres-connector`
`embeddingModelRef`	Reference to the Hugging Face embedding model used for document embeddings	`sentence-transformers/all-MiniLM-L6-v2`
`provider`	The vector database provider for the KnowledgeBase	`Qdrant`
`collectionTemplate`	Configuration settings for the document collection in the vector database	See detailed structure below

Sample YAML Configuration​

Sample YAML Configuration