KnowledgeBase
A KnowledgeBase is a vector database (e.g., Qdrant, Milvus, OpenSearch) that stores and indexes document embeddings for efficient real-time retrieval. It powers context-aware AI workflows by enabling similarity search to fetch the most relevant information for a given query. As a key component of Retrieval-Augmented Generation (RAG) pipelines, the KnowledgeBase enhances query augmentation, ensuring accurate, contextual, and intelligent responses by integrating streaming data with AI-driven retrieval mechanisms.
Sample YAML Configuration
apiVersion: streams.nstream.ai/v1
kind: KnowledgeBase
metadata:
name: "KNOWLEDGEBASE_NAME"
namespace: "NAMESPACE"
spec:
connectorRef: "SOURCE_CONNECTOR_DB"
embeddingModelRef: "HF_EMBEDDING_MODEL"
provider: Qdrant
collectionTemplate:
hnswConfig:
edgesPerNode: 16
efContruct: 200
fullScanThreshold: 1000
maxIndexingThreads: 4
onDisk: true
payloadM: 32
| Key | Description | Example |
|---|---|---|
apiVersion | Defines the API version for KnowledgeBase configuration | streams.nstream.ai/v1 |
kind | Specifies the type of resource being configured | KnowledgeBase |
name | The unique name of the KnowledgeBase instance | my-knowledgebase |
namespace | Kubernetes namespace where the KnowledgeBase is deployed | default |
connectorRef | Reference to the source connector database | postgres-connector |
embeddingModelRef | Reference to the Hugging Face embedding model used for document embeddings | sentence-transformers/all-MiniLM-L6-v2 |
provider | The vector database provider for the KnowledgeBase | Qdrant |
collectionTemplate | Configuration settings for the document collection in the vector database | See detailed structure below |