xVault-RAG: Encrypted Enterprise RAG
Split-storage retrieval-augmented generation. Vector embeddings and source documents are XorIDA-split across independent nodes so no single node holds a searchable knowledge base. Information-theoretic protection for AI context retrieval.
Executive Summary
xVault-RAG brings information-theoretic security to retrieval-augmented generation. Every vector embedding and source document is XorIDA-split into K-of-N shares distributed across independent storage nodes — no single node holds enough data to reconstruct any document or its semantic representation.
Five modules handle the complete RAG pipeline: redact.ts strips PII before ingestion. embed.ts generates vector embeddings on-device (no data leaves the boundary). vault.ts performs XorIDA threshold sharing across all embeddings and documents. rag.ts orchestrates chunking, embedding, similarity search, and context assembly. store.ts provides a SQLite-backed PersistentKnowledgeBase with WAL mode for concurrent reads during background indexing.
When you query the knowledge base, similarity search reconstructs K-of-N shares to compute cosine similarity rankings, returns top-k chunks, and assembles context — all while maintaining information-theoretic security. An attacker who compromises any single storage node learns nothing about the corpus — not computationally hard to break, but mathematically impossible.
101 tests passing (75 core + 26 persistence). Production-ready SQLite storage with Write-Ahead Logging enables background indexing without blocking search queries. Typical RAG latency overhead: 10-55ms for split-search-reconstruct pipeline.
Developer Experience
xVault-RAG provides progress tracking and 30+ structured error codes to help developers build reliable, debuggable RAG systems with split-storage security.
Progress Callbacks
Both ingest() and search() operations support onProgress callbacks for tracking long-running operations, especially useful for large document corpora.
const kb = new PersistentKnowledgeBase({ dbPath: './knowledge.db', threshold: { k: 2, n: 3 } }); // Ingest with progress tracking await kb.ingest(documents, { onProgress: async (event) => { switch (event.stage) { case 'redacting': console.log('Redacting PII...'); break; case 'embedding': console.log(`Embedding chunk ${event.current}/${event.total}...`); break; case 'splitting': console.log('XorIDA splitting...'); break; case 'storing': console.log('Writing to SQLite...'); break; case 'complete': console.log('Ingestion complete'); break; } } }); // Search with progress tracking const results = await kb.search(query, { topK: 5, onProgress: async (event) => { if (event.stage === 'reconstructing') { console.log(`Reconstructing ${event.current} chunks...`); } } });
Structured Error Handling
xVault-RAG uses a Result<T, E> pattern with detailed error structures. Every error includes a machine-readable code, human-readable message, actionable hint, and documentation URL.
interface ErrorDetail { code: string; // e.g., 'INVALID_THRESHOLD' message: string; // Human-readable description hint?: string; // Actionable fix suggestion field?: string; // Field that caused the error docsUrl?: string; // Link to documentation } // Example: handle ingestion errors const result = await kb.ingest(documents); if (!result.ok) { const { code, message, hint } = result.error.details; console.error(`${code}: ${message}`); if (hint) console.log(`Hint: ${hint}`); }
Common Error Codes
| Code | Trigger | Fix |
|---|---|---|
| INVALID_THRESHOLD | k > n or k < 2 | Set k ≤ n, k ≥ 2 |
| EMPTY_DOCUMENT | Document text is empty | Provide non-empty text field |
| RECONSTRUCTION_FAILED | HMAC verification failed | Check share integrity, may indicate tampering |
| EMBEDDING_FAILED | Model initialization error | Verify embedding model availability |
| DB_ERROR | SQLite operation failed | Check database permissions and disk space |
The Problem
Enterprise RAG systems store sensitive corporate documents as vector embeddings in centralized databases, creating a single point of data exfiltration.
Vector databases contain semantic representations of every document in the corpus. A breach exposes not just the documents but their relationships, making corporate knowledge extraction trivial.
Encrypting the vector database at rest doesn't help — the database must be decrypted to perform similarity search. There is no standard way to search encrypted embeddings without decrypting them first.
The Old Way
Use Cases
Build RAG pipelines over sensitive corporate documents without centralized exposure. No single node holds searchable embeddings.
Zero Single-PointHIPAA-compliant RAG over patient records and clinical guidelines. PII redaction + split-storage embeddings.
HIPAAAttorney-client privileged document search without exposure to cloud providers. Information-theoretic protection.
PrivilegeRAG over proprietary trading research without centralized storage. Each share is mathematically meaningless alone.
SEC 17a-4Architecture
xVault-RAG provides a drop-in replacement for standard vector stores with XorIDA split-storage and threshold-reconstructed similarity search.
The New Way
Core Modules
The implementation consists of five main modules, each providing a critical piece of the RAG pipeline:
Persistence Layer
xVault-RAG uses SQLite with Write-Ahead Logging (WAL mode) for durable split-storage:
| Feature | Implementation | Benefit |
|---|---|---|
| Storage backend | SQLite 3.x with WAL mode | ACID guarantees, concurrent readers during writes |
| Share storage | Indexed by (docId, chunkIndex, shareIndex) | Fast retrieval, no full table scans |
| HMAC tags | Stored with each share | Per-share integrity verification before reconstruction |
| Atomic operations | Transaction-wrapped ingest/delete | No partial writes, clean rollback on failure |
| Concurrent access | WAL allows readers during writes | Background indexing doesn't block search |
| Database size | ~3x source corpus (K-of-N shares + embeddings) | Acceptable for enterprise RAG (GB-scale corpora) |
RAG Pipeline
Complete Flow
End-to-end RAG pipeline from document ingestion to context retrieval.
Ingestion Flow
- Redaction — PII patterns (SSN, email, phone, credit card) stripped from document text
- Chunking — Document split into 512-token chunks with 128-token overlap for context continuity
- Embedding — Each chunk embedded on-device using local model (384-768 dimensions)
- XorIDA Split — Embedding vector + chunk text split into K-of-N shares with HMAC-SHA256 tags
- Storage — Shares written to SQLite with atomic transaction (all-or-nothing)
Search Flow
- Query Embedding — User query embedded on-device using same model as corpus
- Share Retrieval — Fetch K shares for each chunk from SQLite (concurrent reads via WAL)
- HMAC Verification — Verify integrity of each share before reconstruction
- Reconstruction — XOR K shares to recover original embedding vector
- Similarity Ranking — Compute cosine similarity between query and reconstructed embeddings
- Top-K Selection — Return highest-ranked chunks
- Context Assembly — Concatenate chunk texts into final context for LLM
Integration
import { PersistentKnowledgeBase } from '@private.me/vaultrag'; // Create persistent knowledge base with SQLite + WAL mode const kb = new PersistentKnowledgeBase({ dbPath: './knowledge.db', threshold: { k: 2, n: 3 } }); // Ingest documents (automatic chunking, embedding, XorIDA split) await kb.ingest([ { id: 'doc1', text: 'Corporate policy...' }, { id: 'doc2', text: 'Technical manual...' } ]); // Search with threshold reconstruction const results = await kb.search('What is the security policy?', { topK: 5 }); // Close database connection await kb.close();
Key API Methods
Security Properties
| Property | Mechanism | Guarantee |
|---|---|---|
| Embedding storage | XorIDA K-of-N split | ✓ Information-theoretic |
| Document storage | XorIDA K-of-N split | ✓ Information-theoretic |
| PII protection | Regex-based redaction | ✓ Pre-ingestion stripping |
| Integrity | HMAC-SHA256 per-share | ✓ Tamper detection |
| On-device embedding | Local model execution | ✓ No data exfiltration |
Benchmarks
Performance characteristics measured on Node.js 22, Apple M2. xVault-RAG adds 10-55ms to standard RAG pipelines while achieving information-theoretic protection of all retrieved context.
Test Coverage
xVault-RAG has 101 tests passing across the full implementation:
- 75 core tests — redaction, embedding, RAG pipeline, vault splitting, threshold reconstruction
- 26 persistence tests — SQLite storage, WAL mode, concurrent reads, atomic operations, database integrity
All tests verify both correctness (HMAC verification, reconstruction accuracy) and performance (sub-millisecond splitting, minimal search overhead).
| Operation | Time | Notes |
|---|---|---|
| Document chunk + XorIDA split | <1ms | Per chunk (typical 512 tokens) |
| Embedding generation (on-device) | ~10-50ms | Model-dependent (MiniLM, BGE, etc.) |
| SQLite insert (WAL mode) | ~2ms | HMAC-tagged share insertion |
| Vector similarity search | ~5ms | Top-k retrieval from split embeddings |
| Chunk reconstruct (per result) | ~35µs | HMAC verify + XOR reconstruction |
| HMAC verification per chunk | <0.1ms | Integrity check before reconstruction |
| Context assembly (5 chunks) | <1ms | Concatenate reconstructed chunks |
| Full RAG pipeline | ~20-60ms | Query → embed → search → reconstruct → assemble |
| Database close + WAL flush | ~10ms | Clean shutdown with checkpoint |
RAG Architecture Comparison
| Property | Plaintext RAG | Encrypted-at-rest RAG | xVault-RAG |
|---|---|---|---|
| Data exposure during retrieval | Full plaintext in memory | Decrypted for search | Split shares only |
| Breach impact | Complete corpus leak | Full data if key stolen | Individual share is meaningless |
| Key management | None needed | Required (KMS) | No keys — split IS security |
| Quantum resistance | No | No (AES) | Information-theoretic |
| Pipeline overhead | Baseline | +5-10ms (decrypt) | +10-55ms (split + search + reconstruct) |
Honest Limitations
Five known limitations documented transparently. xVault-RAG provides practical encrypted retrieval with explicit trade-offs.
| Limitation | Impact | Mitigation |
|---|---|---|
| RAG latency overhead ~10-55ms | The split-search-reconstruct pipeline adds 10-55ms compared to plaintext RAG. Latency-critical chatbots (<100ms target) may notice this. | 55ms is the worst case (large embedding model + many chunks). Typical deployments add 15-25ms — well within interactive response budgets. The LLM generation step (500ms-5s) dominates total latency. |
| On-device embedding quality | Privacy-preserving embeddings must be generated on-device using smaller models (MiniLM, BGE-small). These produce lower-quality vectors than cloud models (OpenAI ada-002, Cohere). | On-device models achieve 85-95% of cloud model quality for domain-specific retrieval. Fine-tuning on domain data closes the gap further. The privacy guarantee (embeddings never leave the device) outweighs the quality trade-off. |
| No cross-document optimization | Each document is split independently. Cross-document deduplication, shared embeddings, or corpus-level optimization is not supported. | Per-document splitting ensures isolation — revoking one document does not affect others. Corpus-level optimization would require shared state that undermines the security model. |
| Chunk size trade-off | Smaller chunks improve retrieval precision but increase the number of split operations. Larger chunks reduce operations but dilute retrieval quality. | Default 512-token chunks balance precision and performance. Configurable chunk size allows tuning per use case. Overlap (128-token stride) maintains context continuity. |
| SQLite database size overhead | Persistent storage requires ~3x source corpus size (K-of-N shares + embeddings + HMAC tags + indexes). A 1 GB corpus becomes ~3 GB on disk. | SQLite compression and WAL checkpointing reduce overhead. For very large corpora (>100 GB), consider distributed storage backends (Postgres, S3). The 3x overhead is acceptable for typical enterprise knowledge bases (1-50 GB). |
Ship Proofs, Not Source
xVault-RAG generates cryptographic proofs of correct execution without exposing proprietary algorithms. Verify integrity using zero-knowledge proofs — no source code required.
- Tier 1 HMAC (~0.7KB)
- Tier 2 Commit-Reveal (~0.5KB)
- Tier 3 IT-MAC (~0.3KB)
- Tier 4 KKW ZK (~0.4KB)
Use Cases
Read the xProve white paper →
Developer Reference
Deep technical documentation for integration and debugging.
Error Hierarchy
xVault-RAG uses a structured error taxonomy with actionable hints.
class VaultError extends Error { public readonly details: ErrorDetail; constructor(message: string, code: string, hint?: string, field?: string) { super(message); this.details = { code, message, hint, field, docsUrl: `https://private.me/docs/xvault-rag#${code}` }; } }
Error Categories
- Configuration Errors — INVALID_THRESHOLD, INVALID_CHUNK_SIZE, MISSING_DB_PATH
- Ingestion Errors — EMPTY_DOCUMENT, REDACTION_FAILED, EMBEDDING_FAILED, SPLIT_FAILED
- Storage Errors — DB_ERROR, WRITE_FAILED, TRANSACTION_FAILED
- Search Errors — RECONSTRUCTION_FAILED, HMAC_MISMATCH, QUERY_EMBEDDING_FAILED
- Validation Errors — MISSING_DOCUMENT_ID, INVALID_SEARCH_PARAMS, INVALID_TOP_K
Full API Surface
Complete TypeScript interface for PersistentKnowledgeBase.
interface PersistentKnowledgeBase { /** Ingest documents with automatic PII redaction, chunking, embedding, and XorIDA split-storage */ ingest(documents: Document[], opts?: IngestOptions): Promise<Result<void, VaultError>>; /** Search for semantically similar chunks with threshold reconstruction */ search(query: string, opts?: SearchOptions): Promise<Result<SearchResult[], VaultError>>; /** Delete all chunks and shares for a document (atomic) */ delete(docId: string): Promise<Result<void, VaultError>>; /** Close database connection (flush WAL, release locks) */ close(): Promise<void>; } interface Document { id: string; // Unique document identifier text: string; // Document content (will be redacted + chunked) metadata?: Record<string, any>; // Optional metadata (not embedded) } interface IngestOptions { chunkSize?: number; // Token count per chunk (default: 512) overlap?: number; // Token overlap between chunks (default: 128) onProgress?: ProgressCallback; } interface SearchOptions { topK?: number; // Number of results to return (default: 5) onProgress?: ProgressCallback; } interface SearchResult { docId: string; // Source document ID chunkIndex: number; // Chunk position in document text: string; // Reconstructed chunk text score: number; // Cosine similarity score (0-1) metadata?: Record<string, any>; // Document metadata }
Error Taxonomy
Complete list of error codes with triggers and fixes.
| Code | Trigger | Fix |
|---|---|---|
| INVALID_THRESHOLD | k > n or k < 2 | Set k ≤ n, k ≥ 2 |
| INVALID_CHUNK_SIZE | chunkSize < 64 or > 2048 | Use 64-2048 token chunks |
| MISSING_DB_PATH | dbPath not provided | Provide absolute path to SQLite database |
| EMPTY_DOCUMENT | Document text is empty | Provide non-empty text field |
| MISSING_DOCUMENT_ID | Document id is missing | Provide unique id for each document |
| REDACTION_FAILED | PII redaction threw error | Check document encoding |
| EMBEDDING_FAILED | Model initialization error | Verify embedding model availability |
| SPLIT_FAILED | XorIDA split threw error | Check threshold configuration |
| DB_ERROR | SQLite operation failed | Check database permissions and disk space |
| WRITE_FAILED | Share write to SQLite failed | Check transaction state |
| TRANSACTION_FAILED | Atomic transaction aborted | Check WAL mode enabled |
| RECONSTRUCTION_FAILED | HMAC verification failed | Check share integrity, may indicate tampering |
| HMAC_MISMATCH | Share HMAC tag mismatch | Re-ingest document if corruption suspected |
| QUERY_EMBEDDING_FAILED | Query embedding generation error | Check query string format |
| INVALID_SEARCH_PARAMS | SearchOptions validation failed | Check topK range (1-100) |
| INVALID_TOP_K | topK < 1 or > 100 | Use topK in range 1-100 |
Codebase Stats
xVault-RAG is production-ready TypeScript with comprehensive test coverage.
Module Breakdown
| Module | Purpose | Tests |
|---|---|---|
| redact.ts | PII redaction (SSN, email, phone, credit card) | 15 |
| embed.ts | On-device vector embedding generation | 12 |
| rag.ts | RAG pipeline orchestration (chunk, embed, search, assemble) | 18 |
| vault.ts | XorIDA split-storage and threshold reconstruction | 20 |
| store.ts | SQLite persistence with WAL mode + atomic operations | 26 |
| errors.ts | Structured error hierarchy with actionable hints | 10 |
Ready to deploy xVault-RAG?
Talk to Sol, our AI sales engineer, or book a live demo with our team.
Deployment Options
SaaS Recommended
Fully managed infrastructure. Call our REST API, we handle scaling, updates, and operations.
- Zero infrastructure setup
- Automatic updates
- 99.9% uptime SLA
- Enterprise SLA available
SDK Integration
Embed directly in your application. Runs in your codebase with full programmatic control.
npm install @private.me/xvault-rag- TypeScript/JavaScript SDK
- Full source access
- Enterprise support available
On-Premise Upon Request
Enterprise CLI for compliance, air-gap, or data residency requirements.
- Complete data sovereignty
- Air-gap capable deployment
- Custom SLA + dedicated support
- Professional services included
Enterprise On-Premise Deployment
While xVault-RAG is primarily delivered as SaaS or SDK, we build dedicated on-premise infrastructure for customers with:
- Regulatory mandates — HIPAA, SOX, FedRAMP, CMMC requiring self-hosted processing
- Air-gapped environments — SCIF, classified networks, offline operations
- Data residency requirements — EU GDPR, China data laws, government mandates
- Custom integration needs — Embed in proprietary platforms, specialized workflows
Includes: Enterprise CLI, Docker/Kubernetes orchestration, RBAC, audit logging, and dedicated support.