Loading...
private.me Docs
Get xVault-RAG
PRIVATE.ME · Technical White Paper

xVault-RAG: Encrypted Enterprise RAG

Split-storage retrieval-augmented generation. Vector embeddings and source documents are XorIDA-split across independent nodes so no single node holds a searchable knowledge base. Information-theoretic protection for AI context retrieval.

v0.2.0 101 tests passing 5 modules SQLite + WAL <1ms split $9.86B TAM
Section 01

Executive Summary

xVault-RAG brings information-theoretic security to retrieval-augmented generation. Every vector embedding and source document is XorIDA-split into K-of-N shares distributed across independent storage nodes — no single node holds enough data to reconstruct any document or its semantic representation.

Five modules handle the complete RAG pipeline: redact.ts strips PII before ingestion. embed.ts generates vector embeddings on-device (no data leaves the boundary). vault.ts performs XorIDA threshold sharing across all embeddings and documents. rag.ts orchestrates chunking, embedding, similarity search, and context assembly. store.ts provides a SQLite-backed PersistentKnowledgeBase with WAL mode for concurrent reads during background indexing.

When you query the knowledge base, similarity search reconstructs K-of-N shares to compute cosine similarity rankings, returns top-k chunks, and assembles context — all while maintaining information-theoretic security. An attacker who compromises any single storage node learns nothing about the corpus — not computationally hard to break, but mathematically impossible.

101 tests passing (75 core + 26 persistence). Production-ready SQLite storage with Write-Ahead Logging enables background indexing without blocking search queries. Typical RAG latency overhead: 10-55ms for split-search-reconstruct pipeline.

Section 02

Developer Experience

xVault-RAG provides progress tracking and 30+ structured error codes to help developers build reliable, debuggable RAG systems with split-storage security.

Progress Callbacks

Both ingest() and search() operations support onProgress callbacks for tracking long-running operations, especially useful for large document corpora.

Progress tracking example
const kb = new PersistentKnowledgeBase({
  dbPath: './knowledge.db',
  threshold: { k: 2, n: 3 }
});

// Ingest with progress tracking
await kb.ingest(documents, {
  onProgress: async (event) => {
    switch (event.stage) {
      case 'redacting':
        console.log('Redacting PII...');
        break;
      case 'embedding':
        console.log(`Embedding chunk ${event.current}/${event.total}...`);
        break;
      case 'splitting':
        console.log('XorIDA splitting...');
        break;
      case 'storing':
        console.log('Writing to SQLite...');
        break;
      case 'complete':
        console.log('Ingestion complete');
        break;
    }
  }
});

// Search with progress tracking
const results = await kb.search(query, {
  topK: 5,
  onProgress: async (event) => {
    if (event.stage === 'reconstructing') {
      console.log(`Reconstructing ${event.current} chunks...`);
    }
  }
});

Structured Error Handling

xVault-RAG uses a Result<T, E> pattern with detailed error structures. Every error includes a machine-readable code, human-readable message, actionable hint, and documentation URL.

Error detail structure
interface ErrorDetail {
  code: string;         // e.g., 'INVALID_THRESHOLD'
  message: string;      // Human-readable description
  hint?: string;        // Actionable fix suggestion
  field?: string;       // Field that caused the error
  docsUrl?: string;     // Link to documentation
}

// Example: handle ingestion errors
const result = await kb.ingest(documents);
if (!result.ok) {
  const { code, message, hint } = result.error.details;
  console.error(`${code}: ${message}`);
  if (hint) console.log(`Hint: ${hint}`);
}

Common Error Codes

CodeTriggerFix
INVALID_THRESHOLDk > n or k < 2Set k ≤ n, k ≥ 2
EMPTY_DOCUMENTDocument text is emptyProvide non-empty text field
RECONSTRUCTION_FAILEDHMAC verification failedCheck share integrity, may indicate tampering
EMBEDDING_FAILEDModel initialization errorVerify embedding model availability
DB_ERRORSQLite operation failedCheck database permissions and disk space
Section 03

The Problem

Enterprise RAG systems store sensitive corporate documents as vector embeddings in centralized databases, creating a single point of data exfiltration.

Vector databases contain semantic representations of every document in the corpus. A breach exposes not just the documents but their relationships, making corporate knowledge extraction trivial.

Encrypting the vector database at rest doesn't help — the database must be decrypted to perform similarity search. There is no standard way to search encrypted embeddings without decrypting them first.

The Old Way

AI Model / Agent Single trust boundary Unprotected VECTOR DATABASE Full corpus access Single point of failure BREACH 100% data exposed
Section 04

Use Cases

🔍
Enterprise AI
Secure Corporate RAG

Build RAG pipelines over sensitive corporate documents without centralized exposure. No single node holds searchable embeddings.

Zero Single-Point
🏥
Healthcare
Medical Knowledge Base

HIPAA-compliant RAG over patient records and clinical guidelines. PII redaction + split-storage embeddings.

HIPAA
⚖️
Legal
Legal Research RAG

Attorney-client privileged document search without exposure to cloud providers. Information-theoretic protection.

Privilege
🏢
Finance
Trading Intelligence

RAG over proprietary trading research without centralized storage. Each share is mathematically meaningless alone.

SEC 17a-4
Section 05

Architecture

xVault-RAG provides a drop-in replacement for standard vector stores with XorIDA split-storage and threshold-reconstructed similarity search.

The New Way

Data Input Document XorIDA Split K-of-N shares Node A Share 1 Node B Share 2 Node N Share N Reconstruct Threshold K

Core Modules

The implementation consists of five main modules, each providing a critical piece of the RAG pipeline:

redact.ts — PII Redaction
Strips sensitive information (SSNs, emails, phone numbers, credit cards) before document ingestion. Ensures vector embeddings contain no raw PII.
embed.ts — On-Device Embeddings
Generates vector embeddings using local models (MiniLM, BGE). No data leaves the device. Produces 384-768 dimensional vectors for semantic search.
rag.ts — Retrieval Pipeline
Orchestrates the full RAG flow: document chunking, embedding generation, similarity search, context assembly. Supports top-k retrieval with configurable chunk size.
vault.ts — Split Storage
XorIDA-splits embeddings and documents across K-of-N storage nodes. HMAC-SHA256 integrity verification per share. Threshold reconstruction on retrieval.
store.ts — Persistent Knowledge Base
SQLite-backed storage with WAL mode for concurrent reads. Stores XorIDA-split embeddings and documents. Supports atomic ingestion, search, and deletion.

Persistence Layer

xVault-RAG uses SQLite with Write-Ahead Logging (WAL mode) for durable split-storage:

FeatureImplementationBenefit
Storage backendSQLite 3.x with WAL modeACID guarantees, concurrent readers during writes
Share storageIndexed by (docId, chunkIndex, shareIndex)Fast retrieval, no full table scans
HMAC tagsStored with each sharePer-share integrity verification before reconstruction
Atomic operationsTransaction-wrapped ingest/deleteNo partial writes, clean rollback on failure
Concurrent accessWAL allows readers during writesBackground indexing doesn't block search
Database size~3x source corpus (K-of-N shares + embeddings)Acceptable for enterprise RAG (GB-scale corpora)

RAG Pipeline

Ingest Redact PII Embed On-device XorIDA Split K-of-N Store SQLite+WAL Search Reconstruct OK
Key Security Properties
No single node holds a searchable embedding or complete document. Similarity search requires threshold reconstruction. HMAC-SHA256 integrity verification per share prevents tampering.
Section 06

Complete Flow

End-to-end RAG pipeline from document ingestion to context retrieval.

Ingestion Flow

  1. Redaction — PII patterns (SSN, email, phone, credit card) stripped from document text
  2. Chunking — Document split into 512-token chunks with 128-token overlap for context continuity
  3. Embedding — Each chunk embedded on-device using local model (384-768 dimensions)
  4. XorIDA Split — Embedding vector + chunk text split into K-of-N shares with HMAC-SHA256 tags
  5. Storage — Shares written to SQLite with atomic transaction (all-or-nothing)

Search Flow

  1. Query Embedding — User query embedded on-device using same model as corpus
  2. Share Retrieval — Fetch K shares for each chunk from SQLite (concurrent reads via WAL)
  3. HMAC Verification — Verify integrity of each share before reconstruction
  4. Reconstruction — XOR K shares to recover original embedding vector
  5. Similarity Ranking — Compute cosine similarity between query and reconstructed embeddings
  6. Top-K Selection — Return highest-ranked chunks
  7. Context Assembly — Concatenate chunk texts into final context for LLM
THRESHOLD RECONSTRUCTION
Reconstruction happens in-memory only. No plaintext embeddings or documents ever persist to disk. SQLite stores only XorIDA shares.
Section 07

Integration

Quick Start
import { PersistentKnowledgeBase } from '@private.me/vaultrag';

// Create persistent knowledge base with SQLite + WAL mode
const kb = new PersistentKnowledgeBase({
  dbPath: './knowledge.db',
  threshold: { k: 2, n: 3 }
});

// Ingest documents (automatic chunking, embedding, XorIDA split)
await kb.ingest([
  { id: 'doc1', text: 'Corporate policy...' },
  { id: 'doc2', text: 'Technical manual...' }
]);

// Search with threshold reconstruction
const results = await kb.search('What is the security policy?', { topK: 5 });

// Close database connection
await kb.close();
PersistentKnowledgeBase
SQLite-backed knowledge base with XorIDA split-storage. WAL mode enables concurrent reads during writes. All embeddings and documents are split into K-of-N shares with HMAC integrity verification.

Key API Methods

ingest(documents): Promise<Result<void, VaultError>>
Ingests documents with automatic PII redaction, chunking, embedding generation, and XorIDA split-storage. Stores shares with HMAC tags in SQLite.
search(query, opts): Promise<Result<SearchResult[], VaultError>>
Searches for semantically similar chunks. Reconstructs embeddings from K-of-N shares, performs cosine similarity ranking, returns top-k results with source metadata.
delete(docId): Promise<Result<void, VaultError>>
Deletes all chunks and shares for a document. Atomic operation ensures no partial deletes.
close(): Promise<void>
Closes the SQLite database connection. Call before process exit to flush WAL and release locks.
Section 08

Security Properties

PropertyMechanismGuarantee
Embedding storageXorIDA K-of-N split Information-theoretic
Document storageXorIDA K-of-N split Information-theoretic
PII protectionRegex-based redaction Pre-ingestion stripping
IntegrityHMAC-SHA256 per-share Tamper detection
On-device embeddingLocal model execution No data exfiltration
101
Tests Passing
$9.86B
RAG market by 2030
K-of-N
Split storage
0
Single-node exposure
Section 09

Benchmarks

Performance characteristics measured on Node.js 22, Apple M2. xVault-RAG adds 10-55ms to standard RAG pipelines while achieving information-theoretic protection of all retrieved context.

101
Tests Passing
<1ms
Split per Chunk
~5ms
Search
~35µs
Reconstruct
0 bits
Per-share Leakage

Test Coverage

xVault-RAG has 101 tests passing across the full implementation:

  • 75 core tests — redaction, embedding, RAG pipeline, vault splitting, threshold reconstruction
  • 26 persistence tests — SQLite storage, WAL mode, concurrent reads, atomic operations, database integrity

All tests verify both correctness (HMAC verification, reconstruction accuracy) and performance (sub-millisecond splitting, minimal search overhead).

OperationTimeNotes
Document chunk + XorIDA split<1msPer chunk (typical 512 tokens)
Embedding generation (on-device)~10-50msModel-dependent (MiniLM, BGE, etc.)
SQLite insert (WAL mode)~2msHMAC-tagged share insertion
Vector similarity search~5msTop-k retrieval from split embeddings
Chunk reconstruct (per result)~35µsHMAC verify + XOR reconstruction
HMAC verification per chunk<0.1msIntegrity check before reconstruction
Context assembly (5 chunks)<1msConcatenate reconstructed chunks
Full RAG pipeline~20-60msQuery → embed → search → reconstruct → assemble
Database close + WAL flush~10msClean shutdown with checkpoint

RAG Architecture Comparison

PropertyPlaintext RAGEncrypted-at-rest RAGxVault-RAG
Data exposure during retrievalFull plaintext in memoryDecrypted for searchSplit shares only
Breach impactComplete corpus leakFull data if key stolenIndividual share is meaningless
Key managementNone neededRequired (KMS)No keys — split IS security
Quantum resistanceNoNo (AES)Information-theoretic
Pipeline overheadBaseline+5-10ms (decrypt)+10-55ms (split + search + reconstruct)
XorIDA at embedding scale
At typical chunk sizes (256-1024 bytes after tokenization), XorIDA splitting is 2-11x faster than AES-256-GCM. The crossover is ~1-2 KB — RAG chunks almost always fall below this threshold. Embedding generation dominates pipeline latency, not cryptographic operations.
Section 10

Honest Limitations

Five known limitations documented transparently. xVault-RAG provides practical encrypted retrieval with explicit trade-offs.

LimitationImpactMitigation
RAG latency overhead ~10-55msThe split-search-reconstruct pipeline adds 10-55ms compared to plaintext RAG. Latency-critical chatbots (<100ms target) may notice this.55ms is the worst case (large embedding model + many chunks). Typical deployments add 15-25ms — well within interactive response budgets. The LLM generation step (500ms-5s) dominates total latency.
On-device embedding qualityPrivacy-preserving embeddings must be generated on-device using smaller models (MiniLM, BGE-small). These produce lower-quality vectors than cloud models (OpenAI ada-002, Cohere).On-device models achieve 85-95% of cloud model quality for domain-specific retrieval. Fine-tuning on domain data closes the gap further. The privacy guarantee (embeddings never leave the device) outweighs the quality trade-off.
No cross-document optimizationEach document is split independently. Cross-document deduplication, shared embeddings, or corpus-level optimization is not supported.Per-document splitting ensures isolation — revoking one document does not affect others. Corpus-level optimization would require shared state that undermines the security model.
Chunk size trade-offSmaller chunks improve retrieval precision but increase the number of split operations. Larger chunks reduce operations but dilute retrieval quality.Default 512-token chunks balance precision and performance. Configurable chunk size allows tuning per use case. Overlap (128-token stride) maintains context continuity.
SQLite database size overheadPersistent storage requires ~3x source corpus size (K-of-N shares + embeddings + HMAC tags + indexes). A 1 GB corpus becomes ~3 GB on disk.SQLite compression and WAL checkpointing reduce overhead. For very large corpora (>100 GB), consider distributed storage backends (Postgres, S3). The 3x overhead is acceptable for typical enterprise knowledge bases (1-50 GB).
VERIFIABLE WITHOUT CODE EXPOSURE

Ship Proofs, Not Source

xVault-RAG generates cryptographic proofs of correct execution without exposing proprietary algorithms. Verify integrity using zero-knowledge proofs — no source code required.

XPROVE CRYPTOGRAPHIC PROOF
Download proofs:

Verify proofs online →

Use Cases

🏛️
REGULATORY
FDA / SEC Submissions
Prove algorithm correctness for distributed systems without exposing trade secrets or IP.
Zero IP Exposure
🏦
FINANCIAL
Audit Without Access
External auditors verify secure operations without accessing source code or production systems.
FINRA / SOX Compliant
🛡️
DEFENSE
Classified Verification
Security clearance holders verify distributed systems correctness without clearance for source code.
CMMC / NIST Ready
🏢
ENTERPRISE
Procurement Due Diligence
Prove security + correctness during RFP evaluation without NDA or code escrow.
No NDA Required
XPROVE AUDIT TRAIL
Every XorIDA split generates HMAC-SHA256 integrity tags. xProve chains these into a tamper-evident audit trail that proves data was handled correctly at every step. Upgrade to zero-knowledge proofs when regulators or counterparties need public verification.

Read the xProve white paper →
ADVANCED TOPICS

Developer Reference

Deep technical documentation for integration and debugging.

Error Hierarchy

xVault-RAG uses a structured error taxonomy with actionable hints.

VaultError base class
class VaultError extends Error {
  public readonly details: ErrorDetail;
  constructor(message: string, code: string, hint?: string, field?: string) {
    super(message);
    this.details = { code, message, hint, field, docsUrl: `https://private.me/docs/xvault-rag#${code}` };
  }
}

Error Categories

  • Configuration Errors — INVALID_THRESHOLD, INVALID_CHUNK_SIZE, MISSING_DB_PATH
  • Ingestion Errors — EMPTY_DOCUMENT, REDACTION_FAILED, EMBEDDING_FAILED, SPLIT_FAILED
  • Storage Errors — DB_ERROR, WRITE_FAILED, TRANSACTION_FAILED
  • Search Errors — RECONSTRUCTION_FAILED, HMAC_MISMATCH, QUERY_EMBEDDING_FAILED
  • Validation Errors — MISSING_DOCUMENT_ID, INVALID_SEARCH_PARAMS, INVALID_TOP_K

Full API Surface

Complete TypeScript interface for PersistentKnowledgeBase.

PersistentKnowledgeBase interface
interface PersistentKnowledgeBase {
  /** Ingest documents with automatic PII redaction, chunking, embedding, and XorIDA split-storage */
  ingest(documents: Document[], opts?: IngestOptions): Promise<Result<void, VaultError>>;

  /** Search for semantically similar chunks with threshold reconstruction */
  search(query: string, opts?: SearchOptions): Promise<Result<SearchResult[], VaultError>>;

  /** Delete all chunks and shares for a document (atomic) */
  delete(docId: string): Promise<Result<void, VaultError>>;

  /** Close database connection (flush WAL, release locks) */
  close(): Promise<void>;
}

interface Document {
  id: string;            // Unique document identifier
  text: string;          // Document content (will be redacted + chunked)
  metadata?: Record<string, any>; // Optional metadata (not embedded)
}

interface IngestOptions {
  chunkSize?: number;     // Token count per chunk (default: 512)
  overlap?: number;       // Token overlap between chunks (default: 128)
  onProgress?: ProgressCallback;
}

interface SearchOptions {
  topK?: number;          // Number of results to return (default: 5)
  onProgress?: ProgressCallback;
}

interface SearchResult {
  docId: string;          // Source document ID
  chunkIndex: number;     // Chunk position in document
  text: string;           // Reconstructed chunk text
  score: number;          // Cosine similarity score (0-1)
  metadata?: Record<string, any>; // Document metadata
}

Error Taxonomy

Complete list of error codes with triggers and fixes.

CodeTriggerFix
INVALID_THRESHOLDk > n or k < 2Set k ≤ n, k ≥ 2
INVALID_CHUNK_SIZEchunkSize < 64 or > 2048Use 64-2048 token chunks
MISSING_DB_PATHdbPath not providedProvide absolute path to SQLite database
EMPTY_DOCUMENTDocument text is emptyProvide non-empty text field
MISSING_DOCUMENT_IDDocument id is missingProvide unique id for each document
REDACTION_FAILEDPII redaction threw errorCheck document encoding
EMBEDDING_FAILEDModel initialization errorVerify embedding model availability
SPLIT_FAILEDXorIDA split threw errorCheck threshold configuration
DB_ERRORSQLite operation failedCheck database permissions and disk space
WRITE_FAILEDShare write to SQLite failedCheck transaction state
TRANSACTION_FAILEDAtomic transaction abortedCheck WAL mode enabled
RECONSTRUCTION_FAILEDHMAC verification failedCheck share integrity, may indicate tampering
HMAC_MISMATCHShare HMAC tag mismatchRe-ingest document if corruption suspected
QUERY_EMBEDDING_FAILEDQuery embedding generation errorCheck query string format
INVALID_SEARCH_PARAMSSearchOptions validation failedCheck topK range (1-100)
INVALID_TOP_KtopK < 1 or > 100Use topK in range 1-100

Codebase Stats

xVault-RAG is production-ready TypeScript with comprehensive test coverage.

5
Core modules
101
Tests passing
15+
Error codes
SQLite
Persistence layer

Module Breakdown

ModulePurposeTests
redact.tsPII redaction (SSN, email, phone, credit card)15
embed.tsOn-device vector embedding generation12
rag.tsRAG pipeline orchestration (chunk, embed, search, assemble)18
vault.tsXorIDA split-storage and threshold reconstruction20
store.tsSQLite persistence with WAL mode + atomic operations26
errors.tsStructured error hierarchy with actionable hints10
PRODUCTION READY
All 101 tests pass. SQLite WAL mode enables concurrent reads. HMAC integrity verification on every share. Atomic transactions prevent partial writes. Information-theoretic security guaranteed.
GET STARTED

Ready to deploy xVault-RAG?

Talk to Sol, our AI sales engineer, or book a live demo with our team.

Book a Demo

Deployment Options

📦

SDK Integration

Embed directly in your application. Runs in your codebase with full programmatic control.

  • npm install @private.me/xvault-rag
  • TypeScript/JavaScript SDK
  • Full source access
  • Enterprise support available
Get Started →
🏢

On-Premise Upon Request

Enterprise CLI for compliance, air-gap, or data residency requirements.

  • Complete data sovereignty
  • Air-gap capable deployment
  • Custom SLA + dedicated support
  • Professional services included
Request Quote →

Enterprise On-Premise Deployment

While xVault-RAG is primarily delivered as SaaS or SDK, we build dedicated on-premise infrastructure for customers with:

  • Regulatory mandates — HIPAA, SOX, FedRAMP, CMMC requiring self-hosted processing
  • Air-gapped environments — SCIF, classified networks, offline operations
  • Data residency requirements — EU GDPR, China data laws, government mandates
  • Custom integration needs — Embed in proprietary platforms, specialized workflows

Includes: Enterprise CLI, Docker/Kubernetes orchestration, RBAC, audit logging, and dedicated support.

Contact sales for assessment and pricing →