PRIVATE.ME PLATFORM

xBenchmark: Privacy-Preserving AI Model Evaluation

Evaluate and red-team AI models without exposing test datasets or model weights. xCompute enables computation on XorIDA shares so neither party sees the other’s data.

AI/ML COMING SOON XorIDA Powered

Section 01

The Problem

AI model evaluation requires access to sensitive test datasets and model internals, but sharing either creates intellectual property and privacy risks.

Red-teaming and benchmarking require running sensitive prompts against models, but test datasets contain proprietary evaluation criteria, adversarial examples, and competitive intelligence. Sharing them with model providers defeats the purpose.

Model providers resist sharing weights or internal metrics for evaluation, creating a trust deadlock where neither party can verify the other’s claims.

The Old Way

Section 02

The PRIVATE.ME Solution

xBenchmark uses xCompute to evaluate models on split data. Test datasets and model responses are XorIDA-split so neither the evaluator nor the model provider sees the other’s complete data.

Evaluation metrics are computed directly on XorIDA shares using xCompute’s Boolean circuit engine. XOR gates are free (zero communication); AND gates use Beaver triples. The result is a score that both parties can verify without either seeing the raw data.

All evaluation runs are recorded in an HMAC-chained audit trail with DID-signed attestations. Results are reproducible and tamper-evident.

The New Way

Section 03

How It Works

xBenchmark orchestrates multi-party evaluation where test data and model outputs are XorIDA-split and scored via xCompute without reconstruction.

Key Security Properties

Neither party sees the other’s data. Evaluation happens on XorIDA shares via xCompute. Results are HMAC-chained and DID-signed for tamper-evident reproducibility.

Section 04

Use Cases

AI Safety

Blind Red-Teaming

Red-team models without exposing adversarial test datasets to the model provider.

Safety

Benchmarking

Private Benchmarks

Run competitive benchmarks where neither model provider sees the test set.

Benchmark

Enterprise

Vendor Evaluation

Evaluate AI vendors against proprietary criteria without sharing your evaluation framework.

Procurement

️

Regulation

EU AI Act Audits

Third-party audits of high-risk AI systems without exposing model internals.

Compliance

Section 05

Integration

Quick Start

import { EvalSession } from '@private.me/xbenchmark';

const session = await EvalSession.create({
  evaluator: evaluatorDid,
  modelProvider: providerDid,
  metrics: ['accuracy', 'toxicity', 'bias'],
  threshold: { k: 2, n: 3 }
});
const result = await session.evaluate(testSuite);

EvalSession.create(opts): Promise<Result<EvalSession, EvalError>>

Creates a privacy-preserving evaluation session between an evaluator and model provider. All computation happens on XorIDA shares via xCompute.

Section 06

Security Properties

Property	Mechanism	Guarantee
Test data privacy	XorIDA split datasets	Information-theoretic
Model privacy	Split model outputs	No weight exposure
Result integrity	HMAC-chained audit	Tamper-evident
Computation	xCompute MPC	No reconstruction

$3.8B

AI eval TAM

MPC

On shares

Data exposure

VERIFIED BY XPROVE

Verifiable Data Protection

Every operation in this ACI produces a verifiable audit trail via xProve. HMAC-chained integrity proofs let auditors confirm that data was split, stored, and reconstructed correctly — without accessing the data itself.

XPROVE AUDIT TRAIL

Every XorIDA split generates HMAC-SHA256 integrity tags. xProve chains these into a tamper-evident audit trail that proves data was handled correctly at every step. Upgrade to zero-knowledge proofs when regulators or counterparties need public verification.

Read the xProve white paper →

GET STARTED

Ready to deploy xBenchmark?

Talk to Ren, our AI sales engineer, or book a live demo with our team.

Book a Demo

VERIFIABLE WITHOUT CODE EXPOSURE

Ship Proofs, Not Source

xBenchmark generates cryptographic proofs of correct execution without exposing proprietary algorithms. Verify integrity using zero-knowledge proofs — no source code required.

XPROVE CRYPTOGRAPHIC PROOF

Download proofs:

Tier 1 HMAC (~0.7KB)
Tier 2 Commit-Reveal (~0.5KB)
Tier 3 IT-MAC (~0.3KB)
Tier 4 KKW ZK (~0.4KB)

Verify proofs online →

Use Cases

️

REGULATORY

FDA / SEC Submissions

Prove algorithm correctness for distributed systems without exposing trade secrets or IP.

Zero IP Exposure

FINANCIAL

Audit Without Access

External auditors verify secure operations without accessing source code or production systems.

FINRA / SOX Compliant

️

DEFENSE

Classified Verification

Security clearance holders verify distributed systems correctness without clearance for source code.

CMMC / NIST Ready

ENTERPRISE

Procurement Due Diligence

Prove security + correctness during RFP evaluation without NDA or code escrow.

No NDA Required

Deployment Options

SaaS Recommended

Fully managed infrastructure. Call our REST API, we handle scaling, updates, and operations.

Zero infrastructure setup
Automatic updates
99.9% uptime SLA
Enterprise SLA available

View Pricing →

SDK Integration

Embed directly in your application. Runs in your codebase with full programmatic control.

npm install @private.me/xbenchmark
TypeScript/JavaScript SDK
Full source access
Enterprise support available

Get Started →

On-Premise Upon Request

Enterprise CLI for compliance, air-gap, or data residency requirements.

Complete data sovereignty
Air-gap capable deployment
Custom SLA + dedicated support
Professional services included

Request Quote →

Enterprise On-Premise Deployment

While xBenchmark is primarily delivered as SaaS or SDK, we build dedicated on-premise infrastructure for customers with:

Regulatory mandates — HIPAA, SOX, FedRAMP, CMMC requiring self-hosted processing
Air-gapped environments — SCIF, classified networks, offline operations
Data residency requirements — EU GDPR, China data laws, government mandates
Custom integration needs — Embed in proprietary platforms, specialized workflows

Includes: Enterprise CLI, Docker/Kubernetes orchestration, RBAC, audit logging, and dedicated support.

Contact sales for assessment and pricing →