PRIVATE.ME PLATFORM

xOrigin: Training Data Provenance

Establish immutable provenance chains for AI training data. Every dataset operation is recorded with SHA-256 hashes, and data is split across independent custodians via XorIDA to prevent unauthorized access.

AI / ML COMING SOON XorIDA Powered

Section 01

The Problem

AI training data origins are untraceable. Poisoned datasets can compromise model behavior with no audit trail. Organizations training on third-party data have no way to verify what was included, when it was modified, or whether it was tampered with.

Data poisoning attacks are cheap and effective. An adversary who injects a small percentage of crafted samples into a training corpus can plant backdoors, bias outputs, or degrade accuracy -- all without detection. Current pipelines offer no chain of custody for training data.

Regulatory pressure compounds the problem: the EU AI Act mandates documentation of training data provenance. Organizations that cannot demonstrate a clear audit trail face fines up to 7% of global revenue.

The Old Way

Section 02

The PRIVATE.ME Solution

xOrigin records every dataset operation in an immutable provenance chain. Each entry is SHA-256 hashed and linked to its predecessor. The dataset itself is split across custodians via XorIDA, so no single custodian can access or tamper with the training data.

Every transformation -- ingestion, cleaning, augmentation, sampling, merging -- is logged with operator identity, timestamp, input hash, and output hash. The chain is cryptographically linked: tampering with any historical entry breaks all subsequent hashes.

Data custody is distributed: the training data is split into N shares via XorIDA. Each custodian holds one share. K shares are required for reconstruction. No single custodian can read, modify, or leak the dataset.

The New Way

Section 02B

Fast Onboarding: 3 Acceleration Levels

Traditional supply chain provenance requires manual certificate setup, custodian coordination, and share distribution infrastructure. Xorigin collapses this to 15 seconds with zero-click accept, 90 seconds with one-line CLI, and 10 minutes with deploy buttons.

Level 1: Zero-Click Accept

15 seconds — Auto-accept invite from env var. No manual DID setup, no custodian coordination.

Node.js/Deno/Bun

// .env file
XORIGIN_INVITE_CODE=XOR-abc123

// Auto-accept on first use
import { createProvenanceManager } from '@private.me/xorigin';

const manager = createProvenanceManager();

const result = await manager.storeCertificate(certificate, 2, 3);
//  Invite auto-accepted, ready to track provenance

Level 2: One-Line CLI

90 seconds — Generates custodian DID, saves to .env, creates first certificate.

CLI

# Install and initialize
npx @private.me/xorigin init

# Output:
#  Custodian DID generated
#  Saved to .env
#  Share storage configured
# Ready to create origin certificates

# Create your first certificate
npx @private.me/xorigin create \
  --product "Organic Coffee Beans" \
  --batch "BATCH-2024-03-15-001" \
  --origin "CO:Huila:Pitalito"

Level 3: Deploy Button

10 minutes — One-click deploys share storage + verification API to Vercel/Netlify/Railway.

INCLUDED

Share storage (AES-256-GCM)
Verification API (cryptographic proofs)
Custody transfer dashboard
Counterfeit detection endpoint

Example: Zero-Click Accept

Set invite code in environment, create origin certificate on first use. No manual setup required.

Zero-Click Accept Example

// 1. Set environment variable
// .env file:
XORIGIN_INVITE_CODE=https://xorigin.private.me/invite/XOR-abc123

// 2. Create origin certificate (auto-accepts invite)
import { createProvenanceManager } from '@private.me/xorigin';

const manager = createProvenanceManager();

const certificate = {
  id: 'cert-001',
  productId: 'SKU-12345',
  productName: 'Organic Coffee Beans',
  manufacturer: 'did:key:z6Mk...',
  origin: {
    country: 'CO',
    region: 'Huila',
    city: 'Pitalito',
  },
  manufacturedAt: new Date('2024-03-15'),
  batchNumber: 'BATCH-2024-03-15-001',
  metadata: {
    category: 'agricultural',
    certifications: ['USDA Organic', 'Fair Trade'],
  },
};

const result = await manager.storeCertificate(certificate, 2, 3, {
  custodians: [
    'did:key:manufacturer',
    'did:key:distributor',
    'did:key:retailer',
  ],
  onProgress: (status, percent) => console.log(`${status} (${percent}%)`)
});

if (result.ok) {
  console.log(' Certificate protected');
  console.log(' Shares distributed to custodians');
  console.log(' Custody chain initialized');
}

// What happened:
// 1. Invite auto-accepted from XORIGIN_INVITE_CODE env var
// 2. Custodian DID generated and saved to .env
// 3. Certificate split via XorIDA (2-of-3)
// 4. Shares distributed to custodians
// 5. Custody chain initialized
// Total time: ~15 seconds

Example: CLI Setup

One command generates custodian DID, saves credentials, and creates your first origin certificate.

CLI Example

# Step 1: Install CLI globally
npm install -g @private.me/xorigin

# Step 2: Initialize (generates custodian DID, saves to .env)
xorigin init

# Output:
# Generating custodian DID...
#  Custodian DID: did:key:z6Mk...
#  Saved to .env
#  Share storage configured: https://xorigin.private.me
# Ready to create origin certificates

# Step 3: Create your first certificate
xorigin create \
  --product "Organic Coffee Beans" \
  --batch "BATCH-2024-03-15-001" \
  --origin "CO:Huila:Pitalito" \
  --threshold 2 --total 3

# Output:
#  Certificate created: cert-001
#  Split via XorIDA (2-of-3)
#  Shares distributed to custodians
#  Custody chain initialized
# Ready for custody transfers

Example: Deploy Button

One-click deployment provisions complete infrastructure for supply chain provenance tracking.

Deploy Button Flow

# 1. Click "Deploy to Vercel" button
# 2. Authenticate with Vercel/Netlify/Railway
# 3. Configure environment variables:
#    - XORIGIN_ADMIN_DID (auto-generated)
#    - SHARE_STORAGE_BACKEND (S3/R2/GCS)
#    - VERIFICATION_API_KEY (auto-generated)

# 4. Deploy completes (~10 minutes)
# 5. Infrastructure ready:
#     Share storage (AES-256-GCM encrypted at rest)
#     Verification API (cryptographic proof generation)
#     Custody transfer dashboard
#     Counterfeit detection endpoint

# 6. Create first certificate via dashboard or API
curl -X POST https://your-deployment.vercel.app/api/certificates \
  -H "Authorization: Bearer $API_KEY" \
  -d '{
    "product": "Organic Coffee Beans",
    "batch": "BATCH-2024-03-15-001",
    "origin": "CO:Huila:Pitalito",
    "threshold": 2,
    "total": 3
  }'

Why This Matters

Traditional supply chain provenance takes days to set up: coordinating custodians, distributing shares, configuring verification infrastructure. Fast Onboarding collapses this to seconds, removing friction from adoption. The faster you can start tracking provenance, the sooner you protect your supply chain from counterfeits and compliance violations.

Section 03

How It Works

xOrigin wraps every dataset operation in a provenance record: input hash, output hash, operator identity, timestamp, and operation type. Records form a linked chain where each entry references its predecessor's hash.

Key Security Properties

Immutable provenance chain: each record links to its predecessor via SHA-256 hash. Tampering with any historical entry is detectable. Data custody is distributed via XorIDA -- no single custodian can access the training data independently. HMAC integrity tags on every share prevent silent modification.

Section 04

Use Cases

Regulatory

EU AI Act Compliance

Provide regulators with a complete, tamper-evident record of every dataset used in model training. Demonstrate data provenance from ingestion through final training run.

audit-ready

Intellectual Property

Dataset Licensing Verification

Prove that only licensed datasets were used in training. Provenance chain records every data source with timestamps and license references.

license-chain

Security

Training Pipeline Auditing

Detect unauthorized modifications to training pipelines. Every transformation is recorded with operator identity and input/output hashes.

tamper-evident

AI Safety

Data Poisoning Detection

Identify when training data was modified post-ingestion. Hash chain breaks indicate unauthorized changes, enabling rapid incident response.

chain-integrity

Section 05

Integration

Quick Start

import { trackProvenance, auditChain } from '@private.me/xorigin';

// Record a dataset operation with provenance metadata
const entry = await trackProvenance(datasetBuffer, {
  operation: 'ingest',
  source: 'licensed-corpus-v3',
  operator: 'pipeline@org.com',
  license: 'CC-BY-4.0',
});

// Audit the entire provenance chain for integrity
const audit = await auditChain(chainId);
if (audit.ok) {
  // audit.value.entries — verified chain, all hashes intact
  // audit.value.datasetHash — SHA-256 of current dataset
}

trackProvenance(dataset: Buffer, metadata: ProvenanceRecord): Promise<Result<ChainEntry>>

Records a dataset operation in the provenance chain. Computes SHA-256 hash of the dataset, links to the previous chain entry, and optionally splits the data across custodians via XorIDA. Returns the new chain entry with its hash.

auditChain(chainId: string): Promise<Result<AuditResult>>

Verifies the integrity of a complete provenance chain. Checks that every SHA-256 link is valid, every HMAC tag verifies, and no entries have been modified or deleted. Returns detailed audit results with any detected anomalies.

Section 06

Security Properties

Property	Mechanism	Guarantee
Confidentiality	XorIDA threshold sharing	Information-theoretic
Integrity	HMAC-SHA256 per share	Tamper-evident
Availability	K-of-N reconstruction	Fault tolerant
Provenance	SHA-256 hash chain	Immutable audit trail
Non-repudiation	Operator identity binding	Attributable operations

Tests

91%

Coverage

Modules

Runtime Deps

VERIFIED BY XPROVE

Verifiable Data Protection

Every operation in this ACI produces a verifiable audit trail via xProve. HMAC-chained integrity proofs let auditors confirm that data was split, stored, and reconstructed correctly — without accessing the data itself.

XPROVE AUDIT TRAIL

Every XorIDA split generates HMAC-SHA256 integrity tags. xProve chains these into a tamper-evident audit trail that proves data was handled correctly at every step. Upgrade to zero-knowledge proofs when regulators or counterparties need public verification.

Read the xProve white paper →

GET STARTED

Ready to deploy xOrigin?

Talk to Ren, our AI sales engineer, or book a live demo with our team.

Book a Demo

VERIFIABLE WITHOUT CODE EXPOSURE

Ship Proofs, Not Source

xOrigin generates cryptographic proofs of correct execution without exposing proprietary algorithms. Verify integrity using zero-knowledge proofs — no source code required.

XPROVE CRYPTOGRAPHIC PROOF

Download proofs:

Tier 1 HMAC (~0.7KB)
Tier 2 Commit-Reveal (~0.5KB)
Tier 3 IT-MAC (~0.3KB)
Tier 4 KKW ZK (~0.4KB)

Verify proofs online →

Use Cases

️

REGULATORY

FDA / SEC Submissions

Prove algorithm correctness for distributed systems without exposing trade secrets or IP.

Zero IP Exposure

FINANCIAL

Audit Without Access

External auditors verify secure operations without accessing source code or production systems.

FINRA / SOX Compliant

️

DEFENSE

Classified Verification

Security clearance holders verify distributed systems correctness without clearance for source code.

CMMC / NIST Ready

ENTERPRISE

Procurement Due Diligence

Prove security + correctness during RFP evaluation without NDA or code escrow.

No NDA Required

Deployment Options

SaaS Recommended

Fully managed infrastructure. Call our REST API, we handle scaling, updates, and operations.

Zero infrastructure setup
Automatic updates
99.9% uptime SLA
Enterprise SLA available

View Pricing →

SDK Integration

Embed directly in your application. Runs in your codebase with full programmatic control.

npm install @private.me/xorigin
TypeScript/JavaScript SDK
Full source access
Enterprise support available

Get Started →

On-Premise Upon Request

Enterprise CLI for compliance, air-gap, or data residency requirements.

Complete data sovereignty
Air-gap capable deployment
Custom SLA + dedicated support
Professional services included

Request Quote →

Enterprise On-Premise Deployment

While xOrigin is primarily delivered as SaaS or SDK, we build dedicated on-premise infrastructure for customers with:

Regulatory mandates — HIPAA, SOX, FedRAMP, CMMC requiring self-hosted processing
Air-gapped environments — SCIF, classified networks, offline operations
Data residency requirements — EU GDPR, China data laws, government mandates
Custom integration needs — Embed in proprietary platforms, specialized workflows

Includes: Enterprise CLI, Docker/Kubernetes orchestration, RBAC, audit logging, and dedicated support.

Contact sales for assessment and pricing →