PRIVATE.ME · Technical White Paper

Fedlearn: Gradient Privacy via XorIDA

Federated learning with information-theoretic gradient privacy. Client gradients split via XorIDA threshold sharing across multiple aggregator nodes. No single aggregator sees complete gradient updates — model inversion and membership inference attacks become mathematically impossible, not computationally hard. Zero npm dependencies.

v0.1.0 76 tests passing 5 modules 0 npm deps <10ms split Dual ESM/CJS

Section 01

Executive Summary

Federated learning allows distributed model training without sharing raw data. But gradient updates themselves leak information — model inversion attacks can reconstruct training samples, membership inference can detect if specific data was used for training.

Fedlearn splits gradient updates via XorIDA (threshold sharing over GF(2)) across multiple independent aggregator nodes. A 2-of-3 configuration means any single compromised aggregator learns zero information about the gradient — not "computationally hard to break," but mathematically impossible.

Two core functions cover the entire workflow: splitGradient() takes a client's gradient update (Float32Array serialized as Uint8Array), generates an HMAC-SHA256 integrity tag, pads to the next odd prime, and splits into N shares with K-of-N reconstruction threshold. aggregateGradients() collects threshold shares from multiple clients, reconstructs each client's gradient (HMAC verification before reconstruction, fail closed), and computes a sample-weighted average for the training round.

Zero configuration out of the box. Zero npm runtime dependencies. Runs anywhere the Web Crypto API is available — Node.js, Deno, Bun, Cloudflare Workers, browsers. Dual ESM and CJS builds ship in a single package.

Section 02

Developer Experience

Fedlearn provides structured error codes and comprehensive validation to help developers build reliable federated learning systems.

Structured Error Handling

Fedlearn uses a Result<T, E> pattern with detailed error structures. Every error includes a machine-readable code and human-readable message.

Error types

type FedLearnError =
  | { code: 'INVALID_CONFIG'; message: string }
  | { code: 'SPLIT_FAILED'; message: string }
  | { code: 'HMAC_FAILED'; message: string }
  | { code: 'RECONSTRUCT_FAILED'; message: string }
  | { code: 'INSUFFICIENT_SHARES'; message: string }
  | { code: 'ROUND_MISMATCH'; message: string };

Error Categories

Fedlearn organizes 6 error codes across 3 categories:

Category	Example Codes	When
Configuration	INVALID_CONFIG, ROUND_MISMATCH	Config validation, round consistency
Integrity	SPLIT_FAILED, HMAC_FAILED	XorIDA split failures, HMAC verification
Reconstruction	RECONSTRUCT_FAILED, INSUFFICIENT_SHARES	Share reconstruction, threshold enforcement

NAMED ERROR CLASSES

Fedlearn exports FedLearnError, FedLearnConfigError, FedLearnIntegrityError, and FedLearnReconstructError for try/catch consumers. Use toFedLearnError(code) to convert string codes to class instances.

Section 03

The Problem

Federated learning keeps raw training data on-device, but gradient updates themselves leak sensitive information about the training corpus.

Model inversion attacks. An attacker with access to gradient updates can reconstruct representative samples from the training data. In healthcare federated learning, this means patient records can be partially recovered from model updates.

Membership inference attacks. An adversary can determine whether a specific data point was used in training by analyzing gradient behavior. This violates privacy guarantees even when raw data never leaves the device.

Central aggregator is a single point of failure. Traditional federated learning routes all gradients through a central aggregation server. If that server is compromised, the attacker sees every gradient from every client — full visibility into the training corpus across all participants.

Differential privacy adds noise. DP-SGD injects Gaussian noise into gradients to provide statistical privacy. But this degrades model accuracy, requires careful hyperparameter tuning, and still relies on computational assumptions about the attacker's capabilities.

The Old Way

The New Way

Section 04

Real-World Use Cases

Six scenarios where Fedlearn provides information-theoretic gradient privacy for federated learning deployments.

Healthcare

Cross-Hospital Model Training

Train diagnostic models across multiple hospitals without sharing patient data. Gradient shares routed to 3 aggregator nodes — any single node learns zero information about patient records.

HIPAA compliant, 2-of-3 threshold

Financial

Fraud Detection

Federated fraud model training across multiple financial institutions. Transaction patterns stay private — gradient splits prevent reconstruction of customer behavior.

PCI DSS, sample-weighted aggregation

Mobile

On-Device Keyboard

Train next-word prediction models across millions of devices. User typing data never reconstructible from gradients — information-theoretic privacy guarantee.

splitGradient() on-device, 2-of-3

Government

Multi-Agency Intelligence

Federated model training across intelligence agencies with classification boundaries. 3-of-5 threshold across classified and unclassified aggregators.

3-of-5, classified networks

AI / ML

Research Collaboration

Multi-institution research projects with competitive data. Each lab contributes gradients without revealing proprietary training corpus.

Academic collaboration, IP protection

IoT

Edge ML Training

Distributed training across IoT sensor networks. Gradient shares routed to edge aggregators — sensor data patterns unrecoverable from individual shares.

Low bandwidth, threshold aggregation

Section 05

Solution Architecture

Two core operations: gradient splitting on training clients and threshold aggregation on aggregator nodes.

Gradient Splitting

Client-side

XorIDA split over GF(2)

PKCS7 padding to next odd prime

HMAC-SHA256 integrity tag

Base64 share encoding

Aggregation

Server-side

Threshold reconstruction (K-of-N)

HMAC verification before reconstruction

Sample-weighted averaging

Float32Array output

Gradient Splitting

The training client computes a local gradient update (Float32Array), serializes it as Uint8Array, generates an HMAC-SHA256 tag, pads to the next odd prime (PKCS7), and splits via XorIDA into N shares. Each share includes metadata (clientId, round, modelId, index, threshold) and is base64-encoded for transport.

Client gradient splitting

import { splitGradient } from '@private.me/fedlearn';

const config = {
  aggregatorNodes: 3,
  threshold: 2,
  round: 0,
  modelId: 'fraud-v2',
};

const update = {
  clientId: 'hospital-A',
  round: 0,
  modelId: 'fraud-v2',
  gradients: new Uint8Array(new Float32Array([0.1, -0.3, 0.5]).buffer),
  sampleCount: 1000,
};

const result = await splitGradient(update, config);
if (!result.ok) throw new Error(result.error.message);

// result.value.shares = [share0, share1, share2]
// Send share[i] to aggregator[i]

Aggregation

The aggregator node collects threshold shares from all training clients for a given round. For each client, it verifies HMAC consistency across shares (all shares must have the same HMAC tag), reconstructs the padded gradient via XorIDA, verifies the HMAC on the reconstructed data (fail closed), unpads, and deserializes to Float32Array. Finally, it computes a sample-weighted average across all clients.

Server aggregation

import { aggregateGradients } from '@private.me/fedlearn';

// Collect shares from all clients (each client sends K shares)
const clientAShares = [share0_from_agg0, share1_from_agg1];
const clientBShares = [share0_from_agg0, share1_from_agg1];

const result = await aggregateGradients(
  [clientAShares, clientBShares],
  config
);

if (!result.ok) throw new Error(result.error.message);

// result.value.gradients = sample-weighted average gradient
// result.value.totalSamples = sum of all client sample counts
// result.value.clientIds = ['hospital-A', 'hospital-B']

Section 06

Integration

Fedlearn integrates with existing federated learning frameworks by replacing the gradient transmission step with XorIDA split-channel delivery.

Installation

Package installation

pnpm add @private.me/fedlearn @private.me/crypto @private.me/shared

Complete Training Round

Full federated learning round

import { splitGradient, aggregateGradients } from '@private.me/fedlearn';

// ────────────────────────────────────────────
// CLIENT SIDE: Compute and split gradient
// ────────────────────────────────────────────
async function clientTrainingStep(model, localData, config) {
  // 1. Compute local gradient update
  const gradientArray = computeGradient(model, localData);

  // 2. Serialize Float32Array → Uint8Array
  const gradientBytes = new Uint8Array(gradientArray.buffer);

  // 3. Split gradient via XorIDA
  const update = {
    clientId: 'client-123',
    round: config.round,
    modelId: config.modelId,
    gradients: gradientBytes,
    sampleCount: localData.length,
  };

  const splitResult = await splitGradient(update, config);
  if (!splitResult.ok) throw new Error(splitResult.error.message);

  // 4. Send share[i] to aggregator[i]
  for (let i = 0; i < config.aggregatorNodes; i++) {
    await sendToAggregator(i, splitResult.value.shares[i]);
  }
}

// ────────────────────────────────────────────
// SERVER SIDE: Aggregate gradients
// ────────────────────────────────────────────
async function serverAggregationStep(config) {
  // 1. Collect threshold shares from all clients
  const allClientShares = await collectSharesFromClients(config.round);

  // 2. Reconstruct and aggregate
  const aggResult = await aggregateGradients(allClientShares, config);
  if (!aggResult.ok) throw new Error(aggResult.error.message);

  // 3. Apply weighted average to global model
  const avgGradient = new Float32Array(
    aggResult.value.gradients.buffer
  );

  applyGradientToModel(model, avgGradient);
  return model;
}

Configuration Options

Parameter	Type	Description
aggregatorNodes	number	Total number of aggregator nodes (N). Must be ≥ 2.
threshold	number	Minimum shares for reconstruction (K). Must be ≥ 2 and ≤ N.
round	number	Training round number. Must match across all updates.
modelId	string	Model identifier. Ensures shares from different models don't mix.

Section 07

Security

Fedlearn provides information-theoretic gradient privacy via XorIDA threshold sharing with HMAC-SHA256 integrity verification.

Information-Theoretic Security

XorIDA splits over GF(2) provide unconditional security — an attacker with access to K-1 shares (where K is the reconstruction threshold) learns zero information about the original gradient. This is not "computationally hard to break" — it is mathematically impossible, regardless of computational resources.

In a 2-of-3 configuration, compromising any single aggregator node reveals nothing. The attacker must compromise at least 2 nodes to reconstruct any gradient.

HMAC Integrity

Every gradient split generates an HMAC-SHA256 tag over the padded data. During aggregation:

Share consistency check: All shares for a client must have identical HMAC tags (fails if shares are from different splits).
Post-reconstruction verification: After XorIDA reconstruction, the HMAC is verified on the padded data. If verification fails, reconstruction is rejected (fail closed).

Sample-Weighted Aggregation

Aggregation computes a weighted average based on each client's sampleCount. A client that trained on 10,000 samples contributes proportionally more than a client with 100 samples. This preserves statistical validity and prevents small-sample clients from biasing the global model.

Randomness

All randomness via crypto.getRandomValues(). No Math.random() anywhere in the codebase.

SECURITY GUARANTEE

Fedlearn provides information-theoretic privacy for gradients: K-1 shares reveal zero bits of information about the original gradient. This holds unconditionally, independent of computational assumptions or adversary capabilities.

Section 08

Benchmarks

Performance measurements for gradient splitting and aggregation operations.

<10ms

Split (10K params)

<15ms

Aggregate (10K params)

~1.5x

Overhead vs plaintext

npm deps

Splitting Latency

Gradient splitting latency scales linearly with gradient size. A 10,000-parameter gradient (40KB as Float32Array) splits in ~8-10ms on a modern CPU. Most of the time is spent in HMAC-SHA256 generation and XorIDA splitting.

Aggregation Latency

Aggregation latency depends on the number of clients and gradient size. For 10 clients with 10K parameters each, aggregation completes in ~150ms (15ms per client). HMAC verification and XorIDA reconstruction dominate.

Bandwidth Overhead

Share size is approximately (gradient_size / threshold) + metadata. For a 40KB gradient with 2-of-3 threshold, each share is ~20KB + ~200 bytes metadata. Total upload per client: ~60KB (3 shares × 20KB). Bandwidth overhead vs. plaintext: ~1.5x.

PERFORMANCE NOTE

All benchmarks measured on Node.js 20.x with Web Crypto API on a modern x64 CPU. WASM acceleration for XorIDA is a future optimization. Current TypeScript implementation is production-ready for gradients up to 1M parameters.

Section 09

Honest Limitations

Fedlearn solves gradient privacy but does not address every federated learning challenge.

1. Does Not Prevent Model Poisoning

Fedlearn protects gradient privacy but does not validate gradient quality. A malicious client can submit poisoned gradients designed to degrade model performance or introduce backdoors. Defense requires additional techniques (robust aggregation, Byzantine fault tolerance, anomaly detection).

2. Requires Honest-Majority Aggregators

If K or more aggregators collude (where K is the reconstruction threshold), they can reconstruct gradients. A 2-of-3 configuration fails if any 2 aggregators are compromised. Choose N and K based on your threat model.

3. No Defense Against Sybil Attacks

Fedlearn does not authenticate clients or prevent a single adversary from registering multiple fake clients (Sybil attack). If 80% of "clients" are controlled by one adversary, gradient privacy is irrelevant — the adversary already controls the training corpus. Sybil resistance requires identity verification outside this package.

4. Bandwidth Overhead

Splitting gradients into N shares increases upload bandwidth by a factor of N. For 2-of-3 configuration, each client uploads 3 shares instead of 1 gradient. Low-bandwidth environments (mobile, IoT) may find this prohibitive.

5. No Compression

Fedlearn does not compress gradients before splitting. Gradient compression (top-k sparsification, quantization) can reduce bandwidth but must happen before splitGradient(). Compressing after splitting destroys the XorIDA shares.

6. Synchronous Aggregation Only

Aggregation waits for threshold shares from all clients before proceeding. Stragglers delay the entire round. Asynchronous federated learning (allow partial aggregation) is not supported.

USE CASE BOUNDARIES

Fedlearn is a gradient privacy layer, not a complete federated learning framework. You still need: client selection, model distribution, learning rate scheduling, convergence detection, and secure aggregator infrastructure. This package handles gradient splitting and aggregation only.

Section 10

Threat Model

Fedlearn defends against gradient leakage attacks under an honest-but-curious aggregator model.

Assumptions

Assumption	Description
Honest clients	Clients execute splitGradient() correctly. Malicious clients can poison gradients (not addressed here).
Honest-but-curious aggregators	Aggregators follow protocol but may collude to reconstruct gradients. K-1 collusion reveals zero information.
Secure channels	Share transmission over TLS 1.3+. Network adversaries cannot intercept shares in transit.
No timing attacks	Aggregators cannot infer gradient content from timing side channels.

Attacks Defended

Model inversion: Attacker with K-1 shares cannot reconstruct training samples (information-theoretic guarantee).
Membership inference: Single aggregator cannot determine if specific data was in training set.
Gradient tampering: HMAC verification detects modified shares before reconstruction.
Share replay: Round number prevents cross-round share reuse.

Attacks NOT Defended

K-of-N aggregator collusion: If threshold aggregators collude, they can reconstruct gradients.
Model poisoning: Malicious clients can submit adversarial gradients.
Sybil attacks: Single adversary registering multiple fake clients.
Client compromise: If client device is compromised, gradient is leaked before splitting.

Advanced Topics

Implementation Details

Low-level details for advanced integrators: error hierarchy, configuration validation, full ACI surface, and codebase statistics.

Appendix A1

Error Hierarchy

Fedlearn exports 4 error classes for try/catch consumers.

Error class hierarchy

class FedLearnError extends Error {
  readonly code: string;
  readonly subCode?: string;
  readonly docUrl?: string;
}

class FedLearnConfigError extends FedLearnError {}
class FedLearnIntegrityError extends FedLearnError {}
class FedLearnReconstructError extends FedLearnError {}

Code	Class	Description
INVALID_CONFIG	FedLearnConfigError	aggregatorNodes < 2, threshold < 2, threshold > aggregatorNodes, round < 0, or missing modelId
SPLIT_FAILED	FedLearnIntegrityError	Empty gradient data or XorIDA split failure
HMAC_FAILED	FedLearnIntegrityError	HMAC inconsistency across shares or verification failure after reconstruction
RECONSTRUCT_FAILED	FedLearnReconstructError	XorIDA reconstruction or unpadding failure
INSUFFICIENT_SHARES	FedLearnReconstructError	Fewer shares than threshold for a client group
ROUND_MISMATCH	FedLearnConfigError	Gradient update round does not match config round

Appendix A2

Configuration

Configuration validation rules and common patterns.

Validation Rules

validateConfig() checks

// All must pass:
aggregatorNodes >= 2
threshold >= 2
threshold <= aggregatorNodes
round >= 0
modelId !== ''

Common Configurations

Pattern	N (nodes)	K (threshold)	Use Case
Minimal	2	2	Development, testing (no fault tolerance)
Standard	3	2	Production (1 node failure tolerance)
High Security	5	3	Sensitive data (2 node collusion required)
Government	5	4	Classified networks (minimal redundancy)

Appendix A3

Full ACI Surface

Complete public API exported by @private.me/fedlearn.

splitGradient(update: GradientUpdate, config: FedLearnConfig): Promise<Result<GradientSplitResult, FedLearnError>>

Split a gradient update via XorIDA for distribution to aggregator nodes. HMAC generated before splitting for integrity verification.

aggregateGradients(shares: GradientShare[][], config: FedLearnConfig): Promise<Result<AggregatedGradient, FedLearnError>>

Reconstruct gradients from threshold shares and compute weighted average. HMAC verification before reconstruction (fail closed).

validateConfig(config: FedLearnConfig): Result<true, FedLearnError>

Validate federated learning configuration. Checks aggregatorNodes, threshold, round, modelId constraints.

packHmac(key: Uint8Array, signature: Uint8Array): string

Pack HMAC key + signature into a single base64 string for share transmission.

unpackHmac(hmacB64: string): { key: Uint8Array; signature: Uint8Array }

Unpack combined HMAC string into key and signature bytes for verification.

toFedLearnError(code: string): FedLearnError

Convert a string error code to a typed FedLearnError instance. Handles colon-separated sub-codes.

isFedLearnError(value: unknown): value is FedLearnError

Type guard for FedLearnError instances.

Appendix A4

Codebase Statistics

Package metrics and test coverage.

678

Lines of code

Source modules

Test cases

Test files

Module Breakdown

Module	Purpose	Lines
gradient-splitter.ts	XorIDA split, HMAC generation, PKCS7 padding	~180
gradient-aggregator.ts	Threshold reconstruction, HMAC verification, weighted averaging	~200
types.ts	TypeScript interfaces and error unions	~75
errors.ts	Error class hierarchy, toFedLearnError(), isFedLearnError()	~90
index.ts	Public API exports (barrel file)	~35

Dependencies

Package	Purpose
@private.me/crypto	XorIDA threshold sharing, HMAC, padding primitives
@private.me/shared	Result type, error utilities

Deployment Options

SaaS Recommended

Fully managed infrastructure. Call our REST API, we handle scaling, updates, and operations.

Zero infrastructure setup
Automatic updates
99.9% uptime SLA
Enterprise SLA available

View Pricing →

SDK Integration

Embed directly in your application. Runs in your codebase with full programmatic control.

npm install @private.me/fedlearn
TypeScript/JavaScript SDK
Full source access
Enterprise support available

Get Started →

On-Premise Upon Request

Enterprise CLI for compliance, air-gap, or data residency requirements.

Complete data sovereignty
Air-gap capable deployment
Custom SLA + dedicated support
Professional services included

Request Quote →

Enterprise On-Premise Deployment

While fedLearn is primarily delivered as SaaS or SDK, we build dedicated on-premise infrastructure for customers with:

Regulatory mandates — HIPAA, SOX, FedRAMP, CMMC requiring self-hosted processing
Air-gapped environments — SCIF, classified networks, offline operations
Data residency requirements — EU GDPR, China data laws, government mandates
Custom integration needs — Embed in proprietary platforms, specialized workflows

Includes: Enterprise CLI, Docker/Kubernetes orchestration, RBAC, audit logging, and dedicated support.

Contact sales for assessment and pricing →

Pricing

PRICING

Coming Soon

Pricing details will be available when this ACI launches. Subscribe to updates to be notified.

Questions about this ACI? Contact us