AI: On-Device AI Inference
Unified interface for on-device AI via Ollama, WebLLM, or platform-native models. Summarization, reply suggestions, intent classification, and embeddings. All inference runs locally. Zero data leaves the device.
The Problem
Cloud AI services see all user data. On-device AI has no unified interface across providers. Privacy-preserving AI inference does not exist for email.
Every cloud AI service (OpenAI, Anthropic, Google) requires sending full message content to remote servers. For email containing sensitive legal, medical, or financial data, this is unacceptable. The AI provider becomes another attack surface.
On-device alternatives exist (Ollama, WebLLM, Apple Foundation Models) but each has a different API, different model formats, and different capabilities. Building a product that works across all of them requires a unified abstraction layer.
The Old Way
The PRIVATE.ME Solution
A unified AI provider abstraction that runs entirely on-device. Supports Ollama (managed sidecar for desktop), WebLLM (browser), and platform-native models. Zero data transmitted to external services.
Provider abstraction normalizes the API across backends. Application code calls summarize() or suggestReply() without knowing which model is running underneath. Switching from Ollama to WebLLM requires changing one configuration line.
Managed sidecar for desktop: Ollama is installed and managed automatically. Model downloads happen in the background. The AI is always available without user configuration.
The New Way
How It Works
Four AI capabilities exposed through a single provider interface: summarization, reply suggestion, intent classification, and embedding generation.
Provider agnostic: Switch between Ollama, WebLLM, or native models without code changes.
Managed sidecar: Ollama installed and managed automatically on desktop.
No training: User data never used for model training. Models are read-only.
Use Cases
Generate contextual reply drafts from message content. All processing on-device. No cloud AI sees your email.
OllamaSummarize long email threads into concise digests. Runs locally with configurable model size and quality tradeoffs.
Local LLMClassify messages by intent (action required, FYI, scheduling, urgent) to power smart inbox prioritization.
On-DeviceGenerate vector embeddings for semantic search without sending content to external embedding APIs.
VectorsIntegration
import { createProvider, summarize } from '@private.me/ai'; // Create on-device AI provider const ai = createProvider('ollama'); // Summarize an email thread — runs entirely on device const summary = await summarize(ai, emailThread, { maxLength: 100, style: 'executive' }); // Generate reply suggestion const reply = await ai.suggestReply(emailThread, { tone: 'professional' });
Security Properties
| Property | Mechanism | Guarantee |
|---|---|---|
| Data locality | On-device inference | Zero data transmitted |
| Provider isolation | Managed sidecar process | Process separation |
| No training | Read-only model files | No data retention |
| API uniformity | Provider abstraction | Backend agnostic |
| Graceful fallback | Capability detection | Feature degradation |
Ready to deploy ai?
Talk to Sol, our AI platform engineer, or book a live demo with our team.
Ship Proofs, Not Source
Ai generates cryptographic proofs of correct execution without exposing proprietary algorithms. Verify integrity using zero-knowledge proofs — no source code required.
- Tier 1 HMAC (~0.7KB)
- Tier 2 Commit-Reveal (~0.5KB)
- Tier 3 IT-MAC (~0.3KB)
- Tier 4 KKW ZK (~0.4KB)