Privacy, Sovereign AI & HIPAA

Email content never leaves your network. By default. Always. By design.

Sovereign AI in Email Triage

Three concentric privacy guarantees.

Sovereign AI is implemented at three levels of the stack — each strictly stronger than the last.

1. Local-First Classifier

Default Ollama backend on your own GPU host. Qwen / Llama / Mistral families supported. Email content never traverses a third-party API. No SOC 2 boundary to argue about. No data-processing addendum to negotiate. The classifier is yours.

2. BAA-Gated Cloud Backend

Operators with BAA coverage (OpenAI, Gemini, OpenAI-compatible) can configure cloud classification. The BAA gate is enforced in code, not policy. HIPAA-flagged messages skip cloud routing until the operator records BAA acknowledgment in the audit log. The compliance officer's question — "what stops PHI from going to OpenAI?" — has a code-level answer.

3. Hard-Locked Embedding Allowlist

The RAG path (sent-mail context for drafted replies) is hard-coded to local-only — ollama, in-process sentence-transformers, and a fallback composite. A static privacy-invariant test fails the build if anyone tries to add a non-local backend here. Drafted-reply context is higher-volume PHI exposure than per-message classification; the allowlist forecloses cloud embedding entirely.

HIPAA Mode

A single toggle. Cascading enforcement across the codebase.

PHI-Scrubbed Logs

Sender, subject, body, and classification reasoning are redacted from system logs. The [redacted] token replaces these fields in every log line.

Static Privacy Scan

A test suite runs on every build that greps the production source for forbidden field references in log calls. Any new code that would log a sensitive field fails the test before merge.

Redacted Notifications

SMS / push notifications include category and timestamp only. Never the sender, subject, or body content.

Recipient Verification

The daily-digest feature refuses to send if the configured to-address doesn't match the account owner. Protects against accidentally cc'ing PHI to the wrong stakeholder.

PHI-Aware Caching

The classification cache is disabled by default for HIPAA-flagged accounts. The cost (re-classifying repeats) buys the audit posture (no PHI in a side-cache).

Tamper-Evident Audit Log

Every login, account view, and credential use is recorded in an append-only access log with a SHA-256 hash chain. Any alteration of a past row breaks the chain — detectable in seconds via the email-triage audit verify CLI command.

Compliance dashboard

/compliance — HIPAA mode, audit-chain status, BAA acknowledgments, TLS certificate lifecycle

Supply Chain Security

You verify the image before you trust it.

Sovereign AI ends at the model boundary if the runtime itself isn't trustworthy. Email Triage publishes a verifiable supply chain so compliance officers can answer "where did this binary come from?" with a cryptographic chain instead of a vendor email.

Cosign-Signed Images

Every published container image is signed with cosign using keyless OIDC against the GitHub Actions identity. No long-lived signing key to lose or rotate. Operators verify with cosign verify before pull.

SLSA-3 Build Provenance

Each image carries a SLSA-3 build provenance attestation: who built it, when, from which source revision, in which builder. Proves the image came out of the public CI pipeline running against the public source — not a one-off developer laptop build.

Operator Approval Attestation

A separate human-validated review event signed against the same digest (predicate operator-approval/v1). Distinguishes "the CI passed" from "an operator reviewed and approved this release."

HIPAA Install Gate

HIPAA-flagged installs verify both attestations on the same image digest before allowing pull. A poisoned CI run with no operator attestation can't reach a HIPAA host. Verification recipe in the public repo's docs/install.md.

Air-Gap Verifiable

Air-gap installs use scripts/download-embedding-bits.sh on a connected machine to produce a hash-pinned tarball + SHA-256 sidecar. Sideload through the admin UI runs the same hash verification as the auto-download path — operator-staged bytes are not trusted.

Pinned Tags, Floating Aliases

Customers pin to immutable vX.Y.Z tags (same digest under both vX.Y.Z and X.Y.Z forms). Float to X.Y, X, or :latest for development-friendly upgrades. :edge for every push to main.

Source: github.com/Unlimited-Data-Works-LLC/Email-Triage · Image: ghcr.io/unlimited-data-works-llc/email-triage · License: Apache 2.0

Encryption at Rest

Fernet-encrypted secrets. Master key options operator-chosen.

Provider passwords, OAuth refresh tokens, and other secrets are encrypted with Fernet (AES-128-CBC + HMAC-SHA256) using a master key held outside the SQLite database. Storage options for the master key:

Backing up the database without also backing up the master key leaves the secrets unrecoverable — a safety property for off-site backup storage.

Transmission Integrity (TLS)

  • ACME / Let's Encrypt automatic renewal
  • Self-signed for internal-only deployments
  • Operator-supplied certificates
  • Tailscale-issued certificates for funnel deployments

Authentication

  • Email OTP (default)
  • WebAuthn / Passkey (YubiKey, Touch ID, Windows Hello)
  • CSRF protection with session-bound tokens
  • Per-session audit row (auth method, IP, user agent)

Multi-Tenancy & Delegation

Three-role audit model.

The user model supports three roles with distinct audit semantics:

Delegate actions are stamped with both the actor and the account-owner in the audit log. The HIPAA §164.312(b) audit gate distinguishes owner self-access from delegate access — the former is a §164.502(a) self-disclosure carve-out and isn't audited as PHI access; the latter is and writes an audit row every time.

Standards Honored

RFCs and regulations the system aligns to.

StandardApplication
RFC 5322Internet Message Format. Parsing inbound mail, writing outbound headers with proper In-Reply-To threading.
RFC 6154IMAP SPECIAL-USE flag. Drafts / Sent folder auto-discovery via \Drafts / \Sent markers.
RFC 5545iCalendar. Parsing .ics payloads on incoming meeting invites.
RFC 5546iMIP. Invite-acceptance / decline / tentative drafted as METHOD=REPLY with proper threading.
HIPAA §164.312(b)Technical Safeguards — Audit Controls. Hash-chained audit log; CLI verifier.
HIPAA §164.312(e)(1)Transmission Security. TLS posture, certificate lifecycle.
HIPAA §164.502(a)Self-Disclosure carve-out. Audit gate avoids spurious "PHI access" rows on owner self-access.
NERC CIP-007-R4Logging and Monitoring. Audit-log shape informed by CIP-007.

Beyond Email

Need sovereign AI for your organization, not just for email?

Email Triage is one application of sovereign AI patterns. The same approach — local-by-default LLM, code-enforced compliance gates, audit-ready architecture — applies to any AI initiative in a regulated environment. I help organizations design and implement sovereign AI strategy across their stack.