# ArcaKey Private AI — Cryptographic Architecture Whitepaper

**Version:** 0.1 (draft — pre-independent-review)
**Date:** 2026-04-18
**Status:** Public draft for review. Subject to revision following independent cryptographic review (planned Q3 2026).

---

## 1. Purpose and audience

This document specifies the cryptographic architecture of the ArcaKey Private AI platform. Its purpose is to let a security-literate reader — a customer's CISO, outside counsel, compliance officer, or independent cryptographer — verify that the claims made in ArcaKey's marketing and contractual materials map to real, named, and inspectable primitives.

Where a primitive is not yet independently reviewed, this document says so explicitly. Where a capability is aspirational or gated on infrastructure not yet live in production, this document labels it as such. We would rather tell you a deployment is scheduled for Q3 than claim something today we cannot prove.

This is version 0.1. It exists so customers evaluating ArcaKey before the formal independent cryptographic review have a named, versioned document to reference — not a marketing page. Version 1.0 will be published after the independent review engagement completes.

## 2. Threat model reference

The threat model for this system is specified in a companion document, [arcakey-threat-model-v0.1.md](./arcakey-threat-model-v0.1.md). Readers should treat the two documents as paired. In summary: the system is designed to protect user content against (a) honest-but-curious ArcaKey operators, (b) network adversaries including those capable of store-and-decrypt-later attacks against today's transit, (c) compromise of the application server itself, and (d) subpoena-compelled disclosure of operational logs that would reveal user content. It is explicitly not designed to protect against a targeted compromise of the user's own client device; if the adversary controls the user's endpoint, the system provides no protection.

## 3. Cryptographic primitives

The platform uses the following primitives. Each primitive is a named, published specification with a public reference implementation.

**Symmetric encryption.** AES-256 in Galois/Counter Mode (AES-256-GCM) for all at-rest and in-session content encryption. Key derivation from password material uses Argon2id with memory cost ≥ 64 MiB, time cost ≥ 3, and parallelism 4, as recommended by OWASP and validated against the reference implementation in the `argon2` Node library.

**Post-quantum key encapsulation.** ML-KEM-768, the NIST FIPS 203 standardization of CRYSTALS-Kyber, implemented via the audited `@noble/post-quantum` library. ML-KEM-768 is used both for session establishment between client and server, and — in Phase 2 — between client and TEE enclave directly.

**Post-quantum digital signatures.** ML-DSA-65, the NIST FIPS 204 standardization of CRYSTALS-Dilithium, implemented via the same `@noble/post-quantum` library. ML-DSA is used to sign server-emitted chat chunks so the client can verify the response stream was produced by the attested server and not by an on-path adversary.

**Hybrid classical-plus-postquantum.** All key establishment combines a classical primitive (X25519 over TLS 1.3) with ML-KEM-768. If one primitive is broken in isolation — whether by future cryptanalysis of X25519 or by a flaw discovered in ML-KEM — the combined scheme remains secure under the other.

**Transport security.** TLS 1.3 with a preference order that rejects any cipher suite not offering AEAD (authenticated encryption with associated data). TLS is terminated at the application edge; for Sovereign tier, TLS terminates *inside* the TEE (see §6).

**Hardware isolation.** NVIDIA H100 Confidential Computing mode with the NVIDIA Remote Attestation Service (NRAS) providing a signed evidence chain rooted in an NVIDIA-held root of trust. The platform integrates NRAS via NVIDIA's `nvtrust` SDK. Evidence is verified both server-side (before inference is allowed to proceed) and — in Phase 2 — by the client directly.

## 4. Session architecture

A chat session has three phases: establishment, operation, and teardown.

**Establishment.** The client authenticates via Clerk (classical TLS + platform authentication). It then calls `/api/vault/session/init`, which returns (a) an ML-KEM-768 encapsulation public key that is freshly generated for this session, and (b) the ML-DSA-65 verification public key the server will use to sign response chunks. The client generates a symmetric session key, encapsulates it using the server's ML-KEM public key, and returns the encapsulation. Both sides now share a session key that an adversary without access to the server's ML-KEM private key cannot recover.

**Operation.** Every client-to-server message is encrypted under the session key with AES-256-GCM. The server decrypts in memory only, passes plaintext to the inference backend (which is inside a TEE for Executive and higher tiers — see §5), encrypts the streamed response chunk by chunk, and signs each ciphertext chunk with ML-DSA-65. The client verifies each chunk's signature before rendering. This binds the response stream cryptographically to the server that holds the ML-DSA signing key.

**Teardown.** The session key is held in server RAM for the lifetime of the session only, and is zeroized on session close. Ghost Mode makes this explicit in the product: when Ghost Mode is on for a session, no portion of the conversation is written to any storage layer — not to the database, not to logs, not to metrics — beyond minimum billing metadata (user, tier, tokens consumed, timestamp).

## 5. TEE-isolated inference

For Executive, Sovereign, Teams Executive, and Enterprise tiers, inference runs inside a Google Cloud A3 Confidential VM with the NVIDIA H100 GPU in Confidential Computing mode. Specifically, this means:

1. The VM's memory is encrypted via AMD SEV-SNP with keys held in the CPU's security processor. The hypervisor cannot read VM memory.
2. The H100 GPU runs in CC-mode, meaning GPU memory and the CPU-GPU bounce buffers are encrypted. The hypervisor cannot read GPU memory.
3. Inference code and model weights are loaded into this protected region. Plaintext prompts enter the TEE only after the TEE has decrypted them with a session key it established directly with the client (Phase 2) or received via the application server acting as a trusted relay (Phase 1).
4. Before any inference is permitted to a client, the server fetches an NRAS attestation report and verifies it against NVIDIA's root of trust. The report binds the running GPU's identity, driver version, confidential-computing mode, and the hash of the inference runtime measured at boot.

The Professional tier uses RunPod Serverless (non-TEE) because the price point does not carry confidential-computing economics today. This is disclosed explicitly on the pricing page and in the feature matrix.

## 6. Phase 2 — end-to-end to enclave (Sovereign)

Phase 1 (the default architecture) has a single remaining trust assumption: the application server briefly holds the session key and the plaintext at the moment it hands them to the TEE. A sufficiently privileged ArcaKey operator with control of the application server could, in principle, read that plaintext in transit.

Phase 2 closes this by removing the application server from the key exchange entirely.

1. The client fetches an NRAS attestation report for the target TEE enclave and verifies it.
2. The attestation report includes the enclave's ephemeral TLS public key fingerprint, bound cryptographically to the attested GPU identity.
3. The client establishes a hybrid TLS session (X25519 + ML-KEM-768) *directly* with the TEE enclave. The application server proxies encrypted traffic only.
4. Plaintext never exists anywhere outside the TEE. ArcaKey operators — including those with full application-server access — are cryptographically excluded from the plaintext path.

Phase 2 is the Sovereign tier activation gate. Sovereign prospects may apply without Phase 2 live; they cannot be activated as billed Sovereign customers until Phase 2 ships. Status is tracked on the public /security page changelog.

## 7. Key management

**Server-managed keys (Locked Memory / Executive / Teams).** A platform key, held in an HSM-backed key vault, wraps per-user data encryption keys (DEKs). The platform key is never exported; DEKs are unwrapped in process memory for the lifetime of a single request and zeroized on return.

**User-held keys (Sovereign and Permanent Vault).** The user's passphrase (Argon2id-derived) or FIDO2 token is the sole unwrapping path for their DEK. The unwrapped DEK exists only on the user's client or, under Phase 2, inside the TEE enclave. Losing the passphrase or hardware token means losing the data. This is the price of the guarantee; it is disclosed at signup and in the user-held-keys onboarding copy.

**Rotation.** All keys rotate on a documented schedule: platform master key annually, per-user DEKs on passphrase change or at user request. Rotation events are recorded in the exportable signed audit log.

## 8. Signed audit log

Every operation on a user's vault — unlock, read, write, purge, key rotate, memory deletion — is recorded in a per-user audit log. Each entry is signed with Ed25519 and chained (each entry includes the hash of the previous entry, producing a tamper-evident chain). Users on Executive and above can export the signed audit log at any time and verify it offline against a published Ed25519 verification key.

The audit log records metadata only — user, timestamp, operation type, model used, attestation evidence reference — and never the content of the operation itself.

## 9. What this document does not yet prove

In the interest of not overclaiming, the following items are deliberately labeled as unproven or pending as of this draft:

- **Independent cryptographic review.** Not yet complete. Engagement planned Q3 2026. Until that review is published, this document should be read as a first-party specification, not an independently verified one.
- **Independent penetration test.** Scheduled; report not yet published. Under-NDA summary will be provided to Sovereign and Enterprise prospects on request.
- **Phase 2 live deployment.** Architecture specified, scaffold in repository, not yet activated in production. Sovereign activation gated on Phase 2.
- **SOC 2 Type II.** Readiness engagement in 2026–2027; certification not yet held.
- **FIPS-140-3 validated modules.** We use FIPS 203 / 204 *standardized* primitives but do not yet run them inside a FIPS-140-3 validated cryptographic module. This is a roadmap item for Sovereign/Enterprise customers whose procurement requires validated modules.

We will update this document, with a new version number, as each of these items ships.

---

**Contact for cryptographic questions:** `security@arcakey.ai`
**Disclosure policy:** https://arcakey.ai/security#disclosure-policy
**Next revision:** Version 0.2 following completion of independent review engagement scoping.
