Architecture Walkthrough: Annota Encryption
Architecture Walkthrough: Annota Encryption
This note is a walkthrough of how the end-to-end encryption (E2EE) and crypto system works in Annota, tracing the flow from a plain-text note (HTML) or raw file on disk to the final encrypted payload sent to the cloud.
- I added links to point to specific functions in the github repository which are mainly in crypto.ts for convenience.
Step 1: Key Generation and Storage
-
Master Seed Mnemonic: When a user signs up or sets up encryption, a 12-word BIP39 mnemonic phrase is generated via generateMasterKey using a secure random number generator (
customRng). -
Device Keychain Storage: The mnemonic phrase is encrypted at rest using encryptAtRest (which uses a device-specific derived key with AES-256-GCM) and securely saved in the device's secure store (keychain) via platform adapters.
Step 2: Key Derivation (BIP39 + Argon2id + HKDF)
To encrypt or decrypt data, the system needs cryptographic keys derived from the user's mnemonic. This process runs in deriveKeysFromMnemonic:
-
BIP39 Seed: The 12-word mnemonic is converted to a 64-byte binary seed via
mnemonicToSeedSync. -
Argon2id (Memory Hard Key Derivation): The 64-byte seed is stretched using Argon2id (via deriveKeyFromMnemonic) with a device-specific
salt(decoded from the user'ssaltHexstring).-
Argon2 Settings:
-
memory: 65,536 KiB (64 MB) -
passes: 2 -
parallelism: 1 -
tagLength: 32 bytes (256-bit key) This yields a 256-bit symmetric Master Key.
-
-
-
HKDF Subkey Extraction: To ensure key isolation (so a compromised notes key cannot access files, and vice versa), the 256-bit Master Key is passed to HKDF-SHA256 (via deriveSubkeys) to extract two distinct subkeys:
-
notesKey: Used for note titles, folder structure, metadata, and HTML contents.- Derived using
infoparameter ="notes"
- Derived using
-
filesKey: Used for encrypting raw file attachments (images, PDFs, etc.).- Derived using
infoparameter ="files"
- Derived using
-
Step 3: Note Encryption (Plain HTML to Encrypted JSON Payload)
During synchronization in sync-service.ts, notes are encrypted using the derived notesKey before being pushed to the cloud:
[Plain HTML Content] + [Metadata]
│
▼ (Object Assembly)
{ id, title, content, ... }
│
▼ (JSON Serialization)
"JSON String"
│
▼ (UTF-8 Encoding)
[Plaintext Bytes]
│
▼ (AES-256-GCM + 12-byte Nonce + notesKey)
[Ciphertext Bytes] + [16-byte Auth Tag]
│
▼ (Base64 Encoding & Concatenation)
"Base64 Ciphertext" + "Base64 Auth Tag" (Last 24 Chars)
- Serialization: The HTML content is combined with its metadata (such as ID, title, folder ID) into a single JavaScript object:
const dataToEncrypt = { ...metadata, content }; // content contains the HTML
const jsonPayload = JSON.stringify(dataToEncrypt);
-
UTF-8 Conversion: The JSON string is encoded into bytes (
Uint8Array) using UTF-8 formatting. -
Initialization Vector (Nonce): A cryptographically secure random 12-byte Nonce is generated. A unique nonce is critical for AES-GCM to prevent replay attacks and pattern leakage.
-
AES-256-GCM Encryption: The plaintext bytes are encrypted using AES-256-GCM with the
notesKeyand the generated nonce. This generates:-
ciphertext: The encrypted bytes. -
authTag: A 16-byte authentication tag ensuring the integrity and authenticity of the encrypted data.
-
-
Payload Formatting: In encryptPayload:
-
The
ciphertextis encoded to Base64. -
The
authTagis encoded to Base64 (always exactly 24 characters). -
They are concatenated together:
encryptedData = base64(ciphertext) + base64(authTag). -
The nonce is encoded as a Hex string (
nonceHex).
-
-
Upload: The final packet containing
{ id, encrypted_data: encryptedData, nonce: nonceHex }is uploaded to Supabase/PostgreSQL database. The server has no access to the keys and only sees scrambled Base64 string data.
Step 4: File Encryption (Raw Bytes to Encrypted Binary)
For media attachments like images or PDFs, the encryption process (in file-sync.service.ts) is slightly different
- Since files can be large, we avoid Base64 encoding which inflates data size by roughly 33%. While this is trivial for a small JSON note, Base64-encoding of large files like images or PDF files can cause bigger memory overhead.
-
Read Disk: The file is read directly from the local device filesystem as a raw byte array (
Uint8Array). -
AES-256-GCM Encryption: In encryptFileBytes:
-
A 12-byte secure random nonce is generated.
-
The file bytes are encrypted using AES-256-GCM with the
filesKeyand the nonce, returning theciphertextand the 16-byteauthTag.
-
-
Binary Concatenation: The system creates a single new binary payload by placing the
authTagdirectly at the end of theciphertext:
const encryptedFinal = new Uint8Array(ciphertext.length + authTag.length);
encryptedFinal.set(ciphertext, 0);
encryptedFinal.set(authTag, ciphertext.length); // last 16 bytes
- Upload: The concatenated raw binary buffer is uploaded to the Supabase storage bucket (
e2e_attachments). ThenonceHexand other metadata (like size and mimeType) are stored separately in the database.
Decryption flow (Reverse Process)
-
Notes: The app slices the last 24 characters from the base64 string to extract the
authTagB64, decodes it and the ciphertext, decrypts with thenotesKeyand thenonceHex, decodes the UTF-8 bytes to JSON, and parses it to retrieve the HTML content. -
Files: The app downloads the raw binary, extracts the last 16 bytes as the binary
authTag, decrypts using thefilesKeyand thenonceHex, and writes the decrypted raw bytes to the local filesystem.