← Back to blog

How RedaktPDF Encrypts Your PDFs: A Deep Dive Into Our E2EE Architecture

Jury D'Ambros··8 min read

Most online PDF editors process your files on their servers. They receive your document in plaintext, manipulate it, and send it back. The moment your file leaves your browser, it is readable by that company's infrastructure — their engineers, their logs, and anyone who gains unauthorized access to their systems.

RedaktPDF works differently. Every PDF you upload as a registered user is encrypted in your browser before a single byte reaches our servers. The server stores ciphertext it cannot read. This post explains exactly how that works — the key hierarchy, the cryptographic primitives, and the design decisions behind each choice.

For a non-technical overview, see Why End-to-End Encryption Matters for PDF Editing.

The Zero-Knowledge Model

The term "zero-knowledge" is overloaded in cryptography, but in this context it means something specific: the server holds data that is cryptographically useless without information only the user possesses.

Here is what RedaktPDF's server knows about your encrypted documents:

  • The wrapped (encrypted) KEK — useless without your master key
  • The wrapped (encrypted) file keys — useless without the KEK
  • Encrypted blobs in S3 — unreadable without the file keys
  • The PBKDF2 salt — not secret; required for key derivation
  • Your email address and account metadata

Here is what the server never sees:

  • Your password (only a bcrypt hash is stored for authentication)
  • The master key — derived in your browser, never transmitted
  • The KEK in plaintext — unwrapped only in your browser
  • File keys in plaintext — unwrapped only in your browser
  • PDF content, page images, and extracted text — encrypted before upload

Why does this matter in practice? Consider a worst-case breach: an attacker exfiltrates the entire database and every S3 object. They have wrapped keys and encrypted blobs. Without your password, they cannot derive the master key. Without the master key, the wrapped KEK cannot be decrypted. Without the KEK, the per-file keys remain sealed. The breach yields nothing actionable.

This is not a contractual commitment — it is a cryptographic property of the architecture.

The Key Hierarchy

RedaktPDF uses a three-level key hierarchy. Each level serves a distinct purpose.

User Password
    |
    v PBKDF2(password, salt, 600,000 iterations, SHA-256)
Master Key (MK) ---- stored nowhere (derived on-demand)
    |
    v AES-GCM wrapKey
Key Encryption Key (KEK) ---- wrapped copy stored in DB
    |
    v AES-GCM wrapKey (per document)
File Key (FK) ---- wrapped copy stored in DB per document
    |
    v AES-256-GCM encrypt
Encrypted PDF, page images, text metadata ---- stored in S3

Master Key (MK): Derived from your password using PBKDF2. It is never stored anywhere — not on the server, not in the browser beyond the current session. Every time you log in to an encrypted session, the MK is re-derived from your password and exists only in memory. If you close the tab, it is gone.

Key Encryption Key (KEK): A randomly generated 256-bit AES-GCM key. It is generated once when you enable encryption and then immediately wrapped (encrypted) with the MK. The wrapped form is stored in the database. The plaintext form exists in browser memory only during an active session.

The KEK exists for one operational reason: password changes. When you change your password, a new MK is derived. Without a KEK layer, every single document's file key would need to be re-encrypted with the new MK. With the KEK layer, only the KEK wrapper needs updating — a single database write regardless of how many documents you have. All existing file keys remain valid.

File Key (FK): A randomly generated 256-bit AES-GCM key, unique per document. It encrypts the actual PDF bytes, page images, thumbnails, and extracted text. The wrapped form is stored in the database alongside document metadata. Isolating keys per document means a hypothetical key compromise affects exactly one document, not your entire file library.

Key Derivation: From Password to Master Key

The master key derivation uses the Web Crypto API's PBKDF2 implementation. Here is the production code from packages/crypto/src/index.ts:

const PBKDF2_ITERATIONS = 600_000;

export async function deriveMasterKey(
  password: string,
  salt: Uint8Array,
): Promise<CryptoKey> {
  const keyMaterial = await globalThis.crypto.subtle.importKey(
    'raw',
    new TextEncoder().encode(password),
    'PBKDF2',
    false,
    ['deriveKey'],
  );

  return globalThis.crypto.subtle.deriveKey(
    {
      name: 'PBKDF2',
      salt: salt as unknown as ArrayBuffer,
      iterations: PBKDF2_ITERATIONS,
      hash: 'SHA-256',
    },
    keyMaterial,
    { name: 'AES-GCM', length: 256 },
    false, // not extractable
    ['encrypt', 'decrypt', 'wrapKey', 'unwrapKey'],
  );
}

Walk through the code:

  1. importKey('raw', ...) — imports the UTF-8 encoded password as raw key material. This is not a usable key yet; it is just bytes fed into the derivation function.

  2. deriveKey(...) — applies PBKDF2 with SHA-256 and 600,000 iterations to produce the master key.

  3. { name: 'AES-GCM', length: 256 } — specifies that the output is a 256-bit AES-GCM key.

  4. false (extractable) — this is critical. The master key is non-extractable. Even JavaScript code running in the same browser context cannot read the raw key bytes. It exists inside the Web Crypto API's internal key store and can only be used for cryptographic operations, never exported. This prevents a malicious script from reading the key out of memory.

  5. ['encrypt', 'decrypt', 'wrapKey', 'unwrapKey'] — the key usages. The master key can wrap and unwrap the KEK, and can encrypt/decrypt directly if needed.

Why 600,000 iterations? PBKDF2 is intentionally slow — each additional iteration makes brute-force attacks proportionally more expensive. The OWASP recommendation as of 2023 is 600,000 iterations for PBKDF2-SHA256. This makes a single password guess take measurable CPU time. At 600,000 iterations, an attacker attempting a dictionary attack against a stolen salt must pay that CPU cost for every candidate password. Even with a GPU cluster, cracking a strong password is impractical.

The 16-byte random salt ensures that identical passwords produce different master keys, defeating rainbow table attacks.

Encrypting a PDF

Once a file key is available, all data encryption uses the same function. Here is the production encryptBlob from the crypto package:

export async function encryptBlob(
  data: Uint8Array,
  key: CryptoKey,
): Promise<Uint8Array> {
  const iv = globalThis.crypto.getRandomValues(new Uint8Array(IV_BYTES));
  const ciphertext = await globalThis.crypto.subtle.encrypt(
    { name: 'AES-GCM', iv },
    key,
    data as unknown as ArrayBuffer,
  );
  return concat(iv, new Uint8Array(ciphertext));
}

Three things to note:

Random IV per encryption. IV_BYTES is 12 (bytes), which is the NIST-recommended IV length for AES-GCM. A fresh 12-byte IV is generated for every call to encryptBlob. This means encrypting the same PDF twice produces different ciphertext. If IVs were reused with the same key, AES-GCM's security would collapse — two ciphertexts under the same key and IV leak the XOR of the plaintexts and allow an attacker to forge authentication tags. Random IVs eliminate this risk.

Authenticated encryption. AES-GCM is an AEAD (Authenticated Encryption with Associated Data) cipher. It provides both confidentiality (the content cannot be read) and integrity (the content cannot be silently modified). The GCM authentication tag appended to the ciphertext detects any tampering. If a stored encrypted blob is modified in S3, decryption will throw an error rather than returning corrupted plaintext.

Self-describing format. The IV is prepended to the ciphertext (concat(iv, new Uint8Array(ciphertext))). The output is a single byte array: [12 bytes IV][remaining bytes: ciphertext + 16-byte auth tag]. Decryption reads the first 12 bytes as the IV, the rest as the authenticated ciphertext. No separate IV storage is needed.

This same function encrypts the PDF file bytes, each rendered page image, thumbnail images, and the extracted text JSON — everything that could expose document content.

When you use the secure PDF editor to apply edits, those operations are assembled and exported entirely in your browser. The edited document is never re-uploaded in decrypted form.

The Upload Flow

Here is the sequence for uploading an encrypted PDF:

  1. Your browser reads the PDF file as a Uint8Array.
  2. PDF.js renders each page locally at 150 DPI (full resolution) and 72 DPI (thumbnail). No rendering happens server-side.
  3. PDF.js extracts text positions locally for search and selection support.
  4. The browser generates a fresh random File Key (FK).
  5. The FK encrypts: the PDF bytes, every rendered page PNG, every thumbnail PNG, and the text position JSON — all via encryptBlob.
  6. The FK is wrapped with the user's KEK using AES-GCM wrapKey.
  7. The browser uploads the encrypted PDF to S3 directly via a presigned URL — the server issues the URL but never receives the file contents.
  8. The browser sends the encrypted page blobs and wrapped FK to /api/documents/confirm-encrypted.
  9. The server stores encrypted blobs in S3 and the wrapped FK in the database. At no point does it see the plaintext PDF, page images, or file key.

Even redaction operations follow this model: redactions are applied client-side to the in-browser PDF representation before the document is encrypted and uploaded.

Why AES-GCM for Key Wrapping

A note on the key wrapping implementation that differs from the documentation table in docs/SECURITY.md.

The SECURITY.md table lists "AES-KW (Key Wrap)" for the key wrapping primitive — a reference to RFC 3394 AES Key Wrap. The actual implementation uses wrapKey with AES-GCM. Here is the wrapKek function:

export async function wrapKek(
  kek: CryptoKey,
  masterKey: CryptoKey,
): Promise<string> {
  const iv = globalThis.crypto.getRandomValues(new Uint8Array(IV_BYTES));
  const wrapped = await globalThis.crypto.subtle.wrapKey(
    'raw',
    kek,
    masterKey,
    {
      name: 'AES-GCM',
      iv,
    },
  );
  return toBase64(concat(iv, new Uint8Array(wrapped)));
}

The distinction matters: AES-KW (RFC 3394) does not use an IV. It provides integrity via a built-in integrity check value, but its authentication mechanism is narrower than GCM. AES-GCM with a random IV provides full AEAD — confidentiality, integrity, and authentication with a 128-bit authentication tag. Any modification to a wrapped key is detectable.

Using AES-GCM throughout the stack (for both data encryption and key wrapping) has another practical benefit: consistency. The implementation uses a single primitive with well-understood properties and excellent browser support. There is no cognitive overhead of reasoning about two different cryptographic modes.

The SECURITY.md table is a documentation gap; the actual implementation is more robust. This post documents what the code actually does.

What This Means in Practice

Tie it together with a concrete scenario. Suppose RedaktPDF's database and all S3 objects were exfiltrated tomorrow. The attacker obtains:

  • Your email address
  • A bcrypt hash of your password (cost factor 12 — not usable directly)
  • The PBKDF2 salt
  • The wrapped KEK
  • Wrapped FKs for each of your documents
  • Encrypted blobs for every document

To decrypt a single document, the attacker must:

  1. Crack your password from the bcrypt hash (or guess it from the PBKDF2 salt, which does not speed up password guessing meaningfully)
  2. Use the cracked password + salt to run PBKDF2 at 600,000 iterations to derive the master key
  3. Use the master key to unwrap the KEK
  4. Use the KEK to unwrap the FK for the target document
  5. Use the FK to decrypt the encrypted blobs

Each step requires the successful completion of the previous step. The PBKDF2 cost is designed to make step 2 take meaningful compute time per guess, making brute-force attacks against strong passwords impractical.

A full breach exposes metadata but not content. That is the practical guarantee of the zero-knowledge architecture. For a broader look at privacy practices in PDF editing, see our PDF privacy guide.

Further Reading

The Web Crypto API, PBKDF2, and AES-GCM specifications behind this implementation:

Ready to try RedaktPDF?

Edit, redact, and annotate PDFs directly in your browser — free and encrypted.

Get started

Related tools

Related articles