Security overview
Last updated: 2026-06-15
collaib is a collaborative document-review platform for engineering teams, with AI agents as first-class, attributed reviewers. This document describes how customer data is protected: where it lives, how tenants are isolated, how humans and agents authenticate, and how to reach us about security issues.
Architecture and data flow
collaib is built on managed infrastructure with a deliberately small operational surface:
- Browser client — React application served from Vercel's CDN. No tenant data is embedded in the static bundle.
- Convex — backend platform holding all structured data: document metadata, review marks and comment threads, approvals, team and organization records, the audit log, and hashed agent credentials. It is also the single durable store for document content (encrypted Yjs blobs in file storage). All business logic runs as Convex functions over TLS.
- Collab server (Fly.io) — self-hosted Hocuspocus v4 server providing real-time document collaboration. It holds ephemeral in-memory Yjs document state while a document is being edited; canonical document content is persisted encrypted in Convex file storage. No tenant data is stored at rest on the Fly machine.
- Better Auth on Convex — authentication is served from the Convex HTTP origin under
/api/auth/*. OAuth account and session metadata is stored in Convex; collaib does not store OAuth provider passwords.
| Data class | Where it lives |
|---|---|
| Document body content | Convex file storage (AES-256-GCM encrypted Yjs blob per document) |
| Document metadata, marks/comments, approvals | Convex |
| Organization, team, and membership records | Convex |
| Audit log and agent usage events | Convex |
| Agent API credentials | Convex (SHA-256 hash only) |
| User identities, OAuth accounts, sessions | Convex (Better Auth tables) |
Tenant isolation
collaib is multi-tenant: organizations are the isolation boundary, with teams and documents nested inside them.
- Every tenant-scoped record carries its organization ID, and that ID is always derived from the stored record server-side. It is never trusted from client input.
- Tenant data paths enforce authentication, organization membership, and restricted-team access through a single set of shared authorization helpers; there are no hand-rolled, per-endpoint membership checks to drift.
- Real-time document rooms are created with no default access and can only be joined with a short-lived, room-scoped HS256 JWT issued by our backend after the same organization-membership check that protects the data plane.
- Agent API and MCP requests are checked for cross-tenant access on every call. Organization keys are organization-scoped; personal keys and MCP OAuth actors are also filtered by the signed-in user's restricted-team access.
- Lookups of identifiers the caller cannot access return the same response as identifiers that do not exist, preventing existence probing across tenant boundaries.
Authentication and credentials
Human sign-in
Handled by Better Auth, self-hosted on Convex. GitHub OAuth is supported today; Google OAuth support is configured with a corporate-domain login policy. Enterprise SSO is a future plan-tier feature, not an enabled production control today.
Agent API keys
- Generated from 32 bytes of cryptographically secure randomness.
- Stored as a SHA-256 hash only; the plaintext is shown exactly once at creation and cannot be retrieved afterwards.
- Expire after 90 days by default and can be revoked at any time.
- Scoped to read-only or read-write, and bound server-side to one organization and to the identity recorded at creation. Agent identity is never taken from request headers.
MCP over OAuth
The remote MCP endpoint accepts OAuth sign-in backed by Better Auth, so agents can act with the permissions of the signed-in user. All agent traffic is rate-limited per organization (API keys) or per user (OAuth).
Real-time room access
Collab server rooms are never reachable without a valid token.
Access requires a per-room HS256 JWT (10-minute TTL) issued by
our /collab-auth backend endpoint, which validates the
caller's Better Auth identity, organization membership, and
restricted-team access before granting access to exactly one room.
Encryption
In transit
All traffic between the browser, our backend, and our subprocessors is encrypted with TLS. WebSocket connections to the collab server use WSS in production.
At rest (vendor layer)
Convex encrypts all stored data with AES-256 (security overview).
At rest (application layer)
collaib adds AES-256-GCM encryption on top of vendor storage for
sensitive content. The following fields are encrypted before being
written to Convex, using a 32-byte key (CONTENT_ENCRYPTION_KEY) held only in the Convex environment:
- Document body content (the full Yjs CRDT blob stored in Convex file storage)
marks.body(review mark content)commentReplies.body(comment reply content)documentApprovals.note(approval notes)documents.pendingInitialMarkdown(buffered initial content before first Yjs sync)
What application-layer encryption defends against: if a Convex backup, export, or storage snapshot is accessed without the encryption key, the ciphertext reveals no plaintext content.
What it does not defend against: a privileged operator who holds CONTENT_ENCRYPTION_KEY can decrypt all content. This is not end-to-end encryption; the server must be able to read document content to serve AI reviewers. The single shared key is not a per-tenant boundary — tenant isolation relies on access control, not key separation. End-to-end encryption remains explicitly out of scope.
Audit logging
All mutations of tenant data write an event to an append-only audit log in the same transaction. The log has insert-only code paths, with no update or delete surface. It currently covers 38 distinct action types across organization, membership, team, document, mark/comment, approval, tag, and agent-credential lifecycles. Separately, every agent API request, allowed or denied, writes a usage event recording the credential, endpoint, and outcome.
Super-admin access
A small number of platform operators can access organizations they are not members of, for support purposes:
- The operator list is defined by a server-side allowlist environment variable, not by any in-app role.
- Multi-factor authentication is mandatory for these accounts, and the list is reviewed quarterly.
- Any write access by a super admin to an organization they are not a member of is recorded in the audit log as a dedicated
superadmin.org_accessevent. - Super-admin accounts are hidden from customer-facing member lists and cannot be granted via the application.
Secrets handling
Server secrets (collab JWT and service keys, content encryption
key, Better Auth secret, OAuth client secrets, notification
secrets, integration encryption keys) live in Convex's encrypted
environment-variable store, with the collab JWT and service keys
mirrored to Fly's encrypted secrets store. They are never embedded
in the client bundle — only variables explicitly prefixed
VITE_ are exposed by the build, and no secret carries that prefix.
Local environment files are excluded from version control.
Policy: any secret that ever appears in version-control history is
rotated, regardless of how quickly it was removed.
Subprocessors
| Subprocessor | Purpose | Compliance posture |
|---|---|---|
| Convex | Backend platform, database, encrypted content storage | SOC 2; HIPAA support available (plan-tier dependent) |
| Fly.io | Collab server compute (transient; no at-rest tenant data) | Serves ephemeral in-memory doc state only; no customer data persisted on the machine |
| Vercel | Frontend hosting and CDN | Serves static assets and edge routing; no tenant database content stored |
| GitHub | OAuth provider, source hosting, CI, optional integration context | Receives OAuth identity data for GitHub sign-in and source repository data when customers connect GitHub integrations |
| OAuth provider | Receives OAuth identity data when Google sign-in is enabled | |
| Sentry | Error monitoring (conditional) | Used only when VITE_SENTRY_DSN is configured; event payloads should not include document bodies |
HIPAA-aligned configuration at Convex, and Convex's SIEM-integrated audit export, are enterprise/business-tier vendor offerings and are not enabled on standard plans by default.
Data retention and deletion
- Document deletion is available in-app and is a hard delete for Convex records: the document record and its marks, comment threads, approvals, reviewer assignments, and tags are permanently removed at the time of deletion, and the deletion itself is audit-logged.
- Organization offboarding (full account deletion) is handled via a support request.
- Document deletion removes the document's encrypted Yjs blob from Convex file storage alongside its structured records; canonical content lives only in Convex, so no separate subprocessor room cleanup is required.
- Subprocessor backup and recovery behavior follows each vendor's documented practices.
Vulnerability disclosure
We welcome good-faith security research. Please report vulnerabilities privately by emailing our security contact. We acknowledge reports within 3 business days and target a fix or mitigation within 30 days.