Security architecture
Union.ai’s security architecture is founded on the principle of strict separation between orchestration (control plane) and execution (compute plane). This architectural decision ensures that customer data remains within the customer’s own cloud infrastructure at all times.
Control plane / compute plane separation
The control plane and compute plane serve fundamentally different purposes and handle different types of data:
Control plane (Union.ai hosted)
The control plane is responsible for workflow orchestration, user management, and providing the web interface. It runs within Union.ai’s AWS account and stores only orchestration metadata in a managed PostgreSQL database. This metadata includes task definitions (image references, resource requirements, typed interfaces), run and action metadata (identifiers, phase, timestamps, error information), user identity and RBAC records, cluster configuration and health records, and trigger/schedule definitions. The control plane never stores customer data payloads. It stores only references (URIs) to data in the customer’s object store, no data. When data must be surfaced to a client, the control plane either proxies a signing request to generate a presigned URL or relays a data stream from the compute plane without persisting it.
See comprehensive list of control plane roles and permissions in Appendix C.
Compute plane (customer hosted)
The compute plane runs inside the customer’s own cloud account on their own Kubernetes cluster. All customer data resides here, including:
| Data Type | Storage Technology | Access Pattern |
|---|---|---|
| Task inputs/outputs | Object Store | Read/write by task pods via IAM roles |
| Code bundles (TGZ) | Object Store (fast-registration bucket) | Write via presigned URL; read by task pods and presigned URL by the browser |
| Container images | Container Registry | Built on-cluster; pulled by K8s |
| Task logs | Cloud Log Aggregator + live K8s API | Streamed via tunnel (never stored in CP) |
| Secrets | K8s Secrets, Vault, or Cloud Secrets Manager | Injected into pods at runtime |
| Observability metrics | Prometheus (in-cluster / customer managed) | Proxied queries via DataProxy |
| Reports (HTML) | Object Store (S3/GCS/Azure Blob) | Accessed by the browser via presigned URL |
| Cluster events | K8s API (ephemeral) | Live from K8s API |
See comprehensive list of compute plane roles and permissions in Appendix D.
Network architecture
Network security is enforced through multiple layers:
In BYOC deployments, Union.ai additionally maintains a private management connection to the customer’s K8s cluster. See BYOC deployment differences: Network architecture for details.
Cloudflare tunnel (outbound-only)
The compute plane connects to the control plane via a Cloudflare Tunnel—an outbound-only encrypted connection initiated from the customer’s cluster. This architecture provides several critical security benefits:
- No inbound firewall rules are required on the customer’s network
- All traffic through the tunnel uses mutual TLS (mTLS) encryption
- The Tunnel Service performs periodic health checks and state reconciliation
- Connection is initiated outward to Cloudflare’s edge network, from the compute plane, which then connects to the control plane
Control plane tunnel (outbound only)
The compute plane reaches out to the control plane to establish a bidirectional, encrypted and authenticated, outbound-only tunnel. Union.ai operates regional control plane endpoints:
| Area | Region | Endpoint |
|---|---|---|
| US | us-east-2 | hosted.unionai.cloud |
| US | us-west-2 | us-west-2.unionai.cloud |
| Europe | eu-west-1 | eu-west-1.unionai.cloud |
| Europe | eu-west-2 | eu-west-2.unionai.cloud |
| Europe | eu-central-1 | eu-central-1.unionai.cloud |
In locked-down environments, networking teams can limit egress access to published Cloudflare CIDR blocks, and further restrict to specific regions in coordination with the Union networking team.
Communication paths
| Communication Path | Protocol | Encryption |
|---|---|---|
| Client → Control Plane | ConnectRPC (gRPC-Web) over HTTPS | TLS 1.2+ |
| Control Plane ↔ Compute Plane | Cloudflare Tunnel (outbound-initiated) | mTLS |
| Client → Object Store (presigned URL) | HTTPS | TLS 1.2+ (cloud provider enforced) |
| Fluent Bit → Log Aggregator | Cloud provider SDK | TLS (cloud-native) |
| Task Pods → Object Store | Cloud provider SDK | TLS (cloud-native) |
BYOC deployments add a PrivateLink/PSC management path between Union.ai and the customer’s K8s API. See BYOC deployment differences: Network architecture.
Data flow architecture
Union.ai implements two primary data access patterns, both designed to keep customer data out of the control plane:
Presigned URL pattern
For task inputs, outputs, code bundles, and reports, the control plane proxies signing requests to the compute plane, which generates time-limited presigned URLs using customer-managed credentials. The client fetches data directly from the customer’s object store—the data never transits the control plane. Presigned URLs generated on the compute plane are single-object scope, operation-specific (GET or PUT), time-limited (default 1 hour maximum), and transport-encrypted at every hop.
Union.ai applies several controls:
- TTL enforcement — URLs expire after a configurable window (default 1 hour, configurable shorter)
- Single-object scope — each URL grants access to exactly one object, not a bucket or prefix
- Operation specificity — each URL is locked to a single operation (GET or PUT)
- Transport encryption — URLs are transmitted only over TLS-encrypted channels
- No URL logging — presigned URLs are not persisted in control plane logs or databases
Organizations with stricter requirements can configure shorter TTLs. The presigned URL model was chosen because it eliminates the need for the control plane to hold persistent cloud IAM credentials, which would represent a larger and more persistent attack surface than time-limited bearer URLs.
Streaming relay pattern
For logs and observability metrics, the control plane acts as a stateless relay—streaming data from the compute plane through the Cloudflare tunnel to the client in real time. The data passes through the control plane’s memory as a TLS encrypted stream with a termination point in the cloud. It is never written to disk, cached, or stored.