Virtual IP and Load Balancing
A virtual IP is assigned to a Fortanix Data Security Manager (DSM) cluster which can be assigned to any of the healthy nodes in the cluster at any time. This is done using the Keepalived service running on all the nodes in the cluster. This floating IP address should be assigned a name that the organization’s DNS server can resolve to. All nodes in a Fortanix DSM cluster use the Kubernetes load balancer, which forwards incoming requests to the individual Fortanix DSM cluster nodes in a round-robin.
The Fortanix DSM built-in load balancer can be of the following types:
Build-in Load Balancer
This is the default option with a single master node that receives and forwards all traffic. It supports automatic master failover. All nodes in this setup must share the subnet.
Figure 1: Built-in load balancer
L3 Load Balancer (ECMP Routing)
In this setup, the load balancer is integrated into network infrastructure and requires stateful routers. It supports automatic failover between multiple sites (different subnets).
Figure 2: L3 (ECMP Routing)
L4 Load Balancer
This setup requires a dedicated load balancer equipment. It integrates with Fortanix DSM node health checks. Depending on the load balancer capabilities, it supports automatic failover and multi-site.
Figure 3: L4 Load Balancer
L4 Load Balancer Setup
The suggested external load balancer setup is as follows:
- Only one DNS needs to be set for Fortanix DSM related services.
- The ports 443, 4445, and 5696 need to be open on the Load Balancer on the same DNS. Refer to Fortanix Data Security Manager Port Requirements for details about these ports.
The following health checks are suggested:
- Port 443: HTTP health check on https://<DNS>/sys/v1/health with expected 204 HTTP status code.
- Port 4445: HTTP health check on https://<DNS>:4445/health with expected 204 HTTP status code.
- Port 5696: TCP Half-open check on DNS port 5696. Example command for health check:
$ netcat -v http://sdkms.onprem-dns.your-company.com 5696
Connection to http://sdkms.onprem-dns.your-company.com 5696 port [tcp/*] succeeded
For High availability, a round-robin method is used.
- Database: Fortanix DSM uses the database Cassandra. This is a distributed database which has an instance running on every node in the cluster. The database use a consistency protocol which requires a quorum of nodes to be available: (n/2)+1 for a cluster of size n. This defines a minimum cluster size of n=3 nodes.
- Kubernetes: The Fortanix DSM application runs inside a container, which is executed and orchestrated using Kubernetes. Part of the initial setup of nodes involves setting up a Kubernetes cluster and making the nodes part of that cluster. Provisioning of Fortanix DSM and software upgrades are done using Kubernetes commands thereafter.
- Intel SGX: The entire Fortanix DSM software runs inside an Intel® SGX enclave. This requires the Intel® SGX driver to be installed on the host, and the Intel® SGX platform software (PSW) to be installed in the Fortanix DSM container.
- Fortanix DSM Container: The Fortanix DSM application runs inside a Docker container. This includes several critical components of the application, such as; TLS termination, the webserver which parses REST APIs, the crypto library, key management logic, as well as the database driver.
- Fortanix DSM Web UI: The Fortanix DSM UI is served through an Nginx web server running inside the Fortanix DSM UI container.
- Monitoring Software: Every Fortanix DSM node runs a monitoring software agent. A system administrator can set alerts for various things, such as CPU load, temperature, network traffic, and so on in the monitoring server.
Figure 4 shows the architecture of a single Fortanix Data Security Manager KMS node:
Figure 4: Architecture of a single Fortanix Data Security Manager node
Fortanix DSM is typically installed in a high availability architecture. Due to the data consistency schema used, to remain fully operational, Fortanix DSM requires the majority of the nodes in the cluster to be available.
Which means for a
- 3 node cluster, at least 2 nodes should be available.
- 5 node cluster, at least 3 nodes should be available.
The Formula for the above is
(n/2)+1 where n is the number of nodes in a cluster.
Due to the same reason, for a multi datacenter deployment, Fortanix suggests using at-least three Datacenters, to ensure service availability in case of failure in a single Datacenter.
Figure 5: Three equal sites
Figure 6: Two equal sites + 1
A Fortanix DSM cluster is fully operational when there is a global quorum, which means that most of the nodes in the cluster (across all data centers) are available.