Fortanix Data Security Manager High Availability Concepts

Virtual IP and Load Balancing

A virtual IP is assigned to a Fortanix Data Security Manager (DSM) cluster which can be assigned to any of the healthy nodes in the cluster at any time. This is done using the Keepalived service running on all the nodes in the cluster. This floating IP address should be assigned a name that the organization’s DNS server can resolve to. All nodes in a Fortanix DSM cluster use the Kubernetes load balancer, which forwards incoming requests to the individual Fortanix DSM cluster nodes in a round-robin.

The Fortanix DSM built-in load balancer can be of the following types:

Build-in Load Balancer

This is the default option with a single master node that receives and forwards all traffic. It supports automatic master failover. All nodes in this setup must share the subnet. BuiltIn_Load_Balancer.png
Figure 1: Built-in load balancer

L3 Load Balancer (ECMP Routing)

In this setup, the load balancer is integrated into network infrastructure and requires stateful routers. It supports automatic failover between multiple sites (different subnets).

ECMP_routing.png
Figure 2: L3 (ECMP Routing)

L4 Load Balancer

This setup requires a dedicated load balancer equipment. It integrates with Fortanix DSM node health checks. Depending on the load balancer capabilities, it supports automatic failover and multi-site.

L4LoadBalancer.png
Figure 3: L4 Load Balancer

L4 Load Balancer Setup

The suggested external load balancer setup is as follows:

  1. Only one DNS needs to be set for Fortanix DSM related services.
  2. The ports 443, 4445, and 5696 need to be open on the Load Balancer on the same DNS. Refer to Fortanix Data Security Manager Port Requirements for details about these ports.

The following health checks are suggested:

  1. Port 443: HTTP health check on https://<DNS>/sys/v1/health with expected 204 HTTP status code.
  2. Port 4445: HTTP health check on https://<DNS>:4445/health with expected 204 HTTP status code.
  3. Port 5696: TCP Half-open check on DNS port 5696. Example command for health check:
    $ netcat -v http://sdkms.onprem-dns.your-company.com 5696

    Connection to http://sdkms.onprem-dns.your-company.com 5696 port [tcp/*] succeeded

For High availability, a round-robin method is used.

NOTE
For F5 LTM load balancers, make sure the “OneConnect” feature is disabled.
  1. Database: Fortanix DSM uses the database Cassandra. This is a distributed database which has an instance running on every node in the cluster. The database use a consistency protocol which requires a quorum of nodes to be available:  (n/2)+1 for a cluster of size n. This defines a minimum cluster size of n=3 nodes.
  2. Kubernetes: The Fortanix DSM application runs inside a container, which is executed and orchestrated using Kubernetes. Part of the initial setup of nodes involves setting up a Kubernetes cluster and making the nodes part of that cluster. Provisioning of Fortanix DSM and software upgrades are done using Kubernetes commands thereafter.
  3. Intel SGX: The entire Fortanix DSM software runs inside an Intel® SGX enclave. This requires the Intel® SGX driver to be installed on the host, and the Intel® SGX platform software (PSW) to be installed in the Fortanix DSM container.
  4. Fortanix DSM Container: The Fortanix DSM application runs inside a Docker container. This includes several critical components of the application, such as; TLS termination, the webserver which parses REST APIs, the crypto library, key management logic, as well as the database driver.
  5. Fortanix DSM Web UI: The Fortanix DSM UI is served through an Nginx web server running inside the Fortanix DSM UI container.
  6. Monitoring Software: Every Fortanix DSM node runs a monitoring software agent. A system administrator can set alerts for various things, such as CPU load, temperature, network traffic, and so on in the monitoring server.

Figure 4 shows the architecture of a single Fortanix Data Security Manager KMS node: DSM_BlockDiagram.png
Figure 4: Architecture of a single Fortanix Data Security Manager node

High Availability

Fortanix DSM is typically installed in a high availability architecture. Due to the data consistency schema used, to remain fully operational, Fortanix DSM requires the majority of the nodes in the cluster to be available. 

Which means for a  

  • 3 node cluster, at least 2 nodes should be available.
  • 5 node cluster, at least 3 nodes should be available.

The Formula for the above is (n/2)+1 where n is the number of nodes in a cluster.

Due to the same reason, for a multi datacenter deployment, Fortanix suggests using at-least three Datacenters, to ensure service availability in case of failure in a single Datacenter.  

Suggested architectures:  

HA_Sites.png
Figure 5: Three equal sites

HA_Sites1.png
Figure 6: Two equal sites + 1

A Fortanix DSM cluster is fully operational when there is a global quorum, which means that most of the nodes in the cluster (across all data centers) are available.

Was this article helpful?
0 out of 0 found this helpful