Fortanix Data Security Manager Data Center Labeling

1.0 Introduction

1.1 Purpose

Welcome to the Fortanix-Data-Security-Manager (DSM) Administration guide. The purpose of this guide is to describe the Fortanix DSM Data Center (DC) labeling of nodes across multiple locations using manual and automated methods (using a script).

1.2 Overview

Configuring DC labeling is a must when the Fortanix DSM cluster nodes are spread across multiple locations or DCs. DC Labeling increases read resiliency by enabling a local quorum of nodes to serve read requests. DC Labeling supports the "Read-Only" mode of operation when a global quorum is lost, and the local quorum is available.

2.0 DC Labeling - Manual Method

DC labeling on the Fortanix DSM cluster can be performed by manual or automated methods. Below are the steps for DC labeling using the manual method. 

2.1 Deployment

  1. Log into any of the nodes and set the KUBECONFIG value using the following command:

    export KUBECONFIG=/etc/kubernetes/admin.conf
  2. Set the data center label using the following command for all the nodes:

    sudo -E kubectl label node <node-name> datacenter=<dcname>

    NOTE

    • <node-name> is usually the hostname of the node. Verify the names of all nodes in the cluster using the command:

      sudo -E kubectl get nodes
    • <dcname> is label identifying the data center. This can be any label of your choice.

  3. Now, include the datacenter change by updating cassandra statefulset. To do this, update the "env" section of Kubernetes' cassandra.yaml file by executing the following command:

    sudo -E kubectl edit statefulset cassandra

    The previous command will open the cassandra.yaml file in the Vim editor. Find the “env” section and add the following changes:

    - name: JVM_EXTRA_OPTS
      value: -Dcassandra.ignore_dc=true

    NOTE

    Ensure that you do not use tabs in the cassandra.yaml file, since it is incompatible. Use white spaces instead.

  4. Save the cassandra.yaml file from the Vim editor. This will restart all the Cassandra pods. After Cassandra rolls out with this change, log into any Cassandra pod, and execute the command `nodetool status` to verify if the data center information is available.

    NOTE

    Wait for all Cassandra pods to be in running state and `nodetool status` to report all the Cassandra instances in the correct data center. All Cassandra instances should have UN state before proceeding further.

  5. Run the following command to get the Cassandra pod names:

    sudo -E kubectl get pods -l app=cassandra

    NOTE

    Typically, Cassandra pod names show up in the format: cassandra-0, cassandra-1, cassandra-2, and so on (depending on the number of nodes in the Fortanix DSM cluster).

  6. Log into any of the Cassandra pods (for example, cassandra-0) using the following command:

    sudo -E kubectl exec -it cassandra-0 bash
  7. Activate the ‘cqlsh’ prompt by typing the command: ‘cqlsh’.

  8. To update the replication strategy in Cassandra, run the following command from cqlsh prompt:

    ALTER KEYSPACE public WITH REPLICATION = {'class': 'NetworkTopologyStrategy', 'DC1': <number of nodes>, 'DC2': <number of nodes>, 'DC3': <number of nodes>};

    The command above informs Cassandra that three Data centers are available with the names that we have supplied.

    NOTE

    To avoid mistakes, apply DC names with caution as they are case-sensitive. We recommend performing DC labeling with help from Fortanix Support.

  9. Exit the cqlsh prompt by entering the command: EXIT. Also, exit from the cassandra-0 pod by executing the command: exit.

  10. To verify, run `nodetool status` from any Cassandra pod to ensure all Cassandra instances should have UN state before proceeding further.

  11. Ensure all the Fortanix DSM backend pods are healthy.

    sudo -E kubectl get pods -l app=sdkms

DC Labeling - Automated Method (Using Script)

Overview

This automated method for DC Labeling uses a script (cassandra_dc_label.sh) that will be available at the location:
/opt/fortanix/sdkms/bin/cassandra_dc_label.sh

Method

The script provides a menu interface from which you can choose one of the following options and perform related tasks. You can also execute the tasks using parameters as explained in the Section: Options using Parameters.

Options Using Menu Interface

  1. CONFIGURE DC LABEL: You will be prompted to label each node and confirmation to proceed. Based on your confirmation, the script will label the nodes, execute an alter statement to update Cassandra's strategy, and then add an environment variable to Cassandra's Statefulset.

  2. INCREMENT REPLICATION FACTOR: This option increases the replication factor by 1 for a given DC label, which is useful when joining the nodes. If the cluster has a global quorum, then it is good to increase the replication factor and join the node so that Cassandra streams the data to the joining node, and after the join completes, that newly added node will have been fully replicated.

    NOTE

    If the cluster has no global quorum (especially for a one-node cluster), the script will not increment the replication factor because the DSM pod enters a crash-loop state. For new clusters, it is recommended to join all nodes using a simple strategy and then configure DC labeling at once (option 1).

  3. DECREMENT REPLICATION FACTOR: This option decreases the replication factor by 1 for a given DC label, which is useful when removing nodes. After node removal, you can run this script to decrease the replication factor so that queries that require consistency from the DSM backend will be successful.

  4. RESET TO SIMPLE STRATEGY: This option may not be useful in a production setup and is not recommended to be used.

  5. VIEW DC LABEL INFO: This option is to view the existing DC labeling info and all node labels and nodetool status. It is useful for troubleshooting cluster issues.

Options Using Parameters

You can also execute the above options using parameters instead of a menu interface.

  1. CONFIGURE DC LABEL: To configure the DC label, use the command: ./script configure <file>.

  2. INCREMENT REPLICATION FACTOR: To increase the replication factor by 1, use the command: ./script <increment-replica-count <DC>.

  3. DECREMENT REPLICATION FACTOR: To decrease the replication factor by 1, use the command: ./script <decrement-replica-count <DC>.

  4. RESET TO SIMPLE STRATEGY: To reset the Strategy to Simple Strategy without user confirmation, use the command ./script <reset>.

  5. VIEW DC LABEL INFO: To Display DC label info, use the command: ./script <view>.

NOTE

To configure DC Label using a file, the file format should be NODENAME=LABEL in each line.

Execution

./cassandra_dc_label.sh
This Utility modifies Node Labels and Cassandra Strategy, Please use it cautiously!!
Please select the required operation, Please Key in the number to select.

1) CONFIGURE DC LABEL            3) DECREMENT REPLICATION FACTOR  5) VIEW DC LABEL INFO
2) INCREMENT REPLICATION FACTOR  4) RESET TO SIMPLE STRATEGY
? 1

Sun 26 Feb 2023 06:31:54 PM PST :  Executing check_node_readiness function

Sun 26 Feb 2023 06:31:54 PM PST :  All nodes are in Ready state!!

Sun 26 Feb 2023 06:31:54 PM PST :  Completed check_node_readiness function

Sun 26 Feb 2023 06:31:54 PM PST :  Executing check_cas_pod_readiness function

Sun 26 Feb 2023 06:31:54 PM PST :  All cassandra pods are in healthy state!!

Sun 26 Feb 2023 06:31:54 PM PST :  Completed check_cas_pod_readiness function

Sun 26 Feb 2023 06:31:54 PM PST :  Executing check_node_label function
Please Provide the label for srv1-sitlab-dc: FR6

Sun 26 Feb 2023 06:32:10 PM PST :  Completed check_node_label function

Provided node labels list:
-----------------------------------------------

srv1-sitlab-dc=FR6

-----------------------------------------------
Please confirm the provided labels [Y/N]?
Y
Thanks for the confirmation
Successfully labeled srv1-sitlab-dc with label FR6
alter KEYSPACE public WITH REPLICATION = {'class': 'NetworkTopologyStrategy', 'FR6': 1}
Cassandra replication strategy was altered successfully

 keyspace_name | durable_writes | replication
---------------+----------------+-------------------------------------------------------------------------------
        public |           True | {'FR6': '1', 'class': 'org.apache.cassandra.locator.NetworkTopologyStrategy'}

(1 rows)
Environment variable is already present, rolling restart cassandra pods!!
statefulset.apps/cassandra patched