Using Data Security Manager with Snowflake

1.0 Introduction

This document describes the Snowflake setup of External Functions to invoke the Fortanix-Data-Security-Manager (DSM) Plugin for tokenization and detokenization using an API Gateway proxy service.

At the time of creation, Snowflake External Functions must specify which Fortanix DSM keys will be used for tokenizing or detokenizing different columns. Thereby, the API Gateway remains stateless and passes through the request from Snowflake to the Fortanix DSM plugin by only requiring mapping certain headers.

We will use the "headers” property of CREATE EXTERNAL FUNCTION to pass key name information. The header name and value should be as follows:

  • Header name: This will be named “key-names”

  • Header value: This will be a string of comma-separated key names. The first entry corresponds to the first column in the data and so on.

Key with the key names specified in the header must already exist. Fortanix DSM plugin will not create the key. It will check and verify if the key exists. Fortanix DSM plugin will also verify that the number of key names passed in the header matches the number of columns in the data sent by Snowflake. If there are fewer keys than the columns, then the operation will fail.

2.0 Prerequisites

  • Fortanix DSM version 3.27 or later is installed and operational.

3.0 Setting up the Fortanix DSM

  1. Sign up at https://smartkey.io/. This opens DSM SaaS for the AMER region. DSM SaaS supports multiple regions, as listed here.

  2. Log in to the Fortanix DSM user interface (UI).

  3. Click the Integrations tab in the left panel.

  4. On the Integrations page, click ADD INSTANCE on the Snowflake wizard. 

    Snowflake_Integration_Tab.png

    Figure1: Add Snowflake Instance

  5. Enter the details as shown in the screenshot below:

    1. Add Instance: This is the name to identify the instance created.

    2. In the Create tokenization keys section:

      1. Select the Data type applicable to you. Depending on the type of data the users want to protect, they can create security objects belonging to any of the four tokenizer data type groups.

      2. Multiple token objects can be created as per your needs. 

        Save_Instance.png

        Figure 2: Save Instance

        NOTE

        Each Security-object will need to have a masking policy enabled during creation time, so that the Fortanix DSM App can perform a partial reveal or dynamic masking during de-tokenization. Masking policy cannot be enabled on the object afterward and hence it needs to be premeditated for the creation. This may be subject to change in the future if the key can be copied (replicated) with a masking policy enabled during the replication.

  6. Click SAVE INSTANCE.

  7. Note down the API Key from the instance details. You will need the API Key when setting your API gateway. To copy the API Key, click COPY API KEY.  

    Instance_Details.png

    Figure 3: Instance Details

  8. You can view all the instances by clicking the View All on the integration wizard.  

    VIEW_ALL.png

    Figure 4: View All

    List_of_Instances.png

    Figure 5: List of All Instances

4.0 Setting up AWS API Gateway Proxy Service

4.1 Create and Test AWS API Gateway

Create and test AWS API Gateway with the following configuration:

  1. REST API Endpoints that may be public or private:

  2. Two resources for /tokenize and /detokenize.

    NOTE

    POST method is required for both resources.

    For each resource method, configure Integration Request with the following configuration:

    • Integration type: HTTP

    • Endpoint URL: Point it to your Plugin URL

      https://<Your_DSM_Service_URL>/sys/v1/plugins/<<PLUGIN_UUID>>

      NOTE

      Change the Fortanix DSM fully qualified domain name (FQDN) and the Plugin UUID.

      • The Fortanix DSM FQDN (Your_DSM_Service_URL) should be the same as selected in Section 3.0: Setting up the Fortanix Data Security Manager, Step 1.

      • The PLUGIN_UUID refers to the UUID of the Snowflake plugin, which was created on the Fortanix DSM Plugins page. Navigate to the plugin and the UUID can be found at the top of the page.

    • Content handling: Passthrough

    • HTTP Headers – Add the “Authorization” header and leave the value empty as it’ll be sent by Snowflake through an External Function custom header.

    AWSAPIGateway1.png

    Figure 6: AWS API Gateway

  3. Click “Add mapping template” and do the following:

    1. Select Content-Type as “application/json”.

    2. Select Request body passthrough as “Never”.

    3. Set the value of the template as follows:

      set($apikey = "Basic $input.params('sf-custom-api-key')")
      set($context.requestOverride.header.Authorization = $apikey)
      set($context.requestOverride.header.sf-custom-api-key = "")
      set($context.requestOverride.header.sf-custom-key-names = "")
      set($inputRoot = $input.path('$'))
      {
      "op": "encrypt OR decrypt",
      "keys":"$input.params('sf-custom-key-names')",
      "data": $input.json('$.data')
      }
      

      NOTE

      • Snowflake External Function will send the FORTANIX_DSM_API_KEY in its custom header, which will need to be mapped to a standard Basic Authentication header. Similarly, the Fortanix DSM tokenization key names will be mapped from another Snowflake custom header to the integration request body.

      • Change the “op” to “encrypt” for /tokenize, and “decrypt” for /detokenize resources.

  4. Test AWS API Gateway using the following input:

    1. Resource: /tokenize

    2. Query String: none or leave blank.

    3. Headers:

      Accept:application/json
      sf-custom-api-key:<<FORTANIX_DSM_API_KEY>>
      sf-custom-key-names:<<KEY_NAME_1,KEY_NAME_2,KEY_NAME_3,KEY_NAME_X,,,>>
      

      NOTE

      • Where, KEY_NAME_1, KEY_NAME_2, KEY_NAME_3..and so on are the tokenization keys created in DSM using the easy wizard

      • FORTANIX_DSM_API_KEY needs to correspond to the “Data Protection” or “Data Analysis” app so that it can /tokenize or /detokenize through the Plugin accordingly.

    4. Request Body:

      {"data": [
           [0, "SUSAN"],
           [1, "SEAN"],
           [2, "SAMUEL"],
           [3, "HOLEE"]
      ]}

    NOTE

    Adapt the data based on the data types or columns being tested. Also, test the /detokenize resource with the same Headers, but different Request Body.

    TestAPI.png

    Figure 7: Test AWS API Gateway

4.2 Finalize the API Gateway Integration

Finalize the API Gateway integration by:

  1. Creating an AWS IAM Role that Snowflake will assume for execution.

  2. Deploying the proxy service on a demo stage and note the public or private URI:

    https://<<XYZ.execute-api.us-east-2>>.amazonaws.com/<<DEMO_STAGE>>/

Where, DEMO_STAGE is an environment in which you can deploy your API. For example, you can create development or production stages. You can configure different settings for each stage of your API. For changes to take effect, you must first deploy your API.

Stage-snowflake.png

Figure 8: Stages

5.0 Setup the Snowflake API Integration and External Functions

When creating External Functions, first determine the number of parameters/columns that need to be processed, and the precise data types of such columns.

Fortanix recommends creating separate External Functions for tokenization and detokenization based on the number and type of columns. External Functions can call API Gateway integration endpoints based on tokenization and detokenization as explained previously and as shown below.

  1. Configure the AWS Gateway API Integration deployment stage and IAM Role in Snowflake.

    CREATE OR REPLACE API INTEGRATION fortanix_plugin
       API_PROVIDER = aws_api_gateway
       API_AWS_ROLE_ARN = 'arn:aws:iam::<<AWS_ACCOUNT_ID>>:role/service-role/<<SNOWFLAKE-ROLE-ID>>'
       API_ALLOWED_PREFIXES = ('https://<<XYZ.execute-api.us-east-2>>.amazonaws.com/<<DEMO_STAGE>>/')
       enabled=true;
    
    DESCRIBE INTEGRATION fortanix_plugin;
    

    NOTE

    Substitute with the correct identifiers above. Where,

  2. Create External Functions for single columns corresponding to SSN and Credit Card number for instance.

    -- Single column tokenization: SSN
    DROP FUNCTION dsm_tokenize_ssn(varchar);
    
    CREATE SECURE EXTERNAL FUNCTION dsm_tokenize_ssn(fname varchar)
      RETURNS variant
      IMMUTABLE
      API_INTEGRATION = fortanix_plugin
      HEADERS = (
          'api-key'='<<FORTANIX_DSM_DATA_PROTECTION_API_KEY>>',
          'key-names'='<<SSN_Key_Name>>'
      )
      AS 'https://<<XYZ.execute-api.us-east2>>.amazonaws.com/<<DEMO_STAGE>>/tokenize';
    
    select dsm_tokenize_ssn('123-45-6789');
    
    -- Single column tokenization: Credit Card Number
    DROP FUNCTION dsm_tokenize_ccn(varchar);
    
    CREATE SECURE EXTERNAL FUNCTION dsm_tokenize_ccn(fname varchar)
      RETURNS variant
      IMMUTABLE
      API_INTEGRATION = fortanix_plugin
      HEADERS = (
          'api-key'='<<FORTANIX_DSM_DATA_PROTECTION_API_KEY>>',
          'key-names'='<<CCN_Key_Name>>'
    )
    AS 'https://<<XYZ.execute-api.us-east2>>.amazonaws.com/<<DEMO_STAGE>>/tokenize';
    
    select dsm_tokenize_ssn('4123456789012340');
  3. Setup a test table and insert data using tokenization External Functions:

    CREATE or REPLACE TABLE test_table (
         id number autoincrement start 1 increment 1,
         fname varchar,
         ssn varchar,
         addr varchar,
         ccn varchar
    );
    
    insert into test_table (id, fname, ssn, addr, ccn) select 1, 'Franky Hou', dsm_tokenize_ssn('001-02-0001')[0]::text, '1 Infinity Loop', dsm_tokenize_ccn('1234123412341234')[0]::text;
    insert into test_table (id, fname, ssn, addr, ccn) select 2, 'Joan Lucas', dsm_tokenize_ssn('001-02-0002')[0]::text, '918 Batman Drive', dsm_tokenize_ccn('9876987698769876')[0]::text;
    insert into test_table (id, fname, ssn, addr, ccn) select 3, 'James Woods', dsm_tokenize_ssn('001-02-0003')[0]::text, '482 Woody Ave', dsm_tokenize_ccn('1849372849384723')[0]::text;
    insert into test_table (id, fname, ssn, addr, ccn) select 4, 'John Wick', dsm_tokenize_ssn('001-02-0004')[0]::text, '711 Nulla St', dsm_tokenize_ccn('3789020817673718')[0]::text;
  4. Create an External Function for de-tokenization of a single column.

    -- single column de-tokenization
    DROP FUNCTION dsm_detokenize_ssn(varchar);
    
    
    CREATE EXTERNAL SECURE FUNCTION dsm_detokenize_ssn(ssn varchar)
      RETURNS variant
      IMMUTABLE
      API_INTEGRATION = fortanix_plugin
      HEADERS = (
         'api-key'='<<FORTANIX_DSM_DATA_PROTECTION_API_KEY>>',
         'key-names'='<<SSN_Key>>'
    )
    
    AS 'https://<<XYZ.execute-api.us-east-2>>.amazonaws.com/<<DEMO_STAGE>>/detokenize';
    
    select dsm_detokenize_ssn('806-30-1382');
    select ssn, dsm_detokenize_ssn(ssn)[0]::text from test_table;
  5. Create External Functions for multi-columns, say corresponding to First Name, SSN, and Credit Card number.

    -- multi-column tokenization
    DROP FUNCTION dsm_tokenize_multi(varchar, varchar, varchar);
    
    CREATE EXTERNAL SECURE FUNCTION dsm_tokenize_multi(fname varchar, ssn varchar, ccn varchar)
      RETURNS variant
      IMMUTABLE
      API_INTEGRATION = fortanix_plugin
      HEADERS = (
          'api-key'='<<FORTANIX_DSM_DATA_PROTECTION_API_KEY>>',
          'key-names'='<<NAME_key,SSN_Key,CCN_Key>>'
    )
    AS 'https://<<XYZ.execute-api.us-east-2>>.amazonaws.com/<<DEMO_STAGE>>/tokenize';
    
    select dsm_tokenize_multi('JOHN','123-01-1234','3935252016295727');
    
    insert into test_table (id, fname, ssn, addr, ccn) select 4, 
    dsm_tokenize_multi('Maggie','029-30-2901','5015254012895431')[0]::text, 
    dsm_tokenize_multi('Maggie','029-30-2901','5015254012895431')[1]::text, 
    dsm_tokenize_multi('Maggie','029-30-2901','5015254012895431')[2]::text
    
    -- copy from one table to another and tokenize in place
    CREATE or REPLACE TABLE stage_table (
        id number autoincrement start 1 increment 1,
        fname varchar,
        ssn varchar,
        addr varchar,
        ccn varchar
    );
    
    insert into stage_table (id, fname, ssn, addr, ccn) select src.id,
    dsm_tokenize(src.fname, src.ssn, src.ccn)[0]::text, 
    dsm_tokenize(src.fname, src.ssn, src.ccn)[1]::text,
    src.addr,
    dsm_tokenize(src.fname, src.ssn, src.ccn)[2]::text from test_table as src;
    
    -- multi-column de-tokenization
    DROP FUNCTION dsm_detokenize_multi(varchar, varchar, varchar);
    
    CREATE EXTERNAL SECURE FUNCTION dsm_detokenize_ multi(fname varchar, ssn varchar, ccn varchar)
      RETURNS variant
      IMMUTABLE
      API_INTEGRATION = fortanix_plugin
      HEADERS = (
          'api-key'='<<FORTANIX_DSM_DATA_PROTECTION_API_KEY>>',
          'key-names'='<<NAME_Key,SSN_Key,CCN_Key>>'
    )
    AS 'https://<<XYZ.execute-api.us-east-2>>.amazonaws.com/<<DEMO_STAGE>>/detokenize';
    
    select dsm_detokenize_ multi('A7G2','806-30-1382','6100938584236546');
    
    select fname, ssn, ccn, 
         dsm_detokenize_ multi(fname,ssn,ccn)[0]::text, 
         dsm_detokenize_ multi(fname,ssn,ccn)[1]::text, 
         dsm_detokenize_ multi(fname,ssn,ccn)[2]::text
    from test_table;

    NOTE

    External Function output data is accessed using column index [0], [1], and [2] to get the first, second, and last element of the array, which corresponds to First Name, SSN and Credit Card Number.