Using Data Security Manager with Snowflake

Introduction

This document describes the Snowflake setup of External Functions to invoke the Fortanix Data Security Manager (DSM) Plugin for tokenization and detokenization using an API Gateway proxy service.

Snowflake External Functions will need to specify at creation time which Fortanix DSM keys are to be used for tokenization or detokenization of various columns. Thereby, the API Gateway remains stateless and simply passes through the request from Snowflake to Fortanix DSM plugin by only requiring mapping certain headers.

We will use the "headers” property of CREATE EXTERNAL FUNCTION to pass key names information. Header name and value should be as follows:

  • Header name : This will be named “key-names”
  • Header value : This will be a string of comma separated key names. First entry corresponds to the first column in the data and so on.

Key with the key names specified in the header must already exist. Fortanix DSM plugin will not create the key. It will check and verify if the key exists. Fortanix DSM plugin will also verify that the number of key names passed in the header match the number of columns in the data sent by Snowflake. If there are less keys than the columns, then the operation will fail.

Prerequisites

  • Fortanix DSM version 3.27 or later is installed and operational.
  • Fortanix DSM Tokenization plugin for Snowflake.

Setting up Fortanix Data Security Manager

Configure Plugins in Fortanix DSM

  1. Sign up at https://smartkey.io/.
  2. Log in to the Fortanix DSM UI.
  3. Create the Fortanix DSM Plugin. Refer to the Appendix for the Fortanix DSM Plugin and copy paste the code. Fortanix DSM Plugin accept input and respond with the output as per the Snowflake doc –https://docs.snowflake.com/en/sql-reference/external-functions-data-format.html. snowflake_createplugin.pngFigure 1: Create plugin

Configure App in Fortanix Data Security Manager

  1. Sign up at https://smartkey.io/.
  2. Log in to the Fortanix DSM UI.
  3. Create Fortanix DSM App with API KEYs 
    NOTE
    It is recommended to have the Data Protection App only have “ENCRYPT” permission in the DSM Group, and the Data Analysis App(s) to have full reveal that is, “DECRYPT” or partial reveal that is, “MASKED-DECRYPT” based on the organizational requirements.
  4. To create an app, click the Apps tab in the Fortanix DSM UI. On the Apps page click the create new app icon Apps.png to create a new app. snowflake_createapp.pngFigure 2: Create new app
  5. Enter the following information:
    1. App name: This is the name to identify the app.
    2. Authentication method: This can be left at the default API Key. You will need the API Key when setting your API gateway.
    3. Group: This is a logical construct that will contain the keys created.
  6. Click Save to complete creating the application.
  7. Note down the application’s API Key from the App table view. To copy the App UUID:
    1. Go to the table view of an app and click the COPY API KEY link in the Credentials column as shown below. AppTable.pngFigure 2: Copy API key
    For more info, Refer to our Quickstart and Getting Started Guide for instructions on creating groups, apps, and so on.
  8. In the same group, now create a Security Object of type “tokenization” and ensure the “Export” permission is enabled. To create a “tokenization” security object:
    1. Click the Security Objects tab in the Fortanix DSM UI. On the Apps page click the create new Security Object icon Apps.png to create a new key. snowflake_createSO.pngFigure 4: Create new SO
    2. In the Add New Security Objects form, enter the security object name and assign the security object to a group.
    3. In the Choose a type section, select Tokenization.
    4. Select the Data type applicable to you. snowflake_createSO1.pngFigure 5: Tokenization SO
    5. Make sure the “Export” permission is enabled. snowflake_createSO2.pngFigure 6: Enable export permission
  9. You can create multiple token objects as per your needs.
    NOTE
    Each security object will need to have a masking policy enabled during creation time, so that the Fortanix DSM App can perform a partial reveal or dynamic masking during de-tokenization. Masking policy cannot be enabled on the object afterward and hence it needs to be premeditated for the creation. This may be subject to change in the future if the key can be copied (replicated) with a masking policy enabled during the replication.
    snowflake_SOtable.pngFigure 7: SO created

Setting up AWS API Gateway Proxy Service

Create and Test AWS API Gateway

Create and test AWS API Gateway with the following configuration:

  1. REST API Endpoints that may be public or private:
  2. Two resources for /tokenize and /detokenize.
    NOTE
    POST method is required for both resources.
    For each resource method, configure Integration Request with the following configuration:
    • Integration type: HTTP
    • Endpoint URL: Point it to your Plugin URL
      https://sdkms.fortanix.com/sys/v1/plugins/<<PLUGIN_UUID>>
      NOTE
      change the Fortanix DSM FQDN and the Plugin UUID
    • Content handling: Passthrough
    • HTTP Headers – Add the “Authorization” header and leave the value empty as it’ll be sent by Snowflake through an External Function custom header.
    AWSAPIGateway.pngFigure 8: AWS API Gateway
  3. Click “Add mapping template” and do the following:
    1. Select Content-Type as “application/json”.
    2. Select Request body passthrough as “Never”.
    3. Set the value of the template as follows:
      set($apikey = "Basic $input.params('sf-custom-api-key')")
      set($context.requestOverride.header.Authorization = $apikey)
      set($context.requestOverride.header.sf-custom-api-key = "")
      set($context.requestOverride.header.sf-custom-key-names = "")
      set($inputRoot = $input.path('$'))
      {
      "op": "encrypt OR decrypt",
      "keys":"$input.params('sf-custom-key-names')",
      "data": $input.json('$.data')
      }

       

      NOTE
      • Snowflake External Function will send the FORTANIX_DSM_API_KEY in its own custom header, which will need to be mapped to a standard Basic Authentication header. Similarly, the Fortanix DSM tokenization key names will be mapped from another Snowflake custom header to the integration request body.
      • Change the “op” to “encrypt” for /tokenize, and “decrypt” for /detokenize resources.
  4. Test AWS API Gateway using the following input:
    1. Resource: /tokenize
    2. Query String: none or leave blank.
    3. Headers:
      Accept:application/json
      sf-custom-api-key:<<FORTANIX_DSM_API_KEY>>
      sf-custom-key-names:<<KEY_NAME_1,KEY_NAME_2,KEY_NAME_3,KEY_NAME_X,,,>>
      NOTE
      FORTANIX_DSM_API_KEY needs to correspond to the “Data Protection” or “Data Analysis” app so that it can /tokenize or /detokenize through the Plugin accordingly.
    4. Request Body:
      {"data": [
      [0, "SUSAN"],
      [1, "SEAN"],
      [2, "SAMUEL"],
      [3, "HOLEE"]
      ]}
    NOTE

    Adapt the data based on the data types or columns being tested. Also, test the /detokenize resource with same Headers, but different Request Body.

    TestAPI.pngFigure 9: Test AWS API Gateway

Finalize the API Gateway Integration

Finalize the API Gateway integration by:

  1. Creating an AWS IAM Role that Snowflake will assume for execution.
  2. Deploying the proxy service on a demo stage and note the public or private URI:
    https://<<XYZ.execute-api.us-east-2>>.amazonaws.com/<<DEMO_STAGE>>/

Setup the Snowflake API Integration and External Functions

When creating External Functions, first determine the number of parameters/columns that need to be processed, and the precise data types of such columns.

Fortanix recommends creating separate External Functions for tokenization and detokenization based on the number and type of columns. External Functions can call API Gateway integration endpoints based on tokenization and detokenization as explained previously and as shown below.

  1. Configure the AWS Gateway API Integration deployment stage and IAM Role in Snowflake.
    CREATE OR REPLACE API INTEGRATION fortanix_plugin
    API_PROVIDER = aws_api_gateway
    API_AWS_ROLE_ARN = 'arn:aws:iam::<<ACCOUNT_ID>>:role/service-role/<<SNOWFLAKE-ROLE-ID>>'
    API_ALLOWED_PREFIXES = ('https://<<XYZ.execute-api.us-east-2>>.amazonaws.com/<<DEMO_STAGE>>/')
    enabled=true;

    DESCRIBE INTEGRATION fortanix_plugin;
    NOTE
    Substitute with the correct identifiers above.
  2. Create External Functions for single columns, say corresponding to SSN and Credit Card number.
    -- Single column tokenization: SSN
    DROP FUNCTION dsm_tokenize_ssn(varchar);

    CREATE SECURE EXTERNAL FUNCTION dsm_tokenize_ssn(fname varchar)
    RETURNS variant
    IMMUTABLE
    API_INTEGRATION = fortanix_plugin
    HEADERS = (
    'api-key'='<<FORTANIX_DSM_DATA_PROTECTION_API_KEY>>',
    'key-names'='<<SSN_Key_Name>>'
    )
    AS 'https://<<XYZ.execute-api.us-east2>>.amazonaws.com/<<DEMO_STAGE>>/tokenize';

    select dsm_tokenize_ssn('123-45-6789');

    -- Single column tokenization: Credit Card Number
    DROP FUNCTION dsm_tokenize_ccn(varchar);

    CREATE SECURE EXTERNAL FUNCTION dsm_tokenize_ccn(fname varchar)
    RETURNS variant
    IMMUTABLE
    API_INTEGRATION = fortanix_plugin
    HEADERS = (
    'api-key'='<<FORTANIX_DSM_DATA_PROTECTION_API_KEY>>',
    'key-names'='<<CCN_Key_Name>>'
    )
    AS 'https://<<XYZ.execute-api.us-east2>>.amazonaws.com/<<DEMO_STAGE>>/tokenize';

    select dsm_tokenize_ssn('4123456789012340');
  3. Setup a test table and insert data using tokenization External Functions:
    CREATE or REPLACE TABLE test_table (
    id number autoincrement start 1 increment 1,
    fname varchar,
    ssn varchar,
    addr varchar,
    ccn varchar
    );

    insert into test_table (id, fname, ssn, addr, ccn) select 1, 'Franky Hou', dsm_tokenize_ssn('001-02-0001')[0]::text, '1 Infinity Loop', dsm_tokenize_ccn('1234123412341234')[0]::text;
    insert into test_table (id, fname, ssn, addr, ccn) select 2, 'Joan Lucas', dsm_tokenize_ssn('001-02-0002')[0]::text, '918 Batman Drive', dsm_tokenize_ccn('9876987698769876')[0]::text;
    insert into test_table (id, fname, ssn, addr, ccn) select 3, 'James Woods', dsm_tokenize_ssn('001-02-0003')[0]::text, '482 Woody Ave', dsm_tokenize_ccn('1849372849384723')[0]::text;
    insert into test_table (id, fname, ssn, addr, ccn) select 4, 'John Wick', dsm_tokenize_ssn('001-02-0004')[0]::text, '711 Nulla St', dsm_tokenize_ccn('3789020817673718')[0]::text;
  4. Create an External Function for de-tokenization of a single column.
    -- single column de-tokenization
    DROP FUNCTION dsm_detokenize_ssn(varchar);


    CREATE EXTERNAL SECURE FUNCTION dsm_detokenize_ssn(ssn varchar)
    RETURNS variant
    IMMUTABLE
    API_INTEGRATION = fortanix_plugin
    HEADERS = (
    'api-key'='<<FORTANIX_DSM_DATA_PROTECTION_API_KEY>>',
    'key-names'='<<SSN_Key>>'
    )

    AS 'https://<<XYZ.execute-api.us-east-2>>.amazonaws.com/<<DEMO_STAGE>>/detokenize';

    select dsm_detokenize_ssn('806-30-1382');
    select ssn, dsm_detokenize_ssn(ssn)[0]::text from test_table;
  5. Create External Functions for multi-columns, say corresponding to First Name, SSN, and Credit Card number.
    -- multi-column tokenization
    DROP FUNCTION dsm_tokenize_multi(varchar, varchar, varchar);

    CREATE EXTERNAL SECURE FUNCTION dsm_tokenize_multi(fname varchar, ssn varchar, ccn varchar)
    RETURNS variant
    IMMUTABLE
    API_INTEGRATION = fortanix_plugin
    HEADERS = (
    'api-key'='<<FORTANIX_DSM_DATA_PROTECTION_API_KEY>>',
    'key-names'='<<NAME_key,SSN_Key,CCN_Key>>'
    )
    AS 'https://<<XYZ.execute-api.us-east-2>>.amazonaws.com/<<DEMO_STAGE>>/tokenize';

    select dsm_tokenize_multi('JOHN','123-01-1234','3935252016295727');

    insert into test_table (id, fname, ssn, addr, ccn) select 4,
    dsm_tokenize_multi('Maggie','029-30-2901','5015254012895431')[0]::text,
    dsm_tokenize_multi('Maggie','029-30-2901','5015254012895431')[1]::text,
    dsm_tokenize_multi('Maggie','029-30-2901','5015254012895431')[2]::text

    -- copy from one table to another and tokenize in place
    CREATE or REPLACE TABLE stage_table (
    id number autoincrement start 1 increment 1,
    fname varchar,
    ssn varchar,
    addr varchar,
    ccn varchar
    );

    insert into stage_table (id, fname, ssn, addr, ccn) select src.id,
    dsm_tokenize(src.fname, src.ssn, src.ccn)[0]::text,
    dsm_tokenize(src.fname, src.ssn, src.ccn)[1]::text,
    src.addr,
    dsm_tokenize(src.fname, src.ssn, src.ccn)[2]::text from test_table as src;

    -- multi-column de-tokenization
    DROP FUNCTION dsm_detokenize_multi(varchar, varchar, varchar);

    CREATE EXTERNAL SECURE FUNCTION dsm_detokenize_ multi(fname varchar, ssn varchar, ccn varchar)
    RETURNS variant
    IMMUTABLE
    API_INTEGRATION = fortanix_plugin
    HEADERS = (
    'api-key'='<<FORTANIX_DSM_DATA_PROTECTION_API_KEY>>',
    'key-names'='<<NAME_Key,SSN_Key,CCN_Key>>'
    )
    AS 'https://<<XYZ.execute-api.us-east-2>>.amazonaws.com/<<DEMO_STAGE>>/detokenize';

    select dsm_detokenize_ multi('A7G2','806-30-1382','6100938584236546');

    select fname, ssn, ccn,
    dsm_detokenize_ multi(fname,ssn,ccn)[0]::text,
    dsm_detokenize_ multi(fname,ssn,ccn)[1]::text,
    dsm_detokenize_ multi(fname,ssn,ccn)[2]::text
    from test_table;
    NOTE
    External Function output data is accessed using column index [0], [1], and [2] to get the first, second, and last element of the array, which corresponds to First Name, SSN and Credit Card Number.

Appendix

Fortanix DSM Plugin Version 3.27 

--[[

-- v1 07132021

-- Invoking the plugin
$ curl -H "Authorization: Basic $FORTANIX_API_KEY" \
$FORTANIX_API_ENDPOINT/sys/v1/plugins/$FORTANIX_PLUGIN_UUID \
-d @input_examples.json # see below

{
"keys": "SSN_tok_key",
"data": [
[0, "001-11-0010"],
[1, "616-21-0666"],
[2, "300-31-1930"],
[3, "021-31-0930"]
]
}

{
"keys": "SSN_tok_key,Name_tok_key",
"data": [
[0, "001-11-0010", "Faisal"],
[1, "616-21-0666", "Joe"],
[2, "300-31-1930", "Jane"],
[3, "021-31-0930", "Bob"]
]
}

-- input example 1
{
"keys": "AlphaNum,ssn-sizer,cc-sizer",
"data": [
[0, "SUSAN", "001-11-0010", "4539943689232943"],
[1, "SEAN", "616-21-0666", "349915113632714"],
[2, "SAMUEL", "300-31-1930", "5929662954927004"],
[3, "HOLEE", "021-31-0930", "4929662954927007"]
]
}

-- input example 2
{
"op": "encrypt",
"debug": false,
"keys": "Name_X,SSN_X,Address_X,Numeric,email-sizer,CCN_X",
"data": [
[
0,
"Karna Ugarte",
"171-73-7920",
"#57 Lester Road",
"28510",
"ugarte_57@siriusxm.com",
4716407329495455
],[
1,
"Matty Grand Jr.",
"544-10-0225",
"14 N-Williams Street",
"26259",
"grand_15@datawire.net",
5414779952385357
]
]
}

]]--
local cached_keys = {}


function logevent(msg, level)
local msg_severity = { 'INFO', 'WARNING', 'ERROR', 'CRITICAL' }
local msg_sf_udf = this_plugin().name .. ": " .. msg
AuditLog.log { message = msg_sf_udf, severity = msg_severity[level] }
return msg_sf_udf
end

function check(input)
if not input.data or type(input.data) ~= "table" then
return nil, Error.new { status = 400, message = logevent("Missing data input", 4) }
end
if not input.keys or type(input.keys) ~= "string" or input.keys == "" then
return nil, Error.new { status = 400, message = logevent("Missing keys input", 4) }
end
for key_name in string.gmatch(input.keys, '([^,]+)') do
local key = Sobject { name = key_name }
if key == nil then
return nil, Error.new { status = 400, message = logevent("Missing key: " .. key_name, 4) }
end
table.insert(cached_keys, key)
end
end

function run(input)
local output = {}
if input.debug then logevent("Parsing input data", 1) end
local row_quartiles = {}
if input.debug then
row_quartiles[math.floor(#input.data/4)] = true
row_quartiles[math.floor(#input.data/2)] = true
row_quartiles[math.floor(#input.data*3/4)] = true
end

for row, values in pairs(input.data) do
local tokens = {}
if input.debug and row_quartiles[row] then logevent("Processing row: " .. row, 1) end
for col, pii in ipairs(values) do
if col == 1 then goto skip_row_index end

local key = cached_keys[col-1]
if key == nil then
return nil, Error.new { status = 400, message = logevent("Check keys count", 3) }
end

local pii_str = tostring(pii)
if math.type(pii) == "float" then
pii_str = string.gsub(string.format("%.9f", pii), "[0]+$", "")
if tonumber(pii_str) ~= pii then
return nil, Error.new { status = 501, message =
logevent("Known issue - " .. pii_str, 2) }
end
end
if input.op == "decrypt" then
op, err = key:decrypt { cipher = Blob.from_bytes(pii_str), mode = 'FPE' }
else
op, err = key:encrypt { plain = Blob.from_bytes(pii_str), mode = 'FPE' }
end
if err then
return nil, Error.new { status = 400, message = logevent("Check PII: " .. pii_str .. " | " .. tostring(err), 3) }
end

local dtype = type(pii)
if op and op.cipher then
if dtype == "number" then
table.insert(tokens, tonumber(op.cipher:bytes()))
else
table.insert(tokens, op.cipher:bytes())
end
elseif op and op.plain then
if dtype == "number" then
table.insert(tokens, tonumber(op.plain:bytes()))
else
table.insert(tokens, op.plain:bytes())
end
else
return nil, Error.new { status = 400, message = "Check: " .. tostring(pii) .. " | " ..json.encode(op) }
end
::skip_row_index::
end
table.insert(output, { row - 1, tokens })
end
if input.debug then logevent("Complete", 1) end
return { data = output }
end

Comments

Please sign in to leave a comment.

Was this article helpful?
0 out of 0 found this helpful