1.0 Introduction
This article describes the best-practices scenario for performing API tokenization.
For detailed information about the FpeConstraints, refer to the Fortanix DSM REST API page.
2.0 API
FpeOptions::Advanced allows specifying a char_set
, which is used for input decoding and output encoding. The requirement is to use one char_set
for input decoding and another char_set
for output encoding. The cardinality (size of alphabets) of the two character sets must be the same for every part.
2.1 Description
An AES security object is required for tokenization. The tokenization options are specified in the FPE field of SobjectRequest
.
A new field, named
cipher_char_set
is added toFpeEncryptedPart
to support specifying two character sets. The token values use thecipher_char_set
alphabets and default tochar_set
if not specified.The
cipher_char_set
field is used for output encoding during tokenization and input decoding during de-tokenization.The specified constraints apply to both
char_set
andcipher_char_set
alphabets. Any constraints specified must be validated to ensure that they can be applied to both of the mentioned alphabets. Otherwise, the API should respond with an appropriate error message.
2.2 Response Body
The response Sobject
struct will have new fields as described above. There is no other change.
pub struct Sobject {
...
}
3.0 JSON Examples
3.1 Applying Constraints to Subparts of an OR Part
For OR parts, the applies_to
field of the constraints field (if specified) cannot specify subparts of the OR to apply the constraints to. Instead, those constraints should be specified directly in those subparts themselves. Thus, the following example is incorrect:
{
"or": [
{"literal": ["12344"]},
{"literal": ["abcdefg"]},
{
"char_set": [["0", "9"]],
"min_length": 10,
"max_length": 10
}
],
"constraints": {
"luhn_check": true,
"applies_to": {
"0": "all",
"2": "all"
}
}
}
The following example is correct:
{
"or": [
{"literal": ["12344"]},
{"literal": ["abcdefg"]},
{
"char_set": [["0", "9"]],
"min_length": 10,
"max_length": 10
}
],
"constraints": {
"luhn_check": true,
"applies_to": {
"0": "all",
"2": "all"
}
}
}
3.2 Bad Preserve or Mask-related Fields
3.2.1 Indices out-of-range
In the example below, the preserve field on the encrypted part is incorrect, due to the index -4
being out of range. (The encrypted part has a minimum length of 3
, and if the input is exactly three digits long, -4
would be out of range. The fact that the maximum length is greater than 4
is irrelevant.)
{
"char_set": [["0", "9"]],
"min_length": 3,
"max_length": 7,
"preserve": [-4, -3]
}
3.2.2 Specifying Fields at Two Different “Levels”
If a preserve or mask field is specified on a compound part then any of its subparts (or its descendants) cannot specify that field. The example below violates that rule. (While the mask fields do not conflict here, this is still disallowed.)
{
"concat": [
{"literal": ["hello"]},
{
"char_set": [["0", "9"]],
"min_length": 6,
"max_length": 6,
"mask": []
}
],
"mask": false
}
The string "all" is used for encrypted parts, and true is used for compound parts. This example below mixes up the two usages.
{
"concat": [
{
"char_set": [["0", "9"]],
"min_length": 10,
"max_length": 10,
"preserve": true
},
{
"concat": [
{
"char_set": [["A", "Z"]],
"min_length": 10,
"max_length": 10
},
{
"char_set": [["a", "z"]],
"min_length": 10,
"max_length": 10
}
],
"preserve": "all"
}
]
}
3.2.3 Preserving Only the Month Part of a Date
Dates have rules as to which parts can be preserved and which cannot. This example is invalid due to its attempt to preserve only the Month part.
{
"concat": [
{
"char_set": [["0", "9"]],
"min_length": 2,
"max_length": 2,
"preserve": "all"
"constraints": {
"date": "month"
}
},
{"literal": ["/"]},
{
"char_set": [["0", "9"]],
"min_length": 2,
"max_length": 2,
"constraints": {
"date": "year"
}
}
],
"constraints": {
"date": {
"month_year_date": {}
},
"applies_to": {
"0": "all",
"2": "all"
}
}
}
3.2.4 Luhn Check Constraints and Other Constraints
Any encrypted part under a Luhn check constraint cannot specify any other constraints unless it is guaranteed to be fully preserved. The first and second encrypted parts in the example below do not satisfy this requirement. (The third one does, because even if the part is at its maximum length of 5, the indices 0, 1, -3, -2, and -1 still manage to cover every single index.)
{
"concat": [
{
"char_set": [["0", "9"]],
"min_length": 5,
"max_length": 5,
"preserve": [2],
"constraints": {
"num_ne": [90, 292]
}
},
{
"char_set": [["0", "9"]],
"min_length": 5,
"max_length": 5,
"constraints": {
"num_lt": 99021
}
},
{
"char_set": [["0", "9"]],
"min_length": 3,
"max_length": 5,
"preserve": [0, 1, -3, -2, -1],
"constraints": {
"num_ne": [902]
}
}
],
"constraints": {
"luhn_check": true
}
}
3.2.5 Nested Luhn Checks
Luhn check constraints prohibit sharing encrypted parts. The following example is invalid due to the first encrypted part falling under two Luhn check constraints: one specified on the outer concatenation and one specified on the inner concatenation.
{
"concat": [
{
"concat": [
{
"char_set": [["0", "9"]],
"min_length": 5,
"max_length": 5
},
{"literal": ["-"]},
{
"char_set": [["0", "9"]],
"min_length": 5,
"max_length": 5
}
],
"constraints": {
"luhn_check": true,
"applies_to": {
"0": "all",
"2": "all"
}
}
},
{
"char_set": [["0", "9"]],
"min_length": 6,
"max_length": 6
}
],
"constraints": {
"luhn_check": true,
"applies_to": {
"0": {
"0": "all"
},
"1": "all"
}
}
}
3.3 Date Parts Underneath an OR Part
The encrypted parts that comprise a date cannot be underneath an or
or multiple part (with the exception that the entire date
part can be underneath an or
or multiple
part). Thus, this example is invalid, as the applies_to
field for the date constraint is being applied to an or
part (and hence the year
subparts underneath the or
).
{
"concat": [
{
"char_set": [["0", "9"]],
"min_length": 2,
"max_length": 2,
"constraints": {
"date": "month"
}
},
{
"or": [
{
"char_set": [["0", "9"]],
"min_length": 2,
"max_length": 2,
"constraints": {
"date": "year"
}
},
{
"char_set": [["0", "9"]],
"min_length": 4,
"max_length": 4,
"constraints": {
"date": "year"
}
}
]
}
],
"constraints": {
"date": {
"month_year_date": {}
}
}
}
Unlike ordinary FF1 encryption when handling date-related parts, the resulting output token may differ in length as compared to the input token.
NOTE
A duplicate name field is added while serializing FpeOptions values of the Advanced variant to accommodate the users with legacy code that expects the availability of the name field.
However, if the user wants to tokenize or detokenize only an input token, then the exact order of the characters is unimportant. The only exception is that “numeric” character sets must consist of exactly the digits from ‘0' to '9', in order. This is relevant for constraints, as all the constraints currently available are only applicable to numeric parts.
NOTE
A Unicode codepoint does not necessarily correspond to the general idea of a “character”; for example, “é” can be represented by U+00E9, or by the sequence U+0065, U+3031 (with the two representations being canonically equivalent). This API treats all three codepoints as distinct, and hence performs no Unicode normalization of any kind.