1.0 Introduction
This article describes how to implement security object tokenization using the Fortanix-Data-Security-Manager (DSM) REST API with the Format_V2 (format_v2) schema.
NOTE
Fortanix recommends using
format_v2for all new tokenization operations.
2.0 Using API Schema
2.1 Format Definition (Format_V2)
Fortanix DSM supports a format_v2 field, which provides an updated format-based definition for describing the structure of a token. The format_v2 model is the successor to the original format definition and should be used for new tokenization use cases.
When creating a security object for format-preserving encryption (FPE), the request includes an fpe field. The value of this field defines how the token is structured and how tokenization and detokenization are performed.
The following code block illustrates the value for the fpe field in a JSON format using the format_v2 schema:
fpe {
"format_v2": <JSON object containing datatype format goes here!>,
"description": "An optional string describing this data format"
}The format_v2 field specifies the token format as a collection of one or more format variants, each describing a valid structural representation of the data. Each variant defines how the input is structured and processed, including which portions of the data are encrypted. Together, these variants describe the complete set of acceptable input formats for tokenization.
At the top level, the format_v2 field may also include the following optional properties:
mode: Controls how the generated token preserves characteristics of the original input value.preserve_length: Preserves the length and variant of the input value.preserve_variant: Preserves the variant of the input; the length may change.preserve_format: Preserves neither length nor variant.
input_processing: Controls how characters that are not explicitly defined in the format are handled during tokenization.passthrough_unknown: Characters that do not match any defined format are retained in the output unchanged.strip_unknown: Characters that do not match any defined format are removed before processing.passthrough_specific: Allows specific unmatched characters to pass through unchanged while controlling how all other unmatched characters are handled.
The passthrough_specific configuration contains the following properties:
passthrough: A list of characters that should be preserved in their original positions.other: Specifies how unmatched characters that are not included in the passthrough list are handled.
Currently supported values for other include:
reject: Rejects the tokenization request if any unmatched character other than the explicitly permittedpassthroughcharacters is encountered.Strip: Removes any unmatched character other than the explicitly permittedpassthroughcharacters before processing the tokenization request.
The following are the types of sub-parts of the format_v2 JSON tree structure:
Variants
variants represent a complete and valid format for a token.
Variants are represented by a JSON object contained within the
variantsarray.Each variant defines an ordered list of Sections.
A token is valid if it matches one complete variant.
If a variant has a variable length, prefix and suffix constraints may be applied.
Sections
sections represent a logical portion of the token.
Sections are evaluated sequentially and may contain one or more Groups.
A section is represented by a JSON object contained within the
sectionsarray.
sections may optionally define a checksum property.
The checksum property specifies a validation algorithm that is applied during tokenization and detokenization to ensure generated values remain valid.
Currently supported values include:
luhn– Validates identifiers using the Luhn checksum algorithm.
Groups
groups represent a concatenation of multiple Codes that together form a structural unit within a section.
Groups are evaluated in sequence within a section.
A group is represented by a JSON object containing a
codesarray.
groups may optionally define a preserve property to specify portions of the matched value that remain unchanged during tokenization.
The preserve property is represented as:
"preserve": [prefix_length, suffix_length]Where:
prefix_lengthspecifies the number of leading characters to preserve.suffix_lengthspecifies the number of trailing characters to preserve.
Codes
codes represent the smallest unit of token generation and validation.
Codes are evaluated sequentially within a group.
Each code defines how a portion of the token is produced and may represent either an encrypted value or a literal value.
A code is represented by a JSON object containing a
wordsdefinition.
codes may optionally define:
min_repetitions: The minimum number of consecutive occurrences of the code.max_repetitions: The maximum number of consecutive occurrences of the code.
These properties are commonly used to define variable-length identifiers.
Words
words specify a set of fixed-length strings.
The following word types are supported:
Alphabet
Represents single characters.
Integer ranges
Represents a union of positive integer ranges.
integer_ranges includes a ranges field that specifies one or more inclusive numeric ranges.
integer_ranges may also include padding_required.
When padding_required is set to true, values are represented using the full width of the defined range and leading zeros are preserved where necessary.
Custom
Represents a custom set of fixed-length strings.
custom includes a list field that specifies the complete set of valid values.
The leaf nodes of the Format V2 structure are code definitions that contain words.
The format_v2 field consists of a hierarchical structure where a data type is divided into variants, sections, groups, and codes. These components together define the complete structure of the token.
In addition to specifying the general format of a token section, Format V2 allows constraints to be applied by defining appropriate codes.
For example,
Numeric values restricted to specific ranges
Fixed-width numeric values using padding
Exclusion of invalid values by omitting them from defined ranges
Validation of structured numeric identifiers such as Social Security Numbers
The following code block illustrates the structure of the fpe field using an example,
"fpe": {
"format_v2": {
"mode": "preserve_length",
"input_processing": "passthrough_unknown",
"variants": [
{
"sections": [
{
"groups": [
{
"codes": [
{
"words": {
"integer_ranges": {
"ranges": [[0, 9]],
"padding_required": true
}
},
"min_repetitions": 12,
"max_repetitions": 19
}
]
}
]
}
]
}
]
}
}American Social Security Number:
The SSN is a nine-digit number assigned to American citizens, permanent residents, and eligible non-immigrant workers. This number is used to report wages to the government, track Social Security benefits, and for other identification purposes.
Some of the constraints are listed as follows:
The format of this number is XXX-XX-XXXX.
The first section (XXX) must be between 001–665 or 667–899 (with 666 and 000 disallowed) and always zero‑padded to three digits.
The second section (XX) must be between 01–99 (00 not allowed) and zero‑padded to two digits.
The third section (XXXX) must be between 0001–9999 (0000 not allowed) and zero‑padded to four digits.
Hyphens remain fixed between sections and are preserved exactly during tokenization and detokenization.
It is a hyphen-separated nine-digit number.
Each group of digits is represented as an encrypted code defined using numeric ranges.
The hyphen (-) separators are represented as literal codes.
Fixed-length formatting is achieved through
padding_required.
The following format_v2 schema defines a structured numeric pattern similar to an SSN format, with specific allowed numeric ranges for each section and a fixed separator (-):
{
"format_v2": {
"variants": [
{
"sections": [
{
"groups": [
{
"codes": [
{
"words": {
"integer_ranges": {
"ranges": [
[
1,
665
],
[
667,
899
]
],
"padding_required": true
}
}
},
{
"words": {
"custom": {
"list": [
"-"
]
}
}
},
{
"words": {
"integer_ranges": {
"ranges": [
[
1,
99
]
],
"padding_required": true
}
}
},
{
"words": {
"custom": {
"list": [
"-"
]
}
}
},
{
"words": {
"integer_ranges": {
"ranges": [
[
1,
9999
]
],
"padding_required": true
}
}
}
]
}
]
}
]
}
]
}
}2.2 Create AES Security Object with FPE Field
Perform the following steps to create a tokenization key:
Send a request to the Fortanix DSM endpoint with the
fpefield in the security object request. Thefpefield contains a JSON object that defines the token format and the behavior used for tokenization and detokenization.NOTE
In the following example, the token format is defined using a hierarchical structure composed of one or more format variants. Each variant describes a valid structural representation of the token, including which portions are encrypted.
The following is a sample JSON request for creating an AES key to tokenize a Social Security Number (SSN) using the
format_v2schema.
{
"name": "<sobject name>",
"description": "<sobject description>",
"obj_type": "AES",
"key_ops": [
"ENCRYPT",
"DECRYPT",
"APPMANAGEABLE"
],
"key_size": 256,
"fpe": {
"format_v2": {
"variants": [
{
"sections": [
{
"groups": [
{
"codes": [
{
"words": {
"integer_ranges": {
"ranges": [
[1, 665],
[667, 899]
],
"padding_required": true
}
}
},
{
"words": {
"custom": {
"list": ["-"]
}
}
},
{
"words": {
"integer_ranges": {
"ranges": [
[1, 99]
],
"padding_required": true
}
}
},
{
"words": {
"custom": {
"list": ["-"]
}
}
},
{
"words": {
"integer_ranges": {
"ranges": [
[1, 9999]
],
"padding_required": true
}
}
}
]
}
]
}
]
}
]
},
"description": "<format description>"
},
"expirationDate": null,
"enabled": true,
"group_id": "<sobject group id>"
}NOTE
For an Irreversible Tokenization key type, the
key_opslist does not include the DECRYPT operation.The security object must be an AES security object. FPE cannot be used with any other security object type.
After the security object is created using format_v2 schema,
Navigate to the Security Objects page in the Fortanix DSM user interface (UI).
Open the detailed view of the security object.
Click Show JSON to view the JSON schema associated with the data type.
NOTE
The Show JSON option in the Fortanix DSM UI is available only when the security object is created using the
format_v2schema. This allows you to clearly differentiate V2 FPE objects from V1.

Figure 1: Access Show JSON
2.3 Tokenize by Calling Encrypt API
When the security object is created, the response contains the "kid" field. This field is required for calling the encrypt or decrypt APIs. The following example illustrates how to tokenize the number 123-12-1234.
The API to tokenize is /crypto/v1/encrypt.
JSON Request:
{
"alg": "AES",
"plain": "MTIzLTEyLTEyMzQ=","mode": "FPE",
"key": {
"name": "17-May-Key"
}
}Where, the value of the plain field is base64 encoded SSN. The result of decoding the "plain" field is “123-12-1234”, the original data provided.
Response:
{
"kid": "<tokenization sobject kid>",
"cipher": "NDU2LTU2LTQ1Njc="
}The “cipher” is a base64 encoded value of the token. The result of decoding the"cipher" field is “456-45-4567”.
2.4 Detokenize by Calling Decrypt API
The API to de-tokenize is /crypto/v1/keys/{kid}/decrypt.
JSON Request:
{
"alg": "AES",
"cipher": "NDU2LTU2LTQ1Njc=","mode": "FPE"
}Response:
{
"kid": "<tokenization sobject kid>",
"plain": "MTIzLTEyLTEyMzQ="
}The "plain" field is the base64 encoded value of the original data. The result of decoding the "plain" field is “123-12-1234”, the original data provided.
NOTE
The
/crypto/v1/keys/{kid}/decryptAPI is not supported for Irreversible Tokenization key type.