1.0 Introduction
This article describes how to create and run Nitro Workflow on an Amazon Web Service (AWS) node.
2.0 Prerequisites
Ensure that you have enrolled a compute node using AWS Nitro on Amazon Linux. For more information, refer to User's Guide: Enroll a Compute Node Using AWS Nitro on Amazon Linux.
3.0 Create Input and Output Datasets
Datasets are the definitions containing the location and access credentials of the data that allow the enclave OS in the Workflow to download the data and upload the data. In this example, we use the AWS S3 bucket to:
- Store an encrypted file that will be downloaded and decrypted by the enclave OS.
- Upload an encrypted file using the credentials provided in the dataset.
- Create an S3 bucket with a directory that is accessible using a URL.
3.1 Input User (Data Provider)
Consider that the Data Owner user has access to sensitive information and wants to allow an Application Owner to process this information.
This sensitive data is stored in a conditions.csv
file.
In this example, the file is encrypted, uploaded to a storage solution (AWS S3), and a dataset is configured with credentials and an encryption key for enclave access or processing:
- If you haven't already, download and untar this tar ball
workflow-scripts.tar.gz
file:tar xvfz workflow-scripts.tar.gz
- Obtain a copy of the CSV sample data (also available here: https://synthea.mitre.org/downloads):
wget https://synthetichealth.github.io/synthea-sample-data/downloads/synthea_sample_data_csv_apr2020.zip
You will only be using the file
unzip synthea_sample_data_csv_apr2020.zipcsv/conditions.csv
to encrypt this file, generate a key locally, encrypt the file and store the key in a KMS securely:
- Run the following command to generate an encryption key:
xxd -u -l 32 -p /dev/random | tr -d '\n' > ./key.hex
- Use
aes_256_gcm.py
script included in the tar file that you downloaded. Run the following command to encrypt the file:./workflow_scripts/aes_256_gcm.py enc -K ./key.hex -in ./csv/conditions.csv -out ./conditions.csv.enc
- Run the following command to generate an encryption key:
- Run the following command to upload the encrypted file to a secure storage location such as AWS S3:
aws s3 --profile upload cp ./conditions.csv.enc <s3-directory>
- Generate a pre-signed URL to access the file to avoid inserting the whole AWS SDK in the example:
Use thepresign.py
script included in the tar file and run the following command:./presign.py default download <s3-directory> conditions.csv.enc 86400
S3-directory
is the directory of your S3 bucket directory.
As an example, the output from the above command will be as shown below:
The output consists of two parts:https://fortanix-pocs-data.s3.amazonaws.com/conditions.csv.enc?AWSAccessKeyId=<key>&Signature=PcpH99nszG2%2Fv85z4IbgwgVDywc%3D&Expires=1613817035
- The location -
https://fortanix-pocs-data.s3.amazonaws.com/conditions.csv.enc
- Query parameters:
AWSAccessKeyId=<key>&Signature=PcpH99nszG2%2Fv85z4IbgwgVDywc%3D&Expires=1613817035
The query parameters must be base64 encoded for the dataset definition using the following command:
echo -n ‘AWSAccessKeyId=<key>&Signature=PcpH99nszG2%2Fv85z4IbgwgVDywc%3D&Expires=1613817035’ | base64
- The output will be as follows:
QVdTQWNjZXNzS2V5SWQ9QUtJQVhPNU42R0dOQ05WMzUzV1MmU2lnbmF0dXJlPVBjcEg5OW5zekcyJTJGdjg1ejRJYmd3Z1ZEeXdjJTNEJkV4cGlyZXM9MTYxMzgxNzAzNQ==
- The location -
Perform the following steps to create an input dataset:
- Click the Datasets icon in the CCM left panel and click the CREATE DATASET button to create a new dataset.
Figure 1: Create a New Dataset - In the Create new dataset form, enter the following details:
- Name – the dataset name. For example:
Conditions Data
. - Description (optional) – the dataset description. For example: Patients with associated conditions.
- Labels (optional)– attach one or more key-value labels to the dataset. For example:
Key
: Location andValue
: East US - Location – the AWS S3 URL where data can be accessed. For example: https://fortanix-pocs-data.s3.amazonaws.com/conditions.csv.enc
- Long Description (optional)– enter the content in GitHub-flavoured Markdown file format. You can also use the Fetch Long Description button to get the Markdown file content from an external URL.
Figure 2: Fetch Long Descripton Dialog Box
The following is the sample long description in Markdown format:- Strikethrough Text
~~It is Strikethrough test..~~
- Blockquote Text
> Test Blockquote.
- Bold
**A sample Description.**
- Italic
*It is Italics*
- Bold and Italic
***Bold and Italics text***
- LinkThis is [Website](https://www.fortanix.com/)?
- Credentials – the credentials needed to access the data. The credentials must be in the correct JSON format and consist of:
- Query parameters that were base64 encoded.
- The key that was used to encrypt the file.
{
"query_string": "<my-query-string>",
"encryption": {
"key": "<my-key>"
}
}{
"query_string": "QVdTQWNjZXNzS2V5SWQ9QUtJQVhPNU42R0dOQ05WMzUzV1MmU2lnbmF0dXJlPVBjcEg5OW5zekcyJTJGdjg1ejRJYmd3Z1ZEeXdjJTNEJkV4cGlyZXM9MTYxMzgxNzAzNQ==",
"encryption": {
"key": "63F0E4C07666126226D795027862ACC5848E939881C3CFE8CB3EB47DD7B3D24A"
}
}
Figure 3: Create Input Dataset - Name – the dataset name. For example:
- Click the CREATE DATASET button to create the input dataset.
3.2 Output User (Data Receiver)
Once the data has been received by the Enclave, the user application will run within the Enclave and generate output data (processed data). This data should be encrypted (using your key) before being uploaded to an untrusted store. This is achieved by defining an output dataset to be used by the Workflow.
Perform the following steps:
- Run the following command to generate an encryption key:
xxd -u -l 32 -p /dev/random | tr -d '\n' > ./key_out.hex
- Use the
presign.py
script included in the tar file. Run the following command to generate a presign URL for the upload:
./presign.py default upload <s3-directory> conditions_output.csv.enc 86400
s3-directory
is the directory of your S3 bucket directory.
As an example, the output of the above command will be as shown below:
The output consists of two parts:https://fortanix-pocs-data.s3.amazonaws.com/conditions_output.csv.enc?AWSAccessKeyId=<key>Signature=HFvhxaiKY0cGR9XqgGLp5zcAWac%3D&Expires=1613817880
- The location:
https://fortanix-pocs-data.s3.amazonaws.com/conditions_output.csv.enc
- Query parameters:
AWSAccessKeyId=<key>Signature=HFvhxaiKY0cGR9XqgGLp5zcAWac%3D&Expires=1613817880
The query parameters must be base64 encoded for the dataset definition using the following command:
echo -n ‘AWSAccessKeyId=<key>Signature=HFvhxaiKY0cGR9XqgGLp5zcAWac%3D&Expires=1613817880’ | base64
- The output will be as follows:
QVdTQWNjZXNzS2V5SWQ9QUtJQVhPNU42R0dOTk1SWFZLUEEmU2lnbmF0dXJlPUhGdmh4YWlLWTBjR1I5WHFnR0xwNXpjQVdhYyUzRCZFeHBpcmVzPTE2MTM4MTc4ODA=
- The location:
- Create an output dataset with the following sample values:
- Name – the dataset name. For example:
Conditions processing output
. - Description (optional) – the dataset description.
- Labels (optional)– attach one or more key-value labels to the dataset.
- or example: Key:
Location
and Value:East US
. - Location – the URL where data can be accessed. For example: https://fortanix-pocs-data.s3.amazonaws.com/conditions_output.csv.enc
- Long Description (optional)– enter the content in GitHub-flavoured Markdown file format. You can also use the Fetch Long Description button to get the Markdown file content from an external URL.
Figure 4: Fetch Long Description Dialog Box
The following is the sample long description:- Strikethrough Text
~~It is Strikethrough test..~~
- Blockquote Text
> Test Blockquote.
- Bold
**A sample Description.**
- Italic
*It is Italics*
- Bold and Italic
***Bold and Italics text***
- LinkThis is [Website](https://www.fortanix.com/)?
- Credentials – the credentials needed to access the data. The credentials must be in the correct JSON format and consist of:
- Query parameters that were base64 encoded.
- The key that was used to encrypt the file.
{
"query_string": "<my-query-string>",
"encryption": {
"key": "<my-key>"
}
}{
"query_string": "QVdTQWNjZXNzS2V5SWQ9QUtJQVhPNU42R0dOTk1SWFZLUEEmU2lnbmF0dXJlPUhGdmh4YWlLWTBjR1I5WHFnR0xwNXpjQVdhYyUzRCZFeHBpcmVzPTE2MTM4MTc4ODA=",
"encryption": {
"key": "63F0E4C07666126226D795027862ACC5848E939881C3CFE8CB3EB47DD7B3D24A"
}
}
- Name – the dataset name. For example:
Figure 5: Create Output Dataset
3.3 Create a General Purpose Python Docker Image
Create a docker image that will run arbitrary protected python code. The following files are used. These files are included in the tar file provided with this example:
Dockerfile
:
FROM python:3.6
RUN apt-get update && apt-get install -y python3-cryptography python3-requests python3-pandas
RUN mkdir -p /opt/fortanix/enclave-os/app-config/rw
RUN mkdir -p /demo/code /demo/input
COPY ./start.py ./utils.py ./describe.py /demo/
CMD ["/demo/start.py"]
start.py
: This file is the main entry point into the application.
!/usr/bin/python3
import os
import utils
import hashlib
from subprocess import PIPE, run
def main():
input_folder="/opt/fortanix/enclave-os/app-config/rw"
command = ["/usr/bin/python3", "/demo/describe.py"]
# This downloads and decrypts all input data. File names are the object names from app config.
for i in utils.read_json_datasets("input"):
decrypted = utils.get_dataset(i)
open(input_folder + i.name, 'wb').write(decrypted)
# Add the file as input argument for our script
command.append(input_folder + i.name)
print("Running script")
result = run(command, stdout=PIPE, stderr=PIPE, universal_newlines=True)
# For simplicity uploading just stdout/stderr/returncode.
utils.upload_result("output", result.returncode, result.stdout, result.stderr)
print("Execution complete")
if __name__ == "__main__":
main()
utils.py
: This file contains the set of library functions.
!/usr/bin/env python3
import os
import sys
import string
import json
import requests
import base64
import hashlib
from subprocess import PIPE, run
from cryptography.hazmat.backends import default_backend
from cryptography.hazmat.primitives.ciphers import (
Cipher, algorithms, modes
)
NONCE_SIZE=12
TAG_SIZE=16
KEY_SIZE=32
PORTS_PATH="/opt/fortanix/enclave-os/app-config/rw/"
def convert_key(key_hex):
key=key_hex.rstrip();
if len(key) != 64:
raise Exception("Key file must be 64 bytes hex string for AES-256-GCM")
if not all(c in string.hexdigits for c in key) or len(key) != 64:
raise Exception("Key must be a 64 character hex stream")
return bytes.fromhex(key)
def encrypt_buffer(key, data):
nonce=os.urandom(NONCE_SIZE)
cipher = Cipher(algorithms.AES(key), modes.GCM(nonce), backend=default_backend())
encryptor = cipher.encryptor()
return nonce + encryptor.update(data.encode()) + encryptor.finalize() + encryptor.tag
def decrypt_buffer(key, data):
tag=data[-TAG_SIZE:]
nonce=data[:NONCE_SIZE]
cipher = Cipher(algorithms.AES(key), modes.GCM(nonce, tag), backend=default_backend())
decryptor = cipher.decryptor()
return decryptor.update(data[NONCE_SIZE:len(data)-TAG_SIZE]) + decryptor.finalize()
class JsonDataset:
def __init__(self, location, credentials, name):
self.location = location
self.credentials = credentials
self.name = name
def read_json_datasets(port):
ports=[]
for folder in os.listdir(PORTS_PATH + port):
subfolder=PORTS_PATH + port + "/" + folder
if os.path.exists(subfolder + "/dataset"):
credentials=json.load(open(subfolder + "/dataset/credentials.bin", "r"))
location=open(subfolder + "/dataset/location.txt", "r").read()
ports.append(JsonDataset(location, credentials, folder))
return ports
def get_dataset(dataset):
url = dataset.location + "?" + base64.b64decode(dataset.credentials["query_string"]).decode('ascii')
r = requests.get(url, allow_redirects=True)
r.raise_for_status()
print("Retrieved dataset from location: " + dataset.location)
key = convert_key(dataset.credentials["encryption"]["key"])
return decrypt_buffer(key, r.content)
class RunResult:
def __init__(self, returncode, stdout, stderr):
self.returncode = returncode
self.stdout = base64.b64encode(stdout.encode()).decode('ascii')
self.stderr = base64.b64encode(stderr.encode()).decode('ascii')
def upload_result(port, returncode, stdout, stderr):
result=RunResult(returncode, stdout, stderr)
json_str=json.dumps(result.__dict__)
for dataset in read_json_datasets(port):
url = dataset.location + "?" + base64.b64decode(dataset.credentials["query_string"]).decode('ascii')
key = convert_key(dataset.credentials["encryption"]["key"])
print("Writing output to location: " + dataset.location)
requests.put(url, encrypt_buffer(key, json_str))
describe.py
: This file contains the custom code called by start.py
.
!/usr/bin/python3
import pandas as pd
import sys
for i in sys.argv[1:]:
df_do = pd.read_csv(i)
print("Dataset: " + i + "\n")
print(df_do['DESCRIPTION'].describe())
print("")
A standard docker build and docker push command must be used to create your docker and push to your registry. For example:
docker build -t <my-registry>/simple-python-sgx .
docker push <my-registry>/simple-python-sgx
4.0 Create a Nitro Enclave OS Application and Image
To know the steps for creating a Nitro Enclave OS application and image, refer to User's Guide: Add and Edit an Application
5.0 Approve Tasks
From Fortanix CCM Tasks tab, fetch the domain and build whitelisting tasks and approve the tasks.
Figure 6: Tasks
6.0 Create Application Configuration
The Docker image recognizes the Python script using an Application Configuration which defines the ports.
Perform the following steps to create an Application Configuration:
- Click the Applications tab in the CCM left panel and from the left menu select Configurations.
- Click ADD CONFIGURATION to add a new configuration.
Figure 7: Create App Configuration - In the ADD APPLICATION CONFIGURATION window, fill the following:
- Image – select the
<my-registry>/simple-python-sgx:latest
application image for which you want to create a configuration.
Where<my-registry>
is the location of your docker registry - Name and Description – Enter a name and description of the configuration.
- Ports – Enter the connections to be used in the Workflow. You can add multiple ports depending on how the connection should work. For example: “
input
”, “output
”, “heartbeat
”, and so on. - Labels – attach one or more key-value labels to the app config.
- Configuration items – These are key-value pairs used for configuring the app.
- For Enclave OS applications, the Key is the path of the file that contains the Value for configuring an app.
Figure 8: Save Configuration - Image – select the
- Click SAVE CONFIGURATION to save the configuration.
Figure 9: Configuration Saved
7.0 Create Workflow with Nitro Application and Datasets
Perform the following steps to create a Workflow:
- Click the Workflows icon in the CCM left panel.
- In the Workflows page, click +WORKFLOW to create a new Workflow.
Figure 10: Create Workflow - In the CREATE WORKFLOW dialog, enter the Workflow Name and Description (optional). Click CREATE to go to the Workflow graph.
- Add an app to the Workflow graph. To add an app to a Workflow graph, drag the App icon and drop it into the graph area. Click +APPLICATION. In the ADD APPLICATION dialog, the App Owner must select an existing application image, for example: <my-registry>/simple-python-sgx:latest from the list of available application images.
Where,<my-registry>
is the location of your registry. - For the selected application image, the App Owner must create an app config/add existing app config.
- Click SELECT APPLICATION to select the application.
Figure 11: Add App to Workflow - Click SAVE AS DRAFT to save the draft.
Figure 12: Save Draft - To access the draft Workflow, click the Draft tab in the Workflows left menu.
- Add an input and output dataset to the Workflow graph. To add a dataset to a Workflow graph, drag the dataset icon and drop it into the graph area. Click +DATASET. In the DATASET dialog, the Data Owner must select from existing datasets that were created in the section above.
Figure 13: Add Input Dataset to Workflow
Figure 13: Add Output Dataset to Workflow - Create connections between the applications and input/output datasets. To do that, drag the Input Dataset connection point and join it to the Application connection point. In the SELECT PORTS window, select the Target Port as “input”. Repeat the same to connect the Application to the Output Dataset and select the Target Port as “Output”.
Figure 14: Create Connection - If the Workflow is complete, the user clicks the REQUEST APPROVAL button to generate the approval process for the Workflow.
Figure 15: Request Workflow Approval - The Workflow is in a “pending” state until all the users approve it. In the Pending tab click SHOW APPROVAL REQUEST to approve a Workflow.
Figure 16: Workflow Pending Approval - In the APPROVAL REQUEST - CREATE WORKFLOW dialog, click APPROVE to approve the Workflow or DECLINE to reject a Workflow.
Figure 17: Approve the Workflow - All the users of a Workflow must approve the Workflow to finalize it. If a user declines a Workflow, the Workflow is rejected. When all the users approve the Workflow, it is deployed.
- CCM configures apps to access the datasets.
- CCM creates the Workflow Application Configs.
- CCM returns the list of hashes needed to start the apps.
8.0 Run Nitro Workflow
Perform the following steps to run a Nitro Workflow:
- Run the following command to execute the application image:
docker run -it --rm --privileged -v /run/nitro_enclaves:/run/nitro_enclaves -e NODE_AGENT=http://172.31.9.232:9092/v1/ -e CCM_BACKEND=ccm.fortanix.com:443 -e APPCONFIG_ID=e545d0ba32c0edf86226306cad924bcfa2ad9f7fd74dafd0f4c1c7724759a9df 513076507034.dkr.ecr.us-west-1.amazonaws.com/development-images/ccm-automation-output-images:python-converted642
Where,
9092
is the default node agent port.172.31.14.110
is the node agent host IP address.APPCONFIG_ID
is the runtime configuration hash of the workflow app, which can be copied from the app info of the workflow.513076507034.dkr.ecr.us-west-1.amazonaws.com/development-images/ccm-automation-output-images:python-generic-app-conv
is the converted app found in the Images tab under the Image Name column in the Images table.
-
To verify that the application is running, click the APPLICATION tab in the Fortanix CCM UI and verify that there is a running application image associated with it and displayed with the application in the detailed view of the application.
-
When the App Owner starts the application with the application config identifier.
The Data Output Owner can view the output using the following steps:-
Run the following command to download the output file:
aws s3 --profile download cp s3:<s3-directory>/conditions_output.csv.enc .
For example:
aws s3 --profile download cp s3://fortanix-pocs-data/conditions_output.csv.enc .
-
Run the following command to decrypt the file:
Use the aes_256_gcm.py script provided in the tar file../aes_256_gcm.py dec -K ./key.hex -in ./conditions_output.csv.enc -out ./output.txt
$ cat output.txt | jq .
{
"returncode": 0,
"stdout": "RGF0YXNldDogL29wdC9mb3J0YW5peC9lbmNsYXZlLW9zL2FwcC1jb25maWcvaW5wdXQvY3hoY2Z4ZHZsCgpjb3VudCAgICAgICAgICAgICAgICAgICAgICAgICAgIDgzNzYKdW5pcXVlICAgICAgICAgICAgICAgICAgICAgICAgICAgMTI5CnRvcCAgICAgICBWaXJhbCBzaW51c2l0aXMgKGRpc29yZGVyKQpmcmVxICAgICAgICAgICAgICAgICAgICAgICAgICAgIDEyNDgKTmFtZTogREVTQ1JJUFRJT04sIGR0eXBlOiBvYmplY3QKCg==",
"stderr": ""
}The following is the expected output of the file:
$ cat output.txt | jq -r .stdout | base64 -d
Dataset: /opt/fortanix/enclave-os/app-config/input/cxhcfxdvl
count 8376
unique 129
top Viral sinusitis (disorder)
freq 1248
Name: DESCRIPTION, dtype: object
-
Comments
Please sign in to leave a comment.