Introduction
This document describes how to generate ONNX and PMML files for Logistic Regression and upload these files in Fortanix Confidential AI using Bring You Own Model (BYOM) workflow. These steps can be followed for other supported Scikit Learn algorithms too.
Prerequisites
- Install the following dependencies as required:
- numpy
- scikit-learn
- sklearn-onnx
- A user signed in to a Confidential AI
For instruction on how to sign up and log in, refer to our User’s guide: Sign up for Confidential AI.
Generate the ONNX and PMML files
Script to Generate the Files
The script below generates the following files:
- ONNX file -
logistic_regr.onnx
- PMML file -
logistic_regr.pmml
- Input dataset for Confidential AI -
logistic_regr_input.csv
# This is a example program that generates a logistic regression
# onnx and pmml model files from scikitlearn's iris dataset.
# This program also creates an csv file that can be uploaded to CAI
# for input dataset.
import numpy
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.utils.validation import column_or_1d
# Train logistic regression model
iris = load_iris()
X, y = iris.data, iris.target
X_train, X_test, y_train, y_test = train_test_split(X, y)
# all parameters not specified are set to their defaults
logisticRegr = LogisticRegression(max_iter=7600)
logisticRegr.fit(X_train, y_train)
# ONNX file generation
print("Generating ONNX file....", end="")
from skl2onnx import convert_sklearn
from skl2onnx.common.data_types import FloatTensorType
initial_type = [('float_input', FloatTensorType([None, 4]))]
onx = convert_sklearn(logisticRegr, initial_types=initial_type)
with open("logistic_regr.onnx", "wb") as f:
f.write(onx.SerializeToString())
print('DONE!')
# PMML file generation
print("Generating PMML file....", end="")
from sklearn2pmml.pipeline import PMMLPipeline
from sklearn2pmml import sklearn2pmml
import pandas as pd
from sklearn.preprocessing import StandardScaler
pmml_file = 'logistic_regr.pmml'
steps = [
("scaler", StandardScaler()),
("classifier", LogisticRegression(max_iter=7600))
]
pipeline = PMMLPipeline(steps)
xdf = pd.DataFrame(X_train, columns = ['u','v','w','x'])
ydf = pd.DataFrame(y_train, columns = ['y'])
pipeline.fit(xdf, ydf.values.ravel())
sklearn2pmml(pipeline, pmml_file, with_repr = True)
print('DONE!')
# Input csv generation
print("Generating input csv....", end="")
(X_test_rows, _) = X_test.shape
z = numpy.zeros((X_test_rows,1))
X_test = numpy.append(X_test, z, axis=1)
numpy.savetxt("logistic_regr_input.csv", X_test, delimiter=",", header="u,v,w,x,y", comments="")
print("DONE!")
#
print("""This script has generated the following files:
1. logistic_regr.onnx
2. logistic_regr.pmml
3. logistic_regr_input.csv
The ML variables to use the model
Features: u,v,w,x
Target: y
""")
Execute the above script using the following command to generate the ONNX, PMML, and the Input dataset files.
python3 -W ignore <filename>
Prepare the PMML and ONNX Model for Confidential AI
This section describes how to prepare the PMML and ONNX model for BYOM workflow in Confidential AI as an example.
Remove the Output Tag for PMML File
Open the PMML file logistic_regr.pmml
and remove the <Output>
tag.
Upload the Input Dataset - Data Ingestion
- Upload the input dataset file
logistic_regr_input.csv
generated using the script in Section: Script to Generate the Files to Confidential AI.- Click CREATE DATASET in the DATA INGESTION tab and select CSV Dataset to upload the dataset.
Figure 1: Upload Dataset
Figure 2: Upload Dataset
Figure 3: Dataset uploaded
Data Preparation
In the previous phase (Data Ingestion), the column names of the tabular dataset were extracted. In this phase, you can optionally specify which of these columns should be used as the features (X) and which column should be used as the target (Y) for the subsequent model training phase.
- In the DATA PREPARATION tab, click ADD VARIABLES to select the features and target.
Figure 4: Select variables - Select u, v, w, and x as features and select y as the target.
Figure 5: Features and target - The variables are added.
Figure 6: Variables added
Upload Model
In this phase, you can upload the PMML and ONNX trained model files generated using the script in Section: Script to Generate the Files and use it to make predictions in the next phase.
- In the ADD MODELS tab, click ADD MODEL and select Upload Model to upload the generated
logistic_regr.onnx
andlogistic_regr.pmml
model files.
Figure 7: Upload model
Figure 8: Upload ONNX model
Figure 9: Upload PMML model - The models are uploaded.
Figure 10: Models uploaded
Data Inference
In this phase, the CSV data is passed through a machine learning model to identify and predict the output from the data.
- In the INFERENCE tab, click BUILD INFERENCE to predict the data output and download the output dataset.
Figure 11: Build inference - In the Build Inference form, enter the Inference flow name, that is, the name of the inference model.
- In the Select model section, select UPLOADED, and select the ONNX trained model that was uploaded in the “Upload a model” stage from the drop-down.
- In the Select input dataset field, select the input dataset that you uploaded in Section: Upload the Input Dataset - Data Ingestion that you want to pass through a machine learning (ML) model.
- In the Select inference application section, select scikit-learn Prediction as the prediction algorithm.
- Select the ML variables.
- In the Output Configuration field, enter the name of the output dataset that will contain the predicted output.
- The Encrypt Dataset option is selected by default to generate an encryption key and add an extra layer of protection to the output data. Copy or download the key to decrypt the output data for viewing.
- Click CREATE INFERENCE FLOW to pass the data through a machine learning model and predict the output.
Figure 12: Data Inference for ONNX model - Repeat Steps 1-9 to build inference for the PMML trained model that was uploaded in the “Upload a model” stage.
Figure 13: Data Inference for PMML model - Click RUN below the ONNX and PMML inference workflows to run the model and predict the output.
Figure 14: Run Inference for ONNX model
Figure 15: Run Inference for PMML model - If the model was executed successfully, you would see the status of the execution for ONNX and PMML models under the Execution Log. Click the Execution Log link to view the log details.
Figure 16: Inference success - Click the download report icon to download the execution log report.
Figure 17: Download execution log report - After the execution is completed successfully, the output is now predicted and ready to be viewed. To view the output, click the DOWNLOAD button below the ONNX and PMML inference workflows.
Figure 18: Download output for ONNX - In the DOWNLOAD dialog box, enter the Encryption key to decrypt the output.
Figure 19: Decrypt output for ONNX model
This will downloadlogistic-regression-onnx-output.tar.gz
andlogistic-regression-pmml-output.tar.gz
to your filesystem. Extract theoutput.csv
from the downloaded files to get the inference results.
Comments
Please sign in to leave a comment.