Bring Your Own Model Using Logistic Regression

Introduction

This document describes how to generate ONNX and PMML files for Logistic Regression and upload these files in Fortanix Confidential AI using Bring You Own Model (BYOM) workflow. These steps can be followed for other supported Scikit Learn algorithms too.

Prerequisites

  • Install the following dependencies as required:
    • numpy
    • scikit-learn
    • sklearn-onnx
  • A user signed in to a Confidential AI

For instruction on how to sign up and log in, refer to our User’s guide: Sign up for Confidential AI.

Generate the ONNX and PMML files

Script to Generate the Files

The script below generates the following files:

  • ONNX file - logistic_regr.onnx
  • PMML file - logistic_regr.pmml
  • Input dataset for Confidential AI - logistic_regr_input.csv
NOTE
This Python script is just an example. You can modify it to suit your requirements.
# This is a example program that generates a logistic regression
# onnx and pmml model files from scikitlearn's iris dataset.
# This program also creates an csv file that can be uploaded to CAI
# for input dataset.
import numpy
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.utils.validation import column_or_1d

# Train logistic regression model
iris = load_iris()
X, y = iris.data, iris.target
X_train, X_test, y_train, y_test = train_test_split(X, y)

# all parameters not specified are set to their defaults
logisticRegr = LogisticRegression(max_iter=7600)
logisticRegr.fit(X_train, y_train)
# ONNX file generation
print("Generating ONNX file....", end="")
from skl2onnx import convert_sklearn
from skl2onnx.common.data_types import FloatTensorType
initial_type = [('float_input', FloatTensorType([None, 4]))]
onx = convert_sklearn(logisticRegr, initial_types=initial_type)
with open("logistic_regr.onnx", "wb") as f:
    f.write(onx.SerializeToString())
print('DONE!')

# PMML file generation
print("Generating PMML file....", end="")
from sklearn2pmml.pipeline import PMMLPipeline
from sklearn2pmml import sklearn2pmml
import pandas as pd
from sklearn.preprocessing import StandardScaler
pmml_file = 'logistic_regr.pmml'
steps = [
    ("scaler", StandardScaler()),
    ("classifier", LogisticRegression(max_iter=7600))
]

pipeline = PMMLPipeline(steps)
xdf = pd.DataFrame(X_train, columns = ['u','v','w','x'])
ydf = pd.DataFrame(y_train, columns = ['y'])
pipeline.fit(xdf, ydf.values.ravel())
sklearn2pmml(pipeline, pmml_file, with_repr = True)
print('DONE!')

# Input csv generation
print("Generating input csv....", end="")
(X_test_rows, _) = X_test.shape
z = numpy.zeros((X_test_rows,1))
X_test = numpy.append(X_test, z, axis=1)
numpy.savetxt("logistic_regr_input.csv", X_test, delimiter=",", header="u,v,w,x,y", comments="")
print("DONE!")

#
print("""This script has generated the following files:
1. logistic_regr.onnx
2. logistic_regr.pmml
3. logistic_regr_input.csv

The ML variables to use the model
Features: u,v,w,x
Target: y
""")

Execute the above script using the following command to generate the ONNX, PMML, and the Input dataset files.

python3 -W ignore <filename>

Prepare the PMML and ONNX Model for Confidential AI

This section describes how to prepare the PMML and ONNX model for BYOM workflow in Confidential AI as an example.

Remove the Output Tag for PMML File

Open the PMML file logistic_regr.pmml and remove the <Output> tag.

NOTE
The PMML file should not have an <Output> tag. This is a requirement for BYOM using PMML only.

Upload the Input Dataset - Data Ingestion

  1. Upload the input dataset file logistic_regr_input.csv generated using the script in Section: Script to Generate the Files  to Confidential AI.
    1. Click CREATE DATASET in the DATA INGESTION tab and select CSV Dataset to upload the dataset.
    DataIngestion-BYOM.png
    Figure 1: Upload Dataset UploadDataset-BYOM.png
    Figure 2: Upload Dataset UploadedDataset-BYOM.png
    Figure 3: Dataset uploaded

Data Preparation

In the previous phase (Data Ingestion), the column names of the tabular dataset were extracted. In this phase, you can optionally specify which of these columns should be used as the features (X) and which column should be used as the target (Y) for the subsequent model training phase.

  1. In the DATA PREPARATION tab, click ADD VARIABLES to select the features and target. DataPreparationBYOM.png
    Figure 4: Select variables
  2. Select u, v, w, and x as features and select y as the target. SelectVariables-BYOM.png
    Figure 5: Features and target
  3. The variables are added. SelectedVariables-BYOM.png
    Figure 6: Variables added

Upload Model

In this phase, you can upload the PMML and ONNX trained model files generated using the script in Section: Script to Generate the Files and use it to make predictions in the next phase.

  1. In the ADD MODELS tab, click ADD MODEL and select Upload Model to upload the generated logistic_regr.onnx and logistic_regr.pmml model files. UploadModel-BYOM.png
    Figure 7: Upload model UploadONNXModel-BYOM.png
    Figure 8: Upload ONNX model UploadPMMLModel-BYOM.png
    Figure 9: Upload PMML model
  2. The models are uploaded. UploadedModels-BYOM.png
    Figure 10: Models uploaded

Data Inference

In this phase, the CSV data is passed through a machine learning model to identify and predict the output from the data.

  1. In the INFERENCE tab, click BUILD INFERENCE to predict the data output and download the output dataset. UploadModel-BYOM.png
    Figure 11: Build inference
  2. In the Build Inference form, enter the Inference flow name, that is, the name of the inference model.
  3. In the Select model section, select UPLOADED, and select the ONNX trained model that was uploaded in the “Upload a model” stage from the drop-down.
  4. In the Select input dataset field, select the input dataset that you uploaded in Section: Upload the Input Dataset - Data Ingestion that you want to pass through a machine learning (ML) model.
  5. In the Select inference application section, select scikit-learn Prediction as the prediction algorithm.
  6. Select the ML variables.
  7. In the Output Configuration field, enter the name of the output dataset that will contain the predicted output.
  8. The Encrypt Dataset option is selected by default to generate an encryption key and add an extra layer of protection to the output data. Copy or download the key to decrypt the output data for viewing.
    NOTE
    Failure to save the key will result in loss of data.
  9. Click CREATE INFERENCE FLOW to pass the data through a machine learning model and predict the output. InferenceFormONNX1-BYOM.png
    Figure 12: Data Inference for ONNX model
  10. Repeat Steps 1-9 to build inference for the PMML trained model that was uploaded in the “Upload a model” stage. InferenceFormPMML1-BYOM.png
    Figure 13: Data Inference for PMML model
  11. Click RUN below the ONNX and PMML inference workflows to run the model and predict the output. RunInferenceOnnx-BYOM.png
    Figure 14: Run Inference for ONNX model RunInferencePmml-BYOM.png
    Figure 15: Run Inference for PMML model
  12. If the model was executed successfully, you would see the status of the execution for ONNX and PMML models under the Execution Log. Click the Execution Log link to view the log details. SuccessInference-BYOM.png
    Figure 16: Inference success
  13. Click the download report icon to download the execution log report. ReportDownloadInferenceOnnx-BYOM.png
    Figure 17: Download execution log report
  14. After the execution is completed successfully, the output is now predicted and ready to be viewed. To view the output, click the DOWNLOAD button below the ONNX and PMML inference workflows. DownloadOutputOnnxPmml-BYOM.png
    Figure 18: Download output for ONNX
  15. In the DOWNLOAD dialog box, enter the Encryption key to decrypt the output. EncryptionKeyOnnx-BYOM.png
    Figure 19: Decrypt output for ONNX model
    This will download logistic-regression-onnx-output.tar.gz and logistic-regression-pmml-output.tar.gz to your filesystem. Extract the output.csv from the downloaded files to get the inference results.

Comments

Please sign in to leave a comment.

Was this article helpful?
0 out of 0 found this helpful