This article describes how to build and test a model using the K-Nearest Neighbor (KNN) algorithm in Fortanix Confidential AI.
- A user signed in to a Fortanix Confidential AI account.
For instructions to sign up and log in, refer to our User’s guide: Sign up for Confidential AI.
Building the Model
- On the Data Ingestion page, click CREATE DATASET, and select CSV Dataset.
- Dataset name - Enter a name for your dataset.
- Select the Upload a file option if you want to upload your data directly to the Fortanix Confidential AI platform.
- In the File Upload section, upload the file. In a CSV dataset, notice that after the file is uploaded, the headers (column names) are detected and displayed. For example: UserID, Gender, Age, and so on. The number of rows is also detected and displayed.
- To track what the data is used for; you can optionally add Labels in the form of “Key:Value” pairs.
- Click CREATE DATASET to save the data. Figure 1: CSV details
- You will now see the saved dataset in the dataset table.
In this phase, you can choose the column names as a set of features and a target for the tabular dataset. you can choose multiple combinations of features and targets (collectively called Variables).
- In the Data Preparation page, click ADD VARIABLES to select the features and target. Figure 2: Add variables
- Select one or more features from the SET A FEATURE column and select one target from the SET A TARGET column.
- Click ADD to add the variables. Figure 3: Select the features and target
- Click SAVE to save the variables and proceed to the next phase, that is, build a model. Figure 4: Variables added
In the build a model stage, you can select the KNN algorithm to run on the dataset defined in the previous phases, to analyze and build AI models.
In the “Build a Model” form:
- Select the BUILD A MODEL tab and click BUILD MODEL to build a training model for the dataset created in the previous phase.
- Enter the
Training flow name, that is, the name of the model.
- In the Training Dataset field, select the training dataset on which you want to run the KNN algorithm and build a trained model.
- In the Algorithm field, select the k-Nearest Neighbors algorithm.
- Select ML variables that you created in the Data Preparation phase.
- Enter the Neighbor count which is the number of nearest neighbors (k) to use in the kNN algorithm.
- In the Model name field, enter the name of the output dataset. This is the output model that will be used in the data inference phase.
- Click BUILD MODEL to run the selected algorithm on the training data and build the model for inference. Figure 5: Build a model
- After the training model is built, you will see the model created under the Training flows. To run the training model, click the RUN button below the model.
Figure 6: Run training model
- If the model was executed successfully, you would see the status of the execution under the Execution Log. Click the Execution log link to view the log details. Figure 7: Model training success
- Click the download report icon to download the execution log report. Figure 8: Execution log
- After the execution is completed successfully, the model is now trained and ready for inference where it will be passed through a machine learning model for output data prediction.
In this stage, the data is passed through a machine learning model to identify and predict the output from the data.
- In the INFERENCE tab, click BUILD INFERENCE to predict the data output.
- In the Build Inference form, enter the Inference flow name, that is, the name of the inference model.
- In the Input dataset field, select the training dataset that you created in the first stage that you want to pass through a machine learning model.
- In the Algorithm field, select the prediction algorithm.
- In the Model field, select the trained model that was built in the “build a model” stage.
- In the Output Configuration field, enter the name of the output dataset that will contain the predicted output.
- The output dataset will be encrypted; hence Encrypt Dataset is enabled to add an extra layer of protection to the output data. Copy or download the encryption key to decrypt the output data for viewing.
- Click CREATE INFERENCE FLOW to pass the data through a machine learning model and predict the output. Figure 9: Build inference
- The inference is successfully created. Click RUN below the inference workflow to run the model and predict the output. Figure 10: Run inference
- If the model was executed successfully, you would see the status of the execution under the Execution Log. Click the Execution Log link to view the log details. Figure 11: Inference success
- After the execution is completed successfully, the output is now predicted and ready to be viewed. To view the output, click the DOWNLOAD button. Figure 12: Download output
- In the DOWNLOAD dialog box, enter the Encryption key to decrypt the output. Figure 13: Decrypt output
*.tar.gzfile is generated on your local machine. Extract the contents of the file. A snapshot of the output appears as shown below.
Figure 14: Sample Output