This article describes how to build and test a model using the K-Nearest Neighbor (KNN) algorithm in Fortanix Confidential AI.
- A user signed in to a Fortanix Confidential AI account.
For instructions to sign up and log in, refer to our User’s guide: Sign up for Confidential AI.
Building the Model
- On the Data Ingestion page, click CREATE DATASET, and select CSV Dataset.
- Dataset name - Enter a name for your dataset.
- Select the Upload a file option if you want to upload your data directly to the Fortanix Confidential AI platform.
- In the File Upload section, upload the file. In a CSV dataset, notice that after the file is uploaded, the headers (column names) are detected and displayed. For example: UserID, Gender, Age, and so on. The number of rows is also detected and displayed.
- To track what the data is used for; you can optionally add Labels in the form of “Key:Value” pairs.
- Click CREATE DATASET to save the data. Figure 1: CSV details
- You will now see the saved dataset in the dataset table.
In this phase, you can choose the column names as a set of features and a target for the tabular dataset. you can choose multiple combinations of features and targets (collectively called Variables).
- In the Data Preparation page, click ADD VARIABLES to select the features and target. Figure 2: Add variables
- Select one or more features from the SET A FEATURE column and select one target from the SET A TARGET column.
- Click ADD to add the variables. Figure 3: Select the features and target
- Click SAVE to save the variables and proceed to the next phase, that is, build a model. Figure 4: Variables added
In the build a model stage, you can select the KNN algorithm to run on the dataset defined in the previous phases, to analyze and build AI models.
In the “Build a Model” form:
- Select the BUILD A MODEL tab and click BUILD MODEL to build a training model for the dataset created in the previous phase.
- Enter the
Training flow name, that is, the name of the model.
- In the Training Dataset field, select the training dataset on which you want to run the KNN algorithm and build a trained model.
- In the Algorithm field, select the k-Nearest Neighbors algorithm.
- Select ML variables that you created in the Data Preparation phase.
- Enter the Neighbor count which is the number of nearest neighbors (k) to use in the kNN algorithm.
- In the Model name field, enter the name of the output dataset. This is the output model that will be used in the data inference phase.
- Click BUILD MODEL to run the selected algorithm on the training data and build the model for inference. Figure 5: Build a model
- After the training model is built, you will see the model created under the Training flows. To run the training model, click the RUN button below the model.
Figure 6: Run training model
- If the model was executed successfully, you would see the status of the execution under the Execution Log. Click the Execution log link to view the log details. Figure 7: Model training success
- Click the download report icon to download the execution log report. Figure 8: Execution log
- After the execution is completed successfully, the model is now trained and ready for inference where it will be passed through a machine learning model for output data prediction.
In this stage, the data is passed through a machine learning model to identify and predict the output from the data.
- In the INFERENCE tab, click BUILD INFERENCE to predict the data output.
- In the Build Inference form, enter the Inference flow name, that is, the name of the inference model.
- In the Select model section, select TRAINED, and select the trained model that was built in the “build a model” stage from the drop-down.
- In the Select input dataset field, select the input dataset you created in the first stage that you want to pass through a machine learning model.
- In the Select inference application section, select the prediction algorithm.
- Select the ML variables.
- In the Output Configuration field, enter the name of the output dataset that will contain the predicted output.
- The Encrypt Dataset option is selected by default to generate an encryption key and add an extra layer of protection to the output data. Copy or download the key to decrypt the output data for viewing.
- Click CREATE INFERENCE FLOW to pass the data through a machine learning model and predict the output.
Figure 9: Build inference
- The inference is successfully created. Click RUN below the inference workflow to run the model and predict the output. Figure 10: Run inference
- If the model was executed successfully, you would see the status of the execution under the Execution Log. Click the Execution Log link to view the log details. Figure 11: Inference success
- After the execution is completed successfully, the output is now predicted and ready to be viewed. To view the output, click the DOWNLOAD button. Figure 12: Download output
- In the DOWNLOAD dialog box, enter the Encryption key to decrypt the output. Figure 13: Decrypt output
*.tar.gzfile is generated on your local machine. Extract the contents of the file. A snapshot of the output appears as shown below.
Figure 14: Sample Output