Following are descriptions of the options available on the k-Nearest Neighbors Prediction dialogs.

Variables In Input Data

All variables in the data set are listed here.

Selected Variables

Variables listed here will be utilized in the Analytic Solver Data Mining output.

Output Variable

Select the variable whose outcome is to be predicted here.

Partition Data

Analytic Solver Data Mining includes the ability to partition a dataset from within a classification or prediction method by selecting Partition Options on the Parameters dialog. If this option is selected, Analytic Solver Data Mining will partition your dataset (according to the partition options you set) immediately before running the prediction method. If partitioning has already occurred on the dataset, this option will be disabled. For more information on partitioning, please see the Data Mining Partitioning chapter.

Rescale Data

Click Rescale Data to open the Rescaling dialog.

New in V2017, use Rescaling to normalize one or more features in your data during the data preprocessing stage. Analytic Solver Data Mining provides the following methods for feature scaling: Standardization, Normalization, Adjusted Normalization and Unit Norm. For more information on this new feature, see the Rescale Continuous Data section within the Transform Continuous Data chapter that occurs earlier in this guide.

# Neighbors (k)

This is the parameter k in the k-nearest neighbor algorithm. If the number of observations (rows) is less than 50 then the value of k should be between 1 and the total number of observations (rows). If the number of rows is greater than 50, then the value of k should be between 1 and 50. The default value is 1.

Nearest Neighbors Search

If Search 1..K is selected, Analytic Solver Data Mining will display the output for the best k between 1 and the value entered for # Neighbors (k).

If Fixed K selected, the output will be displayed for the specified value of k. This is the default setting.

Score Training Data

Select these options to show an assessment of the performance of the tree in classifying the Training Set. The report is displayed according to the specifications: Detailed, Summary, and Lift Chart.

Score Validation Data

These options are enabled when a Validation Set exists. Select these options to show an assessment of the performance of the algorithm in classifying the Validation Set. The report is displayed according to the specifications: Detailed, Summary, and Lift Charts.

Score Test Data

These options are enabled when a test set is present. Select these options to show an assessment of the performance of the tree in classifying the test data. The report is displayed according to your specifications: Detailed, Summary, and Lift Charts.

Score New Data

See the Scoring New Data section for information on the KNNP_Stored worksheet.