This classification method widely used for the Pattern recognition and statistical estimation applications as a non-parametric technique. The Kernel nearest neighbors algorithm (K-NN) is an algorithm that collected all available cases and classifies new cases based on a matching measure like distance functions. The K-NN has stored the entire dataset and there is no Learning algorithm required. For classification and regression, this technique is very useful to assign a common weighting consists in giving each neighbor a weight of 1/d, where d is the distance to the neighbor. Â Predictions are made for a new dataset(X) by searching on the entire training data set for the similar values of the neighbors and added the output variable for those (k) values. The steps to optimize the K-NN is followed. Compute the distances of the all unknown dataset to the known dataset, pick the nearest K neighbors which have same labels, and convert the distance matrix to the pairwise distance between the K neighbors. (Zhang, 2006)The difficulties with this kind of method increased when every labeled sample has its own importance in deciding the label of class for the new dataset to be classified. For example, if we have multi-class labeled in our training dataset than it might be difficult to optimize the correct neighbor class points for the given datasets. Bayes error rate which is used in statistical classification for the lowest possible error rate for any classifier of a random outcome of one class from the two classes. Bayes error rate increases in the case of a large number of datasets because of its simple nonparametric procedure and decision rule for the assignment of a class label to the input datasets.From the Figure 1, the optimization of the nearest neighbors works in the form of the predicted class label (Fault free and Pump fault) for feature X and feature Y. The nearest neighbors around the decision boundary classified by a majority vote of its neighbors and label of the fault-free and pump fault have assigned to the class most common amongst its K nearest neighbors measured by a distance function. The distance function works like this X (Fault free) = Y(Fault free)= Distance 0 and X(Pump fault)= Y(Fault free) =Distance 1.