Data mining and knowledge discovery are quite related to each other. Knowledge discovery adopts the approach of extracting data from data mining and transforms the output into progressive information (Wright P., 1998). Knowledge Data Discovery carries out series of tests to produce the most accurate results such as selection, pre-processing, transformation, data mining, interpretation and evaluation. As we move down this paper, we will discuss the use of data mining in real world by showing its importance in calculating the number of obese and overweight people in Australia.
Data mining is a mechanism that predicts and describes data using some methods such as classification, regression, clustering, summarization, dependency modelling, change and deviation.
In this techno era, many complex tasks have been made simple with the use of different and appropriate technologies. Collection, sorting and evaluation of data have been made easy with the help of many data mining tools and techniques. Techniques such as K-mean and EM have helped us to extract information from the dataset.
Selection of data mining algorithm and technique are completely dependent upon the field we are going to explore.
The K-Means clustering algorithm was invented in 1956. It is the most common form of the algorithm that uses an iterative refinement heuristic known as Lloyd's algorithm. Lloyd's algorithm starts by partitioning the input points into k initial sets, either at random, or using some heuristic data. It then calculates the mean point, or centroid, of each set. It constructs a new partition by associating each point with the closest centroid. Then, the centroids are recalculated for the new clusters, and algorithm repeated by alternate application of these two steps until convergence, which is obtained when the points no longer switch clusters (or alternatively, the centroids are no longer changed).
The K-Means algorithm is an algorithm to...