Clustering in data mining is viewed as unsupervised method of data analysis. Clustering allows users to analyze data from many different dimensions or angles, categorize it, and summarize the relationships identified. Clustering helps to discover groups and identifies interesting distributions in the underlying data. It is one of the most useful technique and used in exploratory analysis of data. It is also used in various areas such as grouping, decision-making, and machine-learning situations, including data mining, document retrieval, image segmentation, classification and image processing. Traditional clustering algorithms both favour clusters with spherical shapes and similar sizes, and are very fragile in the presence of outliers. Clustering plays a major role in analysis of very large data set and it is useful to discover the correlation among attributes both of spherical and non spherical shape which is also robust to outliers. This survey focuses on clustering algorithms that are used on very large data sets which help to find the characteristic of the data. We have taken the best clustering algorithm such as BIRCH, BFR and CURE.
Keywords : Hierarchical, Centroid, CF Tree, Incremental Algorithm, Representation Points, Classes of Points.