Togaware DATA MINING
Desktop Survival Guide
by Graham Williams
Google


Cluster Analysis

Image kmeans04
Clustering is one of the core tools used by the data miner. Clustering allows us to group entities in a generally unguided fashion, according to how similar they are. This is done on the basis of a measure of the distance between entities. The aim of clustering is to identify groups of entities that are close together but as a group are quite separate from other groups.

The amap package includes k-means with a choice of distances like Eulidean and Spearman.

. We optimize implementation (with a parallelized hierarchical clustering) and allow the possibility of using different distances like Eulidean or Spearman (rank-based metric).



Subsections

Copyright © 2004-2010 Togaware Pty Ltd
Support further development through the purchase of the PDF version of the book.
The PDF version is a formatted comprehensive draft book (with over 800 pages).
Brought to you by Togaware. This page generated: Sunday, 22 August 2010