Togaware DATA MINING
Desktop Survival Guide
by Graham Williams
Google


Binning

A binning function is provided by Rattle, coded by Daniele Medri. The Rattle interface provides an option to choose between Quantile binning, KMeans binning, and Equal Width binning. For each option the default number of bins is 4, and we can change this to suit our needs. The generated variables are prefixed with either BIN_QUn_, BIN_KMn_, and BIN_EWn_ respectively, with n replaced with the number of bins. Thus, we can create multiple binnings for any variable.

Note that quantile binning is the same as equal count binning.

Figure 5.7: Binning Age.
Image rattle-audit-transform-binning-age

Figure 5.8: Distributions of binned Age.
Image rattle-audit-binning-age-plot



Copyright © 2004-2008 Togaware Pty Ltd
Support further development through the purchase of the PDF version of the book.
PDF version is properly formatted and forms a comprehensive book (draft with over 600 pages).
Brought to you by Togaware.