DATA MINING
Desktop Survival Guide
by Graham Williams

What is Rattle?

Rattle is a graphical data mining application built upon the statistical language R. An understanding of R is not required in order to use Rattle. However, a basic introduction is provided through this book, acting as a springboard into more sophisticated data mining directly in R itself. Rattle is simple to use, quick to deploy, and allows us to rapidly work through the data processing, modelling, and evaluation phases of a data mining project. On the other hand, R provides a very powerful language for performing data mining well beyond the limitations that must be embodied in any graphical user interface and the consequentially canned approaches to data mining. When we need to fine tune and further develop our data mining projects we can migrate from Rattle to R.

Rattle uses the Gnome graphical user interface and runs under various operating systems, including GNU/Linux, Macintosh OS/X, and MS/Windows. Its intuitive user interface takes us through the basic steps of data mining, as well as illustrating the actual R code that is used to achieve this. Rattle exposes all of the underlying R code to allow it to be directly deployed within the R as well as saved in R scripts for future reference. The R code can be loaded into R (outside of Rattle) to repeat any data mining exercise. This is an important aspect of any scientific and deployed endeavour--to be able to repeat our ``experiments.''

While Rattle by itself may be sufficient for all of a user's needs, particularly in the context of our introduction to data mining, it does provide this stepping stone to more sophisticated processing and modelling in R itself. It is worth emphasising that the user is not limited to how Rattle does things. For sophisticated and unconstrained data mining, the experienced user we progress to interacting directly with a powerful statistical software environment.

The purpose of this Chapter is to place us in a position to effectively interact with Rattle so that we can illustrate the data mining process. Of course, we first need to have Rattle available on our computer, and since it is freely available open source software, it is available to anyone.

Appendix takes us through the installation of Rattle. It is recommended that you install Rattle to work along with the examples presented in this book.

In this Chapter we look at the initial interface presented by Rattle (Section 3.2) and the basic process for interacting with Rattle (Section 3.3), which implements a common data mining process. The Menus and Buttons of the interface are covered in Section , presenting the Rattle interface and its basic environment. Graphical presentations of data and models are an important component of data mining, and Rattle's mechanism for displaying and interacting with graphical plots is introduced in Section 3.5.

Support further development through the purchase of the PDF version of the book.
PDF version is properly formatted and forms a comprehensive book (draft with over 700 pages).
Brought to you by Togaware. This page generated: Sunday, 13 September 2009