Togaware DATA MINING
Desktop Survival Guide
by Graham Williams
Google


Rattle Data Miner

In learning about data mining it is important to learn by example and by doing. Data mining is a very practical activity, often following ones nose as we weave our way through our data. Our aim through this book is to provide hands on practice in data mining. For this we need a tool, and ideally, not one that is expensive and aims to hide how things are done. Instead, we use the open source and freely available data mining tool, called Rattle. It is available for anyone to download from rattle.togawre.com.

Rattle (the R Analytical Tool To Learn Easily) is a graphical data mining application built upon the statistical language R. An understanding of R is not required in order to use Rattle. However, a basic introduction is provided in Chapter 13 with the idea being that this is a springboard into more sophisticated data mining in R itself. Rattle is simple to use, quick to deploy, and allows us to rapidly work through the modelling phase of a data mining project. R, on the other hand, provides a very powerful language for performing data mining, and when we need to fine tune our data mining projects we can migrate from Rattle to R simply by taking Rattle's underlying commands and deploying them within the R console.

Rattle uses the Gnome graphical user interface and runs under various operating systems, including GNU/Linux, Macintosh OS/X, and MS/Windows. Its intuitive user interface takes us through the basic steps of data mining, as well as illustrating (through a Log tab) the actual R code that is used to achieve this. The corresponding R code can be saved to file and used as a script which can be loaded into R (outside of Rattle) to repeat any data mining exercise.

While Rattle by itself may be sufficient for all of a user's needs, it also provides a stepping stone to more sophisticated processing and modelling in R itself. The user is not limited to how Rattle does things. For sophisticated and unconstrained data mining, the experienced user can progress to interacting directly with a powerful language.

In this chapter we present the Rattle interface, and its basic environment for interaction, including menus and toolbars, and saving and loading projects. Chapter 3 works through the process of loading data into Rattle and Chapter 5 presents the various options within Rattle for exploring our data. Chapter 6 considers the options for transforming our data in various ways. We then go through the process of building models and evaluating the models in Chapters 7.1 to 9. Chapter 11 provides an insight into how Rattle works under the bonnet. It begins the user on their way to using R itself.

We begin this chapter with the instructions for installing Rattle.



Subsections
Copyright © Graham.Williams@togaware.com
Support further development through the purchase of the PDF version of the book.
PDF version is properly formatted and forms a comprehensive book (draft with over 600 pages).
Brought to you by Togaware.