DATA MINING
Desktop Survival Guide
by
Graham Williams
Desktop Survival
Project Home
List of Figures
List of Tables
Rattle
Introduction
Getting Started
Interacting with Rattle
Data
Loading Data
Exploring Data
Test
Transforming Data
Cluster Analysis
Evaluation and Deployment
Issues
Moving into R
Troubleshooting
Beyond Rattle
R
Getting Help
Data
Graphics in R
Understanding Data
Preparing Data
Descriptive and Predictive Analytics
Issues
Topics in Data Mining
Archetype Analysis
Text Mining
Algorithms
Bagging
Bayes Classifier
Cluster Analysis
Conditional Trees
Hierarchical Clustering
K-Nearest Neighbours
Linear Models
Support Vector Machines
Open Products
AlphaMiner
Borgelt Data Mining Suite
KNime
R
Rattle
Weka
Closed Products
C4.5
Clementine
Equbits Foresight
GhostMiner
InductionEngine
ODM
Enterprise Miner
Statistica Data Miner
TreeNet
Virtual Predict
Appendices
Installing Rattle
Bibliography
Index
List of Figures
The R command line.
Initial Rattle window.
Loading the
weather.csv
file.
Decision tree model of the
weather
dataset.
Decision tree plot.
Explore
tab's distribution plots.
Explore
tab's distribution plots- Categorics.
Initial steps of the data mining process (Tony Nolan)
The data mining process
A sample of plots
Loading the
weather.csv
dataset.
The
CSV
options of the
Data
tab.
The CSV file chooser
Data tab ARFF option
Loading data through an ODBC database connection
Teradata ODBC connection
Netezza ODBC connection
Netezza configuration
Loading an R binary data file.
Loading an already defined R data frame
Selected region of a spreadsheet copied to the clipboard
Loading an R data frame originally from the clipboard
Data entry spreadsheet
Select tab choosing Adjusted as a Risk variable.
Loading the
weather.csv
dataset.
The
CSV
option of the
Data
tab.
The CSV file chooser
Data tab ARFF option
Loading data through an ODBC database connection.
Netezza ODBC connection
Loading an already defined R data frame XXXX update XXXX.
Selected region of a spreadsheet copied to the clipboard.
Loading an R data frame originally from the clipboard
Loading an R binary data file.
Select tab choosing Adjusted as a Risk variable.
Missing value summary for a modified version of the
audit
dataset.
Benford stratified by Marital and Gender.
Mosaic plot of Age by Adjusted.
Correlations between keywords in documents.
Transform options.
Selection of normalisations performed on Income.
Original distribution of Age.
Normalisations of Age.
Selection of imputations.
Imputation using the mode for missing values of Age.
Binning Age.
Distributions of binned Age.
Turning Gender into an Indicator Variable.
Selection of cleanup operations.
External data change.
KMeans Iteration Interface
KMeans Iteration Plot
Informational dialog.
A sample Cost Curve for a random forest.
Evaluate
tab with
Score
option and a CSV file.
Scores have been saved.
Load and analyse score data using the Gnumeric spreadsheet.
Distribution of scores displayed using Rattle.
R command line under MS/Windows
R Commander GUI
R GUI using ESS for Emacs
A simple time series plot of dates using traditional Rpackage[]graphics.
A simple time series plot of dates using Rpackage[]ggplot2.
An ordered monthly box plot.
A approximate model of random data.
Reduced example of an alternating decision tree.
Audit risk chart from an alternating decision tree.
Togaware's Rattle Gnome Data Mining interface.
The Weka GUI chooser.
Weka explorer viewing data.
Import CSV data into Weka.
Output from running J48 (C4.5).
Fujitsu GhostMiner interface.
Sample ODMiner interface to ODM.
SAS Enterprise Miner interface (Version 4).
Statistica
Data Miner graphical interface.
Copyright © Togaware Pty Ltd
Support further development through the
purchase of the PDF
version of the book.
PDF version is properly formatted and forms a comprehensive book (draft with over 700 pages).
Brought to you by
Togaware
. This page generated: Sunday, 13 September 2009