DATA MINING
Desktop Survival Guide
by
Graham Williams
Desktop Survival
Project Home
List of Figures
List of Tables
Rattle
Introduction
Getting Started
Interacting with Rattle
Data
Loading Data
Exploring Data
Test
Transforming Data
Cluster Analysis
Evaluation and Deployment
Issues
Moving into R
Troubleshooting
Beyond Rattle
R
Getting Help
Data
Graphics in R
Understanding Data
Preparing Data
Descriptive and Predictive Analytics
Issues
Topics in Data Mining
Archetype Analysis
Text Mining
Algorithms
Bagging
Bayes Classifier
Cluster Analysis
Conditional Trees
Hierarchical Clustering
K-Nearest Neighbours
Linear Models
Support Vector Machines
Open Products
AlphaMiner
Borgelt Data Mining Suite
KNime
R
Rattle
Weka
Closed Products
C4.5
Clementine
Equbits Foresight
GhostMiner
InductionEngine
ODM
Enterprise Miner
Statistica Data Miner
TreeNet
Virtual Predict
Appendices
Installing Rattle
Bibliography
Index
Rattle: Basic Data Mining
Subsections
Introduction
Data Mining
Types of Analysis
Data Mining Applications
A Framework for Modelling
Agile Data Mining
R
Why R and Rattle?
Data Preparation
Number of Algorithms
Repeatability
Performance
Business Case
Sample Business Case
Pros and Cons
Books on R
Getting Started
First Contact
Initial Interaction with R
Loading Our First Data Set
Our First Model--Some Details
Understanding Our Data
Evaluating the Model
Where to Now?
Chapter Exercises
Interacting with Rattle
What is Rattle?
The Initial Interface
Interacting with Rattle
Menus and Buttons
Project Menu and Buttons
Edit Menu
Tools Menu and Toolbar
Execute
Export
Settings
Help
Interacting with Plots
Summary
Data
Nomenclature
Loading Data into Rattle
CSV Data
Datasets
Reading Direct from URL
Play Golf
Weather Data
Other Data Sources
ARFF Data
ODBC Sourced Data
Setting Up a Data Source Name
Netezza Setup
Teradata Setup
R Data
R Dataset
Data Entry
Data Tab Options in Rattle
Sampling Data
Variable Roles
Automatic Role Identification
Weights Calculator
Manipulating Data
Loading Data
Data Formats
CSV Data
Locating a File to Load
Loading the File
CSV Options
Basic Data Summary
ARFF Data
ODBC Sourced Data
R Dataset
R Data
Library
Data Options
Sampling Data
Variable Roles
Automatic Role Identification
Weights Calculator
Chapter Exercises
Command Summary
Exploring Data
Summarising Data
Summary
Describe
Basics
Kurtosis
Skewness
Missing
Exploring Distributions
Box Plot
Histogram
Cumulative Distribution Plot
Benford's Law
Other Digits
Stratified Benford Plots
Bar Plot
Dot Plot
Mosaic Plot
GGobi: Interactive Data Exploration
Scatterplot
Data Viewer: Identifying Entities in Plots
Brushing
Identify Multivariate Outliers
Other Options
Quality Plots Using R
Further GGobi Documentation
Correlation Analysis
Hierarchical Correlation
Principal Components
Single Variable Overviews
Test
Transforming Data
Rescale Data
Recenter
Scale [0,1]
Rank
Median/MAD
Nolan Groups
Impute
Zero/Missing
Mean/Median/Mode
Constant
Remap
Binning
Indicator Variables
Join Categorics
Math Transforms
Outliers
Cleanup
Delete Ignored
Delete Selected
Delete Missing
Delete Obs with Missing
Other Transformations
Removing Duplicates
Chapter Exercises
Cluster Analysis
KMeans
Export KMeans Clusters
Discriminant Coordinates Plot
Number of Clusters
Hierarchical Clusters
Evaluation
The Evaluate Tab
Confusion Matrix
Measures
Graphical Measures
Cost Curves
Lift
ROC Curves
Area Under Curve
Precision versus Recall
Sensitivity versus Specificity
Predicted versus Observed
Scoring
Issues
Model Selection
Overfitting
Imbalanced Classification
Sampling
Cost Based Learning
Model Deployment and Interoperability
SQL
PMML
XML for Data
Bibliographic Notes
Moving into R
Interacting with R
Basic Command Line
Windows, Icons, Mouse, Pointer--WIMP
The Current Rattle State
Samples
Projects
The Rattle Log
Further Tuning Models
Emacs and ESS
Troubleshooting
Cairo: cairo_pdf_surface_create could not be located
A factor has new levels
Copyright © Togaware Pty Ltd
Support further development through the
purchase of the PDF
version of the book.
PDF version is properly formatted and forms a comprehensive book (draft with over 700 pages).
Brought to you by
Togaware
. This page generated: Sunday, 13 September 2009