Togaware DATA MINING
Desktop Survival Guide
by Graham Williams
Google


Evaluating Models

Evaluating the outcomes of data mining is important. We need to understand how well any model we build will be expected to perform, and how well it performs in comparison to other models we might choose to build.

A common approach is to compute an error rate which simply reports the number of cases that the model correctly classifiers. Common methods for estimating the empirical error rate are, for example, cross-validation (CV), the Bayesian evidence framework, and the PAC framework.

In this chapter we introduce several measures used to report on the performance of a model and review various approaches to evaluating the output of data mining. This will cover printcp, table for producing confusion matrices, ROCR for the graphical presentation of evaluations, as well as how to tune the presentations for your own needs.



Subsections

Copyright © Togaware Pty Ltd
Support further development through the purchase of the PDF version of the book.
The PDF version is a formatted comprehensive draft book (with over 800 pages).
Brought to you by Togaware. This page generated: Sunday, 22 August 2010