DATA MINING
Desktop Survival Guide by Graham Williams |
|||||
Algorithm |
The Apriori algorithm is a breadth-first or generate-and-test type of search algorithm. Only after exploring all possibilities of associations containing items does it then consider those containing items. For each , all candidates are tested to determine whether they have enough support.
The algorithm uses a simple two step generate and merge process: generate frequent itemsets of size then combine them to generate candidate frequent itemsets of size .
The algorithm is generally simple to implement and is reasonably efficient even though the number of possible items is generally large and the baskets are generally small.
The input data to the algorithm consists of entities or transactions, each transaction representing a basket of items.
The two primary tuning parameters are minsup (minimum support expressed as a percentage of the total number of transactions in data) and mincon (minimum confidence also expressed as a percentage of the total number of transactions in data). Typically they have quite small values because of the size of the databases we are dealing with. Thus a support of 0.1% or smaller is not unusual.
Procedure returns a set of association rules, each consisting of a left hand side, right hand side and a support and confidence tuple.
Copyright © Togaware Pty Ltd Support further development through the purchase of the PDF version of the book.