Jump to content

Apriori algorithm

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by Exa~enwiki (talk | contribs) at 15:41, 20 April 2004 (the first efficient association rule mining algorithm). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.
(diff) ← Previous revision | Latest revision (diff) | Newer revision → (diff)

Description

Apriori is an efficient association rule mining algorithm, devised by Agrawal et al.

Apriori (Agrawal, 94) employs BFS and uses a hash tree structure to count candidate item sets efficiently. The algorithm generates candidate item sets (patterns) of length from length item sets. Then, the patterns which have an infrequent sub pattern are pruned. According to the downward closure lemma, the generated candidate set contains all frequent length item sets. Following that, the whole transaction database is scanned to determine frequent item sets among the candidates \cite{zaki99parallel}. For determining frequent items in a fast manner, the algorithm uses a hash tree to store candidate itemsets. Note: A hash tree has item sets at the leaves and hash tables at internal nodes (Zaki, 99)

Algorithm

Apriori

    large 1-itemsets 
   
   while 
       
       for transactions 
           
           for candidates 
               
       
       
   return 

References

Rakesh Agrawal and Tomasz Imielinski and Arun N. Swami, Mining Association Rules between Sets of Items in Large Databases, Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data.

Rakesh Agrawal and Ramakrishnan Srikant, Fast Algorithms for Mining Association Rules, Proc. 20th Int. Conf. Very Large Data Bases (VLDB), 1994.

Heikki Mannila and Hannu Toivonen and A. Inkeri Verkamo, Efficient algorithms for discovering association rules, AAAI Workshop on Knowledge Discovery in Databases (KDD-94), 1994.

Mohammed Javeed Zaki and Srinivasan Parthasarathy and Mitsunori Ogihara and Wei Li, Parallel Algorithms for Discovery of Association Rules, Data Mining and Knowledge Discovery, 1997.