Improved Algorithm for Mining of High Utility patterns in one phase Based on Map Reduce Framework on Hadoop

Ms.Geeta Raju Popalghat; Prof.S B. Kothari

Improved Algorithm for Mining of High Utility patterns in one phase Based on Map Reduce Framework on Hadoop

Ms.Geeta Raju Popalghat, Prof.S B. Kothari

Abstract

Mining high utility itemsets from a value-based
database alludes to the disclosure of itemsets with high utility like
benefits. In spite of the fact that various significant calculations
have been proposed lately, they bring about the problem of
causing a sizably voluminous number of applicant itemsets for
high utility itemsets. Such a large number of candidate itemsets
degrades the mining performance in terms of execution time
and space requirement. Earlier work shows this on two phase
candidate generation. This approach suffers from scalability issue
due to the huge number of candidates. Our paper presents the
efficient approach where we can generate high utility patterns
in one phase without generating candidates. Here we have
taken experiments on linear data structure, our pattern growth
approach is to search a reverse set enumeration tree and to prune
search space by utility upper bounding. Also high utility patterns
are identified by a closure property and singleton property. Iin
this venture we are displaying new approach which is extending
these calculations to conquer the restrictions utilizing the Map
Reduce structure on Hadoop. Experimental results show that the
proposed algorithms, not only reduce the number of candidates
effectively but also outperform other algorithms substantially in
terms of runtime, especially when databases contain lots of long
transactions.

Full Text:

PDF

References

R. Agarwal, C. Aggarwal, and V. Prasad, Depth first generation of

long patterns, in Proc. ACM SIGKDD Int. Conf. Knowl. DiscoveryData

Mining, 2000, pp. 108118.

R. Agrawal, T. Imielinski, and A. Swami, Mining association rules

between sets of items in large databases in Proc. ACM SIGMOD Int.

Conf. Manage. Data, 1993, pp. 207216.

R. Agrawal and R. Srikant, Fast algorithms for mining association rules

in Proc. 20th Int. Conf. Very Large Databases, 1994, pp. 487499

C. F. Ahmed, S. K. Tanbeer, B.-S. Jeong, and Y.-K. Lee, Efficient tree

structures for high utility pattern mining in incremental databases IEEE

Trans. Knowl. Data Eng., vol. 21, no. 12, pp. 1708 1721, Dec. 2009.

R. Bayardo and R. Agrawal, Mining the most interesting rules in Proc.

th ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, 1999, pp.

F. Bonchi, F. Giannotti, A. Mazzanti, and D. Pedreschi, ExAnte: A

preprocessing method for frequent-pattern mining IEEE Intell. Syst., vol.

, no. 3, pp. 2531, May/Jun. 2005.

F. Bonchi and B. Goethals, FP-Bonsai: The art of growing and pruning

small FP-trees in Proc. 8th Pacific-Asia Conf. Adv. Knowl. Discovery

Data Mining, 2004, pp. 155160

F. Bonchi and C. Lucchese, Extending the state-of-the-art of constraintbased pattern discovery Data Knowl. Eng., vol. 60, no. 2, pp. 377399,

C. Bucila, J. Gehrke, D. Kifer, and W. M. White, Dualminer: A dualpruning algorithm for itemsets with constraints Data Mining Knowl.

Discovery, vol. 7, no. 3, pp. 241272, 2003.

R. Chan, Q. Yang, and Y. Shen, Mining high utility itemsets in Proc.

Int. Conf. Data Mining, 2003, pp. 1926.

S. Dawar and V. Goyal, UP-Hist tree: An efficient data structure for

mining high utility patterns from transaction databases in Proc. 19th Int.

Database Eng. Appl. Symp., 2015, pp. 5661.

T. De Bie, Maximum entropy models and subjective interestingness: An

application to tiles in binary databasesData Mining Knowl. Discovery,

vol. 23, no. 3, pp. 407446, 2011.

L. De Raedt, T. Guns, and S. Nijssen, Constraint programming for

itemset mining in Proc. ACM SIGKDD, 2008, pp. 204212.

A. Erwin, R. P. Gopalan, and N. R. Achuthan, Efficient mining of high

utility itemsets from large datasets in Proc. 12th Pacific-Asia Conf. Adv.

Knowl. Discovery Data Mining, 2008, pp. 554561.

P. Fournier-Viger, C.-W. Wu, S. Zida, and V. S. Tseng, FHM: Faster

high-utility itemset mining using estimated utility cooccurrence pruning

in Proc. 21st Int. Symp. Found. Intell. Syst., 2014, pp. 8392.

L. Geng and H. J. Hamilton, Interestingness measures for data mining:

A survey ACM Comput. Surveys, vol. 38, no. 3, p. 9, 2006.

J. Han, J. Pei, and Y. Yin, Mining frequent patterns without candidate

generation in Proc. ACM SIGMOD Int. Conf. Manage. Data, 2000, pp.

R. J. Hilderman, C. L. Carter, H. J. Hamilton, and N. Cercone, Mining

market basket data using share measures and characterized itemsets in

Proc. PAKDD, 1998, pp. 7286.

R. J. Hilderman and H. J. Hamilton, Measuring the interestingness of

discovered knowledge: A principled approach Intell. Data Anal., vol. 7,

Raymond Chan; Qiang Yang; Yi-Dong Shen, â€Mining high utility

itemsetsâ€ In Proc. of Third IEEE Intl Conf. on Data Mining ,November

Refbacks

There are currently no refbacks.

Username
Password
Remember me