An Improved Approximation Algorithm for Co-location Mining in Uncertain Data Sets using Probabilistic Approach

M. Sheshikala, D. Rajeswara Rao, Md. Ali Kadampur

Abstract


In this paper we investigate colocation mining problem in the context of uncertain data. Uncertain data is a partially complete data. Many of the real world data is Uncertain, for example, Demographic data, Sensor networks data, GIS data etc.,. Handling such data is a challenge for knowledge discovery particularly in colocation mining. One straightforward method is to find the Probabilistic Prevalent colocations (PPCs). This method tries to find all colocations that are to be generated from a random world. For this we first apply an approximation error to find all the PPCs which reduce the computations. Next find all the possible worlds and split them into two different worlds and compute the prevalence probability. These worlds are used to compare with a minimum probability threshold to decide whether it is Probabilistic Prevalent colocation (PPCs) or not. The experimental results on the selected data set show the significant improvement in computational time in comparison to some of the existing methods used in colocation mining.

Full Text:

PDF

References


C.C. Aggarwal et al, Frequent Pattern Mining with Uncertain Data, Proc. 15th ACM SIGKDD Int’l Conf. Knowledge Discovery and Data Mining (KDD), pp. 29-37, 2009.

T. Bernecker, H-P Kriegel, M. Renz, F. Verhein, and A. Zuefle, Proba-bilistic Frequent Itemset Mining in Uncertain Databases, Proc. 15th ACM SIGKDD Conf. Knowledge Discovery and Data Mining (KDD ‘09), pp. 119-127, 2009. [05]

C.-K. Chui, B. Kao, and E. Hung, Mining Frequent Item sets from Uncertain Data, Proc. 11th Pacific-Asia Conf. Knowledge Discovery and Data Mining (PAKDD), pp. 47-58, 2007.

C.-K. Chui, B. Kao, A Decremental Approach for Mining Frequent Item sets from Uncertain Data, Proc. 12th Pacific-Asia Conf. Knowledge Discovery and Data Mining (PAKDD), pp. 64-75, 2008.

Y. Huang, H. Xiong, and S. Shekar, Mining Confident Co-Location Rules without a Support Threshold, Proc. ACM Symp. Applied Com-puting, pp. 497-501, 2003.

Y. Huang, S. Shekar, and H. Xiong, Discovering Co-Location Patterns from Spatial Data Sets: A General Approach, IEEE Trans. knowledge and Data Eng., vol. 16, no. 12, pp. 1472-1485, Dec. 2004.

Y. Huang, J. Pei, and H. Xiong, ”Mining Co-Location Patterns with Rare Events from Spatial Data Sets,” Geoinformatics, vol. 10, no. 3, pp. 239-260, Dec. 2006.

Y. Morimoto, Mining Frequent Neighboring Class Sets in Spatial Databases, Proc. Seventh ACM SIGKDD Int’l Conf. Knowledge Discovery and Data Mining (KDD), pp. 353-358, 2001.

J.S. Yoo, S. Shekar,J. Smith, and J.P. Kumquat, A Partial Join Approach for Mining Co-Location Patterns, Proc. 12th Ann. ACM Int’l Workshop Geographic Information Systems (GIS), pp. 241-249, 2004.

J.S. Yoo and S. Shekar, A Join less Approach for Mining Spatial Co-Location Patterns, IEEE Trans. knowledge and Data Eng.(TKDE), vol. 18, no. 10, pp. 1323-1337, Dec. 2006.

L. Wang, Y. Bao, J. Lu and J. Yip, A New Join-less Approach for Co-Location Pattern Mining, Proc. IEEE Eighth ACM Int’l Conf. Computer and Information Technology (CIT), pp. 197-202, 2008.

L. Wang, H. Chen, L. Zhao and L. Zhou, Efficiently Mining Co-Location Rules of Interval Data, Proc. Sixth Int’l Conf. Advanced Data Mining and Applications, pp. 477-488, 2010.

Q. Zhang, F. Li, and K. Yi, Finding Frequent Items in Probabilistic Data, Proc. ACM SIGMOD Int’l Conf. Management of Data, pp. 819-832, 2008.

Wang, P. Wu, and H. Chen, Finding Probabilistic Prevalent Colocations in Spatially Uncertain Data Sets, IEEE Trans. knowledge and Data Eng.(TKDE), vol. 25, no. 4, pp. 790-804, Apr. 2013.




DOI: https://doi.org/10.11591/APTIKOM.J.CSIT.91

Refbacks

  • There are currently no refbacks.


Copyright (c) 2019 APTIKOM Journal on Computer Science and Information Technologies



ISSN: 2722-323X, e-ISSN: 2722-3221

CSIT Stats

 

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.