Probabilistic Models for Local Patterns Analysis
No Thumbnail Available
Date
2015-03-15
Journal Title
Journal ISSN
Volume Title
Publisher
University of sciences and technology in Oran
Abstract
Recently, many large organizations have multiple data sources (MDS’)
distributed over different branches of an interstate company. Local patterns analysis
has become an effective strategy for MDS mining in national and international
organizations. It consists of mining different datasets in order to obtain frequent
patterns, which are forwarded to a centralized place for global pattern analysis. Various
synthesizing models [2,3,4,5,6,7,8,26] have been proposed to build global patterns
from the forwarded patterns. It is desired that the synthesized rules from such
forwarded patterns must closely match with the mono-mining results (i.e., the results
that would be obtained if all of the databases are put together and mining has been
done). When the pattern is present in the site, but fails to satisfy the minimum support
threshold value, it is not allowed to take part in the pattern synthesizing process.
Therefore, this process can lose some interesting patterns, which can help the decider
to make the right decision. In such situations we propose the application of a
probabilistic model in the synthesizing process. An adequate choice for a probabilistic
model can improve the quality of patterns that have been discovered. In this paper, we
perform a comprehensive study on various probabilistic models that can be applied in
the synthesizing process and we choose and improve one of them that works to
ameliorate the synthesizing results. Finally, some experiments are presented in public
database in order to improve the efficiency of our proposed synthesizing method.
Description
Keywords
Global Pattern, Maximum Entropy Method, Non-derivable Itemset, Itemset Inclusion-exclusion Model
