Skip Navigation



Logic Journal of IGPL Advance Access published online on August 12, 2009

Logic Journal of IGPL, doi:10.1093/jigpal/jzp035
This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Fisseler, J.
Right arrow Articles by Fehér, I.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2009. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oxfordjournals.org

Data fusion with probabilistic conditional logic

Jens Fisseler

Faculty of Mathematics and Computer Science, FernUniversität in Hagen, 58084 Hagen, Germany.
E-mail: jens.fisseler{at}fernuni-hagen.de

Imre Fehér

Magyar Telekom Plc., Hungary.
E-mail: feherimi{at}gmail.com.

Data fusion is the process of combining data and information from two or more sources. One of its application areas is market research, where it is used to combine data sets from different surveys, yielding a joint data set. Most data fusion studies use statistical matching as their fusion algorithm, which has several drawbacks. Therefore, we propose a novel approach to data fusion, based on knowledge discovery and knowledge representation with probabilistic conditional logic. We evaluate our approach on synthetic and real-world data, demonstrating its feasibility.

Key Words: data fusion • graphical models • probabilistic conditionals • maximum entropy

Received for publication 2 January 2008.

References

    [1]  Beierle Christoph, Kern-Isberner Gabriele. Modelling conditional knowledge discovery and belief revision by abstract state machines. In: Abstract State Machines, Advances in Theory and Practice, 10th International Workshop, ASM 2003 (2003) Springer. 186–203. volume 2589 of LNCS.

    [2]  Cowell Robert G, Philip Dawid A, Lauritzen Steffen L, Spiegelhalter David J. Probabilistic Networks and Expert Systems (1999) Springer.

    [3]  Darroch JN, Ratcliff D. Generalized iterative scaling for log-linear models. The Annals of Mathematical Statistics (1972) 43(5):1470–1480.[CrossRef]

    [4]  D’Orazio Marcello, Di Zio Marco, Scanu Mauro. Statistical Matching: Theory and Practice (2006) Wiley: Wiley Series in Survey Methodology. chapter The Statistical Matching Problem.

    [5]  Fayyad Usama M, Piatetsky-Shapiro Gregory, Smyth Padhraic, Uthurusamy Ramasamy, eds. Advances in Knowledge Discovery, & Data Mining (1996) AAAI Press/The MIT Press.

    [6]  Fehér Imre. Datenfusion von Datensätzen mit nominalen Variablen mit Hilfe von CONDOR und SPIRIT. (2007) Germany: Department of Computer Science, FernUniversität in Hagen. Master's thesis.

    [7]  Fisseler J, Kern-Isberner G, Beierle C, Koch A, Müller C. Algebraic knowledge discovery using Haskell. (2007) Berlin, Heidelberg, New York: Springer. In Practical Aspects of Declarative Languages, 9th International Symposium, PADL 2007, volume 4354 of Lecture Notes in Computer Science.

    [8]  Fisseler Jens, Kern-Isberner Gabriele, Beierle Christoph. Learning uncertain rules with CONDORCKD. (2007) AAAI Press. Proceedings of the Twentieth International Florida Artificial Intelligence Research Society Conference.

    [9]  Gray Robert M. Entropy and Information (1990) Springer.

    [10]  Jaynes Edwin T. Probability Theory: The Logic of Science (2003) Cambridge University Press.

    [11]  Jirousek Radim. Data-based construction of multidimensional probabilistic models with MUDIM. Logic Journal of the IGPL (2006) 14(3):501–520.[CrossRef][Web of Science]

    [12]  Jordan Michael I, ed. Learning in Graphical Models (1998) Kluwer Academic Publishers.

    [13]  Kapur JN, Kesavan HK. Entropy Optimization Principles with Applications (1992) Academic Press.

    [14]  Kern-Isberner Gabriele. Solving the inverse representation problem. (2000) IOS Press. 581–585. ECAI 2000, Proceedings of the 14th European Conference on Artificial Intelligence.

    [15]  Kern-Isberner Gabriele, Fisseler Jens. Knowledge discovery by reversing inductive knowledge representation. (2004) AAAI Press. 34–44. Principles of Knowledge Representation and Reasoning: Proceedings of the Ninth International Conference (KR2004).

    [16]  Kern-Isberner Gabriele, Rödder Wilhelm. Belief revision and information fusion on optimum entropy. International Journal of Intelligent Systems (2004) 19(9):837–857.[CrossRef][Web of Science]

    [17]  Kiesl Hans, Rässler Susanne. How valid can data fusion be? (2006) Nürnberg: Institut für Arbeitsmarkt und Berufsforschung (IAB). Technical Report 200615, (Institute for Employment Research, Nuremberg, Germany). Online available at http://ideas.repec.org/p/iab/iabdpa/200615.html.

    [18]  Meyer Carl-Heinz. Korrektes Schließen bei unvollständiger Information (1998) Hagen, Germany: Department of Business Science, Fern Universität. PhD thesis.

    [19]  Paris JB, Vencovská A. A note on the inevitability of maximum entropy. International Journal of Approximate Reasoning (1990) 4:183–223.[CrossRef]

    [20]  Pearl Judea. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference (1988) Morgan Kaufmann.

    [21]  Rödder W, Reucher E, Kulmann F. Features of the expert-system-shell SPIRIT. Logic Journal of IGPL (2006) 14(3):483–500.[Abstract/Free Full Text]

    [22]  Rödder Wilhelm, Kern-Isberner Gabriele. Representation and extraction of information by probabilistic logic. Information Systems (1997) 21(8):637–652.[CrossRef][Web of Science]

    [23]  Saporta Gilbert. Data fusion and data grafting. Computational Statistics & Data Analysis (2002) 38:465–473.[CrossRef][Web of Science]

    [24]  Shore John E, Johnson Rodney W. Axiomatic derivation of the principle of maximum entropy and the principle of minimum cross-entropy. IEEE Transactions on Information Theory (1980) 26(1):26–37.[CrossRef][Web of Science]

    [25]  Singh AC, Mantel HJ, Kinack MD, Rowe G. Statistical matching: Use of auxiliary information as an alternative to the conditional independence assumption. Survey Methodology (1993) 19(1):59–79.

    [26]  van der Putten Peter, Kok Joost N, Gupta Amar. Why the information explosion can be bad for data mining, and how data fusion provides a way out. (2002) SIAM. Proceedings of the Second SIAM International Conference on Data Mining.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?



This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Fisseler, J.
Right arrow Articles by Fehér, I.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?