검색 상세

A study on comparison of Bayesian network structure learning algorithms for selecting an appropriate models

초록/요약

In this paper, we compare the performance between the Bayesian network structure learning algorithm provided by bnlearn package in R. The performance of the study results is evaluated by using a score or comparing between the target network and the learning network. In this paper, it was confirmed that algorithm specific performance test results using fore-mentioned methods are different. Unlike most previous studies which generally used real data, synthetic data generated based on topology was used to compare performance of contrast-specific algorithm. The aim of this paper is to provide objective guidance of selecting suitable algorithm in accordance to target network. Previous tools suffer from serious trade-off between cost and complexity, restricting most studies relevant to Bayesian network to using only real data. To address such problem, a data generator based on Bayesian network model using R is built and introduced.

more

목차

1 Introduction 3
1.1 Bayesian Network 3
1.2 Bayesian Network Structure Learning 4
2 Bayesian Network Structure Learning Algorithms in bnlearn Package 7
2.1 Constraint-based Algorithms 7
2.1.1 Grow-Shrink (GS) Markov Blanket Algorithm 7
2.1.2 Incremental Association (IAMB) Algorithm 9
2.2 Score-Based Algorithms 11
2.2.1 Hill-Climbing (HC) Algorithm 11
2.2.2 TABU Search Algorithm 12
2.3 Hybrid Algorithms 13
2.3.1 Max-Min Hill-Climbing (MMHC) Algorithm 13
2.3.2 More general 2-phase Restricted Maximization (RSMAX2) 14
3 The Comparison Methodology 15
3.1 The Number of Graphical Errors in the Learnt Structure 15
3.2 Network Scores 16
3.2.1 Bayesian Scoring Functions 16
3.2.2 Information-theoretic Scoring Functions 17
4 Data Generation with BN Data Generator in R 19
4.1 BN Data Generator Function in R 20
4.2 A Simple Example 20
5 Simulation 24
5.1 Real Data 25
5.1.1 Asia Data Set by Lauritzen and Spiegelhalter 25
5.1.2 Insurance Evaluation Network Data Set 27
5.1.3 ALARM Monitoring System Data Set 29
5.1.4 The HaliFinder Weather Forecast System Data Set 31
5.1.5 Summary 33
5.2 Synthetic Data According to Topologies 35
5.2.1 Bayesian Network Topologies 35
5.2.2 Collapse 36
5.2.3 Line 39
5.2.4 Star 42
5.2.5 Pseudo Loop 46
5.2.6 Diamond 50
5.2.7 Rhombus 53
6 Discussion 56
Appendix 58
A.1 Table for Collapse 58
A.2 Table for Line 61
A.3 Table for Star 64
A.4 Table for PseudoLoop 67
A.5 Table for Diamond 70
A.6 Table for Rhombus 72
Bibliography 75

more