TY - JOUR
T1 - Learning Bayesian network parameters via minimax algorithm
AU - Chen, Daqing
PY - 2019/3/8
Y1 - 2019/3/8
N2 - Parameter learning is an important aspect of learning in Bayesian networks. Although the maximum likelihood algorithm is often effective, it suffers from overfitting when there is insufficient data. To address this, prior distributions of model parameters are often imposed. When training a Bayesian network, the parameters of the network are optimized to fit the data. However, imposing prior distributions can reduce the fitness between parameters and data. Therefore, a trade-off is needed between fitting and overfitting. In this study, a new algorithm, named MiniMax Fitness (MMF) is developed to address this problem. The method includes three main steps. First, the maximum a posterior estimation that combines data and prior distribution is derived. Then, the hyper-parameters of the prior distribution are optimized to minimize the fitness between posterior estimation and data. Finally, the order of posterior estimation is checked and adjusted to match the order of the statistical counts from the data. In addition, we introduce an improved constrained maximum entropy method, named Prior Free Constrained Maximum Entropy (PF-CME), to facilitate parameter learning when domain knowledge is provided. Experiments show that the proposed methods outperforms most of existing parameter learning methods.
AB - Parameter learning is an important aspect of learning in Bayesian networks. Although the maximum likelihood algorithm is often effective, it suffers from overfitting when there is insufficient data. To address this, prior distributions of model parameters are often imposed. When training a Bayesian network, the parameters of the network are optimized to fit the data. However, imposing prior distributions can reduce the fitness between parameters and data. Therefore, a trade-off is needed between fitting and overfitting. In this study, a new algorithm, named MiniMax Fitness (MMF) is developed to address this problem. The method includes three main steps. First, the maximum a posterior estimation that combines data and prior distribution is derived. Then, the hyper-parameters of the prior distribution are optimized to minimize the fitness between posterior estimation and data. Finally, the order of posterior estimation is checked and adjusted to match the order of the statistical counts from the data. In addition, we introduce an improved constrained maximum entropy method, named Prior Free Constrained Maximum Entropy (PF-CME), to facilitate parameter learning when domain knowledge is provided. Experiments show that the proposed methods outperforms most of existing parameter learning methods.
U2 - 10.1016/j.ijar.2019.03.001
DO - 10.1016/j.ijar.2019.03.001
M3 - Article
SN - 0888-613X
SP - 62
EP - 75
JO - International Journal of Approximate Reasoning
JF - International Journal of Approximate Reasoning
ER -