Hybrid Learning of Bayesian Networks for Joint Modeling of Risk Factors from Survey Data
Nandini Ramanan
Dallas, Texas
- 0 Collaborators
Recent work has identified the potential of machine-learning approaches in predicting medical conditions such as postpartum depression and rare medical diseases from self-reported survey data. Specifically, they provide the opportunity to study the interactions between a condition and related risk factors more closely. Here we consider the problem of learning to model the joint distribution over several factors and their relationship to the occurrence of a condition from survey data. ...learn more
Project status: Concept
Overview / Usage
We develop a novel hybrid Bayesian network learning algorithm that constructs the network in two steps. The first is a dependency network (DN) learning step, where a DN is learned in a scalable and efficient manner. The second is a network reduction step that prunes edges from the DN based on mutual independence tests and turning it into aBayesian Network. Our proposed approach, Mutual Independence-based Dependency (network) to Bayesian Network, or MidBN, has the salient property of being scalable while learning potentially interesting associations between risk factors themselves and with the target of interest
Methodology / Approach
We develop an algorithm that performs three steps: learning a dependency network from data, detect the cycles and then remove
the edges that are mutually independent. The overall intuition behind this approach is fairly simple: use a scalable algorithm to
handle a large number of variables and learn a dense model quickly. In addition, the model will also have cycles. Now, we simply
start removing edges deemed “less informative”, as measured on the data using mutual independence tests. In this manner, we convert a BN to DN. We provide details of these three steps that form the core of our MidBN approach below:
• Learn a dependency network. A key advantage of a DN is that because it permits cycles, learning them is efficient, and scalable
to a large number of variables. While there are several different ways of learning a DN, we take an efficient approach based on the
observation that trees can be used inside probabilistic models to capture context-specific independence. Thus, in order to learn a DN, we iterate through every variable and learn a (probabilistic) decision tree for each variable. The advantage of this approach is that it learns the qualitative relationships (structure) and quantitative influences (parameters) simultaneously. The structure is simply the set of all the variables appearing in the tree and the parameters are the distributions at the leaves. The other advantage is that the approach is that it is easily parallelizable and scalable.
• Delete Cycles to Obtain BN Structure. Next, the goal is to convert the DN learned in the previous step to a BN. Recall that
the key difference between a DN and a BN is that DNs permit cycles. A naïve approach to deleting edges would be: search for an
edge, remove it, check for acyclicity and log-likelihood. We take a slightly different approach. Once a cycle is detected, we sort
the edges in the cycle by their mutual information. The bottom k edges from this sorted list are deleted from the DN. We set k = 5
in all our experiments. This process is repeated iteratively until all the cycles are removed from the DN. Numerous methods exist
to detect cycles including search strategy and topological sorting. In our implementation, we employ depth first search (DFS).
• Learn Conditional Distributions to Obtain BN Parameters: Once all the cycles are detected and removed, we obtain the
skeleton of the BN. To estimate its parameters, since our data is fully observed, we can simply learn the conditional distributions
using maximum likelihood estimation.
Technologies Used
Java
bnlearn