eCommons

 

Getting the Most Out of Your Data: Multitask Bayesian Network Structure Learning, Predicting Good Probabilities and Ensemble Selection

Other Titles

Abstract

First, I consider the problem of simultaneously learning the structures of multiple Bayesian networks from multiple related datasets. I present a multitask Bayes net structure learning algorithm that is able to learn more accurate network structures by transferring useful information between the datasets. The algorithm extends the score and search techniques used in traditional structure learning to the multitask case by defining a scoring function for sets of structures (one structure for each task) and an efficient procedure for searching for a high scoring set of structures. I also address the task selection problem in the context of multitask Bayes net structure learning. Unlike in other multitask learning scenarios, in the Bayes net structure learning setting there is a clear definition of task relatedness: two tasks are related if they have similar structures. This allows one to automatically select a set of related tasks to be used by multitask structure learning.

Second, I examine the relationship between the predictions made by different supervised learning algorithms and true posterior probabilities. I show that quasi-maximum margin methods such as boosted decision trees and SVMs push probability mass away from 0 and 1 yielding a characteristic sigmoid shaped distortion in the predicted probabilities. Naive Bayes pushes probabilities toward 0 and 1. Other models such as neural nets, logistic regression and bagged trees usually do not have these biases and predict well calibrated probabilities. I experiment with two ways of correcting the biased probabilities predicted by some learning methods: Platt Scaling and Isotonic Regression. I qualitatively examine what distortions these calibration methods are suitable for and quantitatively examine how much data they need to be effective.

Third, I present a method for constructing ensembles from libraries of thousands of models. Model libraries are generated using different learning algorithms and parameter settings. Forward stepwise selection is used to add to the ensemble the models that maximize its performance. The main drawback of ensemble selection is that it builds models that are very large and slow at test time. This drawback, however, can be overcome with little or no loss in performance by using model compression.

Journal / Series

Volume & Issue

Description

Sponsorship

The work in this dissertation was supported by NSF grants 0347318, 0412930, 0427914, and 0612031.

Date Issued

2008-07-30T01:01:33Z

Publisher

Keywords

transfer learning; Bayesian network structure learning; probability calibration; ensemble learning

Location

Effective Date

Expiration Date

Sector

Employer

Union

Union Local

NAICS

Number of Workers

Committee Chair

Committee Co-Chair

Committee Member

Degree Discipline

Degree Name

Degree Level

Related Version

Related DOI

Related To

Related Part

Based on Related Item

Has Other Format(s)

Part of Related Item

Related To

Related Publication(s)

Link(s) to Related Publication(s)

References

Link(s) to Reference(s)

Previously Published As

Government Document

ISBN

ISMN

ISSN

Other Identifiers

Rights

Rights URI

Types

dissertation or thesis

Accessibility Feature

Accessibility Hazard

Accessibility Summary

Link(s) to Catalog Record