Intelligible Models: Recovering Low Dimensional Additive Structure For Machine Learning Models

Other Titles



Different supervised learning models have different bias-variance tradeoffs. For low dimensional problems, low-bias models such boosted trees or SVMs with RBF kernels are very accurate but are unfortunately no longer interpretable by the users. For high dimensional problems, high-bias models such as regularized linear/logistic regressions are usually preferred over other models because of the curse of dimensionality and the exponentially growing hypothesis space but it is not clear whether we could further improve the accuracy from those high-bias models. Additive modeling is an excellent tool to control the bias and variance in a finer granularity and provides a great solution to these problems. Generalized additive models (GAMs) express the hypothesis as a sum of components, where each component can include any number of variables. Therefore, by prudently selecting the components or restricting the number of complex components and carefully controlling the complexity of each selected component, GAMs are very flexible of modeling hypothesis with different biases. This dissertation presents a family of additive models called intelligible models, which effectively recover the low dimensional additive structures. Those low dimensional additive components provide the opportunities for data scientists to investigate each simple component individually, and therefore the interpretability is significantly improved. We first present a large-scale empirical study of various methods for fitting GAMs. We demonstrate empirically that gradient boosting with shallow bagged trees yield the best accuracy. In ad- dition, we propose a very efficient method of detecting pairwise feature interactions that scales to thousands of features. With a large-scale empirical study, we show that models with low dimensional additive components (one- and twodimensional components) are as accurate as complex models such as random forests. Finally, we develop a method to carefully control the complexity of the intelligible models by feature selection and intelligently deciding whether the selected term is linear or nonlinear, and show that on high dimensional problems we can further improve the accuracy from the popular linear models by allowing a small set of features to act nonlinearly.

Journal / Series

Volume & Issue



Date Issued




intelligible models; classification and regression; interaction detection


Effective Date

Expiration Date




Union Local


Number of Workers

Committee Chair

Gehrke, Johannes E.

Committee Co-Chair

Committee Member

Kozen, Dexter Campbell
Snavely, Keith Noah
Caruana, Rich A.
Hooker, Giles J.

Degree Discipline

Computer Science

Degree Name

Ph. D., Computer Science

Degree Level

Doctor of Philosophy

Related Version

Related DOI

Related To

Related Part

Based on Related Item

Has Other Format(s)

Part of Related Item

Related To

Related Publication(s)

Link(s) to Related Publication(s)


Link(s) to Reference(s)

Previously Published As

Government Document




Other Identifiers


Rights URI


dissertation or thesis

Accessibility Feature

Accessibility Hazard

Accessibility Summary

Link(s) to Catalog Record