Show simple item record

dc.contributor.authorWhite, Patrick Edward
dc.description96 pages
dc.description.abstractWe consider the problem of how to construct algorithms which deal efficiently with large amounts of data. We give new algorithms which use time and communication resources that are sublinear in the problem size for problems in various domains including statistics and combinatorics. We first consider properties of random variables. We begin with the problem of distinguishing whether two distributions over the same domain are close or far in both the $L_1$ and the $L_2$ norms. We investigate two models for representing a distribution. In one model, elements of a sample space are generated on request according to a fixed but unknown distribution. In the other, the probability assigned to each element is given explicitly in an array. We present algorithms in two settings: (1) when both distributions are represented in the first model; and, (2) when one of each representation is given. We show that the first setting is provably easier than the second setting. Next we give algorithms for testing whether two random variables are independent. In all of our algorithms, the number of samples required from the input distributions is sublinear in the domain size and nearly optimal. We then consider properties of data. Specifically, we give an algorithm which determines if a Markov Chain is rapidly mixing in sublinear time, assuming the input is in a form which allows for easy generation of sequential nodes in a random walk. Our test distinguishes Markov chains which are rapidly mixing from those which cannot be made rapidly mixing by changing a small number of edges. Finally we turn to a model in which the help of an untrusted entity is used to reliably solve a problem in sublinear time. For the problem of multidimensional bin-packing, we give an algorithm which can verify the goodness of a potential solution in sublinear time. To do this we develop tools which allow one to test that a function is approximately monotone.
dc.subjectDistribution Testing
dc.subjectMarkov Chain
dc.subjectProperty Testing
dc.subjectSublinear Algorithms
dc.titleSublinear Algorithms for Statistical, Markov Chain, and Binpacking Problems
dc.typedissertation or thesis Science of Philosophy D., Computer Science
dc.contributor.chairTardos, Éva
dc.contributor.committeeMemberShmoys, David B.
dc.contributor.committeeMemberSelman, Bart
dc.contributor.committeeMemberRubinfeld, Ronitt

Files in this item


This item appears in the following Collection(s)

Show simple item record