Extractor-based Approach to Proving Memory-Sample Lower Bounds for Learning

A recent line of work has focused on the following question: Can one prove strong unconditional lower bounds on the number of samples needed for learning under memory constraints? We study an extractor-based approach to proving such bounds for a large class of learning problems as follows.

A matrix M: A x X -> {-1,1} corresponds to the following learning problem: An unknown function f in X is chosen uniformly at random. A learner tries to learn f from a stream of samples, (a_1, b_1), (a_2, b_2) ..., where for every i, a_i in A is chosen uniformly at random and b_i = M(a_i,f). Assume that k, l, r are such that any submatrix of M, with at least 2^{-k}|A| rows and at least 2^{-l}|X| columns, has a bias of at most 2^{-r} (extractor property).

We show that any learning algorithm for the learning problem corresponding to M requires either a memory of size at least Ω(k l), or at least 2^{Ω(r)} samples. We also extend the lower bounds to a learner that is allowed two passes over the stream of samples. In particular, we show that any two-pass algorithm for learning parities of size n requires either a memory of size Ω(n^{3/2}) or at least 2^{Ω(n^{1/2})} samples. An open question would be to go beyond these techniques to prove such memory-sample tradeoffs.

Joint works with Ran Raz and Avishay Tal.



Harvard University


Sumegha Garg