Verifying The Unseen: Interactive Proofs for Label-Invariant Distribution Properties

Given i.i.d. samples drawn from an unknown distribution over a large domain [N], approximating several basic quantities, such as the distribution's support size and its Shannon Entropy, requires at least roughly (N / \log N) samples [Valiant and Valiant, STOC 2011].


Suppose, however, that we can interact with a powerful but untrusted prover, who knows the entire distribution. Can we use such a prover to approximately *verify* such statistical quantities more efficiently?


We show that this is indeed the case: a distribution's support size, its entropy, and its distance from the uniform distribution, can all be approximately verified via a constant-round interactive proof, where the verifier's running time, the sample complexity, and the communication complexity are all close to \sqrt{N}.


More generally, we give a tolerant interactive proof system with similar parameters for verifying a distribution's proximity to any label-invariant property (any property that is invariant to re-labeling of the elements in the distribution's support).


Joint work with Tal Herman.



Weizmann Institute