A Network Approach to Balanced Sampling

 

Dawn M. Rose

Department of Mathematical Sciences

Clemson University

Clemson, SC 29634-0975

 

Douglas R. Shier

Department of Mathematical Sciences

Clemson University

Clemson, SC 29634-0975

 

Abstract: A situation that frequently arises in statistical sampling is to draw a small but "representative" sample from a given population. Specifically, we may be able to stratify the population based on some measurable variate X and we wish to estimate some other (unknown) variate Z. In cases such as these, a purposeful rather than simply a random sample can be more informative. This situation gives rise to the following problem for univariate X and Z. Select a k-subset (sample) from a given n-set (population) so that the sample matches the population in its first two moments. We model this as an optimal path problem on a network G(n,k) having arc lengths that are unrestricted in sign. In particular, we seek all s-t paths in G(n,k) whose length is zero with respect to two separate lengths, one defined relative to the first moment and the other defined relative to the second moment. This problem can be shown to be NP-hard. We present solution algorithms based on a depth-first search and a modified depth-first search, as well as computational results comparing the two.

Key Words: balanced sample, bicriterion optimization, moments, networks, sampling