Abstract : We show how to uniformly distribute data at random (not to be confounded with permutation routing) in two settings that are able to deal with massive data: coarse grained parallelism and external memory. In contrast to previously known work for parallel setups, our method is able to fulfill the three criteria of uniformity, work-optimality and balance among the processors simultaneously. To guarantee the uniformity we investigate the matrix of communication requests between the processors. We show that its distribution is a generalization of the multivariate hypergeometric distribution and we give algorithms to sample it efficiently in the two settings.
https://hal.inria.fr/inria-00000900 Contributor : Jens GustedtConnect in order to contact the contributor Submitted on : Thursday, October 26, 2006 - 1:57:57 PM Last modification on : Friday, February 26, 2021 - 3:28:02 PM Long-term archiving on: : Monday, October 22, 2012 - 11:51:17 AM