[arXiv] BigDataFr recommends: Communication Efficient Checking of Big Data Operations

big data operationsBigDataFr recommends: Communication Efficient Checking of Big Data Operations

[…] Subjects: Data Structures and Algorithms (cs.DS); Distributed, Parallel, and Cluster Computing (cs.DC)

We propose fast probabilistic algorithms with low (i.e., sublinear in the input size) communication volume to check the correctness of operations in Big Data processing frameworks and distributed databases. Our checkers cover many of the commonly used operations, including sum, average, median, and minimum aggregation, as well as sorting, union, merge, and zip. An experimental evaluation of our implementation in Thrill (Bingmann et al., 2016) confirms the low overhead and high failure detection rate predicted by theoretical analysis. […]

Read more
By Lorenz Hübschle-Schneider, Peter Sanders
Source: arxiv.org

Laisser un commentaire

Votre adresse de messagerie ne sera pas publiée. Les champs obligatoires sont indiqués avec *