BigDataFr recommends: Evaluation of Data Locality Strategies for Hybrid Cloud Bursting of Iterative MapReduce
Abstract
[…] Hybrid cloud bursting (i.e., leasing temporary off-premise cloud resources to boost the overall capacity during peak utilization) is a popular and cost-effective way to deal with the increasing complexity of big data analytics. It is particularly promising for iterative MapReduce applications that reuse massive amounts of input data at each iteration, which compensates for the high overhead and cost of concurrent data transfers from the on-premise to the off-premise VMs over a weak inter-site link that is of limited capacity. In this paper we study how to combine various MapReduce data locality techniques designed for hybrid cloud bursting in order to achieve scalability for iterative MapReduce applications in a cost-effective fashion. […]
Read paper
By Francisco Clemente-Castello1, Bogdan Nicolae2, M. Mustafa Rafique3, Rafael Mayo1, Juan Carlos Fernandez1,
Source: hal-archives-ouvertes.fr
1 Universidad Jaume I
2 Huawei Technologies European Research Center
3 IBM Research – Ireland