We define and explore the design space of efficient algorithms to compute ROLLUP aggregates, using the MapReduce programming paradigm. Using a modeling approach, we explain the non-trivial trade-o. that exists between parallelism and communication costs that is inherent to a MapReduce implementation of ROLLUP. Furthermore, we design a new family of algorithms that, through a single parameter, allow to find a sweet spot in the parallelism vs. communication cost trade-o. We complement our work with an experimental approach, wherein we overcome some limitations of the model we use. Our results indicate that efficient ROLLUP aggregates require striking the good balance between parallelism and communication for both one-round and chained algorithms.

On the design space of mapreduce ROLLUP aggregates

Dell'Amico M.;
2014-01-01

Abstract

We define and explore the design space of efficient algorithms to compute ROLLUP aggregates, using the MapReduce programming paradigm. Using a modeling approach, we explain the non-trivial trade-o. that exists between parallelism and communication costs that is inherent to a MapReduce implementation of ROLLUP. Furthermore, we design a new family of algorithms that, through a single parameter, allow to find a sweet spot in the parallelism vs. communication cost trade-o. We complement our work with an experimental approach, wherein we overcome some limitations of the model we use. Our results indicate that efficient ROLLUP aggregates require striking the good balance between parallelism and communication for both one-round and chained algorithms.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11567/1070896
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? ND
social impact