September 30, 2013

Performance benchmark: Redshift vs Impala vs Shark vs Hive

University of California Berkeley's Amplab has released a new performance benchmark of scalable cloud-based query engines (via Ben Lorica at O'Reilly).

Massively parallel processor architectures (MPP) have quickly developed into the go-to solution for big data analytics. The benchmark confirms how Redshift is the fastest among on-disk solutions, although slower than in-memory queries on Impala and Shark whenever the data could still fit in memory. Google BigQuery was not yet included in the UC Berkeley's benchmark.

We have built MegaPivot to provide access to cutting edge columnar massively parallel processor cloud platforms to any non-technical business analyst that needs flexibility and power to run queries and aggregation over million and even billions of rows. MegaPivot has chosen Google BigQuery as a platform to make it even easier to connect to Google Analytics unsampled data and integrate with Google Apps environments.

MegaPivot is a web-based big data analytics application built on Google BigQuery infrastructure.

Build pivot tables on billions of rows in seconds.
Easily import CSV files of any size directly from your Dropbox folder.
Share your reports with your colleagues.

Sign Up with Google