To avoid latency, Impala circumvents MapReduce to directly access the data through a specialized distributed query engine that is very similar to those found in commercial parallel RDBMSs. The result is order-of-magnitude faster performance than Hive, depending on the type of query and configuration.Read the Full Story * Download the MP3 * View the slides * Subscribe on iTunes* Subscribe to RSS
Sunday, May 12, 2013
Technical Overview of Cloudera Impala
In this slidecast, Justin Erckson from Cloudera presents a technical overview of Cloudera Impala, an SQL-on-Hadoop solution that enables users to do real-time queries of data stored in Hadoop clusters.