BigDataFr recommends: Ibis on Impala – Python at Scale for Data Science
‘This new Cloudera Labs project promises to deliver the great Python user experience and ecosystem at Hadoop scale.
Across the user community, you will find general agreement that the Apache Hadoop stack has progressed dramatically in just the past few years. For example, Search and Impala have moved Hadoop beyond batch processing, while developers are seeing significant productivity gains and additional use cases by transitioning from MapReduce to Apache Spark.
Thanks to such advances in the ecosystem, Hadoop has evolved into a robust and powerful open source data analysis stack. A centerpiece of that stack is Impala, the MPP query engine that is still the only open source option for a truly interactive, BI-style experience (an analytic database, if you will) on Hadoop. For business analysts in particular, who are the rank-and-file of big data consumers, the Hadoop experience is becoming all but indistinguishable from that of traditional data infrastructure but with unprecedented scale, flexibility, and cost-effectiveness under the covers.
The Rise of Python for Data Science’ […]
Read more/get the preview
By Marcel Kornacker and Wes McKinney
Source: blog.cloudera.com
About
Marcel Kornacker is Chief Architect for Database Technology at Cloudera, and the creator of Impala.
Wes McKinney is a Software Engineer at Cloudera. He is the creator of Python’s ubiquitous pandas library and the author of the O’Reilly Media best-seller Python for Data Analysis. Previously, Wes was the founder and CEO of DataPad.