[O’R] BigDataFr recommends: The O’Reilly Data Show Podcast-Todd Lipcon (#Cloudera)

BigDataFr recommends: Resolving transactional access and analytic performance trade-offs

The O’Reilly Data Show Podcast-Todd Lipcon (#Cloudera) on hybrid and specialized tools in distributed systems.

Excerpt

HDFS and HBase

[Hadoop is] more like a file store.  It allows you to upload files onto an arbitrarily sized cluster with 20-plus petabytes, in single clusters. The thing is, you can upload the files but you can’t edit them in place. To make any change, you have to basically put in a new file. What HBase does in distinction is that it has more of a tabular data model, where you can update and insert individual row-by- row data, and then randomly access that data [in] milliseconds. […]

Read more
Ben Lorica, Chief Data Scientist & Director of Content Strategy for Data at O’Reilly Media, Inc
Source: radar.oreilly.com

Laisser un commentaire