[O’R] BigDataFr recommends: Validating data models with Kafka-based pipelines #datascientist #machinelearning #kafka #hadoop #cloudera

BigDataFr recommends: Validating data models with Kafka-based pipelines

‘A/B testing is a popular method of using business intelligence data to assess possible changes to websites. In the past, when a business wanted to update its website in an attempt to drive more sales, decisions on the specific changes to make were driven by guesses; intuition; focus groups; and ultimately, which executive yelled louder. These days, the data-driven solution is to set up multiple copies of the website, direct users randomly to the different variations and measure which design improves sales the most. There are a lot of details to get right, but this is the gist of things.

When it comes to back-end systems, however, we are still living in the stone age. Suppose your business grew significantly and you notice that your existing MySQL database is becoming less responsive as the load increases. Suppose you consider moving to a NoSQL system, you need to decide which NoSQL solution to pick — there are a lot of options: Cassandra, MongoDB, Couchbase, or even Hadoop. There are also many possible data models: normalized, wide tables, narrow tables, nested data structures, etc.’

Read more
By Gwen Shapira
Source: radar.oreilly.com


About Gwen Shapira

Gwen Shapira is a Solutions Architect at Cloudera and leader of IOUG Big Data SIG. She studied computer science, statistics, and operations research at the University of Tel Aviv, and then went on to spend the next 15 years in different technical positions in the IT industry.
She specializes in scalable and resilient solutions, and helps her customers build high-performance large-scale data architectures using Hadoop.
Gwen Shapira is a frequent presenter at conferences and regularly publishes articles in technical magazines and her blog.

Laisser un commentaire