Three best practices for building successful data pipelines

BigDataFr recommends: Three best practices for building successful data pipelines

‘Building a good data pipeline can be technically tricky. As a data scientist who has worked at Foursquare and Google, I can honestly say that one of our biggest headaches was locking down our Extract, Transform, and Load (ETL) process.

At The Data Incubator, our team has trained more than 100 talented Ph.D. data science fellows who are now data scientists at a wide range of companies, including Capital One, the New York Times, AIG, and Palantir. We commonly hear from Data Incubator alumni and hiring managers that one of their biggest challenges is also implementing their own ETL pipelines.’ […]

Read more
By Michael Li
Source: http://radar.oreilly.com

About Michael Li

Michael Li is the founder of The Data Incubator, an eight-week fellowship, to help PhDs and postdocs transition from academia into industry. Previously, he headed monetization data science at Foursquare and has worked at Google, Andreessen Horowitz, J.P. Morgan, and D.E. Shaw. He is a regular contributor to VentureBeat, The Next Web, and Harvard Business Review. Michael earned his Ph.D. at Princeton and was a Marshall Scholar in Cambridge.

[O’R] BigDataFr recommends: Three best practices for building successful data pipelines #datascientist

Laisser un commentaire Annuler la réponse