[Analyticsvidhya – Tutorial] BigDataFr recommends: Use H2O and data.table to build models on large data sets in R #machinelearning

BigDataFr recommends: 19 Data Science Tools for people who aren’t so good at Programming

[…] ‘Okay, I get it. data.table empowers us to do data exploration & manipulation. But, what about model building ? I work with 8GB RAM. Algorithms like random forest (ntrees = 1000) takes forever to run on my data set with 800,000 rows.’

I’m sure there are many R users who are trapped in a similar situation. To overcome this painstaking hurdle, I decided to write this post which demonstrates using the two most powerful packages i.e. H2O and data.table.

For practical understanding, I’ve taken the data set from a previously held competition and tried to improve the score using 4 different machine learning algorithms (with H2O) & feature engineering (with data.table). So, get ready for a journey from rank 154th to 25th on the leaderboard.[…]

Read more
By Manish Saraswat
Source: analyticsvidhya.com

Laisser un commentaire