Big data

Predicting large text data with spark via the R package sparklyr

Unlike the classical programming languages that are very slow and even sometimes fail to load very large data sets since they use only a single core, Apache Spark is known as the fastest distributed system that can handle with ease large datasets...

Predicting binary response variable with h2o framework

H2O is an open-source distributed scalable framework used to train machine learning and deep learning models as well as data analysis. It can handle large data sets, with ease of use, by creating...

Introduction to sparklyr

sparklyr is an R interface for spark