Variable treatment for R data frames (vtreat)
Variable treatment package for R data frames from Win-Vector LLC.
http://…esigning-a-package-for-variable-treatment


Gradient Boosting (GBDT, GBRT or GBM) Library for large-scale and distributed machine learning, on single node, hadoop yarn and more. (xgboost)
An optimized general purpose gradient boosting library. The library is parallelized, and also provides an optimized distributed version. It implements machine learning algorithm under gradient boosting framework, including generalized linear model and gradient boosted regression tree (GBDT). XGBoost can also also distributed and scale to Terascale data.

Easily Tidy Data with spread() and gather() Functions (tidyr)
An evolution of reshape2. It’s designed specifically for data tidying (not general reshaping or aggregating) and works well with dplyr data pipelines.