broom: a package for tidying statistical models into data frames
The concept of ‘tidy data’, as introduced by Hadley Wickham, offers a powerful framework for data manipulation, analysis, and visualization. Popular packages like dplyr, tidyr and ggplot2 take great advantage of this framework, as explored in several recent posts by others. But there’s an important step in a tidy data workflow that so far has been missing: the output of R statistical modeling functions isn’t tidy, meaning it’s difficult to manipulate and recombine in downstream analyses and visualizations.

A first look at rxBTrees
The gradient boosting machine as developed by Friedman, Hastie, Tibshirani and others, has become an extremely successful algorithm for dealing with both classification and regression problems and is now an essential feature of any machine learning toolbox. R’s gbm() function (gbm package) is a particularly well crafted implementation of the gradient boosting machine that served as a model for the rxBTrees() function which was released last year as a feature of Revolution R Enterprise 7.3. You can think of rxBTRees(), as a scaled up version gbm() that is designed to work with massive data sets in distributed environments.

shinyData – GUI for data analysis and reporting
Some people find very hard to start using R because it has no GUI. There exists some GUIs which offers some of the functionality of R. In this post I would like to focus on one such GUI, a very new shiny application called shinyData. I hope the app will make it easier for some to get into R environment. Also it can reduce development time of analysis and reports for existing R users.

Unstructured Data

Scripts to setup a GPU / CUDA enable compute server with libraries to study deep learning development

Tips & Tricks 7: Plotting PCA with TPS grids
Our function plotTangentSpace() performs a Principal Components Analysis (PCA) of shape variation and plots two dimensions of tangent space for a set of Procrustes-aligned specimens and also returns the shape changes associated with the two plotted principal axes. We purposefully restricted the options for this function because plotting in R has almost endless possibilities. That is why we added the ‘verbose=TRUE’ option, so that the pc scores and pc shapes could be plotted on their own. And users are of course expected to do this for their publications and reports. In this week’s exercise, we explore a few options to do advance of plotting PCAs and TPS grids with geomorph and R base functions.