Big Data and the Super Bowl; Why Semantics Will Enrich Your Data
Why Include Semantics In Data? Knowledge Integration is the key. I have added a simple use case to describe the benefits. Since the Super Bowl just ended I will use a football related example.
The R Journal, Volume 6/2, December 2014 – is now online!
The new issue of The R Journal is now available!
Should you teach Python or R for data science?
That’s an excellent question! It doesn’t have a simple answer (in my opinion) because both languages are great for data science, but one might be better than the other depending upon your students and your priorities.
Sharing Your Shiny Apps
For those of you who are not familiar with Shiny, I’ll briefly provide a description of the high level architecture. A Shiny application is comprised of two R files – a server and a user interface (UI) file. The UI file acts as an HTML interpreter – you can create your buttons, checkboxes, images, and other HTML widgets from here. The server file is where Shiny’s real magic happens. This is where you can make those buttons and checkboxes that you created in your UI actually do something. In other words, it is where R users can turn their app, using only R, into a dynamic visual masterpiece. If you type in runApp(“appName”) from your console window, with the Shiny app folder as your working directory of course, then you can see your output.
How to Get the Frequency Table of a Categorical Variable as a Data Frame in R
One feature that I like about R is the ability to access and manipulate the outputs of many functions. For example, you can extract the kernel density estimates from density() and scale them to ensure that the resulting density integrates to 1 over its support set. I recently needed to get a frequency table of a categorical variable in R, and I wanted the output as a data table that I can access and manipulate. This is a fairly simple and common task in statistics and data analysis, so I thought that there must be a function in Base R that can easily generate this. Sadly, I could not find such a function. In this post, I will explain why the seemingly obvious table() function does not work, and I will demonstrate how the count() function in the ‘plyr’ package can achieve this goal.
R + ggplot2 Graph Catalog
Joanna Zhao’s and Jenny Bryan’s R graph catalog is meant to be a complement to the physical book, Creating More Effective Graphs, but it’s a really nice gallery in its own right. The catalog shows a series of different data visualizations, all made with R and ggplot2. Click on any of the plots and you get the R code necessary to generate the data and produce the plot.