Why use the R Language?
A brief outline of why you might want to make the effort to learn R.
Type conversion and you (or and R)
Types and type conversion can be a tricky and intricate topic, and sometimes can lead to some real head-scratcher issues in R. Hence a somewhat confusing title. This is for people still relatively new to R, and I will skip some gory details. Actually I will skip most of them, the canonical source for type and conversion information is the official R documentation, and the help pages for the functions at hand. Instead I thought I would walk through some examples of when the type engine can behave in seemingly odd ways, and take a look at what is going on when mysterious errors arise and what can be done to track down their source.
Tabula: Tabula is a tool for liberating data tables locked inside PDF files.
If you’ve ever tried to do anything with data provided to you in PDFs, you know how painful it is — there’s no easy way to copy-and-paste rows of data out of PDF files. Tabula allows you to extract that data into a CSV or Microsoft Excel spreadsheet using a simple, easy-to-use interface. Tabula works on Mac, Windows and Linux.
Simple Lower US 48 Albers Maps & Local (no-API) City/State Geocoding in R
I’ve been seeing an uptick in static US “lower 48″ maps with “meh” projections this year, possibly caused by a flood of new folks resolving to learn R but using pretty old documentation or tutorials. I’ve also been seeing an uptick in folks needing to geocode US city/state to lat/lon. I thought I’d tackle both in a quick post to show how to (simply) use a decent projection for lower 48 US maps and then how to use a very basic package I wrote – localgeo to avoid having to use an external API/service for basic city/state geocoding.
What Is Big Data Discovery?
According to Gartner, ‘Big Data Discovery’ is the next big trend in analytics. It’s the logical combination of three of the hottest trends of the last few years in analytics: Big Data, Data Discovery, and Data Science. Each of these areas has seen explosive growth, but there are clear upsides and downsides to each. For example, Data Discovery excels in ease of use, but allows only limited depth of exploration, while Data Science provides powerful analysis but is slow, complex, and difficult to implement.
An introduction to ggplot
Last week at the Davis R Users’ Group, Myfanwy Johnston gave an introduction to to the powerful and ubiquitous ggplot2 package for plotting in R. See below for the screencast and her particularly enlightening figure of how ggplot’s syntax and conceptual approach. Myfanwy also placed all her slides, code, and links to more ggplot resources in this GitHub repository.
The Mixology of Databases
If you’re like me, you’ve wondered what your database would look like if it were a beverage. Fortunately, we don’t have to wonder anymore! The DZone Guide to Database and Persistence Management’s infographic shows you the ingredients of 29 popular databa