Installing and Starting SparkR Locally on Windows OS and Rstudio
With the recent release of Apache Spark 1.4.1 on July 15th, 2015, I wanted to write a step-by-step guide to help new users get up and running with SparkR locally on a Windows machine using command shell and RStudio. SparkR provides an R frontend to Apache Spark and using Spark’s distributed computation engine allows R-Users to run large scale data analysis from the R shell. The steps listed here are also documented in my online book title ‘Getting Started with SparkR for Big Data Analysis’ which can be accessed at: http://…/. These steps will get you up and running in less than 5 mins.
Where do letters occur in words
A while back I encountered an interesting graphic showing where letters were located in english words (http://…/graphing-distribution-of-english.html). The other day I decided to do a similar one for letters in danish words and for this I used R. – See more at: http://…/#sthash.Kn1GLTxq.dpuf
Predicting Titanic deaths on Kaggle II: gbm
Following my previous post I have decided to try and use a different method: generalized boosted regression models (gbm). I have read the background in Elements of Statistical Learning and arthur charpentier’s nice post on it. This data is a nice occasion to get my hands dirty.
Roll Your Own Gist Comments Notifier in R
Here’s what you’ll grok from the code:
• one way to deal with the ‘default namespace’ issue in R+XML
• one way to deal with error checking for scraping
• how to build an XML file (and, specifically, an RSS/Atom feed) with R
• how to escape XML entities with R
• how to get an XML object as a character string in R
Logistic Growth, S Curves, Bifurcations, and Lyapunov Exponents in R
If you’ve ever wondered how logistic population growth (the Verhulst model), S curves, the logistic map, bifurcation diagrams, sensitive dependence on initial conditions, ‘orbits’, deterministic chaos, and Lyapunov exponents are related to one another… this post attempts to provide a simplified explanation(!) in just 10 steps, each with some code in R so you can explore it all yourself. I’ve included some code written by other people who have explored this problem (cited below) as portions of my own code.
A Path Towards Easier Map Projection Machinations with ggplot2
The $DAYJOB doesn’t afford much opportunity to work with cartographic datasets, but I really like maps and tinker with shapefiles and geo-data when I can, plus answer a ton of geo-questions on StackOverflow. R makes it easy—one might even say too easy—to work with maps.
An Introduction to reshape2
reshape2 is an R package written by Hadley Wickham that makes it easy to transform data between wide and long formats.