Understanding Bayes: Updating priors via the likelihood

In this post I explain how to use the likelihood to update a prior into a posterior. The simplest way to illustrate likelihoods as an updating factor is to use conjugate distribution families (Raiffa & Schlaifer, 1961). A prior and likelihood are said to be conjugate when the resulting posterior distribution is the same type of distribution as the prior. This means that if you have binomial data you can use a beta prior to obtain a beta posterior. If you had normal data you could use a normal prior and obtain a normal posterior. Conjugate priors are not required for doing bayesian updating, but they make the calculations a lot easier so they are nice to use if you can.

Installing and Starting SparkR Locally on Windows OS and Rstudio
With the recent release of Apache Spark 1.4.1 on July 15th, 2015, I wanted to write a step-by-step guide to help new users get up and running with SparkR locally on a Windows machine using command shell and RStudio. SparkR provides an R frontend to Apache Spark and using Spark’s distributed computation engine allows R-Users to run large scale data analysis from the R shell. The steps listed here are also documented in my online book title ‘Getting Started with SparkR for Big Data Analysis’ which can be accessed at: http://…/. These steps will get you up and running in less than 5 mins.

Where do letters occur in words
A while back I encountered an interesting graphic showing where letters were located in english words (http://…/graphing-distribution-of-english.html). The other day I decided to do a similar one for letters in danish words and for this I used R. – See more at: http://…/#sthash.Kn1GLTxq.dpuf

Predicting Titanic deaths on Kaggle II: gbm
Following my previous post I have decided to try and use a different method: generalized boosted regression models (gbm). I have read the background in Elements of Statistical Learning and arthur charpentier’s nice post on it. This data is a nice occasion to get my hands dirty.

Here’s what you’ll grok from the code:
• one way to deal with the ‘default namespace’ issue in R+XML
• one way to deal with error checking for scraping
• how to build an XML file (and, specifically, an RSS/Atom feed) with R
• how to escape XML entities with R
• how to get an XML object as a character string in R

Logistic Growth, S Curves, Bifurcations, and Lyapunov Exponents in R
If you’ve ever wondered how logistic population growth (the Verhulst model), S curves, the logistic map, bifurcation diagrams, sensitive dependence on initial conditions, ‘orbits’, deterministic chaos, and Lyapunov exponents are related to one another… this post attempts to provide a simplified explanation(!) in just 10 steps, each with some code in R so you can explore it all yourself. I’ve included some code written by other people who have explored this problem (cited below) as portions of my own code.

A Path Towards Easier Map Projection Machinations with ggplot2
The \$DAYJOB doesn’t afford much opportunity to work with cartographic datasets, but I really like maps and tinker with shapefiles and geo-data when I can, plus answer a ton of geo-questions on StackOverflow. R makes it easy—one might even say too easy—to work with maps.

An Introduction to reshape2
reshape2 is an R package written by Hadley Wickham that makes it easy to transform data between wide and long formats.