Univariate stats sometimes fail, while multivariate modelings work well
In many cases of digital marketing especially if it’s online, marketers or analysts usually love to apply A/B tests in order to find the most influential metric on KGI/KPIs from a huge set of explanatory metrics, such as creative components of UI, choice of ads, background images of the page, etc. Such influential metrics are sometimes called ‘golden feature’ or ‘golden metric’ — even though it sounds ridiculous — and many people are looking for it very hard, as they firmly believe ‘once the metric is found, we can very easily raise revenue and/or profit with just raising the golden metric!!’. Ironically, not a few A/B tests are run on such a basis. But, is it really true? If you find any kind of such golden metrics, can you really raise revenue, gather more users, or get more conversions? Yes, in some cases it may be true; however you have to see a case that theoretically it cannot be.

Getting started with Cloud Computing using R Programming
In this article, I have explained the concept of cloud computing using R programming and RStudio using a step-wise methodology. Furthermore, you will also learn about the benefits of using R programming for cloud computing than any other software / programming language.

Document Clustering with Python
In this guide, I will explain how to cluster a set of documents using Python. My motivating example is to identify the latent structures within the synopses of the top 100 films of all time (per an IMDB list).

Simple Data Analysis Using Apache Spark
The purpose of this tutorial is to walk through a simple spark example, by setting the development environment and doing some simple analysis on a sample data file compose of userId, age, gender, profession, and zip code (you can download the source and the data file from Github https://…/SimpleSparkAnalysis).

SPARQL with R in less than 5 minutes
In this article we’ll get up and running on the Semantic Web in less than 5 minutes using SPARQL with R. We’ll begin with a brief introduction to the Semantic Web then cover some simple steps for downloading and analyzing government data via a SPARQL query with the SPARQL R package.

Creating your personal, portable R code library with GitHub
I have a few helper functions I’ve created that I commonly use in my work. Until recently, I manually included these functions at the start of my R scripts by either the tried and true copy-and-paste method, or by extracting them from a local file with the source() function. The former approach has the benefit of keeping the helper code inextricably attached to the main script, but it adds a good bit of code to wade through. The latter approach keeps the code cleaner, but requires that whoever is running the code always has access to the sourced file and that it is always in the same relative path – and that makes sharing or moving code more difficult. The start of a recent project requiring me to share my helper function library prompted me to find a better solution. The resulting approach takes advantage of GitHub Gists and R’s ability to source via a web-based location to enable you to create a personal, portable library of R functions for private use or to share.

R Helper Functions
If you do a lot of R programming, you probably have a list of R helper functions set aside in a script that you include on R startup or at the top of your code. In some cases helper functions add capabilities that aren’t otherwise available. In other cases, they replicate functionality that is available elsewhere without loading unnecessary components. Below I present two of my most frequently used data manipulation helper functions as examples.

Progress bars in R using winProgressBar
Using progress bars in R scripts can provide valuable timing feedback during development and additional polish to final products. winProgressBar and setWinProgressBar are the primary functions for creating progress bars in R.

Organize a walk around London with R
The subtitle of this post can be ‘How to plot multiple elements on interactive web maps in R’. In this experiment I will show how to include multiple elements in interactive maps created using both plotGoogleMaps and leafletR. To complete the work presented here you would need the following packages: sp, raster, plotGoogleMaps and leafletR.

Using Hadoop with R: It Depends.
In the course of working with our Hadoop users, we are often asked, what’s the best way to integrate R with Hadoop? The answer, in nearly all cases is, It depends.