Another example of why centering predictors can be good idea
You can perhaps already see what happened. The other researchers noticed that the two lines had essentially the same intercept, and they concluded that the two groups differed only in the slopes, not the level of the lines. But this was a mistake in the context of the data. It’s similar to the example that Jennifer and I give in our book with the regressions of earnings on height. The intercept of such a regression doesn’t have much meaning, as it corresponds to the earnings of someone with a height of zero inches.

From Big Data to Intelligent Applications
Is there a bigger purpose for all the work that small and large companies are doing in the Big Data domain? Bigger than trying to find needles of insight in haystacks of data? My answer is a definite ‘Yes’. Big Data’s ultimate purpose is to make humans more intelligent! It’s to save us from some (yes, not all!) of the dumb decisions that we make! Doing things more intelligently by enhancing our natural capabilities via some sort of tool has been a dream not unlike the dream of building the iron horse or the flying machine. These are the type of dreams that were/are immensely challenging for humans to bring to reality but we just cannot give up on achieving them. Building these tools are part of our innate desires to explore the world beyond what is naturally available to us.

What is Linear Regression? A Qualitative Exploration
When it comes to statistical modeling few things are as tried and tested as linear regression. It’s simple, it’s (fairly) easy to conceptualize, and fast. Unfortunately, most of the articles I’ve read about it feel closer to math textbooks than to layman’s definitions. In this post I’ll give a fairly informal definition of linear regression, overview the goals of linear regression, and talk about a few things you can use it for.

MapReduce for C: Run Native Code in Hadoop
We are pleased to announce the release of MapReduce for C (MR4C), an open source framework that allows you to run native code in Hadoop.

Achieve Big Results with Big Data
People get excited about Big Data, not just because it’s big but also because it holds the promise of providing big impact and return for the business. But Big Data by itself is of little use. And so far, few organizations have been able to harness it and make a big impact on their business, whether it’s for customer retention, improving financial planning, or optimizing marketing campaigns. – See more at: http://blogs.sap.com/analytics/2015/02/19/achieve-big-results-with-big-data/#sthash.c9mBSrxx.dpuf

Host a CRAN mirror using Docker
CRAN mirrors are the backbone to everyday common R usage. They provide the R website and most of the R packages today. Currently there are about 104 official CRAN mirrors. Hosting a CRAN mirror is one step to help the R community and is explained here.

Customer segmentation – LifeCycle Grids, CLV and CAC with R
We studied a very powerful approach for customer segmentation in the previous post, which is based on the customer’s lifecycle. We used two metrics: frequency and recency. It is also possible and very helpful to add monetary value to our segmentation. If you have customer acquisition cost (CAC) and customer lifetime value (CLV), you can easily add these data to the calculations.

Myth Busting Artificial Intelligence
We’ve all been seeing hype and excitement around artificial intelligence, big data, machine learning and deep learning. There’s also a lot of confusion about what they really mean and what’s actually possible today. These terms are used arbitrarily and sometimes interchangeably, which further perpetuates confusion. So, let’s break down these terms and offer some perspective.