Telling What’s True About Power, if practicing within the error-statistical tribe

Suppose you are reading about a statistically significant result x …

Introduction to Monte Carlo Methods

I’m going to keep this tutorial light on math, because the goal is just to give a general understanding. The idea of Monte Carlo methods is this—generate some random samples for some random variable of interest, then use these samples to compute values you’re interested in.

24 Data Science, R, Python, Excel, and Machine Learning Cheat Sheets

Here’s a good starting point.

Analytical Data Marts – data analyst’s indispensable tool

Information about provided services, customers and transactions can be stored in different database systems and data warehouses, depending on the way in which a company operates. Due to such arrangements, even the simplest analyses or report may require significant expenditures of time, as well as in-depth knowledge about database systems and their availability. For an analyst this situation is frequently the source of difficulties – lack of required information or time to analyze source data may lead to errors in the resulting analyses and, in consequence, to financial losses.

GAM: The Predictive Modeling Silver Bullet

Imagine that you step into a room of data scientists; the dress code is casual and the scent of strong coffee is hanging in the air. You ask the data scientists if they regularly use generalized additive models (GAM) to do their work. Very few will say yes, if any at all.

Predict Social Network Influence with R and H2O Ensemble Learning

H2O is an awesome machine learning framework. It is really great for data scientists and business analysts ‘who need scalable and fast machine learning’. H2O is completely open source and what makes it important is that works right of the box. There seems to be no easier way to start with scalable machine learning. It hast support for R, Python, Scala, Java and also has a REST API and a own WebUI. So you can use it perfectly for research but also in production environments. H2O is based on Apache Hadoop and Apache Spark which gives it enormous power with in-memory parallel processing.

Learn Big Data Analytics using Top YouTube Videos, TED Talks & other resources

There has been a lot of investment in Big Data by various companies in last few years. This rise in usage of big data analytics has resulted in high demand of skilled big data professionals. While there has been a lot of debate over usefulness of this spend, there is a clear increase in the jobs on Big Data.

Taxi Trajectory Winners’ Interview: 1st place, Team

Taxi Trajectory Prediction was the first of two competitions that we hosted for the 2015 ECML PKDD conference on machine learning. Team Taxi took first place using deep learning tools developed at the MILA lab where they currently study. In this post, they share more about the competition and their winning approach.

How Google Translate squeezes deep learning onto a phone

Today we announced that the Google Translate app now does real-time visual translation of 20 more languages. So the next time you’re in Prague and can’t read a menu, we’ve got your back. But how are we able to recognize these new languages?

15 Questions All R Users Have About Plots

1. How To Draw An Empty R Plot?
2. How To Set The Axis Labels And Title Of The R Plots?
3. How To Add And Change The Spacing Of The Tick Marks Of Your R Plot

MRAN’s Packages Spotlight

At MRAN we are attempting to provide some help with the problem of keeping up with what’s new through the old fashioned (pre-machine learning) practice of making some idiosyncratic, but not completely capricious, human generated recommendations. With every new release of RRO we publish on the Package Spotlight page brief descriptions of packages in three categories: New Packages, Updated Packages and GitHub packages. None of these lists are intended to be either comprehensive or complete in any sense.

Hockey Elbow and Other Response Time Injuries

You’ve heard of tennis elbow. Well, there’s a non-sports, performance injury that I like to call hockey elbow. An example of such an ‘injury’ is shown in Figure 1, which appeared in a recent computer performance analysis presentation. It’s a reminder of how easy it is to become complacent when doing performance analysis and possibly end up reaching the wrong conclusion.