Call matplotlib from R
I often use Python and matplotlib for exploring measurement data (from e.g. accelerometers), even if I use R for the actual analysis. The reason is that I like to be able to flexibly zoom into different parts of the plot using the mouse and this works well for me with matplotlib. So I decided to try to call matplotlib from R using Rcpp and Python/C API. It was surprisingly simple to get it working and I put together a small R-package Rpyplot. The package seems to work well on Ubuntu and Windows 7 for my use cases. A lot of the code is based on the informative Call Python from R through Rcpp post in Rcpp gallery. I decided not use Boost.Python to make compiling on Windows simpler. This post explains how I implemented the package and hopefully it will also allow others to expand the package for their needs. If you do implement additional functionality for Rpyplot I’d appreaciate a pull request on Github.
A Probabilistic Theory of Deep Learning
A grand challenge in machine learning is the development of computational algorithms that match or outperform humans in perceptual inference tasks that are complicated by nuisance variation. For instance, visual object recognition involves the unknown object position, orientation, and scale in object recognition while speech recognition involves the unknown voice pronunciation, pitch, and speed. Recently, a new breed of deep learning algorithms have emerged for high-nuisance inference tasks that routinely yield pattern recognition systems with near- or super-human capabilities. But a fundamental question remains: Why do they work? Intuitions abound, but a coherent framework for understanding, analyzing, and synthesizing deep learning architectures has remained elusive. We answer this question by developing a new probabilistic framework for deep learning based on the Deep Rendering Model: a generative probabilistic model that explicitly captures latent nuisance variation. By relaxing the generative model to a discriminative one, we can recover two of the current leading deep learning systems, deep convolutional neural networks and random decision forests, providing insights into their successes and shortcomings, as well as a principled route to their improvement.
Bags, Balls and the Hypergeometric Distribution
A friend came to me with a question. The original question was a little complicated, but in essence it could be explained in terms of the familiar urn problem. So, here’s the problem: you have an urn with 50 white balls and 9 black balls. The black balls are individually numbered. Balls are drawn from the urn without replacement. What is the probability that
1. the last ball drawn from the urn is a black ball (Event 1) and
2. when the urn is refilled, the first ball drawn will be the same black ball (Event 2).
My colleague thought that this would be an extremely rare event. My intuition also suggested that it would be rather unlikely. As it turns out, we were both surprised.