Accuracy, Pecision, Recall and the Glass of Water

Accuracy, precision and recall are three basic scores to measure the performance of a model in machine learning classification cases. Accuracy is a concept that is rather easy to grasp: take the sum of all correct predictions and divide them by the total number of predictions, correct and wrong ones. Naturally, if there are no wrong predictions, correct predictions devided by correct predictions will give us 1. Perfect accuracy!

Big Ideas in AI for the Next 10 Years

Despite our concerns about China taking the lead in AI, our own government efforts mostly through DARPA continue powerful leadership and funding to maintain our lead. Here’s their plan to maintain that lead over the next decade.

Reinforcement learning is going mainstream. Here’s what to expect.

The same tech that’s beating world champions in games will soon revolutionize anything that can be simulated (and because everything is physics at it’s core – over time that’s everything)

But, What Exactly Is AI?

For years, people viewed computers as machines which can perform mathematical operations at a much quicker rate than humans. They were initially viewed as computational machines like glorified calculators. Early scientists felt that computers could never simulate the human brain. Then, scientists, researchers and (most importantly probably) science fiction authors started asking ‘or could it?’ The biggest obstacle to solving this probem came down to one major issue: the human mind can do things that scientists couldn’t understand, much less approximate.

Using SQL to detect outliers

SQL doesn’t have the features of a language like R or Python, but that doesn’t mean you can’t use it to perform an initial clean of your data by looking for abnormal points or outliers.

From zero to Real-Time Hand Keypoints detection in five months with OpenCV, Tensorflow, and Fastai

In this article, I will show you step by step, how to build your own real time hand keypoints detector with OpenCV, Tensorflow and Fastai (Python 3.7). I will be focusing on the challenges I faced when building it during a fascinating 5 months intensive journey.

A Gentle Introduction to Maximum Likelihood Estimation and Maximum A Posteriori Estimation

Maximum Likelihood Estimation (MLE) and Maximum A Posteriori (MAP) estimation are method of estimating parameters of statistical models. Despite a bit of advanced mathematics behind the methods, the idea of MLE and MAP are quite simple and intuitively understandable. In this article, I’m going to explain what MLE and MAP are, with a focus on the intuition of the methods along with the mathematics behind.

Regression or Classification? Linear or Logistic?

If it’s one of the former options, then you should use a regression model. This means that if you’re trying to predict quantities like height, income, price, or scores, you should be using a model that will output a continuous number. Or, if the target is the probability of an observation being a binary label (ex. probability of being good instead of bad), then you should also choose a regression model, but the models you use will be slightly different. These models are evaluated by the mean squared error (MSE or variation) and root mean squared error (RMSE or standard deviation) to quantify the amount of error within the model.

How to Lie with P-values

P-values are used in statistics and scientific publications, much less so in machine learning applications where re-sampling techniques are favored and easy to implement today thanks to modern computing power. In some sense, p-values are a relic from old times, when computing power was limited and mathematical / theoretical formulas were favored and easier to deal with than lengthy computations.

CNN Heat Maps: Class Activation Mapping (CAM)

This is the first post in an upcoming series about different techniques for visualizing which parts of an image a CNN is looking at in order to make a decision. Class Activation Mapping (CAM) is one technique for producing heat maps to highlight class-specific regions of images.

AutoML – A Tool to Improve Your Workflow (Updated)

After publishing my original article on this topic, Erin LeDell – Chief machine learning scientist at, provided me with some great feedback on the article. As a response to this I decided to include her comments in this article to demonstrate how easy it can be to use the AutoML features.

The Future of AI in the Face of Data Famine

The field of artificial intelligence research was founded as an academic discipline in 1956. Despite a history of 60 years, the era is still at the very beginning, and the future has a bumpy road ahead when compared to similar disciplines, which is mainly driven by challenges in the domain of ethics and availability of data.

Open-sourcing AI Habitat, an advanced simulation platform for embodied AI research

From a robot asked to ‘grab my phone from the desk upstairs’ to a device that helps its visually impaired wearer navigate an unfamiliar subway system, the next generation of AI-powered assistants will need to demonstrate a broad range of abilities. Many researchers believe the most effective way to develop these skills is to focus on embodied AI, which uses interactive environments to ground systems’ training in the real world, rather than relying on static data sets. To accelerate progress in this space, we’re sharing AI Habitat, a new simulation platform created by Facebook AI that’s designed to train embodied agents (such as virtual robots) in photo-realistic 3D environments. Our goal in sharing AI Habitat is to provide the most universal simulator to date for embodied research, with an open, modular design that’s both powerful and flexible enough to bring reproducibility and standardized benchmarks to this subfield. To illustrate the benefits of this new platform, we’re also sharing Replica, a data set of hyperrealistic 3D reconstructions of a staged apartment, retail store, and other indoor spaces that were generated by a group of scientists within Facebook Reality Labs (FRL). To our knowledge, this data set contains the most photo-realistic 3D reconstructions of environments available. This level of detail narrows the training gap between virtual and physical spaces, which we believe is important for transferring the skills learned in simulation to the real world.

Towards multiverse databases

A typical backing store for a web application contains data for many users. The application makes queries on behalf of an authenticated user, but it is up to the application itself to make sure that the user only sees data they are entitled to see. ‘Any frontend can access the whole store, regardless of the application user consuming the results. Therefore, frontend code is responsible for permission checks and privacy-preserving transformations that protect user’s data. This is dangerous and error-prone, and has caused many real-world bugs … the trusted computing base (TCB) effectively includes the entire application.’ The central idea behind multiverse databases is to push the data access and privacy rules into the database itself. The database takes on responsibility for authorization and transformation, and the application retains responsibility only for authentication and correct delegation of the authenticated principal on a database call. Such a design rules out an entire class of application errors, protecting private data from accidentally leaking.

Learning Sparse Networks Using Targeted Dropout

Neural networks are easier to optimise when they have many more weights than are required for modelling the mapping from inputs to outputs. This suggests a two-stage learning procedure that first learns a large net and then prunes away connections or hidden units. But standard training does not necessarily encourage nets to be amenable to pruning. We introduce targeted dropout, a method for training a neural network so that it is robust to subsequent pruning. Before computing the gradients for each weight update, targeted dropout stochastically selects a set of units or weights to be dropped using a simple self-reinforcing sparsity criterion and then computes the gradients for the remaining weights. The resulting network is robust to post hoc pruning of weights or units that frequently occur in the dropped sets. The method improves upon more complicated sparsifying regularisers while being simple to implement and easy to tune.

Deep Learning Models

A collection of various deep learning architectures, models, and tips for TensorFlow and PyTorch in Jupyter Notebooks.

AI Software Reveals the Inner Workings of Short-term Memory

Research by neuroscientists at the University of Chicago shows how short-term, working memory uses networks of neurons differently depending on the complexity of the task at hand. The researchers used modern artificial intelligence (AI) techniques to train computational neural networks to solve a range of complex behavioral tasks that required storing information in short term memory. The AI networks were based on the biological structure of the brain and revealed two distinct processes involved in short-term memory. One, a ‘silent’ process where the brain stores short-term memories without ongoing neural activity, and a second, more active process where circuits of neurons fire continuously. The study, led by Nicholas Masse, PhD, a senior scientist at UChicago, and senior author David Freedman, PhD, professor of neurobiology, was published this week in Nature Neuroscience. ‘Short-term memory is likely composed of many different processes, from very simple ones where you need to recall something you saw a few seconds ago, to more complex processes where you have to manipulate the information you are holding in memory,’ Freedman said. ‘We’ve identified how two different neural mechanisms work together to solve different kinds of memory tasks.’