Causality and graphical methods

Sometimes the topics that we use intuitively day to day aren’t the easiest to formalise, like causality. There’s a famous saying that, ‘Correlation does not imply causality’ which expresses the basic difficulty of formalising causal processes. To simplify, this saying means that just because two events always come one after the other, it doesn’t mean that the first event caused the second. This book club is going to explore Judea Pearl’s Causality: Models, Reasoning, and Inference, an attempt to formalise causality and will basically take the form of the notes I make while reading the book. While others learn differently, I find that, when exploring a formal topic, it helps to have an informal understanding first. That informal view may turn out to be simplified or even incorrect but having that overview helps me to have a conceptual view on which to hang new ideas. So I’m going to start my reading of Causality with the epilogue which is the text of a less formal speech summarising Pearl’s views.

The Microsoft Infer.NET machine learning framework goes open source

It isn’t every day that one gets to announce that one of the top-tier cross-platform frameworks for model-based machine learning is open to one and all worldwide. We’re extremely excited today to open source Infer.NET on GitHub under the permissive MIT license for free use in commercial applications. Open sourcing Infer.NET represents the culmination of a long and ambitious journey. Our team at Microsoft Research in Cambridge, UK embarked on developing the framework back in 2004. We’ve learned a lot along the way about making machine learning solutions that are scalable and interpretable. Infer.NET initially was envisioned as a research tool and we released it for academic use in 2008. As a result, there have been hundreds of papers published using the framework across a variety of fields, everything from information retrieval to healthcare. In 2012 Infer.NET even won a Patents for Humanity award for aiding research in epidemiology, genetic causes of disease, deforestation and asthma.

A Step-by-Step Introduction to the Basic Object Detection Algorithms (Part 1)

How much time have you spent looking for lost room keys in an untidy and messy house? It happens to the best of us and till date remains an incredibly frustrating experience. But what if a simple computer algorithm could locate your keys in a matter of milliseconds? That is the power of object detection algorithms. While this was a simple example, the applications of object detection span multiple and diverse industries, from round-the-clock surveillance to real-time vehicle detection in smart cities. In short, these are powerful deep learning algorithms.

Machine Learning Black Friday Dataset

In this tutorial, you will gain knowledge on filling null values, preprocessing data, reducing dimensionality using PCA, and split data using K-Fold.

Add value to your visualizations in R

One of the most demanded skills of the data analyst/scientist in 2018 is visualization. Organizations and companies have a huge demand for data visualization because this branch of methods is very efficient for the recognition of patterns and getting insight into big data.

Monte Carlo techniques to create counterfactuals

In the previous STT5100 course, last week, we’ve seen how to use monte carlo simulations. The idea is that we do observe in statistics a sample …

AI in production: The droids you’re looking for

Jonathan Ballon explains why Intel’s AI and computer vision edge technology will drive advances in machine learning and natural language processing.

Optimize your R Code using Memoization

This article describes how you can apply a programming technique, called Memoization, to speed up your R code and solve performance bottlenecks. Wikipedia says: In computing, […] memoisation is an optimization technique used primarily to speed up computer programs by storing the results of expensive function calls and returning the cached result when the same inputs occur again.

Python vs. Scala: a comparison of the basic commands

I recently started playing a little bit with Scala, and I have to say it has been kind of traumatic. I love learning new things but after months of programming with Python, it is just not natural to set that aside and switch mode while solving Data Science problems. When learning a new language, whether it is a coding or a spoken one, it is normal for this to happen. We tend to fill in the gaps of the things we don’t know with the things we know, even if they don’t belong to the language we are trying to write/speak! When trying to learn a new language, it is important to be completely surrounded by the language you want to learn, but first of all, it is important to have well established parallelisms between the known and the new language, at least in the beginning. This works for me, a bilingual person who learned a second language really quickly, at an adult age. At the beginning, I needed connections between Italian (the language I knew) and English (the language I was learning), but as I became more and more fluent in English, I started to forget the parallelisms because it was just becoming natural and I didn’t need to translate it in my head first, anymore. The reason why I decided to write this post is, in fact, to establish parallelisms between Python and Scala, for people who are fluent in one of the two, and are starting to learn the other one, like myself.

Tensorflow GPU Installation Made Easy: Use conda instead of pip

I have a good configuration GPU on which I used to play FIFA. Switching to AI, I wanted to use GPU for Deep Learning instead of playing games. But….. I was scared of Tensorflow installations with incompatible Cuda Versions. In this article I will explain the conventional approach and the new optimized approach and why we should dump pip and use conda instead.

Finding Science in Data Science

Quick note: This article includes fancy terms like ‘formal system’ and ‘grid search’, but these aren’t complicated terms at all! I think of a formal system as a language we use to interpret something, and a grid search is just trying every combination of things so you can stick with the best one. I do think a basic interest in, and familiarity with, experimentation and machine learning is helpful when reading this article.

A Quick Introduction to Neural Arithmetic Logic Units

Classical neural networks are extremely flexible, but there are certain tasks they are not well suited for. Some arithmetic operations, in particular, prove challenging to neural networks. That’s why Trask et al in their paper Neural Arithmetic Logic Units introduce two new modules that are meant to perform well on certain arithmetic tasks. In this blog post, I’ll describe the problem that the paper aims to solve, the proposed solution and discuss implementation details and results reproduction using PyTorch.

Confusion Matrix in Object Detection with TensorFlow

At the time of this writing, the TensorFlow Object Detection API is still under research and constantly evolving, so it’s not strange to find missing pieces that could make the library much more robust for production applications.