Introductory guide to Linear Optimization in Python (with TED videos case study)

Data Science & Machine Learning are being used by organizations to solve a variety of business problems today. In order to create a real business impact, an important consideration is to bridge the gap between the data science pipeline and business decision making pipeline. The outcome of data science pipeline is uaully predictions, patterns and insights from data (typically without any notion of constraints) but that alone is insufficient for business stakeholders to take decisions. Data science output has to be fed into the business decision making pipeline which involves some sort of optimization involving constraints and decision variables which model key aspects of the business. For example, if you are running a Super Market chain – your data science pipeline would forecast the expected sales. You would then take those inputs and create an optimised inventory / sales strategy. In this article, we will show one such example of Linear optimization for selecting which TED videos to watch.


Recursive (not recurrent!) Neural Nets in TensorFlow

For the past few days I’ve been working on how to implement recursive neural networks in TensorFlow. Recursive neural networks (which I’ll call TreeNets from now on to avoid confusion with recurrent neural nets) can be used for learning tree-like structures (more generally, directed acyclic graph structures). They are highly useful for parsing natural scenes and language; see the work of Richard Socher (2011) for examples. More recently, in 2014, Ozan Irsoy used a deep variant of TreeNets to obtain some interesting NLP results.


A simple neural network with Python and Keras

If you’ve been following along with this series of blog posts, then you already know what a huge fan I am of Keras. Keras is a super powerful, easy to use Python library for building neural networks and deep learning networks. In the remainder of this blog post, I’ll demonstrate how to build a simple neural network using Python and Keras, and then apply it to the task of image classification.


Gödel’s Incompleteness Theorem And Its Implications For Artificial Intelligence

This text gives an overview of Gödel’s Incompleteness Theorem and its implications for artificial intelligence. Specifically, we deal with the question whether Gödel’s Incompleteness Theorem shows that human intelligence could not be recreated by a traditional computer. Sections 2 and 3 feature an introduction to axiomatic systems, including a brief description of their historical development and thus the background of Gödel’s Theorem. These sections provide the basic knowledge required to fully understand Gödel’s Theorem and its significance for the history of mathematics – a necessary condition for understanding the arguments to follow. Section 4 features a thorough description of Gödel’s Theorem and outlines the basic idea of its proof. Sections 5 and 6 deal with arguments advocating the view that intelligence has a non-algorithmic component on the grounds of Gödel’s Theorem. In addition to a detailed account of the arguments, these sections also feature a selection of prominent objections to these arguments raised by other authors. The last section comprises a discussion of the arguments and my own objections.


The best dataviz & infographics of the year: the Information is Beautiful Awards Longlist 2017

With more entries from more countries than ever before, it’s been a vintage year for our celebration of the world’s finest dataviz & infographics.


How to sample from multidimensional distributions using Gibbs sampling?

We will show how to perform multivariate random sampling using one of the Markov Chain Monte Carlo (MCMC) algorithms, called the Gibbs sampler.


Linear / Logistic Regression in R: Dealing With Unknown Factor Levels in Test Data

Let’s say you have data containing a categorical variable with 50 levels. When you divide the data into train and test sets, chances are you don’t have all 50 levels featuring in your training set. This often happens when you divide the data set into train and test sets according to the distribution of the outcome variable. In doing so, chances are that our explanatory categorical variable might not be distributed exactly the same way in train and test sets – so much so that certain levels of this categorical variable are missing from the training set. The more levels there are to a categorical variable, it gets difficult for that variable to be similarly represented upon splitting the data.


Data Visualization Course for First-Year Students

A little over a year ago, we decided to propose a data visualization course at the first-year level. We had been thinking about this for awhile, but never had the time to teach it given the scheduling constraints we had. When one of the other departments on campus was shut down and the faculty merged in with other departments, we felt that the time was ripe to make this proposal.


Practical Machine Learning with R and Python – Part 1

This is the 1st part of a series of posts I intend to write on some common Machine Learning Algorithms in R and Python. In this first part I cover the following Machine Learning Algorithms
• Univariate Regression
• Multivariate Regression
• Polynomial Regression
• K Nearest Neighbors Regression


Principal Component Analysis – Unsupervised Learning

Unsupervised learning is a machine learning technique in which the dataset has no target variable or no response value-Y Y .The data is unlabelled. Simply saying,there is no target value to supervise the learning process of a learner unlike in supervised learning where we have training examples which have both input variables X i Xi and target variable-Y Y i,e-(xi,yi) (xi,yi) vectors and by looking and learning from the training examples the learner generates a mapping function(also called a hypothesis) f:X i ->Y f:Xi->Y which maps X i Xi values to Y Y and learns the relationship between input variables and target variable so that we could generalize it to some random unseen test examples and predict the target value.


Forget Killer Robots—Bias Is the Real AI Danger

Google’s AI chief isn’t fretting about super-intelligent killer robots. Instead, John Giannandrea is concerned about the danger that may be lurking inside the machine-learning algorithms used to make millions of decisions every minute. “The real safety question, if you want to call it that, is that if we give these systems biased data, they will be biased,” Giannandrea said before a recent Google conference on the relationship between humans and AI systems.