The Seductive Diversion of ‘Solving’ Bias in Artificial Intelligence

Artificial intelligence evokes a mythical, objective omnipotence, but it is backed by real-world forces of money, power, and data. In service of these forces, we are being spun potent stories that drive toward widespread reliance on regressive, surveillance-based classification systems that enlist us all in an unprecedented societal experiment from which it is difficult to return. Now, more than ever, we need a robust, bold, imaginative response.

Grasp2Vec: Learning Object Representations from Self-Supervised Grasping

From a remarkably young age, people are capable of recognizing their favorite objects and picking them up, despite never being explicitly taught how to do so. According to cognitive developmental research, the ability to interact with objects in the world plays a crucial role in the emergence of object perception and manipulation capabilities, such as targeted grasping. By interacting with the world around them, people are able to learn with self-supervision: we know what actions we took, and we learn from the outcome. In robotics, this type of self-supervised learning is actively researched because it enables robotic systems to learn without the need for large amounts of training data or manual supervision. Inspired by the concept of object permanence, we propose Grasp2Vec, a simple yet highly effective algorithm for acquiring object representations. Grasp2Vec is based on the intuition that an attempt to pick up anything provides several pieces of information – if a robot grasps an object and holds it up, the object had to be in the scene before the grasp. Furthermore, the robot knows that the object it grasped is currently in its gripper, and therefore has been removed from the scene. By using this form of self supervision, the robot can learn to recognize the object by the visual change in the scene after the grasp.

Sharing Modeling Pipelines in R

Reusable modeling pipelines are a practical idea that gets re-developed many times in many contexts. wrapr supplies a particularly powerful pipeline notation, and a pipe-stage re-use system (notes here). We will demonstrate this with the vtreat data preparation system. Our example task is to fit a model on some arbitrary data. Our model will try to predict y as a function of the other columns. Our example data is 10,000 rows of 210 variables. Ten of the variables are related to the outcome to predict (y), and 200 of them are irrelevant pure noise. Since this is a synthetic example we know which is which (and deliberately encode this information in the column names).

Automated Web Scraping in R

There are many blogs and tutorials that teach you how to scrape data from a bunch of web pages once and then you’re done. But one-off web scraping is not useful for many applications that require sentiment analysis on recent or timely content, or capturing changing events and commentary, or analyzing trends in real time. As fun as it is to do an academic exercise of web scraping for one-off analysis on historical data, it is not useful when wanting to use timely or frequently updated data.

Introduction to Named Entity Recognition

In this article we will learn what is Named Entity Recognition also known as NER. We will discuss some of its use-cases and then evaluate few standard Python libraries using which we can quickly get started and solve problems at hand. In the next series of articles we will get under the hood of this class of algorithms, get more sophisticated and will create our own NER from scratch. So, let’s begin this journey.

Introducing TF-Ranking

Ranking is one of the most common problems in machine learning scenarios. From search to recommendation systems, ranking models are an important component of many mainstream machine learning architectures. In machine learning theory, ranking methods are often referred to using terms like learning-to-rank(LTR) or machine learning ranking(LTR). Despite its relevance, developing LTR models at scale remains a challenge in most machine learning frameworks. Recently, artificial intelligence(AI) engineers from Google introduced TF-Ranking, a TensorFlow-based framework for building highly scalable LTR models. The principles behind TF-Ranking are detailed in a research paper published a few weeks ago. Conceptually, a ranking problem is defined as a derivation of ordering over a list of examples that maximizes the utility of the entire list. That definition sounds similar to classification and regression problem but ranking problems are fundamentally different. While the goal of classification or regression is to predict a label or a value for each individual example as accurately as possible, the goal of ranking is to optimally sort the entire example list, such that the examples of highest relevance are presented first. To infer relevance, LTR methods attempt to learn a scoring function that maps example feature vectors to real-valued scores from labeled data. During inference, this scoring function is used to sort and rank examples.

Towards Ethical Machine Learning

I quit my job to enter an intensive data science bootcamp. I understand the value behind the vast amount of data available that enables us to create predictive machine learning algorithms. In addition to recognizing its value on a professional level, I benefit from these technologies as a consumer. Whenever I find myself in a musical rut, I rely on Spotify’s Discover Weekly. I’m often amazed by how Spotify’s algorithms and other machine learning models so accurately predict my behavior. In fact, when I first sat down to write this post, I took a break to watch one Youtube video. Twenty minutes later, I realized just how well Youtube’s recommendation algorithm works. Although I so clearly see the benefits of machine learning, it is also essential to recognize and mitigate its potential dangers.

Step-by-Step Tutorial on Linear Regression with Stochastic Gradient Descent

This article should provide you a good start for us to dive deep into deep learning. Let me walk you through the calculations step-by-step in a stochastic gradient descent for a linear regression task.

Quick ML Concepts: Tensors

Tensorflow, Tensorlab, Deep Tensorized Networks, Tensorized LSTMs… it’s no surprise that the word ‘tensor’ is embedded in the names of many machine learning technologies. But what are tensors? And how do they relate to machine learning? In part one of Quick ML Concepts, I aim to provide a short yet concise summary of what tensors are.

Machine Learning Going Meta

Machine learning has its own rule of three: in order to use it, you need three essential ingredients, namely 1) labeled data; 2) a model architecture you can optimize; and 3) a well-defined objective function. Many a discussion about applying ML to a problem are simply cut short because one of those three is not available. It is getting easier to arrive to that magical combination: ‘big data’ trends have made access to data more ubiquitous in industry, and deep learning has made it much simpler to find good model architectures to apply to a broad class of problems. Interestingly, much of the difficulty often remains in defining the right objective: one that makes business sense, provides a sufficient amount of supervision, and encompasses all the goals inherent to the problem, for instance fairness, safety or interpretability.

Understanding the concept of Hierarchical clustering Technique

Hierarchical clustering Technique is one of the popular Clustering techniques in Machine Learning. Before we try to understand the concept of Hierarchical clustering Technique let us understand about the Clustering…

Intro to Recommender System: Collaborative Filtering

Like many machine learning techniques, a recommender system makes prediction based on users’ historical behaviors. Specifically, it’s to predict user preference for a set of items based on past experience. To build a recommender system, the most two popular approaches are Content-based and Collaborative Filtering. Content-based approach requires a good amount of information of items’ own features, rather than using users’ interactions and feedbacks. For example, it can be movie attributes such as genre, year, director, actor etc., or textual content of articles that can extracted by applying Natural Language Processing. Collaborative Filtering, on the other hand, doesn’t need anything else except users’ historical preference on a set of items. In terms of preference, it usually expressed by two categories. Explicit Rating, is a rate given by a user to an item on a sliding scale, like 5 stars for Titanic. This is the most direct feedback from users to show how much they like an item. Implicit Rating, suggests users preference indirectly, such as page views, clicks, purchase records, whether or not listen to a music track, and so on. In this article, I will take a close look at collaborative filtering that is a traditional and powerful tool for recommender systems.

On integrating symbolic inference into deep neural networks

Deep neural networks have been a tremendous success story over the last couple of years. Many advances in the field of AI, such as recognizing real world objects, fluently translating natural language or playing GO at a world class level, are based on deep neural networks. However, there were only few reports concerning the limitations of this approach. One such limitation is the inability to learn from a small amount of examples. Deep neural networks usually require a huge amount of training examples, whereas humans are able to learn from one single example. If you show a cat to a child who has never seen one before, it can recognize another cat based on this single instance. Deep neural networks on the other hand require hundreds of thousands of images to learn what a cat looks like. Another limitation is the inability to make inferences based on previously learned common knowledge. When reading a text, humans tend to derive wide ranging inferences about possible interpretations of the text. Humans can do this because they can recall knowledge from very different domains and apply it to the text.

Understanding the basic Conversational AI concepts with Dialogflow

This is a beginners guide intended for understanding the different concepts around designing conversations and implementing them using Google Dialogflow. Other Conversational AI tools use the same concepts, so these should be transferable to any platform. I have used a variety of bot builders, and in my opinion Dialogflow is the easiest for rapidly creating simple bots or for a non-programmer. This overview covers how to create intents and the different parts they are made up of, training your bot, and other useful tips to help with using Dialogflow. Before implementing your first bot into dialogflow, it’s always good to map out the conversation flow, just in a mind map style. Having this visualisation will come in handy later when some conversations can be quite long and hard to keep track of. Note: The screenshots below are from Dialogflow, and the use case was for an internal business process assistant.

Building custom models using Keras (BiSeNet) Part II

This year is coming to an END, 2018 was the year that I had the most amazing Artificial Intelligence(AI) learning journey and I came to realise that Keras is a formidable high-level API for fast Deep Learning(DL) development. It reminds me of LEGOS, you just stack layer on top of layer and if you are a creative person with a wild imagination you can adapt or create custom LEGOS so you can build complex shapes and representations. As an engineer, this means a lot and saves a lot of time?-?just plug and play.

Data Science for Product Managers Certificate

Data is at the core of product management. Today’s product managers are increasingly responsible for shipping machine learning-driven product features, making critical product decisions based on machine learning techniques, and developing strong partnerships with data science counterparts. Our Data Science for Product Managers certificate provides the foundational understanding of data science and machine learning necessary for extracting business value from the data produced by your product and your organization.

A High-level Introduction to Differential Privacy

In the eyes of a data scientist, every moment of your life is a data point. From the brand of your toothpaste to the number of times you wave your hand, details that we often take for granted are crucial factors that can be used to infer our behavior and intentions. These insights, mined by multinational organizations, can be used to make aspects of our collective lives more convenient and interesting at the price of our private information being exposed or even exploited.