4 Ways to Debug your Deep Neural Network

So you’ve spent weeks building a dataset and coding up a neural network. You’ve trained the model, and are getting less-than-stellar results. What do you do next? Deep learning is often seen as a black box, and I’m not going to argue with this view – what sense can you make of tens of thousands of learned parameters? But the black-box view poses an obvious problem for machine learning practitioners: How do you debug a model? In this post, I’ll walk through some techniques we’ve used at Cardiogram to debug DeepHeart, a deep neural network to predict diseases using data from the Apple Watch, Garmin, and WearOS devices. At Cardiogram, we believe that constructing DNN architectures isn’t alchemy, it’s engineering.


When To Multiply Inside Your Neural Network?

Typical neural networks consist of linear combinations of input features and Relu units built upon them, and nothing else. So is there a need to introduce explicit multiplications, either on inputs or inside the network? But first, let’s consider why you should not multiply inside your neural network. Suppose you have a bunch of features and want to construct arbitrary multiplicative terms. The straightforward thing would be to feed them into the network after applying log(). Multiplications turn to additions, job done! This is useful because of other reasons too. When you multiply a bunch of numbers, the product typically has a wide variance with either large or small values in the range. If you deal with the raw values, then a large region of your interest will likely get squashed. Mapping the features to log space prevents such catastrophes, so that’s definitely the first consideration before injecting multipliers into the network. Secondly, neural networks can approximate arbitrary functions. And of course, it can approximate a multiplier as well. To see this, we train a single hidden layer neural network to learn multiplication. If you have Tensorflow installed, you can copy-paste the code below and run it.


Using Transfer Learning for NLP with Small Data

Text classification has numerous applications, from tweet sentiment, product reviews, toxic comments, and more. It’s a popular project topic among Insight Fellows, however a lot of time is spent collecting labeled datasets, cleaning data, and deciding which classification method to use. Services like Clarifai, and Google AutoML have made it very easy to create image classification models with less labeled data, but it’s not as easy to create such models for text classification.


Intro to XGBoost: Predicting Life Expectancy with Supervised Learning

Today we’ll use XGBoost Boosted Trees for regression over the official Human Development Index dataset. Who said Supervised Learning was all about classification?


Monotonic Optimal Binning (MOB) for Consumer Credit Risk Scorecard Development

The MOB (Monotonic Optimal Binning) package is a collection of R functions that would generate the monotonic binning and perform the WoE (Weight of Evidence) transformation used in consumer credit scorecard developments. Being a piecewise constant transformation in the context of logistic regressions, the WoE has also been employed in other use cases, such as consumer credit loss estimation, prepayment, and even fraud detection models. In addition to monotonic binning and WoE transformation, Information Value and KS statistic of each independent variables are also calculated to evaluate the variable predictiveness. In order to be useful in a production environment, the package also provides functionalities that would allow users to do batch processing of binning and transformations with many independent variables concurrently in a multi-core computing engine.


Optimize your CPU for Deep Learning

In the last few years, Deep Learning has picked up pace in academia as well as industry. Every company is now looking for an A.I. based solution to problems. This boom has its own merits and demerits, but that’s for another article, another day. This surge in Machine Learning practitioners have infiltrated the academia to its roots, and almost every student from every domain has access to AI and ML knowledge via courses, MOOCs, books, articles, and of course papers.


The False Promise of Off-Policy Reinforcement Learning Algorithms

We have all witnessed the rapid development of reinforcement learning methods in the last couple of years. Most notably the biggest attention has been given to off-policy methods and the reason is quite obvious, they scale really well in comparison to other methods. Off-policy algorithms can (in principle) learn from data without interacting with the environment. This is a nice property, this means that we can collect our data by any means that we see fit and infer the optimal policy completely offline, in other words, we use a different behavioral policy that the one we are optimizing. Unfortunately, this doesn’t work out of the box like most people think, as I will describe in this article.


Three Simple Theories to Help Us Understand Overfitting and Underfitting in Machine Learning Models

The two worst things that could happen to a machine learning model is either to build useless knowledge or to learn nothing relevant from a training dataset. In machine learning theory, these two phenomenon’s are described using the terms overfitting and underfitting respectively and they constitute two of the biggest challenges in modern deep learning solutions. I often like to compare deep learning overfitting to human hallucinations as the former occurs when algorithms start inferring non-existing patterns in datasets. Underfitting is closer to a learning disorder that prevents people from acquiring the relevant knowledge to perform a given task. Despite its importance, there is no easy solution to overfitting and deep learning application often need to use techniques very specific to individual algorithms in order to avoid overfitting behaviors. This problem get even more scarier if you consider that humans are also incredibly prompt to overfitting which translates into subjective evaluations of the machine learning models. Just think about how many stereotypes you used in the last week. Yeah, I know….Today, I would like to present three different theories that are helpful to better reason through overfitting and underfitting conditions in machine learning models.


Complete Guide to Enterprise Chatbot Development

Customer service is the new marketing.’ The present-day customer has information at the tip of their fingertips. Enterprises are always on the lookout to make sure that they build a water-tight customer support process and have the right systems in place. As on today, major brands and enterprises are looking at getting started with bot development initiatives in order to reach customers with better efficiency as well as cost-effectiveness. When it comes to scaling customer support, KLM Royal Dutch Airlines handled an upwards of 16,000 interactions on a weekly basis and in 6 months, the Blue Bot sent out almost 2 million messages to more than 500,000 customers. Reports suggest that close to 37% of Americans would prefer to use a chatbot to get a swift answer, in an urgent situation. Apart from that, there’s a whopping 64% of Americans that consider the 24-hour availability of bots to be the best feature. Additional stats when it comes to business profitability show that 47% of consumers would purchase through a chatbot and millennials (26 to 36-year-olds) are prepared to spend up to £481.15 on a business transaction through a bot. So far, enterprises that have adopted chatbots have done so by creating and using them in silos. Although this approach may work for businesses that need to automate a handful of tasks, it doesn’t exactly align with the high-end needs of enterprise – scalability, agility, and cost-effectiveness across a smorgasbord of functions.


AI Search Algorithms Every Data Scientist Should Know

The post below outlines a few of the key search algorithms in AI, why they are important, what and what they are used for. While in recent years, search and planning algorithms have taken a back seat to machine and deep learning methods, better understanding these algorithms can boost the performance of your models. Additionally as more powerful computational technologies such as quantum computing emerge it is very likely that search based AI will make a comeback .


Hewlett Packard Enterprise integrates BlueData to accelerate Artificial Intelligence and data-driven innovation in the enterprise

Hewlett Packard Enterprise (HPE) today announced a new integrated offering that combines the HPE BlueData software platform with HPE Apollo systems and HPE Pointnext services, to provide customers with a powerful solution for AI and data-driven business innovation. Today’s announcement builds on HPE’s acquisition of BlueData in November 2018, and represents an important milestone in HPE’s ongoing strategy for the AI and analytics market. BlueData is now available as a standalone HPE software solution, and customers can continue to run BlueData software on any infrastructure. With BlueData, HPE offers customers an as-a-service experience for artificial intelligence/machine learning (AI/ML) and analytics running on multi-vendor infrastructure and public cloud services – leveraging the portability of containers across on-premises, hybrid cloud, and multi-cloud environments. The result is dramatically faster deployments down from months, to minutes, with the agility and flexibility that data science teams need to deliver business value.


Automate your Flask Deployments on AWS

In an ideal world, data scientists would focus their time on analysis and modeling, but there is often seemingly unavoidable overhead in visualizing and presenting their results. Because data scientists generally don’t have an infrastructure background, they often struggle to build out servers and use tools to visualize their modeling and analysis. At Insight, Fellows have to manually set up a web server on an Amazon cloud EC2 instance to deploy a Flask app to showcase their projects, which can take 6-8 hours or more, especially for those without prior experience. But, what if they could shrink that time 100-fold? In this post, I will share the work I did in the Insight DevOps Engineering program, where I launched an open source project called One-click to deploy web apps in minutes. Check it out on GitHub.


3 Awesome Visualization Techniques for every dataset

Visualizations are awesome. However, a good visualization is annoyingly hard to make. Moreover, it takes time and effort when it comes to present these visualizations to a bigger audience. We all know how to make Bar-Plots, Scatter Plots, and Histograms, yet we don’t pay much attention to beautify them.
This hurts us – our credibility with peers and managers. You won’t feel it now, but it happens. Also, I find it essential to reuse my code. Every time I visit a new dataset do I need to start again? Some reusable ideas of graphs that can help us to find information about the data FAST. In this post, I am also going to talk about 3 cool visual tools:
• Categorical Correlation with Graphs,
• Pairplots,
• Swarmplots and Graph Annotations using Seaborn.
In short, this post is about useful and presentable graphs.
Advertisements