Recently a method has been developed to visualize the loss landscape of deep neural networks. I personally believe that this is a huge breakthrough, however, I feel a bit questionable about the validity of the created visualization. And today I will investigate the author’s visualization method as well as introduce a few other methods that I think are pretty cool.
In this article, we’ll focus on Markov Models, where an when they should be used, and Hidden Markov Models. This article will focus on the theoretical part. In a second article, I’ll present Python implementations of these subjects.
In this article, you will learn about GloVe, a very powerful word vector learning technique. This article will focus explaining the why GloVe is better and the motivation behind the cost function of GloVe which is the most crucial part of the algorithm. . The code will be discussed in detail in a later article.
If you are here it’s likely that you are interested in analyzing tweets (or something similar) and you have a lot of them, or can get them. One of the most annoying things for that is getting a Twitter application, get the authentication and all of that. And then if you are using Pandas, there’s no way to scale that. So what about a system that doesn’t have to authenticate with the Twitter API, that can get an unlimited (well almost) amount of tweets and the power to analyze them, with NLP and more. Well you’re in for a treat because that’s exactly what I’m going to show you right now.
So, regression… aside from other algorithms and statistical models, it is one more building block upon which Machine Learning successfully works. In its core, regression aims to find the relationship between variables and for Machine Learning it is needed for predicting the outcome based on such a relationship. Obviously, any self-respecting ML-engineer has to be well-oriented in this subject. But wait, there is a whole slew of regressions. Linear and Logistic regressions are ordinarily the first algorithms people learn. But, the truth is that some innumerable forms of regressions can be performed. Each form has its own importance and a specific condition where they are best suited to apply. So, which one to use? In this article, I have explained the most commonly used forms of regression in an understandable way, so you can calculate what is most suitable for your specific task. Let’s roll.
Learning from my own mistakes and best practices, I designed a Data Science Workflow Canvas* to help others achieve their own data science projects. This canvas helps you prioritize your goals first and then work backward to achieve them. You can think about it this way: instead of following steps in a recipe to cook a predetermined meal, you first envision what the meal looks and tastes like and then you start developing a recipe. When working on a data science project, you usually don’t have a set of instructions to achieve a predetermined outcome. Instead, you have to determine the outcomes and the steps to achieve those outcomes. This Data Science Workflow Canvas was designed with that process in mind. Here, I’ll walk you through how to use this canvas, and I’ll share examples of how I implemented this canvas in my own projects.
Okay this title is deliberately provocative and dripping with hyperbole. As one of our data scientists said ‘I was appalled by the title but completely agreed with you in the end’. I’m a strong believer in the opportunities machine learning and the field of data sciences offers but we have to be honest with ourselves. There are growing pockets of skeptics, inflated expectations, and some are even warning of a looming credibility crisis in data sciences. In addition, many decision makers do not have the proper training to help interpret, understand and properly apply the output from our models.
A significant part of recent success in deep learning goes to ReLU activation function. It has achieved the state of art results in deep CNNs for Image classification problems. In this blog, we’ll discuss a robust weight initialization method, which helps in faster convergence of deeper neural models. Kaiming He et al. proposes this method in the Delving Deep into Rectifiers paper(2015). This blog takes inspiration from Fast.ai’s course Deep Learning for coders, Part 2, taught by Jeremy Howard at USF.
How can we understand progress in Deep Learning without a map? I created one such map a couple years ago, but this map needs a drastic overhaul. In ‘Five Capability Levels of Deep Learning Intelligence’, I proposed a hierarchy of capabilities that was meant to inform the progress of Deep Learning development.
Recurrent Neural Networks have been the recent state-of-the-art methods for various problems whose available data is sequential in nature. Adding attention to these networks allows the model to focus not only on the current hidden state but also take into account the previous hidden state based on the decoder’s previous output. There have been various different ways of implementing attention models. One such way is given in the PyTorch Tutorial that calculates attention to be given to each input based on the decoder’s hidden state and embedding of the previous word outputted. This article would introduce you to these mechanisms briefly and then demonstrate a different way of implementing attention that does not limit the number of input samples taken into consideration for calculating attention.
GANs, or Generative Adversarial Networks, are a type of neural network architecture that allow neural networks to generate data. In the past few years, they’ve become one of the hottest subfields in deep learning, going from generating fuzzy images of digits to photorealistic images of faces. Variants of GANs have now done insane stuff, like converting images of zebras to horses and vice versa. I found GANs fascinating, and in an effort to understand them better, I thought that I’d write this article, and in the process of explaining the math and code behind them, understand them better myself.
During last year’s F8 developer conference, Facebook announced the 1.0 launch of PyTorch, the company’s open-source deep learning platform. At this year’s F8, the company launched version 1.1. The small increase in version numbers belies the importance of this release, which focuses on making the tool more appropriate for production usage, including improvements to how the tool handles distributed training. ‘What we’re seeing with PyTorch i is an incredible moment internally at Facebook i to ship it and then an echo of that externally with large companies,’ Joe Spisak, Facebook AI’s product manager for PyTorch, told me. ‘Make no mistake, we’re not trying to monetize PyTorch […] but we want to see PyTorch have a community. And that community is starting to shift from a very research-centric community – and that continues to grow fast – into the production world.’ So with this release, the team and the more than 1,000 open-source committers that have worked on this project are addressing the shortcoming of the earlier release as users continue to push the limits. Some of those users, for example, include Microsoft, which is using PyTorch for its language models that scale to a billion words, and Toyota, which is using it for some of its driver assistance features.