In many projects I carried out, companies, despite having fantastic AI business ideas, display a tendency to slowly become frustrated when they realize that they do not have enough data… However, solutions do exist! The purpose of this article is to briefly introduce you to some of them (the ones that are proven effective in my practice) rather than to list all existing solutions. The problem of data scarcity is very important since data are at the core of any AI project. The size of a dataset is often responsible for poor performances in ML projects. Most of the time, data related issues are the main reason why great AI projects cannot be accomplished. In some projects, you come to the conclusion that there is no relevant data or the collection process is too difficult and time-consuming.
All of us have heard about the Monte Carlo Markov Chain sometime or other. Sometimes while reading about Bayesian statistics. Sometimes while working with tools like Prophet. But MCMC is hard to understand. Whenever I read about it, I noticed that the crux is typically hidden in deep layers of Mathematical noise and not easy to decipher. I had to spend many hours to get a working understanding of the concept. This blog post is intended to explain MCMC methods simply and knowing what they are useful for. I will delve upon some more applications in my next post. So let us get started.
A guide to Neural Machine Translation using an Encoder Decoder structure with attention. Includes a detailed tutorial using PyTorch in Google Colaboratory.
An introductory, hands-on guide to time series analysis and forecasting; investigating climate data using Python, Pandas, and Facebook’s Prophet library
Starting from the observation that Friedman’s partial dependence plot has exactly the same formula as Pearl’s back-door adjustment, we explore the possibility of extracting causal information from black-box models trained by machine learning algorithms. There are three requirements to make causal interpretations: a model with good predictive performance, some domain knowledge in the form of a causal diagram and suitable visualization tools. We provide several illustrative examples and find some interesting causal relations in these datasets. ( see also: https://…/whats-in-the-black-box-8f36b262362e )
Sedentary behavior has become a major public health risk around the world. Experts tell us that a minimum amount of daily physical activity (PA) is necessary to maintain health and reduce the risk of chronic diseases such as diabetes, heart disease, and cancer. Some researchers have suggested that sitting for long periods of time may in itself contribute to the problem, in addition to the total amount of inactivity. A study published in 2017 found that sitting in periods of longer than 30 minutes at a time increased mortality risk after control for other factors. The total amount of sedentary time was separately a risk factor. A follow-up study by the same team, published in January 2019, found that there was no benefit of reducing the duration of episodes of sitting unless those episodes were replaced with physical activity (of any intensity). A study published in April 2019 found that sitting time increased an hour between 2007 and 2016 – to more than 6 and a half hours for adults and nearly 8 and a half hours for adolescents.
Algorithms are complex mathematical equations for computers that has little practical use for people, or are they? In reality, an algorithm is just a set of steps that are followed to complete a task. You could make an algorithm for ordering pizza, that would look a little something like this: choose the crust, choose the sauce, choose the amount of cheese, choose the toppings, and submit an order. We just created the pizza algorithm, and without getting a 4-year degree in computer science, what? Impossible I know but that’s all an algorithm is at its core, a bunch of steps to do something, though the name and its heavy use in computer science make it out to be intimidating. Decisions decisions, the bane of existence, am I making the right decision, is this the best option, will I regret this? Well, good news for any indecisive people out there, there’s an algorithm for it!
We present Markov Chains and the Hidden Markov Model.
Feature engineering is an important subject in Machine Learning. If we want our machine to learn, we need to give meaningful information to it. Deep learning architectures may not require well created features, because they can actually create features by themselves. But, this would create a need for a huge data set which will also need big amount of computation power. For feature creation, we need to know about the subject that machine learning will work on. If it is signal processing, we need to know about signal processing. If it is finance, we need to know some finance so that we will know which features we can create.
Local and global optimization is usually known and somewhat ignored once we leave high school calculus. For a quick review, take the cover image of this blog. For a given function, there are multiple points where there are dips across space. Each dip is a minimum. However, you can see that only one point is the deepest, known as the global minimum. All other points are local minima. The first section, I will go over calculating this global value for a known function. However, once we leave the realm of this known function of high school math, most optimization problems deal with local optimizations only. Even this, I will go over and provide reasons why. Quick note, there are plenty of other strategies out that I have not mentioned in this blog. However, I am hoping that it gets you interested enough to explore more options.
Deep Learning has shown immense success in various fields and is continuing to spread its wings. But one of the major issues with training any traditional neural network model is the requirement of colossal amounts of data, and using this data to perform many iterative updates across many labeled examples. Let’s take a look at a classic example of cats vs dogs classification. Although over the last two decades, we have made our models better and better to increase the accuracy, but the fundamental problem mentioned above still persists. We still need loads of labelled dogs and cats to get a decent accuracy.
AI technologies are slowly and steadily making a foray into our lives. They are already making some important decisions for us like whether we are qualified for a mortgage, what kind of movies or songs we prefer and are even suggesting email replies to us. Computer vision is one such actively growing subfield of AI which holds a lot of promise. Techniques like facial recognition, object detection, Image recognition, emotion analysis etc are being used across industries to enhance the consumer experience, reduce costs and increase security. But what if the results of these systems are prejudiced towards a particular race, gender or region. Well, there is definitely more to it than what meets the eye.
G-computation algorithm was first introduced by Robins in 1986  to estimate the causal effect of a time-varying exposure in the presence of time-varying confounders that are affected by exposure, a scenario where traditional regression-based methods would fail. G-computation or G-formula belongs to the G-method family  which also includes inverse probability weighted marginal structural models and g estimation of a structural nested model. They provide consistent estimates of contrasts (e.g. differences, ratios) of average potential outcomes under a less restrictive set of identification conditions than standard regression methods. In this post, I’ll explain more in details of how G-computation works in causal analysis.
Do you happen to know the library, AllenNLP? If you’re working on Natural Language Processing (NLP), you might hear about the name. However, I guess a few people actually use it. Or the other has tried before but hasn’t know where to start because there are lots of functions. For those who aren’t familiar with AllenNLP, I will give a brief overview of the library and let you know the advantages of integrating it to your project.
R-squared can help you answer the question ‘How does my model perform, compared to a naive model?’. However, r2 is far from a perfect tool. Probably the main issue is that every data set contains a certain amount of unexplainable data. R-squared can’t tell the difference between the explainable and the unexplainable , so it will go on, and on, trying to perfect its goal. If you keep on ‘perfecting’ with r-squared by adding more predictors, you’ll end up with misleading results, and reduced precision.