Artificial General Intelligence (AGI) is Impeding AI Machine Learning Success

I was at a social gathering a few weeks ago and one of the guests approached me saying, ‘I understand you know something about artificial intelligence.’ And then he went on to tell me how scary it is that within a few years we will have computers that can replace human beings. He is talking about what is referred to as artificial general intelligence (AGI).

A literature review on machine learning in supply chain management

In recent years, a number of practical logistic applications of machine learning (ML) have emerged, especially in Supply Chain Management (SCM). By linking applied ML methods to the SCM task model, the paper indicates the current applications in SCM and visualises potential research gaps. Methodology: Relevant papers with applications of ML in SCM are extracted based on a literature review of a period of 10 years (2009-2019). The used ML methods are linked to the SCM model, creating a reciprocal mapping. Findings: This paper results in an overview of ML applications and methods currently used in the area of SCM. Successfully applied ML methods in SCM in industry and examples from theoretical approaches are displayed for each task within the SCM task model. Originality: Linking the SC task model with current application areas of ML yields an overview of ML in SCM. This facilitates the identification of potential areas of application to companies, as well as potential future research areas to science.

A Comprehensive Guide to Attention Mechanism in Deep Learning for Everyone

What does one of the most famous quotes of the 21st century have to do with deep learning? Well, think about it. We are in the midst of an unprecedented slew of breakthroughs thanks to advancements in computation power. And if we had to trace this back to where it began, it would lead us to the Attention Mechanism. It is, to put it simply, a revolutionary concept that is changing the way we apply deep learning. The attention mechanism is one of the most valuable breakthroughs in Deep Learning research in the last decade. It has spawned the rise of so many recent breakthroughs in natural language processing (NLP), including the Transformer architecture and Google’s BERT If you’re working in NLP (or want to do so), you simply must know what the Attention mechanism is and how it works. In this article, we will discuss the basics of several kinds of Attention Mechanisms, how they work, and what the underlying assumptions and intuitions behind them are. We will also provide some mathematical formulations to express the Attention Mechanism completely along with relevant code on how you can easily implement architectures related to Attention in Python.

Gartner and Forrester Begin to Weigh in on Automated Machine Learning (AML)

This has been a big year for AML (automated machine learning). A number of new players have emerged and pretty much everyone acknowledges that some level of automation is appropriate to enhance the productivity of your data science team. And no, data scientists shouldn’t be alarmed by this trend. None of this approaches letting an untrained person push the button and have a useful model pop out. And yet something keeps bugging me. I’ve been following this trend since its emergence in about 2016 and there’s still no good single source to turn to for comprehensive reviews and comparisons. Another way of saying this is that if Forrester and Gartner haven’t reviewed AML, does this really represent something to pay attention to.

Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead

It’s pretty clear from the title alone what Cynthia Rudin would like us to do! The paper is a mix of technical and philosophical arguments and comes with two main takeaways for me: firstly, a sharpening of my understanding of the difference between explainability and interpretability, and why the former may be problematic; and secondly some great pointers to techniques for creating truly interpretable models.

Reproducibility, Replicability, and Data Science

Reproducibility and replicability are cornerstones of scientific inquiry. Although there is some debate on terminology, if something is reproducible, it means that the same result can be recreated by following a specific set of steps with a consistent dataset. If something is replicable, it means that the same conclusions or outcomes can be found using slightly different data or processes. Without reproducibility, process and findings can’t be verified. Without replicability, it is difficult to trust the findings of a single study.

Testing for Normality using Skewness and Kurtosis

We’ll cover the following 4 topics:
• What is normality and why should you care about it?
• What are Skewness and Kurtosis and how to use them for testing for normality?
• How to use two very commonly used tests of normality, namely the Omnibus K-squared and Jarque-Bera tests that are based on Skewness and Kurtosis.
• How to apply these tests to a real-world data set to decide if Ordinary Least Squares regression is the appropriate model for this data set.

Beginner’s Guide to Creating an SVD Recommender System

Ever logged into Netflix and see they are suggesting you watch Gravity if you had spent the last night watching Interstellar? Or perhaps bought something on Amazon and saw they are recommending us products that we may be interested in? Or ever wondered how the online ad agencies show us ads based on our browsing habits? It all boils down something called a recommendation system which predicts what we may be interested in based on our and others’ history of interacting with products. As I promised, we’ll make a recommender system. And just so you don’t feel bad about yourself, we’ll make a pretty cool one too. We’ll make a collaborative filtering one using the SVD ( Singular Vector Decomposition ) technique; that’s quite a notch above the basic content-based recommender system. Buckle up!

Predict figure skating world championship ranking from season performances

In the previous parts of the project, I tried to predict the ranking in the annual world championship of figure skating based on the scores that skaters earned from previous competition events in the season. The main strategy is to separate the skater effect, the intrinsic ability of each skater, from the event effect, the influence of an event on a skater’s performance, so that a more accurate ranking could be built.

The Best Classification Metric You’ve Never Heard Of

Congratulations! You’ve built a binary classifier – a fancy-schmancy neural network that uses 32 GPUs with a dedicated power station, or perhaps a simple and robust logistic regression model that runs on your old Thinkpad X220. You’ve designed the model and tuned the parameters; now the time has finally come to measure the classifier’s performance. Don’t get me wrong: ROC curves are the best choice for real-world applications. However, scalar metrics still remain popular among the machine-learning community with the four most common being accuracy, recall, precision, and F1-score. Scalar metrics are ubiquitous in textbooks, web articles, online courses, and they are the metrics that most data scientists are familiar with. But a couple of weeks ago, I stumbled upon another scalar metric for binary classification: the Matthews Correlation Coefficient. Following my ‘discovery’, I asked around and was surprised to find that many people in the field are not familiar with this classification metric. As a born-again believer, I’m here to spread the gospel!

Byte Pair Encoding – The Dark Horse of Modern NLP

Deriving meaning from rare infrequent words. The last few years have been an exciting time to be in the field of NLP. The evolution from sparse frequency-based word vectors to dense semantic word representation pre-trained models like Word2vec and GloVe set the foundation for learning the meaning of words. For many years, they served as reliable embedding layer initializations to train models in the absence of large amounts of task-specific data. Since the word embedding models pre-trained on Wikipedia were either limited by vocabulary size or the frequency of word occurrences, rare words like athazagoraphobia would never be captured resulting in unknown <unk> tokens when occurring in the text.

Using NLP to understand laws

An unsupervised analysis of the Accessibility for Ontarians with Disabilities Act. The process of legal reasoning and decision making is heavily reliant on information stored in text. Tasks like due diligence, contract review, and legal discovery, that are traditionally time-consuming, can benefit from NLP models and be automated, saving a huge amount of time. But can NLP be leveraged to improve the public’s understand of laws? The idea is to obtain an abstract representation of laws that would make it easier to extract the rules and obligations defined in the text and understand what are the entities responsible for compliance, highlight patterns of similarity across industries, differences between public and private responsibilities, or even identify parts that are unclear.

Probabilistic Thinking, the one critical right left behind by most people.

Let’s use one simple question to test if you left behind your essential right to think probabilistically. And look at what you might be missing when you make your next decision.