6-essential practices to successfully implement machine learning solutions in your organization.

Executive’s Guide to Successfully Becoming an AI-Driven Enterprise. McKinsey Insights recently published its Global AI Survey and discussed many aspects of the impact AI is generating across multiple companies. What really caught my eye was the comparison done between AI high performing companies versus the rest. According to the comparison done, companies with a clear enterprise-level road map of use cases, a solid cross-functional collaboration between the analytics & business units, a standard AI toolset for professionals to use, an understanding of frequently updating AI models and systematically tracing a comprehensive set of well-defined KPI for AI perform 3.78x better than other players in the market.

I’m Bayesed and I know it

If you’re too young to realize where the title reference comes from, I’m gonna make you lose your mind. It has something to do with parties and rocks and anthems. Actually, no, I just want you to have a good time so I’ll instead ask you to take a look at the title picture. What did you notice? I am obviously drawing your attention to both the title and picture for a reason. With the title, you might not have realized there was a ‘pattern’ to it till I pointed it out. With the picture, if you only took a quick glance, you might have seen just sheep. If you managed to figure both out without me having to point it out, you can stop reading.

Writing Linguistic Rules for Natural Language Processing

When I first started exploring data science towards the end of my Ph.D. program in linguistics, I was pleased to discover the role of linguistics – specifically, linguistic features – in the development of NLP models. At the same time, I was a bit perplexed by why there was relatively little talk of exploiting syntactic features (e.g., level of clausal embedding, presence of coordination, type of speech act, etc.) compared to other types of features, such as how many times a certain word occurs in the text (lexical features), word similarity measures (semantic features), and even where in the document a word or phrase occurs (positional features). For example, in a sentiment analysis task, we may use a list of content words (adjectives, nouns, verbs, and adverbs) as features for a model to predict the semantic orientation of user feedback (i.e., positive or negative). In another feedback classification task, we may curate domain-specific lists of words or phrases to train a model that can direct user comments to appropriate divisions of support, e.g., billing, technical, or customer service.

Dynamic Meta Embeddings in Keras

Many NLP solutions make use of pretrained word embeddings. The choice of which one to use is often releted to the final performances and is achived after lot of trials and manual tuning. At Facebook AI Lab agreed that the best way to make this kind of selection is to let neural networks to figure out by themselves. They introduced dynamic meta-embeddings, a simple yet effective method for the supervised learning of embedding ensembles, which leads to state of the art performance within the same model class on a variety of tasks. This simple, but extremely efficient, method permits to learn a liner combination of a set of selected word embeddings which outperforms the naive concatenation of various embeddings. As mentioned, the authors proved the validity of their solution on various tasks in NLP domain. We limited ourself to adopt these tecniques in a text classification problem, where we have 2 pretrained embeddings and want to combine them intelligently in order to boost the final performances.

Machine Learning on Encrypted Data Without Decrypting It

Suppose you have just developed a spiffy new machine learning model (using Flux.jl of course) and now want to start deploying it for your users. How do you go about doing that? Probably the simplest thing would be to just ship your model to your users and let them run it locally on their data. However there are a number of problems with this approach:
• ML models are large and the user’s device may not have enough storage or computation to actually run the model.
• ML models are often updated frequently and you may not want to send the large model across the network that often.
• Developing ML models takes a lot of time and computational resources, which you may want to recover by charging your users for making use of your model.

Learning Data Structure Alchemy

We propose a solution based on first principles and AI to the decades-old problem of data structure design. Instead of working on individual designs that each can only be helpful in a small set of environments, we propose the construction of an engine, a Data Alchemist, which learns how to blend fine-grained data structure design principles to automatically synthesize brand new data structures.

Interpretability: Cracking open the black box – Part III

Previously, we looked at the pitfalls with the default ‘ feature importance ‘ in tree based models, talked about permutation importance, LOOC importance, and Partial Dependence Plots. Now let’s switch lanes and look at a few model agnostic techniques which takes a bottom-up way of explaining predictions. Instead of looking at the model and trying to come up with global explanations like feature importance, these set of methods look at each single prediction and then try to explain them.

What does a Fine-tuned BERT model look at?

There is a lot of buzz around NLP of late, especially after the advancement in transfer learning techniques and with the advent of architectures like transformers. As someone from the applied side of Machine learning, I feel that it is not only important to have models that can surpass the state of the art results in many benchmarks, It is also important to have models that are trustable, understandable and not a complete black box. This post is an attempt to understand the learnings of BERT on task-specific training. Let’s start with how attention is implemented in a Transformer and how it can be leveraged for understanding the model ( Feel free to skip this section if you are already aware of it).

Variance, Attractors and Behavior of Chaotic Statistical Systems

We study the properties of a typical chaotic system to derive general insights that apply to a large class of unusual statistical distributions. The purpose is to create a unified theory of these systems. These systems can be deterministic or random, yet due to their gentle chaotic nature, they exhibit the same behavior in both cases. They lead to new models with numerous applications in Fintech, cryptography, simulation and benchmarking tests of statistical hypotheses. They are also related to numeration systems. One of the highlights in this article is the discovery of a simple variance formula for an infinite sum of highly correlated random variables. We also try to find and characterize attractor distributions: these are the limiting distributions for the systems in question, just like the Gaussian attractor is the universal attractor with finite variance in the central limit theorem framework. Each of these systems is governed by a specific functional equation, typically a stochastic integral equation whose solutions are the attractors. This equation helps establish many of their properties. The material discussed here is state-of-the-art and original, yet presented in a format accessible to professionals with limited exposure to statistical science. Physicists, statisticians, data scientists and people interested in signal processing, chaos modeling, or dynamical systems will find this article particularly interesting. Connection to other similar chaotic systems is also discussed.

Is Data Science dying?

The people who have been in the industry for a long time can relate this. Many years ago the industry went crazy for a similar skill known as Business Analytics. Nowadays, the term Data Scientist is exploding on the internet and it is the modern job that seems very promising.

History of AI; Labeling ‘AI’ correctly; Excerpts from upcoming ‘AI Bill of Rights’

What is the difference between artificial intelligence and true intelligence? Artificial intelligence, to me, is when a group purposefully tries to make a single individual more intelligent. Before we can talk about the history of AI, we must accurately define it as well as label it. We must decipher what AI stands for, then we can go back to the roots of AI, and where it’s future is heading.

How PyTorch lets you build and experiment with a neural net

We show, step-by-step, a simple example of building a classifier neural network in PyTorch and highlight how easy it is to experiment with advanced concepts such as custom layers and activation functions.