I had no idea how to build a Machine Learning Pipeline. But here’s what I figured.

As a postgraduate studying Artificial Intelligence (AI), my exposure to Machine Learning (ML) is largely academic. Yet, when given a task to create a simple ML pipeline for a time series forecast model, I realised how clueless I was. Also, I could barely find any specific information or code out there on this topic, hence I decided to write this topic. This article will present a basic structure of how a simple ML pipeline can be created (More information may be supplemented over time).


John Allspaw: People are the adaptable element of complex systems

All work in software involves people facing multiple tangled layers of trade-offs and coping with complexity. Uncertainty, ambiguity, and dilemmas are part of the everyday experience in modern enterprises. Exploring and understanding how people successfully cope with these challenges is core to Resilience Engineering. I will talk about the apparent irony of finding sources of resilience (sustaining the capacity to adapt to the unforeseen) by examining closely what would otherwise be categorized as failure: the messy details of critical incidents.


A Visual Guide to Using BERT for the First Time

Progress has been rapidly accelerating in machine learning models that process language over the last couple of years. This progress has left the research lab and started powering some of the leading digital products. A great example of this is the recent announcement of how the BERT model is now a major force behind Google Search. Google believes this step (or progress in natural language understanding as applied in search) represents ‘the biggest leap forward in the past five years, and one of the biggest leaps forward in the history of Search’. This post is a simple tutorial for how to use a variant of BERT to classify sentences. This is an example that is basic enough as a first intro, yet advanced enough to showcase some of the key concepts involved.


CCSM: Scalable statistical anomaly detection to resolve app crashes faster

Our family of mobile apps is used by more than 2 billion people every month – on a wide variety of mobile devices. We employ rigorous code review and testing processes, but, as with any system, software bugs still sometimes slip through and may even cause our apps to crash. Resolving these crashes and other reliability issues in a timely manner is a top priority. To help us respond as quickly as possible, we have been creating a collection of services that use machine learning (ML) to aid engineers in diagnosing and resolving software reliability and performance issues. As part of this collection, we recently implemented continuous contrast set mining (CCSM), an anomaly-detection framework that uses contrast set mining (CSM) techniques to locate statistically ‘interesting’ (defined by several statistical properties) sets of features in groups. A novel algorithm we’ve developed extends standard contrast set mining from categorical data to continuous data, inspired by tree search algorithms and multiple hypothesis testing. Our model is more than 40 times faster than naive baseline approaches, enabling us to scale to challenging new data sets and use cases.


Human-Machine Collaboration: The Future of Work

Today, organizations are rethinking work as we know it. We are seeing a fundamental shift in the work model to one that fosters human-machine collaboration, enables new skills and worker experiences, and supports an environment unbounded by time or physical space. Many companies, call this the ‘future of work.’ However, the reality is that we are seeing many of these changes occurring in the present day. ‘Digital workers’ are making up a growing share of the workforce. We define a ‘digital worker’ as technology – including artificial intelligence (AI), intelligent process automation (IPA), augmented reality/virtual reality (AR/VR), and software robotics – that automates and augments work previously accomplished by humans.


Unsupervised Sentiment Analysis

How to extract sentiment from the data without any labels. One of the common applications of NLP methods is sentiment analysis, where you try to extract from the data information about the emotions of the writer. Mainly, at least at the beginning, you would try to distinguish between positive and negative sentiment, eventually also neutral, or even retrieve score associated with a given opinion based only on text.


Stock market forecasting using Time Series analysis

The general research associated with the stock or share market is highly focusing on neither buy nor sell but it fails to address the dimensionality and expectancy of a new investor. The common trend towards the stock market among the society is that it is highly risky for investment or not suitable for trade so most of the people are not even interested. The seasonal variance and steady flow of any index will help both existing and naïve investors to understand and make a decision to invest in the stock/share market. To solve these types of problems, the time series analysis will be the best tool for forecasting the trend or even future. The trend chart will provide adequate guidance for the investor. So let us understand this concept in great detail and use a machine learning technique to forecast stocks.


Time Series Forecasting with LSTMs using TensorFlow 2 and Keras in Python

Introduction to data preparation and prediction for Time Series forecasting using LSTMs. Learn about Time Series and making predictions using Recurrent Neural Networks. Prepare sequence data and use LSTMs to make simple predictions.


Dataset unavailable? No problem!

What if you couldn’t find a dataset even on those sites? what would you do? Would you blame the internet for not giving you the dataset of your need or you would curse the whole universe? Well, I would not do both the things mentioned above. I would create my own dataset and trust me, It wouldn’t take any longer than 5 minutes. Now, Let me show you how to create your own dataset quickly. We’ll be using a python package called Faker.


Microsoft Introduces Icebreaker to Address the Famous Ice-Start Challenge in Machine Learning

The new technique allows the deployment of machine learning models that operate with minimum training data. The acquisition and labeling of training data remains one of the major challenges for the mainstream adoption of machine learning solutions. Within the machine learning research community, several efforts such as weakly supervised learning or one-shot learning have been created in order to address this issue. Microsoft Research recently incubated a group called Minimum Data AI to work on different solutions for machine learning models that can operate without the need of large training datasets. Recently, that group published a paper unveiling Icebreaker, a framework for ‘wise training data acquisition’ which allow the deployment of machine learning models that can operate with little or no-training data. The current evolution of machine learning research and technologies have prioritized supervised models that need to know quite a bit about the world before they can produce any relevant knowledge. In real world scenarios, the acquisition and maintenance of high quality training datasets results quite challenging and sometimes impossible. In machine learning theory, we refer to this dilemma as the ice(cold)-start problem.


Moving AI and ML from research into production

Dean Wampler discusses the challenges and opportunities businesses face when moving AI from discussions to production.