What does it take to build a Chatbot for your application?

You might have spent hours on end to build your application that has the best in-app user experience, but you fall short when it comes to building the latest and hottest feature in the market – A conversational interface that you can integrate with messaging apps, emails, voice assistants, IVR and many other channels. What now ?, Should you start research on building the conversational agent? What should you think about ? …..


Ecom Data Series: What is RFM segmentation?

RFM is a data modeling method used to analyze customer value. It stands for recency, frequency, and monetary, which are just three metrics that describe what your customers did. Recency measures the time (usually in days) between when your customer last ordered to today. Frequency measures how many total orders the customers had, and Monetary is the average amount they spent from those orders.


Correlation & Causation – How Alcohol Affects Life Expectancy

We hear this sentence over and over again as beginner statisticians and data scientists. But what does that actually mean? This small analysis uncovers this topic with the help of R, and simple regressions, focusing on how alcohol impacts health.


Machine learning using TensorFlow for Absolute Beginners

Welcome to this article where you will learn how to train your first Machine Learning model using TensorFlow and use it for Predictions! As the title suggests, this tutorial is only for someone who has no prior understanding of how to use a machine learning model. The only pre-requisite for this course is to have a basic understanding of Python programming language. I have tried to keep things simple here, and only introduce basic concepts of machine learning and Neural Network. What is TensorFlow: TensorFlow is an end-to-end open source platform for machine learning. It has a comprehensive, flexible ecosystem of tools, libraries and community resources that lets researchers push the state-of-the-art in ML and developers easily build and deploy ML powered applications. Learn more about TensorFlow from here.


Sentiment analysis and the moon landing

The last couple of days media around the world joined in celebration of the 50th anniversary of the epic lunar mission that took us to the moon for the first time. It is clear from the archived material that the milestone was widely celebrated as a pinnacle of human achievement. However, at the same time there where voices raised around the huge costs associated with the Apollo program. Could the resources have been spent more wisely? The argument goes that we are still struggling with challenges here on earth, so why go look for more in space. How could we go about to find out what the general opinion is today? There have been some dramatic changes since 1969 that allow us to get a broader picture of the most widespread sentiments. We have a wealth of information through social media combined with the advent of powerful algorithms that allows us to automate processing of data in a variety of ways. Imagine having machines that can tell us about the emotional content of a statement! The application of Sentiment analysis is playing an ever-increasing role in our society, ranging from product development to gauging the popularity of different public policies. We will explore how sentiment analysis is done and see how it can be put to use to decode the public sentiment based on twitter data collected in and around the 50th anniversary.


Transforming Skewed Data for Machine Learning

Skewed data is common in data science; skew is the degree of distortion from a normal distribution. For example, below is a plot of the house prices from Kaggle’s House Price Competition that is right skewed, meaning there are a minority of very large values.


Microsoft and the learnings from its failed Tay artificial intelligence bot

In March 2016, Microsoft sent its artificial intelligence (AI) bot Tay out into the wild to see how it interacted with humans. According to Microsoft Cybersecurity Field CTO Diana Kelley, the team behind Tay wanted the bot to pick up natural language and thought Twitter was the best place for it to go. ‘A great example of AI and ML going awry is Tay,’ Kelley told RSA Conference 2019 Asia Pacific and Japan in Singapore last week. Tay was targeted at American 18 to 24-year olds and was ‘designed to engage and entertain people where they connect with each other online through casual and playful conversation’. In less than 24 hours after its arrival on Twitter, Tay gained more than 50,000 followers, and produced nearly 100,000 tweets. Tay started fairly sweet; it said hello and called humans cool. But Tay started interacting with other Twitter users and its machine learning (ML) architecture hoovered up all the interactions, good, bad, and awful. Some of Tay’s tweets were highly offensive. In less than 16 hours Tay had turned into a brazen anti-Semite and was taken offline for re-tooling.


If you can identify what’s in these images, you’re smarter than AI

Computer vision has improved massively in recent years, but it’s still capable of making serious errors. So much so that there’s a whole field of research dedicated to studying pictures that are routinely misidentified by AI, known as ‘adversarial images.’ Think of them as optical illusions for computers. While you see a cat up a tree, the AI sees a squirrel. There’s a great need to study these images. As we put machine vision systems at the heart of new technology like AI security cameras and self driving cars, we’re trusting that computers see the world the same way we do. Adversarial images prove that they don’t.


Method evaluation, parameterization, and result validation in unsupervised data mining: A critical survey

Machine Learning (ML) and Data Mining (DM) build tools intended to help users solve data-related problems that are infeasible for ‘unaugmented’ humans. Tools need manuals, however, and in the case of ML/DM methods, this means guidance with respect to which technique to choose, how to parameterize it, and how to interpret derived results to arrive at knowledge about the phenomena underlying the data. While such information is available in the literature, it has not yet been collected in one place. We survey three types of work for clustering and pattern mining: (1) comparisons of existing techniques, (2) evaluations of different parameterization options and studies providing guidance for setting parameter values, and (3) work comparing mining results with the ground truth. We find that although interesting results exist, as a whole the body of work on these questions is too limited. In addition, we survey recent studies in the field of community detection, as a contrasting example. We argue that an objective obstacle for performing needed studies is a lack of data and survey the state of available data, pointing out certain limitations. As a solution, we propose to augment existing data by artificially generated data, review the state-of-the-art in data generation in unsupervised mining, and identify shortcomings. In more general terms, we call for the development of a true ‘Data Science’ that – based on work in other domains, results in ML, and existing tools – develops needed data generators and builds up the knowledge needed to effectively employ unsupervised mining techniques.


Discovering the influence of sarcasm in social media responses

Sarcasm in verbal and nonverbal communication is known to attract higher attention and create deeper influence than other negative responses. Many people are adept at including sarcasm in written communication thus sarcastic comments have the potential to stimulate the virality of social media content. Although diverse computational approaches have been used to detect sarcasm in social media, the use of text mining to explore the influential role of sarcasm in spreading negative content is limited. Using tweets during a service disruption of a leading Australian organization as a case study, we explore this phenomenon using a text mining framework with a combination of statistical modeling and natural language processing (NLP) techniques. Our work targets two main outcomes: the quantification of the influence of sarcasm and the exploration of the change in topical relationships in the conversations over time. We found that sarcastic expressions during the service disruption are higher than on regular days and negative sarcastic tweets attract significantly higher social media responses when compared to literal negative expressions. The content analysis showed that consumers initially complaining sarcastically about the outage tended to eventually widen the negative sarcasm in a cascading effect towards the organization’s internal issues and strategies. Organizations could utilize such insights to enable proactive decision-making during crisis situations. Moreover, detailed exploration of these impacts would elevate the current text mining applications, to better understand the impact of sarcasm by stakeholders expressed in a social media environment, which can significantly affect the reputation and goodwill of an organization.


Decentralized and Collaborative AI: How Microsoft Research is Using Blockchains to Build More Transparent Machine Learning Models

The biggest challenge of the next decade of artificial intelligence(AI) is going to be based on determining whether data and intelligence remains a privilege of a handful of large technology companies based in a few countries or it can be democratized to the rest of the world. The centralized nature of machine learning and AI applications foments a ‘rich get richer’ dynamic in which only the companies with access to high quality datasets and data science talent are able to take advantage of AI opportunities. The field of decentralized AI is one of the leading trends that is looking to address this challenge. Although still impractical for many real world implementations, the decentralized AI space have been steadily gaining traction within the AI community. Recently, AI researchers from Microsoft open sourced the Decentralized & Collaborative AI on Blockchain project that enables the implementation of decentralized machine learning models based on blockchain technologies.


What is Two-Stream Self-Attention in XLNet

In my previous post What is XLNet and why it outperforms BERT, I mainly talked about the difference between XLNet (AR language model) and BERT (AE language model) and the Permutation Language Modeling. I believe that having an intuitive understanding of XLNet is far important than the implementation detail, so I only explained the Permutation Language Modeling and don’t mention another important part, the Two-Stream Self-Attention architecture. But as Jiaming Chen mentioned in the comment, the Two-Stream Self-Attention is another highlight in the XLNet paper, so I wrote this post to explain the Two-Stream Self-Attention as clearly as possible.


Are All Explainable Models Trustworthy?

Explainable AI or Explainable Data Science is one of the top buzzwords of Data Science at the moment. Models that are explainable are seen as the A frequently given to make models more explainable is that they will then be trusted more readily by users, and sometimes it appears people assume the ideas are almost synonymous. For example, the paper introducing the influential LIME method of explaining black box models was titled ‘Why Should I Trust You?’, as if having an explanation for how a model came to its decision was a short direct step away from trusting it. Is this really the case however? An immediate problem with this equivalence is that trust is given at an emotional level whereas an explanation is a more technical artifact – the assumption behind explaining a model is that there are a certain number of pieces of information which can be provided to ensure the user understands what the model is doing. In contrast to gain trust means crossing a number of emotional thresholds. Hence, while it is true that an overly opaque model can be a huge obstacle to gaining a user’s trust, it isn’t the whole story – and there may even be occasions that an opaque model is trustworthy if some other conditions are met.


Uncertainty Sampling Cheatsheet

When a Supervised Machine Learning model makes a prediction, it often gives a confidence in that prediction. If the model is uncertain (low confidence), then human feedback can help. Getting human feedback when a model is uncertain is a type of Active Learning known as Uncertainty Sampling.


7 Steps to Ensure and Sustain Data Quality

Several years ago, I met a senior director from a large company. He mentioned the company he worked for was facing data quality issues that eroded customer satisfaction, and he had spent months investigating the potential causes and how to fix them. ‘What have you found?’ I asked eagerly. ‘It is a tough issue. I did not find a single cause, on the contrary, many things went wrong,’ he replied. He then started citing a long list of what contributed to the data quality issues – almost every department in the company was involved and it was hard for him to decide where to begin next. This is a typical case when dealing with Data Quality, which directly relates to how an organization is doing its business and the entire life cycle of the data itself.


Pitfalls of Data Normalization

This is the fourth article of the column Mathematical Statistics and Machine Learning for Life Sciences. In this column, as well as in Deep Learning for Life Sciences I have been repeatedly emphasizing that the data we work with in Life Sciences are high-dimensional, the fact we do not always realize and properly take into account. Today we are going to talk about another typical pitfall, namely how performing data normalization you might end up in the Simplex Space where Euclidean distance is not valid any more and classical Frequentist statistics breaks.


Signal Detection Theory vs. Logistic Regression

I recently came across a paper that explained the equality between the parameters of signal detection theory (SDT) and the parameters of logistic regression in which the state (‘absent’/’present’) is used to predict the response (‘yes’/’no’, but also applicable in scale-rating designs) (DeCarlo, 1998; DOI: 10.1037/1082-989X.3.2.186). Here is a short simulation-proof for this equality.