DialogFlow: A Simple Way to Build your Voicebots & Chatbots

Nowadays, businesses, whether it is B2B or B2C, are heavily relying on chatbots in order to automate their processes and reducing human workloads. There are various NLP chatbot platforms used by the chatbot development companies to build a chatbot, and one of the best platforms among them is DialogFlow. The Platform was previously called as API.AI. It was acquired by Google in 2016 and renamed as DialogFlow. DialogFlow is a Google-owned natural language processing platform that can be used to build conversational applications like chatbots and voice bots. It is a platform that provides a use-cHeease specific, engaging voice and text-based conversations, powered by AI. The complexities of human conversations are still an art that machines lack but a domain-specific bot is the closest thing we can build to overcome these complexities. It can be integrated with multiple platforms, including Web, Facebook, Slack, Twitter, and Skype.


Geek Girls Rising: Myth or Reality

I recently participated in the 2019 Kaggle ML & DS Survey Challenge. The survey that is in its third year now aims to offer a comprehensive view of the state of data science and machine learning. The challenge is open to all and the notebook that tells a unique and creative story about the data science community wins a prize. The challenge description says -: ‘The challenge is to deeply explore (through data) the impact, priorities, or concerns of a specific group of data science and machine learning practitioners.’ This was an excellent opportunity for me to participate in the challenge and explore the dataset w.r.t women’s participation in the survey, worldwide. The objective of the notebook was to analyze the survey data to answer an important question: is the women participation in STEM really improving, or is it just a hype? I employed my analytical skills to investigate whether things seem to be improving, or is there still much left to be done.


KNIME Analytics Platform is the “killer app” for machine learning and statistics

KNIME Analytics Platform is the strongest and most comprehensive free platform for drag-and-drop analytics, machine learning, statistics, and ETL that I’ve found to date. The fact that there’s neither a paywall nor locked features means the barrier to entry is nonexistent. Connectors to data sources (both on-premise and on the cloud) are available for all major providers, making it easy to move data between environments. SQL Server to Azure? No problem. Google Sheets to Amazon Redshift? Sure, why not. How about applying a machine learning algorithm and filtering/transforming the results? You’re covered.


Evolution of text representation in NLP

How the representation of text progressed in the NLP context, and what are the advantages of each method. Information can be represented in multiple ways while keeping the same meaning. We may pass information through multiple languages, we may represent somethings with mathematical expressions or by drawings. We choose the representations which are more fit to convey the information we want to pass. In Natural Language Processing we have to convert text data into something the machine can manipulate. Numbers! There is a number (no pun intended) of ways to make this conversion. A simple one would be to give each word in a text a particular id. But, not all representations are equal, some are more sophisticated and carry more information than others, which will impact the performance of a NLP model. With today’s computing power we have the capability of building ML models capable of performing very complex tasks and handling a lot of data. We want to shove our models the most information we can get.


Introducing Redpoint’s ML Workflow Landscape

ML is not only a hot buzz word but is becoming an expected component of modern businesses. Companies that want to stay competitive invest in ML infrastructure that supports the data prep, training, serving, and management of ML algorithms. Overarching ML workflow themes include accelerating time to value and production optimization. We delineate sub-category trends below. As with any new market, heightened interest results in a deluge of options for the operator. We categorize ~280 offerings from academic research to open source projects to commercial offerings (both startup and established) to provide a comprehensive picture of the ML workflow landscape. We are excited about innovation in the space and look forward to speaking with startups offering ML-focused solutions. Businesses adopt ML technology to remain competitive. According to McKinsey, businesses leveraging AI have ‘an insurmountable advantage’ over rivals and 71% of extensive AI adopters expect a 10% increase in revenue. Deloitte predicts the number of ML pilots and implementations will double in 2018 compared to 2017, and double again by 2020. The c-suite’s emphasis on AI and ML will only increase.


Deep Java Library(DJL) – a Deep Learning Toolkit for Java Developers

Deep Java Library (DJL), is an open-source library created by Amazon to develop machine learning (ML) and deep learning (DL) models natively in Java while simplifying the use of deep learning frameworks. I recently used DJL to develop a footwear classification model and found the toolkit super intuitive and easy to use; it’s obvious a lot of thought went into the design and how Java developers would use it. DJL APIs abstract commonly used functions to develop models and orchestrate infrastructure management. I found the high-level APIs used to train, test and run inference allowed me to use my knowledge of Java and the ML lifecycle to develop a model in less than an hour with minimal code.


Successor Uncertainties

I describe here our recent NeurIPS paper [1] [code], which introduces Successor Uncertainties (SU), a state-of-the-art method for efficient exploration in model-free reinforcement learning. The leading authors are David Janz and Jiri Hron, two PhD students from the Cambridge Machine Learning group, with the work having originated during an internship by David Janz at Microsoft Research Cambridge. The main insight behind SU is to describe Q-functions using probabilistic models that directly take into account correlations between Q-function values, as given by the Bellman equation. These correlations had been ignored in previous work. The result is a method that outperforms competitors on tabular benchmarks and Atari games, while still being fast and highly scalable.


How to do data quality with DataOps

The costs of poor data quality are so high that many have trouble believing the stats. Gartner estimated that the average organization takes a $15M hit due to poor data quality every year. For some organizations, it can even be fatal. I’m often reminded of a story told by my Data Science Innovation Summit co-presenter, Dan Enthoven from Domino Data Labs, about a high-frequency trading firm, Knight Capital, who deployed a faulty update to their algorithm without testing its effect. Within a day, the firm had automated away nearly all of their capital and had to orchestrate an emergency sale to another firm. He also speaks of a credit card company that failed to validate the FICO© Credit Score field from a 3rd party provider. When the company later switched providers, the new provider indicated ‘no credit’ with an illegal value of 999 (850 is the highest legal value). Because there was no data quality check in place, their automated approvals algorithm started approving these applications with huge credit limits, leading to major losses.


A collection of must known resources for every Natural Language Processing (NLP) practitioner

After thorough readings from multiple sources since last one year, here are my compiled versions of the best sources of learnings which can help anyone to start their journey into the fascinating world of NLP. There are a variety of tasks which comes under the broader area of NLP such as Machine Translation, Question Answering, Text Summarization, Dialogue Systems, Speech Recognition, etc. However to work in any of these fields, the underlying must known pre-requisite knowledge is the same which I am going to discuss briefly in this blog.


Illustrating Online Learning through Temporal Differences

Fundamentals of Reinforcement Learning. Over the course of our articles covering the fundamentals of reinforcement learning at GradientCrescent, we’ve studied both model-based and sample-based approaches to reinforcement learning. Briefly, the former class is characterized by requiring knowledge of the complete probability distributions of all possible state transitions, and is exemplified by Markovian Decision Processes. In contrast, sample-based learning methods allow for the determination of state values simply through repeated observations, eliminating the need for transition dynamics. In our last article, we discussed the applications of Monte Carlo approaches in determining the values of different states and actions simply through environmental sampling. More generally, Monte Carlo approaches belong to the offline family of learning approaches, insofar as to allow updates to the values of states only when the terminal state is reached, or at the end of an episode. While this approach may seem sufficient for many controlled or simulated environments, it would be woefully inadequate for applications requiring rapid changes, such as in the training of autonomous vehicles. The use of offline learning for such applications could possibly result in an accident, as a delay in updating the state values would result in an unacceptable loss of life or property.


Biased Algorithms Are Easier to Fix Than Biased People

In one study published 15 years ago, two people applied for a job. Their résumés were about as similar as two résumés can be. One person was named Jamal, the other Brendan. In a study published this year, two patients sought medical care. Both were grappling with diabetes and high blood pressure. One patient was black, the other was white. Both studies documented racial injustice: In the first, the applicant with a black-sounding name got fewer job interviews. In the second, the black patient received worse care. But they differed in one crucial respect. In the first, hiring managers made biased decisions. In the second, the culprit was a computer program.