• Home
  • About
  • Books
  • Courses
  • Documents
  • eBooks
  • Feeds
  • Images
  • Quotes
  • R Packages
  • What is …

AnalytiXon

~ Broaden your Horizon

Category Archives: Distilled News

Distilled News

17 Tuesday Dec 2019

Posted by Michael Laux in Distilled News

≈ Leave a comment

DialogFlow: A Simple Way to Build your Voicebots & Chatbots

Nowadays, businesses, whether it is B2B or B2C, are heavily relying on chatbots in order to automate their processes and reducing human workloads. There are various NLP chatbot platforms used by the chatbot development companies to build a chatbot, and one of the best platforms among them is DialogFlow. The Platform was previously called as API.AI. It was acquired by Google in 2016 and renamed as DialogFlow. DialogFlow is a Google-owned natural language processing platform that can be used to build conversational applications like chatbots and voice bots. It is a platform that provides a use-cHeease specific, engaging voice and text-based conversations, powered by AI. The complexities of human conversations are still an art that machines lack but a domain-specific bot is the closest thing we can build to overcome these complexities. It can be integrated with multiple platforms, including Web, Facebook, Slack, Twitter, and Skype.


Geek Girls Rising: Myth or Reality

I recently participated in the 2019 Kaggle ML & DS Survey Challenge. The survey that is in its third year now aims to offer a comprehensive view of the state of data science and machine learning. The challenge is open to all and the notebook that tells a unique and creative story about the data science community wins a prize. The challenge description says -: ‘The challenge is to deeply explore (through data) the impact, priorities, or concerns of a specific group of data science and machine learning practitioners.’ This was an excellent opportunity for me to participate in the challenge and explore the dataset w.r.t women’s participation in the survey, worldwide. The objective of the notebook was to analyze the survey data to answer an important question: is the women participation in STEM really improving, or is it just a hype? I employed my analytical skills to investigate whether things seem to be improving, or is there still much left to be done.


KNIME Analytics Platform is the “killer app” for machine learning and statistics

KNIME Analytics Platform is the strongest and most comprehensive free platform for drag-and-drop analytics, machine learning, statistics, and ETL that I’ve found to date. The fact that there’s neither a paywall nor locked features means the barrier to entry is nonexistent. Connectors to data sources (both on-premise and on the cloud) are available for all major providers, making it easy to move data between environments. SQL Server to Azure? No problem. Google Sheets to Amazon Redshift? Sure, why not. How about applying a machine learning algorithm and filtering/transforming the results? You’re covered.


Evolution of text representation in NLP

How the representation of text progressed in the NLP context, and what are the advantages of each method. Information can be represented in multiple ways while keeping the same meaning. We may pass information through multiple languages, we may represent somethings with mathematical expressions or by drawings. We choose the representations which are more fit to convey the information we want to pass. In Natural Language Processing we have to convert text data into something the machine can manipulate. Numbers! There is a number (no pun intended) of ways to make this conversion. A simple one would be to give each word in a text a particular id. But, not all representations are equal, some are more sophisticated and carry more information than others, which will impact the performance of a NLP model. With today’s computing power we have the capability of building ML models capable of performing very complex tasks and handling a lot of data. We want to shove our models the most information we can get.


Introducing Redpoint’s ML Workflow Landscape

ML is not only a hot buzz word but is becoming an expected component of modern businesses. Companies that want to stay competitive invest in ML infrastructure that supports the data prep, training, serving, and management of ML algorithms. Overarching ML workflow themes include accelerating time to value and production optimization. We delineate sub-category trends below. As with any new market, heightened interest results in a deluge of options for the operator. We categorize ~280 offerings from academic research to open source projects to commercial offerings (both startup and established) to provide a comprehensive picture of the ML workflow landscape. We are excited about innovation in the space and look forward to speaking with startups offering ML-focused solutions. Businesses adopt ML technology to remain competitive. According to McKinsey, businesses leveraging AI have ‘an insurmountable advantage’ over rivals and 71% of extensive AI adopters expect a 10% increase in revenue. Deloitte predicts the number of ML pilots and implementations will double in 2018 compared to 2017, and double again by 2020. The c-suite’s emphasis on AI and ML will only increase.


Deep Java Library(DJL) – a Deep Learning Toolkit for Java Developers

Deep Java Library (DJL), is an open-source library created by Amazon to develop machine learning (ML) and deep learning (DL) models natively in Java while simplifying the use of deep learning frameworks. I recently used DJL to develop a footwear classification model and found the toolkit super intuitive and easy to use; it’s obvious a lot of thought went into the design and how Java developers would use it. DJL APIs abstract commonly used functions to develop models and orchestrate infrastructure management. I found the high-level APIs used to train, test and run inference allowed me to use my knowledge of Java and the ML lifecycle to develop a model in less than an hour with minimal code.


Successor Uncertainties

I describe here our recent NeurIPS paper [1] [code], which introduces Successor Uncertainties (SU), a state-of-the-art method for efficient exploration in model-free reinforcement learning. The leading authors are David Janz and Jiri Hron, two PhD students from the Cambridge Machine Learning group, with the work having originated during an internship by David Janz at Microsoft Research Cambridge. The main insight behind SU is to describe Q-functions using probabilistic models that directly take into account correlations between Q-function values, as given by the Bellman equation. These correlations had been ignored in previous work. The result is a method that outperforms competitors on tabular benchmarks and Atari games, while still being fast and highly scalable.


How to do data quality with DataOps

The costs of poor data quality are so high that many have trouble believing the stats. Gartner estimated that the average organization takes a $15M hit due to poor data quality every year. For some organizations, it can even be fatal. I’m often reminded of a story told by my Data Science Innovation Summit co-presenter, Dan Enthoven from Domino Data Labs, about a high-frequency trading firm, Knight Capital, who deployed a faulty update to their algorithm without testing its effect. Within a day, the firm had automated away nearly all of their capital and had to orchestrate an emergency sale to another firm. He also speaks of a credit card company that failed to validate the FICO© Credit Score field from a 3rd party provider. When the company later switched providers, the new provider indicated ‘no credit’ with an illegal value of 999 (850 is the highest legal value). Because there was no data quality check in place, their automated approvals algorithm started approving these applications with huge credit limits, leading to major losses.


A collection of must known resources for every Natural Language Processing (NLP) practitioner

After thorough readings from multiple sources since last one year, here are my compiled versions of the best sources of learnings which can help anyone to start their journey into the fascinating world of NLP. There are a variety of tasks which comes under the broader area of NLP such as Machine Translation, Question Answering, Text Summarization, Dialogue Systems, Speech Recognition, etc. However to work in any of these fields, the underlying must known pre-requisite knowledge is the same which I am going to discuss briefly in this blog.


Illustrating Online Learning through Temporal Differences

Fundamentals of Reinforcement Learning. Over the course of our articles covering the fundamentals of reinforcement learning at GradientCrescent, we’ve studied both model-based and sample-based approaches to reinforcement learning. Briefly, the former class is characterized by requiring knowledge of the complete probability distributions of all possible state transitions, and is exemplified by Markovian Decision Processes. In contrast, sample-based learning methods allow for the determination of state values simply through repeated observations, eliminating the need for transition dynamics. In our last article, we discussed the applications of Monte Carlo approaches in determining the values of different states and actions simply through environmental sampling. More generally, Monte Carlo approaches belong to the offline family of learning approaches, insofar as to allow updates to the values of states only when the terminal state is reached, or at the end of an episode. While this approach may seem sufficient for many controlled or simulated environments, it would be woefully inadequate for applications requiring rapid changes, such as in the training of autonomous vehicles. The use of offline learning for such applications could possibly result in an accident, as a delay in updating the state values would result in an unacceptable loss of life or property.


Biased Algorithms Are Easier to Fix Than Biased People

In one study published 15 years ago, two people applied for a job. Their résumés were about as similar as two résumés can be. One person was named Jamal, the other Brendan. In a study published this year, two patients sought medical care. Both were grappling with diabetes and high blood pressure. One patient was black, the other was white. Both studies documented racial injustice: In the first, the applicant with a black-sounding name got fewer job interviews. In the second, the black patient received worse care. But they differed in one crucial respect. In the first, hiring managers made biased decisions. In the second, the culprit was a computer program.

Distilled News

14 Saturday Dec 2019

Posted by Michael Laux in Distilled News

≈ Leave a comment

Knowledge is Everything: Using Representation Learning to Optimize Feature Extraction and Knowledge Quality

Representation Learning is one of the most effective techniques to streamline feature extraction and knowledge building in deep learning models.


Animated storytelling using the Javascript D3 library

D3 is the most flexible data visualization tool available and allows you to create great data storytelling illustrations.


Figuring out a Fair Price of a Used Car in a Data Science Way

What is an ordinary way of figuring out the price for a used car? You search for similar vehicles, estimate the rough baseline price and then fine-tune it depending on the current mileage, color, number of options, etc. You use both domain knowledge and current market state analysis. If you go deeper, you may consider selling the car in a different region of the country where the average price is higher. You may even investigate for how long cars are listed in the catalog and detect overpriced samples to make more informed decision.


Machine Learning: A Practical Guide To Managing Risk

A fundamental question raised by the increasing use of machine learning (ML) – is quickly becoming one of the biggest challenges for data-driven organizations, data scientists, and legal personnel around the world. This challenge arises in various forms and has been described in various ways by practitioners and academics alike, but all relate to the basic ability to assert a causal connection between inputs to models and how that input data impacts model output. According to Bain & Company, investments in automation in the US alone will approach $8 trillion in the coming years, many premised on recent advances in ML. But these advances have far outpaced the legal and ethical frameworks for managing this technology. There is simply no commonly agreed-upon framework for governing the risks – legal, reputational, ethical, and more – associated with ML. This post aims to provide a template for effectively managing this risk in practice, with the goal of providing lawyers, compliance personnel, data scientists, and engineers a framework to safely create, deploy, and maintain ML models, and to enable effective communication between these distinct organizational perspectives. The ultimate aim of this article is to enable data science and compliance teams to create better, more accurate, and more compliant ML models.


AutoGluon: Deep Learning AutoML

AutoGluon is a new open source AutoML library that automates deep learning (DL) and machine learning (ML) for real world applications involving image, text and tabular datasets. Whether you are new to ML or an experienced practitioner, AutoGluon will simplify your workflow. With AutoGluon, you can develop and refine state-of-the-art DL models using just a few lines of Python code. In this post, we explain the benefits of AutoGluon, demonstrate how to install it on Linux, and get started using AutoGluon to solve real-world problems with state-of-the-art performance within minutes.


Dynamic Charts: Make Your Data Move

Dynamic charts give us an intuitive and interactive experience. This article will explain what dynamic charts are and how to make them.


eXtreme Deep Factorization Machine(xDeepFM)

The new buzz in the recommendation system domain. We are living in times where we are spoilt by choice. Take the example of food: you have thousands of restaurant-dish combinations to choose from. Just imagine the amount of time it could take to manually choose a restaurant-dish. Thank God! We never do that as zomato’s/swiggy’s of the world have done this for us. This is done using Recommendation Engines. We are going to cover an algorithm called xDeepFM in this blog.


Lessons I’ve Learned Developing An AI Strategy

How do large firms build AI strategies? How do they build a competitive advantage in the AI era? How do they build an AIoT ecosystem? I worked on these questions for the past two years as a consultant for a global technology company. In this article, I wanted to show how a large firm thinks and builds its AI strategy. Many companies are trying to transition from a state of low maturity to an AI-first model. However, becoming an AI-first organization is proving to be extremely difficult but also highly rewarding…


Inference in Graph Database

As described in W3 standards, the inference is briefly discovering new edges within a graph based on a given ontology. On Semantic Web, the data as modeled as triples, i.e. two resources and one relationship between them. The inference is that inferring new triples based on the data and some additional information, which may be in the form of vocabularies(ontologies) and rule sets(logic). For example, let’s say we have two sources: John the human and Max the dog and we have information that John has Max and Max is Dog. In addition, if we have an additional vocabulary, which includes information; Dog is Mammal. We can infer that Max is Mammal and John has Mammal.


Powerful Artificial Intelligence Narratives Hidden in Plain Sight

This article explores the importance of narratives in public and policy debates. Why fruitful stories can help to advance the development and adoption of Artificial Intelligence.


About Text Vectorization

The magic of converting text to numbers. This post will walk you through the basics of text vectorization which is converting text to vectors (list of numbers). In this post, we present Bag of Words (BOW) and its flavors: Frequency Vectors, One Hot Encoding (OHE), and Term Frequency/Inverse Document Frequency (TF/IDF).


SVM Optimization on a GPU with KernelML

This notebook shows the optimization of a multi-class, linear support vector machine using a simulation-based optimizer. Any simulation-based optimizer could be used with the Cuda kernel in this notebook. I used KernelML, my custom optimizer, in this example. The runtime for this script should be set to use the GPU: Runtime->Change runtime type.

Distilled News

13 Friday Dec 2019

Posted by Michael Laux in Distilled News

≈ Leave a comment

Failure Modes in Machine Learning

In the last two years, more than 200 papers have been written on how Machine Learning (ML) can fail because of adversarial attacks on the algorithms and data; this number balloons if we were to incorporate non-adversarial failure modes. The spate of papers has made it difficult for ML practitioners, let alone engineers, lawyers and policymakers, to keep up with the attacks against and defenses of ML systems. However, as these systems become more pervasive, the need to understand how they fail, whether by the hand of an adversary or due to the inherent design of a system, will only become more pressing. The purpose of this document is to jointly tabulate both the of these failure modes in a single place.


Screenshot-to-Code

A neural network that transforms a design mock-up into a static website.


Search Optimization for Large Data Sets for GDPR

This post describes our approach to addressing the challenges of cost and scale in scanning for GDPR compliance in Adobe Experience Platform.


How to use a Machine Learning Model to Make Predictions on Streaming Data using PySpark

Picture this – every second, more than 8,500 Tweets are sent, more than 900 photos are uploaded on Instagram, more than 4,200 Skype calls are made, more than 78,000 Google Searches happen, and more than 2 million emails are sent (according to Internet Live Stats). We are generating data at an unprecedented pace and scale right now. What a great time to be working in the data science space! But with great data, comes equally complex challenges. Primarily – how do we collect data at this scale? How do we ensure that our machine learning pipeline continues to churn out results as soon as the data is generated and collected? These are significant challenges the industry is facing and why the concept of Streaming Data is gaining more traction among organizations.


Text Generation with Python

This article is a little bit different compared with the other ones that I have already published. I don’t write this article to share some pieces of codes but to share with you the first article almost totally written with GPT-2. The introduction and the conclusion are written by me and not by the GPT-2 model. The rest of the article is generated by the model with some tricks. The topic of the the article is about Text generation…


Fairness Indicators: Scalable Infrastructure for Fair ML Systems

While industry and academia continue to explore the benefits of using machine learning (ML) to make better products and tackle important problems, algorithms and the datasets on which they are trained also have the ability to reflect or reinforce unfair biases. For example, consistently flagging non-toxic text comments from certain groups as ‘spam’ or ‘high toxicity’ in a moderation system leads to exclusion of those groups from conversation. In 2018, we shared how Google uses AI to make products more useful, highlighting AI principles that will guide our work moving forward. The second principle, ‘Avoid creating or reinforcing unfair bias,’ outlines our commitment to reduce unjust biases and minimize their impacts on people.


Lessons Learned from Developing ML for Healthcare

In an effort to improve guidance for research at the intersection of ML and healthcare, we have written a pair of articles, published in Nature Materials and the Journal of the American Medical Association (JAMA). The first is for ML practitioners to better understand how to develop ML solutions for healthcare, and the other is for doctors who desire a better understanding of whether ML could help improve their clinical work.


AI, Analytics, Machine Learning, Data Science, Deep Learning Technology Main Developments in 2019 and Key Trends for 2020

We asked leading experts – what are the most important developments of 2019 and 2020 key trends in AI, Analytics, Machine Learning, Data Science, and Deep Learning? This blog focuses mainly on technology and deployment.


Interpretability: Cracking open the black box, Part 2

The second part in a series on leveraging techniques to take a look inside the black box of AI, this guide considers post-hoc interpretation that is useful when the model is not transparent.


Towards a new Theory of Learning: Statistical Mechanics of Deep Neural Networks

Here, I am going to sketch out the ideas we are currently researching to develop a new theory of generalization for Deep Neural Networks. We have a lot of work to do, but I think we have made enough progress to present these ideas, informally, to flush out the basics.


Dimensionality reduction method through autoencoders

We’ve already talked about dimensionality reduction long and hard in this blog, usually focusing on PCA. Besides, in my latest post I introduced another way to reduce dimensions based on autoencoders. However, in that time I focused on how to use autoencoders as predictor, while now I’d like to consider them as a dimensionality reduction technique. Just a reminder about how autoencoders work. Its procedure starts compressing the original data into a shortcode ignoring noise. Then, the algorithm uncompresses that code to generate an image as close as possible to the original input.


Outlier Detection (Part 2): Multivariate

Mahalanobis distance | Robust estimates (MCD): Example in R

Distilled News

12 Thursday Dec 2019

Posted by Michael Laux in Distilled News

≈ Leave a comment

Dynamic Programming for Data Scientists

Algorithms and data structures are an integral part of data science. While most of us data scientists don’t take a proper algorithms course while studying, they are important all the same. Many companies ask data structures and algorithms as part of their interview process for hiring data scientists. Now the question that many people ask here is what is the use of asking a data scientist such questions. The way I like to describe it is that a data structure question may be thought of as a coding aptitude test. We all have given aptitude tests at various stages of our life, and while they are not a perfect proxy to judge someone, almost nothing ever really is. So, why not a standard algorithm test to judge people’s coding ability. But let’s not kid ourselves, they will require the same zeal to crack as your Data Science interviews, and thus, you might want to give some time for the study of algorithms and Data structure and algorithms questions. This post is about fast-tracking this study and explaining Dynamic Programming concepts for the data scientists in an easy to understand way.


Why You Are Using t-SNE Wrong

t-SNE has become a very popular technique for visualizing high dimensional data. It’s extremely common to take the features from an inner layer of a deep learning model and plot them in 2-dimensions using t-SNE to reduce the dimensionality. Unfortunately, most people just use scikit-learn’s implementation without actually understanding the results and misinterpreting what they mean.


Sentiment Analysis using ALBERT

Every researcher or NLP practitioner is well aware of BERT which came in 2018. Since then the NLP industry has transformed by a much larger extent. Albert which is A Lite BERT was made in focus to make it as light as possible by reducing parameter size.

Distilled News

10 Tuesday Dec 2019

Posted by Michael Laux in Distilled News

≈ Leave a comment

Data-Science Observability For Executives

Observability for data-science (DS) is a new and emerging field, which is sometimes mentioned in tandem with MLOps or AIOps. New offerings are being developed by young startups to address the lack of monitoring and alerts for everything data-science. However, they are mostly addressing data-scientists or -engineers, which are, of course, the first personas that feel the pain of managing multiple models. However, I will try to argue that the impact of data-science observability should be aimed toward decision-makers such as high & mid-level managers, the people who are responsible for spending, funding, managing and most importantly are accountable for the impact of the data-science-operation on the company’s clients, business, product, sales, and lets not forget the company’s bottom line.


An Introduction to Discretization Techniques for Data Scientists

Discretization is the process through which we can transform continuous variables, models or functions into a discrete form. We do this by creating a set of contiguous intervals (or bins) that go across the range of our desired variable/model/function.


Traditional AI vs. Modern AI.

The evolution of Artificial Intelligence and the new wave of ‘Future AI’


Automate a Data Science Workflow – Movie Reviewer Sentiment Analysis

I’m very resistant to point and click solutions. And I think my resistance is in good faith and for good reasons. We’ve waited a long time for drag and drop solutions for web apps, mobile apps, and a whole host of other things. But fundamentally, I think a solution like Knime is perfect for letting the user introduce the perfect amount of flexibility and simplification as necessary. For me, boxing up all my steps in a Data Science workflow has given me a whole new level of management to my projects.


Catch Me if You Can: Outlier Detection

Outlier detection is an interesting data mining task that is used quite extensively to detect anomalies in data. Outliers are points that exhibit significantly different properties than the majority of the points. To this end, outlier detection has very interesting applications such as credit card fraud detection (suspicious transactions), traffic management (drunk/rash driving) or Network Intrusions (hacks) etc. Due to the time-critical nature of these applications, there is a need for scalable outlier detection techniques. In this project, we will aim to detect outlier in a Taxi Dataset (Beijing), using a technique that only uses spatio-temporal characteristics to detect outliers in very large datasets. We will be using the geo-coordinates and the timestamps collected by the GPS on these taxis.


BERT Visualization in Embedding Projector

This story shows how to visualize pre-trained BERT embeddings in Tensorflow’s Tensorboard Embedding Projector. The story uses around 50 unique sentences and their BERT embeddings generated with TensorFlow Hub BERT models.


Multi-Label Text Classification

In the case of binary classification we just ask a yes/no type of question. If there are multiple possible answers and only one to be chosen, then it’s multiclass classification. In our example we can’t really select only one label, I would say that all of them match the photo. The goal of multi-label classification is to assign a set of relevant labels for a single instance. However, most of widely known algorithms are designed for a single label classification problems. In this article four approaches for multi-label classification available in scikit-multilearn library are described and sample analysis is introduced.


Features correlations: data leakage, confounded features and other things that can make your Deep Learning model fail

As it seems from the plot that boss is showing, the more employees have shaved heads, the more the company sales increase. If you where that boss, would you consider the same action on your employees? Probably not. In fact, you recognize that there is no causality between the two sets of events, and their behaviour is similar just by chance. More clearly: the shaved heads do not cause the sales. So, we just spotted the existence of at least two possible categories of correlations: without and with causality. We also agreed that only the second one is interesting, while the other is useless, when not misleading. But let’s dive deeper.


Talking with BERT

The growth of knowledge and research around language models has been amazing in the past few years. For BERT especially, we have seen some incredible uses for this massive pre-trained language model on tasks like text classification, prediction, and question answering. I’ve recently written about how some have researched some of the limitations of BERT when performing certain language tasks. Further, I did some testing on my own with creating a question-answering system to get a feel for how it could be used. It has been great to see and try in practice some of the many capabilities of language models.


SQL vs noSQL: Two Approaches to ETL Applications

In this article I will explore the differences in SQL and noSQL ETL pipelines. This article will focus on the transfer and load techniques -> that is what happens once the data has been loaded into the application. The extraction part is simple. It involves reading files and some basic data wrangling. By comparing how the datasets are divided post extraction you can see how the choice of database impacts the application architecture. You can also identify the strengths and weaknesses of choosing particular databases.


Why Kernelized Support Vector Machine (SVM) is MLs most beautiful Algorithm?

Machine learning has more than a few beautiful algorithms that are helping data scientists and researchers transform business models and societies altogether. It has a package of both supervised and unsupervised algorithms that can be trained as the requirements of the problem. Even though a majority of new-age machine learning applications are moving towards exploring deep learning theories and neural networks, a lot can be done with existing algorithms. One class of such a beautiful machine learning algorithms are the support vector machines. Even though people don’t use these much since the advent of neural networks, they still have a lot of scopes in research and getting answers to complex problems. The beauty of support vector machines lies in the fact that they can be reproduced using maximum likelihood estimates and understood in terms of a Bayesian classifier. Tweaks in the optimal Bayesian classifier with the appropriate assumption of a prior can give you a classic support vector machine. Support vector machines are a supervised learning algorithm when it comes to classification, but with progress in research, their scope for unsupervised clustering methods is being explored.


A Creative Approach Towards Feature Selection

Feature selection is one of the most important things when it comes to feature engineering. We need to reduce the number of features so that we can interpret the model better, make it less computational stressful to train the model, remove redundant effects and make the model generalise better. In some case feature selection becomes extremely important or else the input dimensional space is too big making it difficult for the model to train.


Exploratory Data Analysis …A topic that is neglected in Data Science Projects

Exploratory Data Analysis (EDA) is the first step in your data analysis process developed by ‘John Tukey’ in the 1970s. In statistics, exploratory data analysis is an approach to analyzing data sets to summarize their main characteristics, often with visual methods. By the name itself, we can get to know that it is a step in which we need to explore the data set.


A Comprehensive Guide To Data Imputation

In the real world, missing data is a nearly inevitable problem. Only a special few can swerve it -usually through large investments in data collection. This issue is crucial because the way we handle missing data has a direct impact on our findings, and it also plows into time management. Therefore, it should always be a priority to handle missing data properly, which can be much harder than it seems. The difficulty arises as we realize that not all missing data is created equal just because it all looks the same – a blank space – and that different types of missing data must be handled differently. In this article, we review the types of missing data, as well as basic and advanced methods to tackle them.


Build Data Pipelines with Apache Airflow

Originally created at Airbnb in 2014, Airflow is an open-source data orchestration framework that allows developers to programmatically author, schedule, and monitor data pipelines. Airflow experience is one of the most in-demand technical skills for Data Engineering (another one is Oozie) as it is listed as a skill requirement in many Data Engineer job postings. In this blog post, I will explain core concepts and workflow creation in Airflow, with source code examples to help you create your first data pipeline using Airflow.

Distilled News

10 Tuesday Dec 2019

Posted by Michael Laux in Distilled News

≈ Leave a comment

Amazon CodeGuru (Preview)

Amazon CodeGuru is a machine learning service for automated code reviews and application performance recommendations. It helps you find the most expensive lines of code that hurt application performance and keep you up all night troubleshooting, then gives you specific recommendations to fix or improve your code. CodeGuru is powered by machine learning, best practices, and hard-learned lessons across millions of code reviews and thousands of applications profiled on open source projects and internally at Amazon. With CodeGuru, you can find and fix code issues such as resource leaks, potential concurrency race conditions, and wasted CPU cycles. And with low, on-demand pricing, it is inexpensive enough to use for every code review and application you run. CodeGuru supports Java applications today, with support for more languages coming soon. CodeGuru helps you catch problems faster and earlier, so you can build and run better software.


European AI Policy Conference

AI is emerging as the most important technology in a new wave of digital innovation that is transforming industries around the world. Businesses in Europe are at the forefront of some of the latest advancements in the field, and European universities are home to the greatest concentration of AI researchers in the world. Every week, new case studies emerge showing the potential opportunities that can arise from greater use of the technology. To fully realize its vision for AI, Europe needs an influx of resources and talent, plus some important policy changes. Join the Center for Data Innovation to discuss why European success in AI is important, how the EU compares to other world leaders today, and what steps European policymakers should take to be more competitive in AI.


Statistics for Data Science in One Picture

There’s no doubt about it, probability and statistics is an enormous field, encompassing topics from the familiar (like the average) to the complex (regression analysis, correlation coefficients and hypothesis testing to name but a few). If you want to be a great data scientist, you have to know some basic statistics. The following picture shows which statistics topics you must know if you’re going to excel in data science.


Understanding Transfer Learning for Medical Imaging

As deep neural networks are applied to an increasingly diverse set of domains, transfer learning has emerged as a highly popular technique in developing deep learning models. In transfer learning, the neural network is trained in two stages: 1) pretraining, where the network is generally trained on a large-scale benchmark dataset representing a wide diversity of labels/categories (e.g., ImageNet); and 2) fine-tuning, where the pretrained network is further trained on the specific target task of interest, which may have fewer labeled examples than the pretraining dataset. The pretraining step helps the network learn general features that can be reused on the target task.


The 4 Hottest Trends in Data Science for 2020

Companies all over the world across a wide variety of industries have been going through what people are calling a digital transformation. That is, businesses are taking traditional business processes such as hiring, marketing, pricing, and strategy, and using digital technologies to make them 10 times better. Data Science has become an integral part of those transformations. With Data Science, organizations no longer have to make their important decisions based on hunches, best-guesses, or small surveys. Instead, they’re analyzing large amounts of real data to base their decisions on real, data-driven facts. That’s really what Data Science is all about – creating value through data. This trend of integrating data into the core business processes has grown significantly, with an increase in interest by over four times in the past 5 years according to Google Search Trends. Data is giving companies a sharp advantage over their competitors. With more data and better Data Scientists to use it, companies can acquire information about the market that their competitors might not even know existed. It’s become a game of Data or perish.


Why software engineering processes and tools don’t work for machine learning

While AI may be the new electricity significant challenges remain to realize AI potential. Here we examine why data scientists and teams can’t rely on software engineering tools and processes for machine learning.


Activation Functions and Optimizers for Deep Learning Models

Deep Learning (DL) models are revolutionizing the business and technology world with jaw-dropping performances in one application area after another – image classification, object detection, object tracking, pose recognition, video analytics, synthetic picture generation – just to name a few. However, they are like anything but classical Machine Learning (ML) algorithms/techniques. DL models use millions of parameters and create extremely complex and highly nonlinear internal representations of the images or datasets that are fed to these models. Whereas for the classical ML, domain experts and data scientists often have to write hand-crafted algorithms to extract and represent high-dimensional features from the raw data, deep learning models, on the other hand, automatically extracts and work on these complex features.


Data science curriculum roadmap

We venture to suggest a curriculum roadmap after receiving multiple requests for one from academic partners. As a group, we have spent the vast majority of our time in industry, although many of us have had spent time in one academic capacity or another. What follows is a set of broad recommendations, and it will inevitably require a lot of adjustments in each implementation. Given that caveat, here are our curriculum recommendations.


Understanding Dimensionality Reduction

We all understand that more data means better AI. That sounds great! But, with the recent blast of information, we often end in a problem of too much data! We need all that data. But it turns out to be too much for our processing. Hence we need to look into ways of streamlining the available data so that it can be compressed without losing value. Dimensionality reduction is an important technique that achieves this end.


Missing Data?

Three weeks into my journey to become a data scientist and I’ve officially been baptized… by fire, that is! I chose to attend Flatiron’s Data Science 15-week bootcamp to transition out of finance. So far, the program has exceeded expectations (and my expectations were high). While the curriculum is rigorous and fast-paced, it’s well constructed, the instructors are dedicated to helping students learn, and my cohort is amazing – everyone is friendly, helpful, smart, and undoubtedly will go on to accomplish great things. This series of blog posts is dedicated to them… Here’s to the future data scientists!


Process Capability Analysis with R

Process capability analysis represents a significant component of the Measure phase from the DMAIC (Define, Measure, Analysis, Improve, Control) cycle during a Six Sigma project. This analysis measures how a process performance fits the customer’s requirements, which are translated into specification limits for the interesting characteristics of the product to be manufactured or produced. The results from this analysis may help industrial engineers identify variation within a process and develop further action plans that lead to better yield, lower variation and less number of defects.


How to Navigate Artificial Intelligence Landscape?

Understanding Various Terminologies & Roles in Artificial Intelligence Projects. Artificial Intelligence (AI) is a complex and evolving field. The first challenge an AI aspirant faces is understanding the landscape and how he could navigate through it. Consider this, if you are travelling to a new city, and if you don’t have the map, you will have trouble to navigate the city and you will need to ask a lot of random people during your travel without knowing how much they know about the place. Similarly, all the newcomers to AI have this trouble, and there are two ways to deal with this, arrange the map (or a guide) or travel yourself and learn with experience.

Distilled News

05 Thursday Dec 2019

Posted by Michael Laux in Distilled News

≈ Leave a comment

Chatbots: A User’s Guide

A Brief History of Chatbot Technology, and the Problems and Solutions that this Software Offers for Businesses


Point Biserial Correlation with Python

Linear regression is a classic technique to determine the correlation between two or more continuous features of a data file. This is of course only ideal if the features have an almost linear relationship.


The Rise of Chatbots and AI Assistants

How Invisible Interfaces such as Chatbots and Intelligent Assistants are Redefining Brand Experiences and Customer Engagement


Eigenvectors and Eigenvalues – All you need to know

The reason why eigenvalues are so important in mathematics are too many. Here is a short list of the applications that are coming now in mind to me:
• Principal Components Analysis (PCA) in dimensionality reduction and object/image recognition. (See PCA)
• Face recognition by computing eigenvectors of images (See Eigenfaces).
• Physics – stability analysis, the physics of rotating bodies (See Stability Theory).
• Google uses it to rank pages for your search results (See PageRank).


Graph Neural Networks and Permutation invariance

Using invariance theory for learning on graph and relational data. In one of the previous posts, we discussed a way to learn aggregate functions. The specific need arises when analyzing relational data, where a conclusion has to be made based on an unknown number of records. For example, classification of customers based on transaction history. A more general case is learning on the Graph, where we have to predict node’s property based on the properties of the vertices which are connected to the current vertex, and also any of the properties of the edges that terminate on a given vertex. To illustrate, consider a problem of predicting customer churn based on transaction history. Below is an ER diagram of the famous Northwind database.


Fine-tuning BERT with Keras and tf.Module

In this experiment we convert a pre-trained BERT model checkpoint into a trainable Keras layer, which we use to solve a text classification task. We achieve this by using a tf.Module, which is a neat abstraction designed to handle pre-trained Tensorflow models. Exported modules can be easily integrated into other models, which facilitates experiments with powerful NN architectures.


Understanding Dimensionality Reduction for Machine Learning

Dimensionality Reduction is a technique in Machine Learning that reduces the number of features in your dataset. The great thing about dimensionality reduction is that it does not negatively affect your machine learning model’s performance. In some cases, this technique has even increased the accuracy of the model. By reducing the number of features in our dataset, we are also reducing the storage space required to store the data, our python compiler will need less time to go through the dataset. In this post, we will take a look at two of the most popular types of Dimensionality Reduction techniques, Principle Component Analysis (PCA) and Linear Discriminant Analysis (LDA).


[NLP] Basics: Understanding Regular Expressions

When I started learning natural language processing, regular expressions truly felt like a foreign language. I struggled to understand the syntax and it would take me hours to write a regular expression that would return the input I was looking for. Naturally, I tried to stay away from them as long as I could. But the truth is, as a data scientist, you’ll have to engage with regular expressions one day or the other. They form part of the basic techniques in natural language processing and learning them will go a long way to making you a more efficient programmer. So it’s time to sit down and get to it. Think of learning regular expressions like a grammar class: they’re painful, it will seem incomprehensible at first but once you understand it and learn it, you’ll feel so relieved it’s behind you. And I promise you, it’s not that hard in the end.


The Hidden Peculiarities of Realtime Data Streaming Applications

With the increasing number of open-source frameworks such as Apache Flink, Apache Spark, Apache Storm, and cloud frameworks such as Google Dataflow, creating realtime data-processing jobs has become quite easy. The APIs are well defined, and the standard concepts such as Map-Reduce follow almost similar semantics across all frameworks. However, still today, a developer starting in the realtime data processing world struggles with some of the peculiarities of this domain. Due to this, they unknowingly create a path that leads to rather common errors in the application. Let’s take a look at a few of the odd concepts which you might need to conquer while designing your realtime application.


Introduction to Bayesian Belief Networks

Bayesian Belief Network or Bayesian Network or Belief Network is a Probabilistic Graphical Model (PGM) that represents conditional dependencies between random variables through a Directed Acyclic Graph (DAG).


Build pipelines with Pandas using “pdpipe”

We show how to build intuitive and useful pipelines with Pandas DataFrame using a wonderful little library called pdpipe.

Distilled News

04 Wednesday Dec 2019

Posted by Michael Laux in Distilled News

≈ Leave a comment

BERTasticity – Part 1

Understanding Transformers – the CORE behind the Mammoth (Bert). In Language Modelling domain, BERT is something that has created quite a chaos since it is introduced. A lot of similar models have come from that time which always have a competition in claiming which one is better. Some of the alternatives include:
• GPT,
• GPT-2,
• RoBERTa,
• DistilBERT,
• XLNet, etc.
BERT, and other alternatives, have found their applications in numerous NLP problem statements like Machine Translation, Text Summarisation, Question Answering, etc. One rule for any famous algorithm is that as it gets introduced, it is followed by a lot of explanations and a lot of packages enabling you to apply the algorithm. This article can also be counted among one of them, but may be different in the approach and examples.


Forecasting in Python with Facebook Prophet

In this post, I’ll explain how to forecast using Facebook’s Prophet and demonstrate a few advanced techniques for handling trend inconsistencies by using domain knowledge. There are a lot of Prophet tutorials floating around the web, but none of them went into any depth about tuning a Prophet model, or about integrating analyst knowledge to help a model navigate the data. I intend to do both of those with this post.


Automated testing with ‘testthat’ in practice

You test your code. We know you do. How else are you sure that your changes don’t break the program? But after you commit, you discard those pesky scripts and throw away code. Don’t you think it’s a bit of a waste to dump all that effort that took you quite a decent chunk of your day to conjure? Well, here you are, so let’s see another way. A better way.


Human in the Loop Auto Machine Learning with Dabl

The most time-consuming parts of any machine learning project are the initial phases of data cleaning, pre-processing and analysis. Prior to training a model, you will first need to go through a lengthy process. Carrying out tasks such as dealing with missing values, converting categorical variables to numerical and analysing the data to inform feature engineering and selection.


An overview of several recommendation systems

Collaborative filtering, KNN, Deep learning, transfer learning, Tfidf…ect explore all of these


Economics of data science

Everywhere we look, data science – perhaps simply a combination of statistical analysis, machine learning and data analytics as Cassie Kozyrkov put it in one of her articles – is on a hot streak. So much data generated, so many questions to ask.


Google’s new ‘Explainable AI’ (xAI) service

Google has started offering a new service for ‘explainable AI’ or XAI, as it is fashionably called. Presently offered tools are modest, but the intent is in the right direction.


A Practical Way to Include an Ethics Review in Your Development Processes

Imagine you are in a meeting with 5-8 people. It’s a development meeting, early in the product cycle. Maybe it is a sprint planning session, with ideas flowing from the group about how to approach the problem. You can feel the energy and excitement of the fully engaged team members, with each suggestion bringing about a better solution. The team has brought its best ideas to the table and collectively shaped them, and the meeting is about to wrap up. But, is your solution ethical? Is it ‘ok’ for your customers, or employees? How can you know? Aren’t these questions for the lawyers and people that don’t do the real development work? Most of us in this function/space aren’t used to asking these questions, and our tools don’t account for ethical debates. How can we do it?


Best Artificial Intelligence Technologies to know in 2019

1. Natural Language Generation
2. Speech Recognition
3. Online Agents
4. Machine Learning Platforms
5. AI-Optimized Hardware
6. Choice Management
7. Deep Learning Platforms
8. Biometrics
9. Robotic Processes Automation
10. Text Analytics and Natural Language Processing
11. Digital Twin/AI Modeling
12. Cyber Defense
13. Conformity
14. Knowledge Worker Aid
15. Material Creation
16. Peer-to-Peer Networks
17. Feeling Recognition
18. Picture Recognition
19. Advertising and Marketing Automation


Julia Box: Google Colab for Julia

Juliabox is similar to Colab, but rather than running Python, it runs Julia. Just like Colab, JuliaBox is free.


Advantages and Disadvantages of Artificial Intelligence

Advantages:
1) Reduction in Human Error
2) Takes risks instead of Humans
3) Available 24×7
4) Helping in Repetitive Jobs
5) Digital Assistance
6) Faster Decisions
7) Daily Applications
8) New Inventions
Disadvantages:
1) High Costs of Creation
2) Making Humans Lazy
3) Unemployment
4) No Emotions
5) Lacking Out of Box Thinking


Introducing Deep Java Library(DJL)

We are excited to announce the Deep Java Library (DJL), an open source library to develop, train and run Deep learning models in Java using intuitive, high-level APIs. If you are a Java user interested in learning Deep learning, DJL is a great way to start learning. If you’re a Java developer working with Deep learning models, DJL will simplify the way you train and run predictions. In this post, we will show how to run a prediction with a pre-trained Deep learning model in minutes.

Distilled News

02 Monday Dec 2019

Posted by Michael Laux in Distilled News

≈ Leave a comment

I had no idea how to build a Machine Learning Pipeline. But here’s what I figured.

As a postgraduate studying Artificial Intelligence (AI), my exposure to Machine Learning (ML) is largely academic. Yet, when given a task to create a simple ML pipeline for a time series forecast model, I realised how clueless I was. Also, I could barely find any specific information or code out there on this topic, hence I decided to write this topic. This article will present a basic structure of how a simple ML pipeline can be created (More information may be supplemented over time).


John Allspaw: People are the adaptable element of complex systems

All work in software involves people facing multiple tangled layers of trade-offs and coping with complexity. Uncertainty, ambiguity, and dilemmas are part of the everyday experience in modern enterprises. Exploring and understanding how people successfully cope with these challenges is core to Resilience Engineering. I will talk about the apparent irony of finding sources of resilience (sustaining the capacity to adapt to the unforeseen) by examining closely what would otherwise be categorized as failure: the messy details of critical incidents.


A Visual Guide to Using BERT for the First Time

Progress has been rapidly accelerating in machine learning models that process language over the last couple of years. This progress has left the research lab and started powering some of the leading digital products. A great example of this is the recent announcement of how the BERT model is now a major force behind Google Search. Google believes this step (or progress in natural language understanding as applied in search) represents ‘the biggest leap forward in the past five years, and one of the biggest leaps forward in the history of Search’. This post is a simple tutorial for how to use a variant of BERT to classify sentences. This is an example that is basic enough as a first intro, yet advanced enough to showcase some of the key concepts involved.


CCSM: Scalable statistical anomaly detection to resolve app crashes faster

Our family of mobile apps is used by more than 2 billion people every month – on a wide variety of mobile devices. We employ rigorous code review and testing processes, but, as with any system, software bugs still sometimes slip through and may even cause our apps to crash. Resolving these crashes and other reliability issues in a timely manner is a top priority. To help us respond as quickly as possible, we have been creating a collection of services that use machine learning (ML) to aid engineers in diagnosing and resolving software reliability and performance issues. As part of this collection, we recently implemented continuous contrast set mining (CCSM), an anomaly-detection framework that uses contrast set mining (CSM) techniques to locate statistically ‘interesting’ (defined by several statistical properties) sets of features in groups. A novel algorithm we’ve developed extends standard contrast set mining from categorical data to continuous data, inspired by tree search algorithms and multiple hypothesis testing. Our model is more than 40 times faster than naive baseline approaches, enabling us to scale to challenging new data sets and use cases.


Human-Machine Collaboration: The Future of Work

Today, organizations are rethinking work as we know it. We are seeing a fundamental shift in the work model to one that fosters human-machine collaboration, enables new skills and worker experiences, and supports an environment unbounded by time or physical space. Many companies, call this the ‘future of work.’ However, the reality is that we are seeing many of these changes occurring in the present day. ‘Digital workers’ are making up a growing share of the workforce. We define a ‘digital worker’ as technology – including artificial intelligence (AI), intelligent process automation (IPA), augmented reality/virtual reality (AR/VR), and software robotics – that automates and augments work previously accomplished by humans.


Unsupervised Sentiment Analysis

How to extract sentiment from the data without any labels. One of the common applications of NLP methods is sentiment analysis, where you try to extract from the data information about the emotions of the writer. Mainly, at least at the beginning, you would try to distinguish between positive and negative sentiment, eventually also neutral, or even retrieve score associated with a given opinion based only on text.


Stock market forecasting using Time Series analysis

The general research associated with the stock or share market is highly focusing on neither buy nor sell but it fails to address the dimensionality and expectancy of a new investor. The common trend towards the stock market among the society is that it is highly risky for investment or not suitable for trade so most of the people are not even interested. The seasonal variance and steady flow of any index will help both existing and naïve investors to understand and make a decision to invest in the stock/share market. To solve these types of problems, the time series analysis will be the best tool for forecasting the trend or even future. The trend chart will provide adequate guidance for the investor. So let us understand this concept in great detail and use a machine learning technique to forecast stocks.


Time Series Forecasting with LSTMs using TensorFlow 2 and Keras in Python

Introduction to data preparation and prediction for Time Series forecasting using LSTMs. Learn about Time Series and making predictions using Recurrent Neural Networks. Prepare sequence data and use LSTMs to make simple predictions.


Dataset unavailable? No problem!

What if you couldn’t find a dataset even on those sites? what would you do? Would you blame the internet for not giving you the dataset of your need or you would curse the whole universe? Well, I would not do both the things mentioned above. I would create my own dataset and trust me, It wouldn’t take any longer than 5 minutes. Now, Let me show you how to create your own dataset quickly. We’ll be using a python package called Faker.


Microsoft Introduces Icebreaker to Address the Famous Ice-Start Challenge in Machine Learning

The new technique allows the deployment of machine learning models that operate with minimum training data. The acquisition and labeling of training data remains one of the major challenges for the mainstream adoption of machine learning solutions. Within the machine learning research community, several efforts such as weakly supervised learning or one-shot learning have been created in order to address this issue. Microsoft Research recently incubated a group called Minimum Data AI to work on different solutions for machine learning models that can operate without the need of large training datasets. Recently, that group published a paper unveiling Icebreaker, a framework for ‘wise training data acquisition’ which allow the deployment of machine learning models that can operate with little or no-training data. The current evolution of machine learning research and technologies have prioritized supervised models that need to know quite a bit about the world before they can produce any relevant knowledge. In real world scenarios, the acquisition and maintenance of high quality training datasets results quite challenging and sometimes impossible. In machine learning theory, we refer to this dilemma as the ice(cold)-start problem.


Moving AI and ML from research into production

Dean Wampler discusses the challenges and opportunities businesses face when moving AI from discussions to production.

Distilled News

01 Sunday Dec 2019

Posted by Michael Laux in Distilled News

≈ Leave a comment

6-essential practices to successfully implement machine learning solutions in your organization.

Executive’s Guide to Successfully Becoming an AI-Driven Enterprise. McKinsey Insights recently published its Global AI Survey and discussed many aspects of the impact AI is generating across multiple companies. What really caught my eye was the comparison done between AI high performing companies versus the rest. According to the comparison done, companies with a clear enterprise-level road map of use cases, a solid cross-functional collaboration between the analytics & business units, a standard AI toolset for professionals to use, an understanding of frequently updating AI models and systematically tracing a comprehensive set of well-defined KPI for AI perform 3.78x better than other players in the market.


I’m Bayesed and I know it

If you’re too young to realize where the title reference comes from, I’m gonna make you lose your mind. It has something to do with parties and rocks and anthems. Actually, no, I just want you to have a good time so I’ll instead ask you to take a look at the title picture. What did you notice? I am obviously drawing your attention to both the title and picture for a reason. With the title, you might not have realized there was a ‘pattern’ to it till I pointed it out. With the picture, if you only took a quick glance, you might have seen just sheep. If you managed to figure both out without me having to point it out, you can stop reading.


Writing Linguistic Rules for Natural Language Processing

When I first started exploring data science towards the end of my Ph.D. program in linguistics, I was pleased to discover the role of linguistics – specifically, linguistic features – in the development of NLP models. At the same time, I was a bit perplexed by why there was relatively little talk of exploiting syntactic features (e.g., level of clausal embedding, presence of coordination, type of speech act, etc.) compared to other types of features, such as how many times a certain word occurs in the text (lexical features), word similarity measures (semantic features), and even where in the document a word or phrase occurs (positional features). For example, in a sentiment analysis task, we may use a list of content words (adjectives, nouns, verbs, and adverbs) as features for a model to predict the semantic orientation of user feedback (i.e., positive or negative). In another feedback classification task, we may curate domain-specific lists of words or phrases to train a model that can direct user comments to appropriate divisions of support, e.g., billing, technical, or customer service.


Dynamic Meta Embeddings in Keras

Many NLP solutions make use of pretrained word embeddings. The choice of which one to use is often releted to the final performances and is achived after lot of trials and manual tuning. At Facebook AI Lab agreed that the best way to make this kind of selection is to let neural networks to figure out by themselves. They introduced dynamic meta-embeddings, a simple yet effective method for the supervised learning of embedding ensembles, which leads to state of the art performance within the same model class on a variety of tasks. This simple, but extremely efficient, method permits to learn a liner combination of a set of selected word embeddings which outperforms the naive concatenation of various embeddings. As mentioned, the authors proved the validity of their solution on various tasks in NLP domain. We limited ourself to adopt these tecniques in a text classification problem, where we have 2 pretrained embeddings and want to combine them intelligently in order to boost the final performances.


Machine Learning on Encrypted Data Without Decrypting It

Suppose you have just developed a spiffy new machine learning model (using Flux.jl of course) and now want to start deploying it for your users. How do you go about doing that? Probably the simplest thing would be to just ship your model to your users and let them run it locally on their data. However there are a number of problems with this approach:
• ML models are large and the user’s device may not have enough storage or computation to actually run the model.
• ML models are often updated frequently and you may not want to send the large model across the network that often.
• Developing ML models takes a lot of time and computational resources, which you may want to recover by charging your users for making use of your model.


Learning Data Structure Alchemy

We propose a solution based on first principles and AI to the decades-old problem of data structure design. Instead of working on individual designs that each can only be helpful in a small set of environments, we propose the construction of an engine, a Data Alchemist, which learns how to blend fine-grained data structure design principles to automatically synthesize brand new data structures.


Interpretability: Cracking open the black box – Part III

Previously, we looked at the pitfalls with the default ‘ feature importance ‘ in tree based models, talked about permutation importance, LOOC importance, and Partial Dependence Plots. Now let’s switch lanes and look at a few model agnostic techniques which takes a bottom-up way of explaining predictions. Instead of looking at the model and trying to come up with global explanations like feature importance, these set of methods look at each single prediction and then try to explain them.


What does a Fine-tuned BERT model look at?

There is a lot of buzz around NLP of late, especially after the advancement in transfer learning techniques and with the advent of architectures like transformers. As someone from the applied side of Machine learning, I feel that it is not only important to have models that can surpass the state of the art results in many benchmarks, It is also important to have models that are trustable, understandable and not a complete black box. This post is an attempt to understand the learnings of BERT on task-specific training. Let’s start with how attention is implemented in a Transformer and how it can be leveraged for understanding the model ( Feel free to skip this section if you are already aware of it).


Variance, Attractors and Behavior of Chaotic Statistical Systems

We study the properties of a typical chaotic system to derive general insights that apply to a large class of unusual statistical distributions. The purpose is to create a unified theory of these systems. These systems can be deterministic or random, yet due to their gentle chaotic nature, they exhibit the same behavior in both cases. They lead to new models with numerous applications in Fintech, cryptography, simulation and benchmarking tests of statistical hypotheses. They are also related to numeration systems. One of the highlights in this article is the discovery of a simple variance formula for an infinite sum of highly correlated random variables. We also try to find and characterize attractor distributions: these are the limiting distributions for the systems in question, just like the Gaussian attractor is the universal attractor with finite variance in the central limit theorem framework. Each of these systems is governed by a specific functional equation, typically a stochastic integral equation whose solutions are the attractors. This equation helps establish many of their properties. The material discussed here is state-of-the-art and original, yet presented in a format accessible to professionals with limited exposure to statistical science. Physicists, statisticians, data scientists and people interested in signal processing, chaos modeling, or dynamical systems will find this article particularly interesting. Connection to other similar chaotic systems is also discussed.


Is Data Science dying?

The people who have been in the industry for a long time can relate this. Many years ago the industry went crazy for a similar skill known as Business Analytics. Nowadays, the term Data Scientist is exploding on the internet and it is the modern job that seems very promising.


History of AI; Labeling ‘AI’ correctly; Excerpts from upcoming ‘AI Bill of Rights’

What is the difference between artificial intelligence and true intelligence? Artificial intelligence, to me, is when a group purposefully tries to make a single individual more intelligent. Before we can talk about the history of AI, we must accurately define it as well as label it. We must decipher what AI stands for, then we can go back to the roots of AI, and where it’s future is heading.


How PyTorch lets you build and experiment with a neural net

We show, step-by-step, a simple example of building a classifier neural network in PyTorch and highlight how easy it is to experiment with advanced concepts such as custom layers and activation functions.
← Older posts

Blogs by Category

  • arXiv
  • arXiv Papers
  • Blogs
  • Books
  • Causality
  • Distilled News
  • Documents
  • Ethics
  • Magister Dixit
  • Personal Productivity
  • Python Packages
  • R Packages
  • Uncategorized
  • What is …
  • WordPress

Blogs by Month

Follow Blog via Email

Enter your email address to follow this blog and receive notifications of new posts by email.

Follow AnalytiXon

Powered by WordPress.com.

 

Loading Comments...