With ML.NET, you can create custom ML models using C# or F# without having to leave the .NET ecosystem. ML.NET lets you re-use all the knowledge, skills, code, and libraries you already have as a .NET developer so that you can easily integrate machine learning into your web, mobile, desktop, gaming, and IoT apps.

Graph Processing on FPGAs: Taxonomy, Survey, Challenges

Graph processing has become an important part of various areas, such as machine learning, computational sciences, medical applications, social network analysis, and many others. Various graphs, for example web or social networks, may contain up to trillions of edges. The sheer size of such datasets, combined with the irregular nature of graph processing, poses unique challenges for the runtime and the consumed power. Field Programmable Gate Arrays (FPGAs) can be an energy-efficient solution to deliver specialized hardware for graph processing. This is reflected by the recent interest in developing various graph algorithms and graph processing frameworks on FPGAs. To facilitate understanding of this emerging domain, we present the first survey and taxonomy on graph computations on FPGAs. Our survey describes and categorizes existing schemes and explains key ideas. Finally, we discuss research and engineering challenges to outline the future of graph computations on FPGAs.

Predicting the next search keyword using Deep Learning

Next Word Prediction or Language Modeling is the task of predicting what word comes next. You might be using it daily when you write texts or emails without realizing it. E-commerce, especially groceries based e-commerce, can benefit from such features extensively.

Machine Learning: Naive Bayes Classifier And Naive Assumption Explained

In contrast to other machine learning algorithms that run through multiple iterations in order to converge towards some solution, naive bayes classifies data solely based off of conditional probabilities. Naive bayes has the following advantages:
• extremely fast for both training and making predictions
• interpretable
• doesn’t require hyperparameter tuning

Generative Temporal Memory Models

I recently came across a DeepMind paper that presents generative temporal models with memory, in order to learn long-range dependencies based on temporally-distant, past observations. Given the current popularity of adversarial models, it made sense to take a few minutes to understand how memory-enabled generative adversarial nets (GANs) might perform.

Random Forest Regression model explained in depth

In my previous article, I presented the Decision Tree Regressor algorithm. If you haven’t read this article I would urge you to read it before continuing. The reason is that the Decision Tree is the main building block of a Random Forest. Random Forest is a flexible, easy to use machine learning algorithm that produces great results most of the time with minimum time spent on hyper-parameter tuning. It has gained popularity due to its simplicity and the fact that it can be used for both classification and regression tasks. In this article, I will present in details the Random Forest Regression model.

Automated Data Profiling Using Python

This blog is about automating the data profiling stage of the Exploratory Data Analysis process (EDA). We will automate the data profiling process using Python and produce a Microsoft Word document as the output with the results of data profiling. The key advantage of producing a MS Word document as the output with data profiling results is that it can be used to capture the discussions and decisions with the domain experts regarding data quality, data transformations and feature engineering for further modelling and visualisations.

Intermittent demand, Croston and Die Hard

I have recently been confronted to a kind of data set and problem that I was not even aware existed: intermittent demand data. Intermittent demand arises when the demand for a certain good arrives sporadically.

Machine Learning: Lessons Learned from the Enterprise

There’s a huge difference between the purely academic exercise of training Machine Learning (ML) models versus building end-to-end Data Science solutions to real enterprise problems. This article summarizes the lessons learned after two years of our team engaging with dozens of enterprise clients from different industries including manufacturing, financial services, retail, entertainment, and healthcare, among others. What are the most common ML problems faced by the enterprise? What is beyond training an ML model? How to address data preparation? How to scale to large datasets? Why is feature engineering so crucial? How to go from a model to a fully capable system in production? Do I need a Data Science platform if every single data science tool is available in the open source? These are some of the questions that will be addressed, exposing some challenges, pitfalls, and best practices through specific industry examples.

Custom object detection for non-data scientists

In general terms, at the end of this tutorial you basically will be able to pick up your dataset, load it on jupyter notebook, train and use your model 🙂 The picture above is the result in the example that we are going to play here.

Traditional vs Deep Learning Algorithms used in BlockChain in Retail Industry – III

In continuation to my previous blogs, ‘Traditional vs Deep Learning in Retail Industry’ and ‘Deep Learning Vs Deep Reinforcement Learning Algorithms in Retail Industry’ this blog highlights on different ML algorithms used in blockchain transactions with a special emphasis on bitcoins in retail payments. This blog is structured as follows:
• Overview on the role of blockchain in retail industry.
• Different traditional (SecureSVM, Bagging, Boosting Clustering) vs deep learning algorithms (LSTM, CNN and GAN) used in bitcoin retail payments.

19 entities for 104 languages: A new era of NER with the DeepPavlov multilingual BERT

There’s hardly anyone left in the world data science community who wouldn’t agree that the release of BERT was the most exciting event in the NLP field. For those who still haven’t heard: BERT is a transformer-based technique for pretraining contextual word representations that enables state-of-the-art results across a wide array of natural language processing tasks. The BERT paper was acknowledged as the best long paper ?? of the year by the North American Chapter of the Association for Computational Linguistics. Google Research released several pretrained BERT models, including the multilingual, Chinese, and English-language BERT.

Predicting Customer Lifetime Value with “Buy ‘Til You Die” probabilistic models in Python

What is a customer worth? How many more times a customer will purchase before churning? How likely is he to churn within the next 3 months? And above all, how long should we expect a customer to be ‘alive’ for? While these are very common questions among Marketing, Product, VCs and Corporate Finance professionals, it is always hard to properly answer them with accurate numbers.

The Prosecutor’s Fallacy

You know that you are innocent, but physical evidence at the scene of the crime matches your description. The prosecutor argues that you are guilty because the odds of finding this evidence given that you are innocent are so small that the jury should discard the probability that you did not actually commit the crime. But those numbers don’t add up. The prosecutor has misapplied conditional probability and neglected the prior odds of you, the defendant, being guilty before they introduced the evidence.

An Introduction to the Powerful Bayes’ Theorem for Data Science Professionals

Probability is at the very core of a lot of data science algorithms. In fact, the solutions to so many data science problems are probabilistic in nature – hence I always advice focusing on learning statistics and probability before jumping into the algorithms. But I’ve seen a lot of aspiring data scientists shunning statistics, especially Bayesian statistics. It remains incomprehensible to a lot of analysts and data scientists. I’m sure a lot of you are nodding your head at this! Bayes’ Theorem, a major aspect of Bayesian Statistics, was created by Thomas Bayes, a monk who lived during the eighteenth century. The very fact that we’re still learning about it shows how influential his work has been across centuries! Bayes’ Theorem enables us to work on complex data science problems and is still taught at leading universities worldwide.

API requests in Python

In this article, you learn how to use API requests in Python.

How to measure distances in machine learning

In machine learning, many supervised and unsupervised algorithms use Distance Metrics to understand patterns in the input data. Also, it is used to recognize similarities among the data. Choosing a good distance metric will improve how well a classification or clustering algorithms performed.

The Recommender Canvas: Everything You Wanted To Know About Recommender System Design & Its Necessity Today

The internet is evolving day by day, and when users shop online, they are flooded with thousands of results, leaving them in a dilemma to choose the best possible product that suits their requirements. Have you ever thought of how Google Ads precisely knew what you need and display all those products in their ads or how Netflix gives you movie recommendations you might be interested? Yes, everything is made possible through an exciting concept called recommender systems.

An ‘Equation-to-Code’ Machine Learning Project Walk-Through – Part 2 Non-Linear Separable Problem

The detailed explanation behind the math equation to build the practical math foundations for your machine learning or deep learning journey.