Automated Dashboard for Credit Modelling with Decision trees and Random forests in R

In this article, you learn how to make Automated Dashboard for Credit Modelling with Decision trees and Random forests in R. First you need to install the rmarkdown rmarkdown package into your R library. Assuming that you installed the rmarkdown rmarkdown , next you create a new rmarkdown rmarkdown script in R.

Automated Dashboard for Classification Neural Network in R

In this article, you learn how to make Automated Dashboard for Classification Neural Network in R. First you need to install the rmarkdown rmarkdown package into your R library. Assuming that you installed the rmarkdown rmarkdown , next you create a new rmarkdown rmarkdown script in R.

Think your Data Different

In the last couple of years deep learning (DL) has become a main enabler for applications in many domains such as vision, NLP, audio, click stream data etc. Recently researchers started to successfully apply deep learning methods to graph datasets in domains like social networks, recommender systems and biology, where data is inherently structured in a graphical way. So how do Graph Neural Networks work? Why do we need them?

Deep Learning Vision for Non-Vision Tasks

In recent years, deep learning has revolutionized computer vision. And thanks to transfer learning and amazing learning resources, anyone can start getting state of the art results within days and even hours, by using a pre-trained model and adapting it to your domain. As deep learning is becoming commoditized, what is needed is its creative application to different domains. Today, deep learning in computer vision has largely solved visual object classification, object detection, and recognition. In these areas, deep neural networks outperform human performance. Even if your data is not visual, you can still leverage the power of these vision deep learning models, mostly CNNs. To do that, you have to transform your data from the non-vision domain into images and then use one of the models trained on images with your data. You will be surprised how powerful this approach is! In this post, I will present 3 cases where companies used deep learning creatively, applying vision deep learning models to non-vision domains. In each of these cases, a non-computer vision problem was transformed and stated in such a way as to leverage the power of a deep learning model suitable for image classification.

Model Evaluation Techniques for Classification models

In machine learning, we often use the classification models to get a predicted result of population data. Classification which is one of the two sections of supervised learning, deals with data from different categories. The training dataset trains the model to predict the unknown labels of population data. There are multiple algorithms, namely, Logistic regression, K-nearest neighbour, Decision tree, Naive Bayes etc. All these algorithms have their own style of execution and different techniques of prediction. But, at the end, we need to find the effectiveness of an algorithm. To find the most suitable algorithm for a particular business problem, there are few model evaluation techniques. In this article different model evaluation techniques will be discussed.

Curse of Dimensionality

In Machine Learning, we often have high-dimensional data. If we’re recording 60 different metrics for each of our shoppers, we’re working in a space with 60 dimensions. If we’re analyzing grayscale images sized 50×50, we’re working in a space with 2,500 dimensions. If the images are RGB-colored, the dimensionality increases to 7,500 dimensions (one dimension for each color channel in each pixel in the image).

Pancake: A Python package for model stacking

In a previous post, I have provided a discussion of model stacking, a popular approach in data science competitions for boosting predictive performance. Since then, the post has attracted some attention, so I have decided to put together a Python package which provides a simple API to stack models with minimal effort. In this post, I will present the Pancake package, which is designed to simplify the stacking process and help the user experiment with stacking efficiently. For readers who are new to model stacking, I recommend reading the previous post first. For readers who would like to learn the implementation details, I recommend going over the documentation in the repository. There are several packages out there which provide user friendly APIs for stacking (in Python and R). The Pancake package differs mostly in the way stacking is implemented. Additionally, I have tried to document the fine details as clearly as possible to provide the user with the inner workings of the stacking process.

Gentle Introduction of XGBoost Library

If things don’t go your way in predictive modeling, use XGboost. XGBoost algorithm has become the ultimate weapon of many data scientist. It’s a highly sophisticated algorithm, powerful enough to deal with all sorts of irregularities of data. In this article, you will discover XGBoost and get a gentle introduction to what it is, where it came from and how you can learn more.

The Data Fabric for Machine Learning. Part 1.

How the new advances in semantics and the data fabric can help us be better at Machine Learning. Also, a new definition of machine learning.

Generating, With Style: The Mechanics Behind NVIDIA’s Highly Realistic GAN Images

Hearing that jaw-dropping results are being produced by some novel flavor of GAN is hardly a new experience if you follow the field, but even by recently heightened standards, these images are stunning. For the first time, I’m confident I wouldn’t be personally able to differentiate these from real images. Reading between the lines of the paper framing, it seems like the primary goal of this approach was actually to create a generator architecture where global and local image features were represented in a more separable way, and could as a result be more easily configured to specify and control the image you want to generate. The fact that the images were also astonishingly realistic appears to have been a pleasant side effect.

An intuitive guide to Gaussian processes

Gaussian processes are a powerful algorithm for both regression and classification. Their greatest practical advantage is that they can give a reliable estimate of their own uncertainty. By the end of this maths-free, high-level post I aim to have given you an intuitive idea for what a Gaussian process is and what makes them unique among other algorithms.

Support Vector Machine: MNIST Digit Classification with Python; Including my Hand Written Digits

Following the previous detailed discussions of SVM algorithm, I will finish this series with an application of SVM to classify handwritten digits. Here we will use the MNIST database for handwritten digits and classify numbers from 0 to 9 using SVM. The original data-set is complicated to process so I am using the data-set processed by Joseph Redmon. I have followed the Kaggle competition procedures and you can download the data-set from the kaggle itself. The data-set is based on gray-scale images of handwritten digits and each image is 28 pixel in height and 28 pixel in width. Each pixel has a number associated with it where 0 represents a dark pixel and 255 represents a white pixel. Both the train and test data-set have 785 columns where ‘label’ column represents the handwritten digit and remaining 784 columns represent the (28, 28) pixel values. The test and test data-set contains 60,000 and 10,000 samples respectively. I will use several techniques like GridSearchCV and Pipeline which I have introduced in a previous post, and some new concepts like representing a gray-scale image in a numpy array. I have used 12000 samples and 5000 samples from the training and test data-sets just to reduce the time of computation and it is recommended to use the full set to obtain a better score and avoid selection bias.

Your Guide to Natural Language Processing (NLP)

Everything we express (either verbally or in written) carries huge amounts of information. The topic we choose, our tone, our selection of words, everything adds some type of information that can be interpreted and value extracted from it. In theory, we can understand and even predict human behaviour using that information. But there is a problem: one person may generate hundreds or thousands of words in a declaration, each sentence with its corresponding complexity. If you want to scale and analyze several hundreds, thousands or millions of people or declarations in a given geography, then the situation is unmanageable.

Understanding the Magic of Neural Networks

Everything ‘neural’ is (again) the latest craze in machine learning and artificial intelligence. Now what is the magic here? Let us dive directly into a (supposedly little silly) example: we have three protagonists in the fairy tail little red riding hood, the wolf, the grandmother and the woodcutter. They all have certain qualities and little red riding hood reacts in certain ways towards them. For example the grandmother has big eyes, is kindly and wrinkled – little red riding hood will approach her, kiss her on the cheek and offer her food (the behavior ‘flirt with’ towards the woodcutter is a little sexist but we kept it to reproduce the original example from Jones, W. & Hoskins, J.: Back-Propagation, Byte, 1987). We will build and train a neural network which gets the qualities as inputs and little red riding wood’s behaviour as output, i.e. we train it to learn the adequate behaviour for each quality.


Desnapify is a deep convolutional generative adversarial network (DCGAN) trained to remove Snapchat filters from selfie images. It is based on the excellent pix2pix project by Isola et al., and specifically the Keras implementation by Thibault de Boissiere.