Data Science with Python explained

An overview of using Python for data science including Numpy, Scipy, pandas, Scikit-Learn, XGBoost, TensorFlow and Keras.

The Deployment Pain

In October 2017, I was running the KNIME booth at the ODSC London conference. At the booth, we had the usual conference material to distribute: informative papers, various gadgets, and even Swiss chocolates. Amongst the gadgets we had magnets, more specifically four types of magnets representing abstract nodes in an abstract workflow for data analytics: Read, Transform, Analyze, and Deploy. These magnets were quite popular. Conference attendees would come to play with them, to assemble them in a workflow, or to take a few home to their fridges.

DeViSE Zero-shot learning

DeViSE: A Deep Visual-Semantic Embedding Model’ by Fromme et al. (2013) is a truly beautiful paper. The authors present a novel image classification method which leverages semantic knowledge learned using language models. The method is able to make zero-shot predictions of tens of thousands of image labels not observed during training. I’ll explain the method to you in detail and I deployed a DeViSE model on AWS using Docker so that you can experiment with it.

D3 for Data Scientists, Part I: A re-usable template for combining R and D3 to build interactive visualizations

The path to D3 mastery is dark and full of terrors. D3 itself is a JavaScript (JS) library and on top of that, you’ll need a basic understanding of HTML (Hypertext Markup Language) and CSS (Cascading Style Sheets) to get the most out of it. If you’re a data scientist, chances are JavaScript, at best, ranks as your fourth language, after R, Python, and SQL. In this three-part series of blog posts, I will show you step-by-step how you can combine R with D3, HTML, and CSS to create a fully interactive data visualization from scratch.

Intro to Grafana: Installation, Configuration, and Building the First Dashboard

Grafana is an open-source, nightly built dashboarding, analytics, and monitoring platform that is tinkered for connection with a variety of sources like Elasticsearch, Influxdb, Graphite, Prometheus, AWS Cloud Watch, and many others. One of the biggest highlights of Grafana is the ability to bring several data sources together in one dashboard with adding rows that will host individual panels (each with visual type).

How to be an Artificial Intelligence (AI) Expert?

Artificial Intelligence is growing at a rapid pace in the last decade. You have seen it all unfold before your eyes. From self-driving cars to Google Brain, artificial intelligence has been at the centre of these amazing huge-impact projects. Artificial Intelligence (AI) made headlines recently when people started reporting that Alexa was laughing unexpectedly. Those news reports led to the usual jokes about computers taking over the world, but there’s nothing funny about considering AI as a career field. Just the fact that five out of six Americans use AI services in one form or another every day proves that this is a viable career option Why AI? Well, there can be many reasons for students selecting this as their career track or professionals changing their career track towards AI. Let us have a look at some of the points on discussing why AI!

How to build an Autonomous Sailboat Using Machine Learning

In this blog post we will take apart this challenge and focus on the first sub-task: using machine learning to find the optimal course to steer in a sailing race. You will learn how to win a sailing race and the basic machine learning concepts needed to accomplish this.

The Best Public Datasets for Machine Learning

What are the best datasets for machine learning? After reviewing datasets hours after hours, we have created a great cheat sheet for high quality, and diverse machine learning datasets.

Something You don’t know about data File if you are new to Data Science, Import data File from the web: Part 1

To be a master in data science, You have to understand how to manage your data and import it from the web because approx. 90% of data in real-world come straight from the internet.

Machine Learning Operations

This repository contains a curated list of awesome open source libraries that will help you deploy, monitor, version and scale your machine learning.

Machine Learning for Transactional Analytics: Acqusition Cost Vs Life time Value

The application performance, the outcome of the business, and the users are connected real-time through a mechanism known as Transactional Analytics. The real-time data gives insights on the customer experience, business outcomes after it is collected and correlated. Transactional Analytics could be used to answer several questions about the performance of the business, and the KPI’s in real time. A correlation between the business and the performance data would ensure business growth, and the automated data gathering would provide time to value. Moreover, the application performance could be optimized if the hundred percent of the business transaction is automatically collected, and correlated. Details of every business transactions of the application need to be captured, and its performance needs to be analyzed. The relationship between the data about a particular application should be auto-correlated to optimize the performance of that application.

Data science productionization: scale

You can wait until you are surprised by the unexpected, or you can build systems to limit the extent to which the expected can hurt you.

Why Norms Matters – Machine Learning

In this article, I will review the most common norms used in these situations, namely the L¹ and L² norms. I will describe their similarities and differences, as well as when to use which norm. In addition, I’ll show how to visualize these norms and their use in optimization problems.

Review: Hypercolumn (Instance Segmentation)

In this story, Hypercolumn is reviewed. The term ‘Hypercolumn’, borrowed from neuroscience, to describe a set of V1 neurons sensitive to edges at multiple orientations and multiple frequencies arranged in a columnar structure. By borrowing the idea of Hypercolumn, it boosts the prediction accuracy and it is published in 2015 CVPR with over 800 citations. When it was published in CVPR, the first author, Dr. Bharath Hariharan, was studying PhD in University of California Berkeley. When Hypercolumn was later on extended to 2017 TPAMI, Dr. Hariharan had become the Postdoctoral Researcher at Facebook AI Research (FAIR). After that, another famous

The Actual Difference Between Statistics and Machine Learning

No, they are not the same. If machine learning is just glorified statistics, then architecture is just glorified sand-castle construction. I am, to be quite honest, tired of hearing this debate reiterated on social media and within my University on a near-daily basis. Usually, this is accompanied by somewhat vague statements to explain away the issue. Both sides are guilty of doing this. I hope that by the end of this article you will have a more informed position on these somewhat vague terms.