A Beginner’s Guide to Convolutional Neural Networks (CNNs)

Convolutional neural networks are deep artificial neural networks that are used primarily to classify images (e.g. name what they see), cluster them by similarity (photo search), and perform object recognition within scenes. They are algorithms that can identify faces, individuals, street signs, tumors, platypuses and many other aspects of visual data. Convolutional networks perform optical character recognition (OCR) to digitize text and make natural-language processing possible on analog and hand-written documents, where the images are symbols to be transcribed. CNNs can also be applied to sound when it is represented visually as a spectrogram. More recently, convolutional networks have been applied directly to text analytics as well as graph data with graph convolutional networks. The efficacy of convolutional nets (ConvNets or CNNs) in image recognition is one of the main reasons why the world has woken up to the efficacy of deep learning. They are powering major advances in computer vision (CV), which has obvious applications for self-driving cars, robotics, drones, security, medical diagnoses, and treatments for the visually impaired.


Intro to Forecasting

Forecasting is another technique that uses structured data (often obtained by using techniques from Natural Language Processing and Object Recognition) to inform decision-making. Forecasting techniques predict future outcomes or states. Why would we want to forecast? Say you’re buying a house. It might be useful to predict what the value of your investment will be in a year. What if you renovated the kitchen? Forecasting techniques can help you determine how much value a kitchen remodel might add. As a business, budgeting is incredibly important. If you can predict demand, customer churn, preventative maintenance costs, and yield, to name a few, you can efficiently deploy resources across your business. If you have an online store and can accurately predict what your customer might buy next, you can surface that item in their search results or in advertisements to increase the probability of a sale.


Want Europe to have the best AI? Reform the GDPR

Artificial intelligence is rapidly transforming the global economy and society. From accelerating the development of pharmaceuticals to automating factories and farms, many countries are already seeing the benefits of AI. Unfortunately, it is becoming increasingly clear that the European Union’s data-processing regulations will limit the potential of AI in Europe. Data provides the building blocks for AI, and with serious restrictions on how they use it, European businesses will not be able to use the technology to its full potential.


Data cubes in Python

Data cubes are a popular way to display multidimensional data and the method have become increasingly popular. In this article you learn to use Python for data cubes.


Boost your Image Classification Model

Image classification is assumed to be a nearly solved problem. Fun part is when you have to use all your cunning to gain that extra 1% accuracy. I came across such a situation, when I participated in Intel Scene Classification Challenge hosted by Analytics Vidhya. I thoroughly enjoyed the contest as I tried to extract out all the juices from my deep learning model. The techniques below can in general be applied to any image classification problem at hand.


Creating data frame using structure() function in R

Structure() function is a simple, yet powerful function that describes a given object with given attributes. It is part of base R language library, so there is no need to load any additional library. And also, since the function was part of S-Language, it is in the base library from the earlier versions, making it backward or forward compatible.


Demystifying Regular Expressions in R

In this post, we will learn about using regular expressions in R. While it is aimed at absolute beginners, we hope experienced users will find it useful as well. The post is broadly divided into 3 sections. In the first section, we will introduce the pattern matching functions such as grep, grepl etc. in base R as we will be using them in the rest of the post. Once the reader is comfortable with the above mentioned pattern matching functions, we will learn about regular expressions while exploring the names of R packages by probing the following:
• how many package names include the letter r?
• how many package names begin or end with the letter r?
• how many package names include the words data or plot?
In the final section, we will go through 4 case studies including simple email validation. If you plan to try the case studies, please do not skip any of the topics in the second section.


Instagram Data Analysis Using Panoply and Mode

This project is built on top of the data challenge that Panoply has released in Apr 2019. Panoply is a cloud data warehouse that you could gather data from different data sources (i.e. AWS S3, Google analytics and etc.) easily into one place and then connect to different Business Intelligence tools (i.e. Chartio, Mode and etc.) for analytics and insights. Panoply has recently integrated their data warehouse with Instagram API to collect data. This challenge is about using Panoply as an ETL tool to explore Instagram data for marketing use (i.e. promotion, segmentation and etc.).


Epileptic Seizure Classification ML Algorithms

Binary Classification Machine Learning Algorithms in Python: Epilepsy is a disorder of the central nervous system (CNS), affecting about 1.2% (3.4 million people) in the US, and more than 65 million globally. Additionally, about 1 in 26 people will develop epilepsy at some point during their lifetime. There are many kinds of seizures, each with different symptoms, such as losing consciousness, jerking movements, or confusion. Some seizures are much harder to detect visually; the patients will usually exhibit symptoms such as not responding or staring blankly for a brief period. Seizures can happen unexpectedly and can result in injuries such as falling, biting of the tongue, or losing control of one’s urine or stool. Hence, these are some of the reasons why seizure detection is of utmost importance for patients under medical supervision that are suspected to be seizure prone. This project will use binary classification methods to predict whether an individual is having a seizure or not.


Viewing text through the eyes of a machine

How to make black box language models more transparent: We have been able to lift the lid on Convolutional Neural Networks (CNN) in computer vision tasks for a number of years now. This has brought with it significant improvements to the field through:
• Increased robustness of models;
• Visibility of, and reduction of model bias; and
• A better understanding of how adversarial images can alter the outputs of deep learning models.
With such clear benefits attributed to better model understanding, why do we not seem to have the same level of focus on model interpretability in the field of Natural Language Processing (NLP)?


Four Mistakes You Make When Labeling Data

Data labeling needs to be done fast at scale and with high accuracy, without any one of those compromising the other. The first step in creating a quality annotation pipeline is anticipating common problems and accommodating for them. This post showed four of the most common errors that come up in text annotation projects and how text annotation tools like LightTag can help solve them.


Autonomous Agents And Multi-Agent Systems 101: Agents And Deception

This article provides a brief introduction to the area of autonomous agents and multi-system agents. Furthermore, a perspective of deception mechanisms used by agents is presented.


Getting to Know Natural Language Understanding

We like to imagine talking to computers the way Picard spoke to Data in Next Generation, but in reality, natural language processing is more than just teaching a computer to understand words. The subtext of how and why we use the words we do is notoriously difficult for computers to comprehend. Instead of Data, we get frustrations with our assistants and endless SNL jokes.


The Fundamentals of Reinforcement Learning

If men were made in the image of God, robots were certainly made in the image of men. Our insights into how we think, how we learn, and even how the networks of neurons in our brains communicate with each other have led to the development of artificial intelligence, machine learning, and deep learning, the cornerstones of data science. Today, robots can do more than ever before. There is now a suitcase robot that will follow your phone’s geolocation, Travelmate. Moley Robotics has created a robot chef. The Grillbot and the BratWurst Bot were designed for your BBQ parties. Kobi can take care of your yard, and WinBot will wash your windows for you.


An introduction to Convolutional Neural Networks

Describing what Convolutional Neural Networks are, how they function, how they can be used and why they are so powerful.


Why Swift May Be the Next Big Thing in Deep Learning

If you are into programming, when you hear Swift, you will probably think about app development for iOS or MacOS. If you’re into deep learning, then you must have heard about Swift for Tensorflow (abbreviated as S4TF). Then, you can ask yourself: ‘Why would Google create a version of TensorFlow for Swift? There are already versions for Python and C++; why add another language?’ In this post, I will try to answer this question and outline the reasons why you should carefully follow S4TF as well as the Swift language itself. The goal of this post is not to give very detailed explanations but to provide a general overview with plenty of links so that you can go and dig deeper if you get interested.


Deploy ML models at scale

Let’s assume that you have built a ML model and that you are happy with its performance. Then the next step is to deploy the model into production. In this blog series I will cover how you can deploy your model for large scale consumption with in a scalable Infrastructure using AWS using docker container service. In this blog I will start with the first step of building an API framework for the ML model and running it in you local machine.


Machine Learning has never been this easy: Feature Engineering Concepts in 6 questions

This article is written for people who are keen to master machine learning concepts and skills required for machine learning jobs quickly by going through a set of popular and useful questions. Any comments and suggestions are welcome.


Technical Debt in Machine Learning – Part 1

This post is a collection of excerpts from the paper Hidden Technical Debt in Machine Learning Systems.


Cross Validation: A Beginner’s Guide

In beginning your journey into the world of machine learning and data science, there is often a temptation to jump into algorithms and model creation, without gaining an understanding of how to test the effectiveness of a generated model on real world data. Cross validation is a form of model validation which attempts to improve on the basic methods of hold-out validation by leveraging subsets of our data and an understanding of the bias/variance trade-off in order to gain a better understanding of how our models will actually perform when applied outside of the data it was trained on. Don’t worry, it’ll all be explained! This article seeks to be a beginning to execution guide for three methods of model validation (hold out, k-fold, and LOOCV) and the concepts behind them, with links and references to guide you to further reading. We make use of scikit learn, pandas, numpy and other python libraries in the given examples.