Let’s Talk About AI Ethics; We’re On A Deadline

The first industrial revolution, powered by steam, launched mass production. The second revolution added electricity to everything. The third added computing power. This new revolution, powered by artificial intelligence (AI), is adding cognitive capabilities to everything – and it’s a game changer. Code that learns is both powerful and dangerous. It threatens the basic rules of markets and civic life. AI requires a new technical and civic infrastructure, a new way to conduct business, a new way to be together in community. AI and enabling technologies like robotics and autonomous vehicles will change lives and livelihoods. Great benefits and unprecedented wealth will be created. But with that will come waves of disruption. Compared to prior revolutions, this one is occurring at exponential speed and while impacts are ubiquitous, control is concentrated. AI is a centralizing force. It plows through monster data sets in seconds aggregating benefits and wealth at an unprecedented speed.

Benefits and Risks of Artificial Intelligence

From SIRI to self-driving cars, artificial intelligence (AI) is progressing rapidly. While science fiction often portrays AI as robots with human-like characteristics, AI can encompass anything from Google’s search algorithms to IBM’s Watson to autonomous weapons. Artificial intelligence today is properly known as narrow AI (or weak AI), in that it is designed to perform a narrow task (e.g. only facial recognition or only internet searches or only driving a car). However, the long-term goal of many researchers is to create general AI (AGI or strong AI). While narrow AI may outperform humans at whatever its specific task is, like playing chess or solving equations, AGI would outperform humans at nearly every cognitive task.

A GAMEBOY supercomputer

It is 2016. Deep learning is everywhere. Image recognition can be considered kind of solved by convolutional neural networks and my research interests gravitate towards neural networks with memories and reinforcement learning. Specifically, in a paper showed by Google Deepmind, it has been shown that it is possible to achieve human or even superhuman-level performance on a variety of Atari 2600 (a home game console released in 1977) games using a simple reinforcement learning algorithm called Deep Q-Neural Network. All that done by just observing the gameplay. That caught my attention.

Using XGBoost for time series prediction tasks

Recently Kaggle master Kazanova along with some of his friends released a ‘How to win a data science competition’ Coursera course. The Course involved a final project which itself was a time series prediction problem. Here I will describe how I got a top 10 position as of writing this article.

Overview of Encoding Methodologies

In this tutorial, you will get a glimpse of encoding techniques along with some advanced references that will help you tackle categorical variables.

Six Steps to Master Machine Learning with Data Preparation

To prepare data for both analytics and machine learning initiatives teams can accelerate machine learning and data science projects to deliver an immersive business consumer experience that accelerates and automates the data-to-insight pipeline by following six critical steps.
Step 1: Data collection
Step 2: Data Exploration and Profiling
Step 3: Formatting data to make it consistent
Step 4: Improving data quality
Step 5: Feature engineering
Step 6: Splitting data into training and evaluation sets

Machine Learning Explainability vs Interpretability: Two concepts that could help restore trust in AI

Machine Learning Explainability vs Interpretability: Two concepts that could help restore trust in AI

R 3.5.2 now available

R 3.5.2, the latest version of the R language for statistical computation and graphics from the R Foundation, was released today. (This release is codenamed ‘Eggshell Igloo’, likely in reference to this or this Peanuts cartoon.) Compared to R 3.5.1, this update includes only bug fixes, so R scripts and packages compatible with R 3.5.0 or R 3.5.1 should work without modification.

Introducing TAPAS

Forecasting the performance of a deep neural network is a nightmare for every data scientist. Every month, dozens of new deep learning research algorithms are published making incredible claims about their performance. However, applying those algorithms to real world problems requires a leap of faith that the model can achieve similar levels of performance with unseen datasets. Not surprisingly, many of the research algorithms that performed incredibly well for specific datasets miserably fail when apply to different domains as a clear manifestation of the famous ‘No Free Lunch Theorem’. Very recently, researchers from IBM’s artificial intelligence(AI) lab in Zurich published a new paper proposing a method that uses neural networks to predict the performance of a new model prior to training. At first, the idea of estimating the performance of a deep learning model before training sounds ludicrous. After all, training is the main way in which deep neural networks build their knowledge and structure. However, IBM approached this problem from an unexpected angle by trying to recreate human’s intuition. If you show a deep learning expert a neural network and a sample dataset and asked whether the specific network will perform well or not, the expert will have an intuition about it. For instance, it will immediately recognize whether the neural network is a good fit for the target problem or whether the different layers are structured in a way that optimizes accuracy. IBM setup to model a similar intuitive criteria in the form of a neural network.

Progressive Learning and Network Growing in TensorFlow

In many real world applications new training data becomes available after a network has already been trained. Especially with big neural networks, it would be very tedious to retrain the complete model every time new information becomes available. It would be much easier to simply add new nodes to the network for each new class or other information that was introduced and keep all the other previously trained weights. In this post I will give a brief overview on how to do this in TensorFlow and how the network performance is affected by it. I will demonstrate the procedure and results on the CIFAR-10 data set but the Git repository linked at the end of this post also provides code and results for MNIST and CIFAR-100 and can easily be adapted to any other data set of this kind. The network I use has a very simple structure of two convolutional layers followed by one fully connected layer with 512 neurons and one read out layer with as many neurons as classes in the data set.

Understanding AI and ML for Mobile app development

Last time I published this blog where I explained about one application of AI and ML?-?’Vision’ and also explained briefly about using ML kit in mobile development which is a cloud platform offered by Google to integrate ML features in Android and iOS apps. This article is prequel to that one and in this I am going to explain the very basics of Artificial intelligence and Machine Learning. I will shed some light on what are the different kinds of Machine Learning and how it all happens underneath when we see Google Photos app detect our faces or when Gmail suggests us full sentences to put as reply.

The Mathematics Behind Principal Component Analysis

The central idea of principal component analysis (PCA) is to reduce the dimensionality of a data set consisting of a large number of interrelated variables while retaining as much as possible of the variation present in the data set. This is achieved by transforming to a new set of variables, the principal components (PCs), which are uncorrelated, and which are ordered so that the first few retain most of the variation present in all of the original variables.

Planet Beehive

Exploring our planets’ touristic activities.

Feature engineering, Explained

A brief introduction to feature engineering, covering coordinate transformation, continuous data, categorical features, missing values, normalization, and more.