Microsoft Azure Machine Learning for Data Scientist

Well, as we all know that data science is a vast field covering multiple disciplines using various scientific methods and algorithms to extract insights from data, be it structured or unstructured, it is a very difficult task to learn and master it. It requires a lot of hands-on practice.


Radio Wave Classifier in Python

In my last blog I summarized a research paper that investigated the use of residual neural networks for the purposes of radio signal classification. In this blog I will get you started with Google Cloud Platform and show you how to build a ResNet signal classifier in Python with Keras. Here is a link to my GitHub with the ResNet code: GitHub.


Publish Data Science Articles to the Web using Jupyter, Github and Kyso

Data science is exploding, more and more organizations are using data to power, well, everything. But it can sometimes still be a little difficult to publish data-science based reports. This might be because the charts are interactive, or because you want a reproducible document, or because it can be difficult to move from the data-science tools to more readable formats. In this article I’ll go through how using python in Jupyter notebooks, Github and Kyso we can go from getting some raw data to publishing an awesome chart to the web in a few minutes.


Let’s Stop Treating Algorithms Like They’re All Created Equal

A recent poll found that most Americans think algorithms are unfair. Unfortunately, the poll was itself biased and an example of the very phenomenon it decries. All around us, algorithms are invisibly at work. They’re recommending music and surfacing news, finding cancerous tumors, and making self-driving cars a reality. But do people trust them? Not really, according to a Pew Research Center survey taken last year. When asked whether computer programs will always reflect the biases of their designers, 58 percent of respondents thought they would. This finding illustrates a serious tension between computing technology, whose influence on people’s lives is only expected to grow, and the people affected by it.


Use Scikit-Learn Pipelines to clean data and train models faster

If you’re looking for a way to organize your data processing workflow and decrease code redundancy, Scikit-Learn Pipelines will make a great addition to your data science toolkit. After explaining what they are and why they’re used, I’ll show you how to use them to automate data processing by using them for worldwide box office revenue predictions.


AI Safety and Intellectual Debt

A friend shared the article in the New Yorker The Hidden Costs of Automated Thinking Jonathan Zittrain. He refers first to the drug industry or pharmaceutical business and how not all drugs are fully understood – beyond that they are working. He draws this parallel towards the discussion regarding automation and artificial intelligence, machine learning techniques in particular. He mentions that ‘theory-free’ advances can be indispensable to the development of life-saving drugs, however, it comes with a cost. He mentioned altering a few pixels in photograph to fool an algorithm, and that systems can have unknown gaps. In this article, I would like to first attempt to understand slightly better who Jonathan is and secondly a reflection on this concept of intellectual debt. The subtitle of this piece is taken from one of the headings of Jonathan Zittrain’s post on Medium called Intellectual Debt: With Great Power Comes Great Ignorance. As a quick disclaimer, these texts are short reflections as part of my project #500daysofAI and as such will not be comprehensive, it is a process of learning every day about the topic.


Not 1, not 2…but 5 ways to Correlate

The term ‘correlation’ refers to a mutual relationship or association between two things. In almost any business or for personal reasons, it is useful to express something in terms of its relationship with others. For example, sales might increase when the marketing spends more on TV advertisement or a ice-cream sales increases when temperature increases. Often, correlation is the first step to understanding these relationships and subsequently building better business and statistical models.
1. Scatter Plot – the basic technique
2. Pearson Correlation Coefficient – having something measurable
3. Using Correlation Matrix – going big … matrix style
4. Principal Component Analysis (PCA) – Another interesting way to find correlations
5. Lasso Regression…see only whats important


Machine Learning for Unbalanced Datasets using Neural Networks

There are a few ways to address unbalanced datasets: from built-in class_weight in a logistic regression and sklearn estimators to manual oversampling, and SMOTE. We will look at whether neural networks can serve as a reliable out-of-the-box solution and what parameters can be tweaked to achieve a better performance.


Making Fairness an Intrinsic Part of Machine Learning

The suitability of Machine Learning models is traditionally measured on its accuracy. A highly accurate model based on metrics like RMSE, MAPE, AUC, ROC, Gini, etc is considered to be high performing models. While such accuracy metrics important, are there other metrics that the data science community has been ignoring so far? The answer is yes – in the pursuit of accuracy, most models sacrifice ‘fairness’ and ‘interpretability.’ Rarely, a data scientist tries to dissect a model to find out if the model follows all ethical norms. This is where machine learning fairness and interpretability of models come into being.


Teaching AI to plan using language in a new open-source strategy game

When humans face a complex challenge, we create a plan composed of individual, related steps. Often, these plans are formed as natural language sentences. This approach enables us to achieve our goal and also adapt to new challenges, because we can leverage elements of previous plans to tackle new tasks, rather than starting from scratch each time. Facebook AI has developed a new method of teaching AI to plan effectively, using natural language to break down complex problems into high-level plans and lower-level actions. Our system innovates by using two AI models – one that gives instructions in natural language and one that interprets and executes them – and it takes advantage of the structure in natural language in order to address unfamiliar tasks and situations. We’ve tested our approach using a new real-time strategy game called MiniRTSv2, and found it outperforms AI systems that simply try to directly imitate human gameplay.


The Work of the Future: Shaping Technology and Institutions

Technological change has been reshaping human life and work for centuries. The mechanization that began with the Industrial Revolution enabled dramatic improvements in human health, well-being, and quality of life – not only in the developed countries of the West, but increasingly throughout the world. At the same time, economic and social disruptions often accompanied those changes, with painful and lasting results for workers, their families, and communities. Along the way, valuable skills, industries, and ways of life were lost. Ultimately new and unforeseen occupations, industries, and amenities took their place. But the benefits of these upheavals often took decades to arrive. And the eventual beneficiaries were not necessarily those who bore the initial costs. The world now stands on the cusp of a technological revolution in artificial intelligence and robotics that may prove as transformative for economic growth and human potential as were electrification, mass production, and electronic telecommunications in their eras. New and emerging technologies will raise aggregate economic output and boost the wealth of nations. Will these developments enable people to attain higher living standards, better working conditions, greater economic security, and improved health and longevity? The answers to these questions are not predetermined. They depend upon the institutions, investments, and policies that we deploy to harness the opportunities and confront the challenges posed by this new era. How can we move beyond unhelpful prognostications about the supposed end of work and toward insights that will enable policymakers, businesses, and people to better navigate the disruptions that are coming and underway? What lessons should we take from previous epochs of rapid technological change? How is it different this time? And how can we strengthen institutions, make investments, and forge policies to ensure that the labor market of the 21st century enables workers to contribute and succeed?


Artificial Intelligence Confronts a ‘Reproducibility’ Crisis

A few years ago, Joelle Pineau, a computer science professor at McGill, was helping her students design a new algorithm when they fell into a rut. Her lab studies reinforcement learning, a type of artificial intelligence that’s used, among other things, to help virtual characters (‘half cheetah’ and ‘ant’ are popular) teach themselves how to move about in virtual worlds. It’s a prerequisite to building autonomous robots and cars. Pineau’s students hoped to improve on another lab’s system. But first they had to rebuild it, and their design, for reasons unknown, was falling short of its promised results. Until, that is, the students tried some ‘creative manipulations’ that didn’t appear in the other lab’s paper.