Reinforcement Learning: Introduction to Monte Carlo Learning using the OpenAI Gym Toolkit

What’s the first thing that comes to your mind when you hear the words ‘reinforcement learning’? The most common thought is – too complex with way too much math. But I’m here to assure you that this is quite a fascinating field of study – and I aim to break down these techniques in my articles into easy-to-understand concepts. I’m sure you must have heard of OpenAI and DeepMind. These are two leading AI organizations who have made significant progress in this field. A team of OpenAI bots was able to defeat a team of amateur gamers in Dota 2, a phenomenally popular and complex battle arena game.

Things that Aren’t Working in Deep Learning

This may be the golden age of deep learning but a lot can be learned by looking at where deep neural nets aren’t working yet. This can be a guide to calming the hype. It can also be a roadmap to future opportunities once these barriers are behind us.

Lecture slides: Introduction to Adjoint Differentiation and Back-Propagation in Machine Learning and Finance

This is a work in progress, feedback is highly appreciated.

7 Lessons That Will Teach You All You Need To Know About Machine Learning

1. Have an Understanding of Python
2. Deepen your Understanding of Statistics
3. Learn as much Theory as you Possibly can
4. Dive into Target Practice
5. Skills which will help to lay a Foundation in Machine Learning
6. Go Through Python Packages
7. Deep Learning in Python

Ultimate Python Cheatsheet: Data Science Workflow with Python

At Business Science, we are developing a revolutionary system for teaching Business Analysis with Python (Business Analysis with Python is a new course we are developing at Business Science University). Python Cheatsheet The system is revolutionary for a number of reasons (we’ll get to these in a minute). The cornerstone of our teaching process is the Data Science with Python Workflow, which is an adaptation of the Data Science with R workflow originally taught by Hadley Wickham and Garrett Grolemund in the the excellent book, R For Data Science. The NEW Python Cheatsheet links the documentation, cheatsheets, and key resources available for the most widely used Python packages into one meta-cheatsheet that illustrates the workflow.

Time is Partial, or: why do distributed consistency models and weak memory models look so similar, anyway?

There’s only one hard problem in computer science: recognising that cache invalidation errors are misnamed. They’re just off-by-one errors in the time domain.

RIP wordclouds, long live CHATTERPLOTS

Replacing ‘the pie chart of text data’ with a tidy approach in R

Deploying a Python Web App on AWS

While I enjoy doing data science and programming projects for the personal thrill that comes with building something of my own, there is also a certain joy in sharing your project online with anyone in the world. Fortunately, thanks to Amazon Web Services (AWS), in a few minutes, we can deploy a Python web application to the entire world for free. In this article, we’ll see how to deploy a deep learning web app to AWS on a free EC2 instance. This article will work with the app built in Deploying a Keras Deep Learning Model as a Web Application in Python using the model developed in Recurrent Neural Networks by Example in Python. Neither of these is required, just know that our application generates novel patent abstracts with an RNN. All the code for the project can be found on GitHub.

Ensemble Learning Using Scikit-learn

Ensemble learning uses multiple machine learning models to try to make better predictions on a dataset. An ensemble model works by training different models on a dataset and having each model make predictions individually. The predictions of these models are then combined in the ensemble model to make a final prediction. Every model has its strengths and weaknesses. Ensemble models can be beneficial by combining individual models to help hide the weaknesses of an individual model. In this tutorial, we will be using a Voting Classifier in which the ensemble model makes the prediction by majority vote. For example, if we use three models and they predict [1, 0, 1] for the target variable, the final prediction that the ensemble model would make would be 1, since two out of the three models predicted 1.

Master Python through building real-world applications (Part 1)

The internet is a mess and at times, without apt resources, learning a new programming language could be a tedious task. And in that case, the majority of learners give up or they pick something else to play with. So, let me assure you one thing before we start, this is not just any other ‘learn python programming’ post you stumble upon while surfing on the internet. Trust me, it’s not. What we are going to do in this series of 10 posts is to use python to build 10 real-world applications and as we go along, learn other important and necessary tools to master our python skills for Data Science.

Machine Learning | An Introduction

Machine Learning is undeniably one of the most influential and powerful technologies in today’s world. More importantly, we are far from seeing its full potential. There’s no doubt, it will continue to be making headlines for the foreseeable future. This article is designed as an introduction to the Machine Learning concepts, covering all the fundamental ideas without being too high level. Machine learning is a tool for turning information into knowledge. In the past 50 years, there has been an explosion of data. This mass of data is useless unless we analyse it and find the patterns hidden within. Machine learning techniques are used to automatically find the valuable underlying patterns within complex data that we would otherwise struggle to discover. The hidden patterns and knowledge about a problem can be used to predict future events and perform all kinds of complex decision making.

A History of Triggering Artificial Neuron

It was all started when I learned about Deep Belief Nets four years ago. At that time, people said ‘it is inspired by a biological neural network’ and I just accepted that. After I dived more into it, there were so many questions popped up in my head, especially when I saw the sigmoid function. At first, it seems ridiculous to me because this sigmoid function just maps any given number to the range -1 to 1. So, why do we need this?. My first guess was maybe it just normalizes the output so it won’t be exploded at some point. This question remained there in the corner of my head, sitting peacefully and waiting to be recalled someday.

Clean Code for a Data Scientist (?)

I’m a Data Scientist.. I don’t need to write clean code because most of my code is throwaway anyways’. ‘Clean code and agile are good for developing softwares.. It does not make sense in my work’. The number of times I have heard the above & the reluctance to even try some of the suggestions on clean code, baffles me. Well, let me tell you.. you don’t need to write clean code for software development either. You don’t need to practice agile for software development either. One can make a perfectly working software even without the above (maintaining/ modifying/ scaling will get difficult. But that’s not the focus of this article). When you need to follow clean code practices is when you are working in a TEAM! Irrespective of whether you are developing a software or an algorithm or have to try out multiple algorithms.

Forecasting Air Pollution with Recurrent Neural Networks

After the citizen science project of Curieuze Neuzen, I wanted to learn more about air pollution and see if I could make a data science project out of it. On the website of the European Environment Agency you can find a huge amount of data and information about air pollution. In this notebook we will focus on the air quality in Belgium and more specific on the pollution by sulphur dioxide (SO2). The data can be downloaded via https://…/be. The zip file contains separate files for different air pollutants and aggregation levels. The first digit represents the pollutant ID as described in the vocabulary. The file used in this notebook is BE_1_2013-2015_aggregated_timeseries.csv. This is the SO2 pollution in Belgium, but you can also find similar data for other European countries.