Toward an AI Physicist for Unsupervised Learning

We investigate opportunities and challenges for improving unsupervised machine learning using four common strategies with a long history in physics: divide-and-conquer, Occam’s Razor, unification, and lifelong learning. Instead of using one model to learn everything, we propose a novel paradigm centered around the learning and manipulation of *theories*, which parsimoniously predict both aspects of the future (from past observations) and the domain in which these predictions are accurate. Specifically, we propose a novel generalized-mean-loss to encourage each theory to specialize in its comparatively advantageous domain, and a differentiable description length objective to downweight bad data and ‘snap’ learned theories into simple symbolic formulas. Theories are stored in a ‘theory hub’, which continuously unifies learned theories and can propose theories when encountering new environments. We test our implementation, the ‘AI Physicist’ learning agent, on a suite of increasingly complex physics environments. From unsupervised observation of trajectories through worlds involving random combinations of gravity, electromagnetism, harmonic motion and elastic bounces, our agent typically learns faster and produces mean-squared prediction errors about a billion times smaller than a standard feedforward neural net of comparable complexity, typically recovering integer and rational theory parameters exactly. Our agent successfully identifies domains with different laws of motion also for a nonlinear chaotic double pendulum in a piecewise constant force field.


Storm is a tool for the analysis of systems involving random or probabilistic phenomena. Given an input model and a quantitative specification, it can determine whether the input model conforms to the specification. It has been designed with performance and modularity in mind.


MonoCorpus is a note taking app for software and machine learning engineers meant to encourage learning, sharing, and easier development. Increase documentation for yourself and your team without slowing your velocity. Take notes as part of your process instead of dedicating time to writing them.


Odin – Auto-Scaling Group Deployer. Deploy your 12-factor-applications to AWS easily and securely with the Odin. Odin is a AWS Step Function base on the step framework that deploys services as Auto-Scaling Groups (ASG’s) to AWS.

A Practical Implementation of the Faster R-CNN Algorithm for Object Detection (Part 2 – with Python codes)

Which algorithm do you use for object detection tasks? I have tried out quite a few of them in my quest to build the most precise model in the least amount of time. And this journey, spanning multiple hackathons and real-world datasets, has usually always led me to the R-CNN family of algorithms. It has been an incredible useful framework for me, and that’s why I decided to pen down my learnings in the form of a series of articles. The aim behind this series is to showcase how useful the different types of R-CNN algorithms are. The first part received an overwhelmingly positive response from our community, and I’m thrilled to present part two!

Keras Application for Pre-trained Model

In earlier posts, we learned about classic convolutional neural network (CNN) architectures (LeNet-5, AlexNet, VGG16, and ResNets). We created all the models from scratch using Keras but we didn’t train them because training such deep neural networks to require high computation cost and time. But thanks to transfer learning where a model trained on one task can be applied to other tasks. In other words, a model trained on one task can be adjusted or finetune to work for another task without explicitly training a new model from scratch. We refer such model as a pre-trained model.

New R Cheatsheet: Data Science Workflow with R

Teaching R is our mission at Business Science University because R is the most efficient language for exploring data, performing business analysis, and applying data science to business to extract ROI for an organization. R has an amazing ecosystem of tools that seemlessly work together, which has been termed the ‘tidyverse’. The ‘tidyverse’ ecosystem can take business analysis to a new level. Here’s the key: 99.9% of business analysts don’t know R (and its awesome power). With our learning system, you can learn R quickly, setting yourself on a career tragectory in data science.

Hands on Apache Beam, building data pipelines in Python

Apache Beam is an open-source SDK which allows you to build multiple data pipelines from batch or stream based integrations and run it in a direct or distributed way. You can add various transformations in each pipeline. But the real power of Beam comes from the fact that it is not based on a specific compute engine and therefore is platform independant. You declare which ‘runner’ you want to use to compute your transformation. It is using your local computing resource by default, but you can specify a Spark engine for example or Cloud Dataflow… In this article, I will create a pipeline ingesting a csv file, computing the mean of the Open and Close columns fo a historical S&P500 dataset. The goal here is not to give an extensive tutorial on Beam features, but rather to give you an overall idea of what you can do with it and if it is worth for you going deeper in building custom pipelines with Beam. Though I only write about batch processing, streaming pipelines are a powerful feature of Beam!

Facebook Open Sources Horizon to Streamline the Implementation of Reinforcement Learning Solutions

Reinforcement learning is one of the most exciting areas of development in the current artificial intelligence(AI) landscape. From AlphaGo to OpenAI Five, reinforcement learning has been at the center of major AI breakthroughs in the last few years. And yet, the implementation of reinforcement learning remains difficult enough that only specialized teams with advanced AI research skills have been able to pursue those efforts. Not surprisingly, most of the interesting reinforcement learning applications we see today come from AI powerhouses like Google, Microsoft, Amazon, Apple or Facebook. In the case of Facebook, the social media giant has been using reinforcement learning across different scenarios such as intelligent notifications or the M assistant. Recently, the Facebook engineering team open sourced Horizon, a framework that brings together some of the best practices Facebook has learned in the implementation of reinforcement learning solutions so that they can be used by mainstream developers.

Review: U-Net (Biomedical Image Segmentation)

In this story, U-Net is reviewed. U-Net is one of the famous Fully Convolutional Networks (FCN) in biomedical image segmentation, which has been published in 2015 MICCAI with more than 3000 citations while I was writing this story. In the field of biomedical image annotation, we always need experts, who acquired the related knowledge, to annotate each image. And they also consume large amount of time to annotate. If the annotation process becomes automatic, less human efforts and lower cost can be achieved. Or it can be act as a assisted role to reduce the human mistake.

4 Awesome things you can do with Keras and the code you need to make it happen

Keras is one of the most widely used Deep Learning frameworks out there. It’s very easy to use and yet is still on par in terms of performance with the more complex libraries like TensorFlow, Caffe, and MXNet. Unless you’re in an application that requires some very low-level and complex code, Keras will give you the best bang for your buck! But there’s more to Keras than meets the eye. Today we’re sharing a few lesser known yet awesome things you can do with Keras along with the code you need to make it happen. These will help you code all of your custom things directly in Keras without having to switch to those other more tedious and complex libraries.

Recurrent Neural Networks by Example in Python

The first time I attempted to study recurrent neural networks, I made the mistake of trying to learn the theory behind things like LSTMs and GRUs first. After several frustrating days looking at linear algebra equations, I happened on the following passage in Deep Learning with Python: In summary, you don’t need to understand everything about the specific architecture of an LSTM cell; as a human, it shouldn’t be your job to understand it. Just keep in mind what the LSTM cell is meant to do: allow past information to be reinjected at a later time.

Choose an open source license

An open source license protects contributors and users. Businesses and savvy developers won’t touch a project without this protection.

Analyzing train travel times and fares in Singapore

Analyzing train travel times and fares in Singapore. I set out to try to understand the accessibility of different train stations in Singapore and how travel times and fares vary by stations. The purpose of the analysis is to identify areas that are more convenient, based on train as the sole mode of transport. There are many ways to do the analysis:
• Looking at number of stations reachable within X mins
• Looking at the range/ distribution of travel times
• Looking at the range/ distribution of travel fares

Optimize data preparation code using Python concurrent futures

Get your Python code for data preparation to perform significantly faster with just a few lines of code. Take advantage of the build in Concurrent futures. This post will discuss and show how to utilize all your CPU cores when executing your Python code for data preparation by just adding a few lines of extra code.

Deep (Double) Q-Learning

Making Deep Q-Learning Great Again. Part III of the Series about self-learning AI-Agents.

Decrypting your Machine Learning model using LIME

Recent times have seen a renewed focus on model interpretability. ML Experts are able to understand the importance of a model interpretability in it’s subsequent adaption by business. The problem with model explainability is that it’s very hard to define a model’s decision boundary in human understandable manner. LIME is a python library which tries to solve for model interpretability by producing locally faithful explanations. Below is an example of one such explanation for a text classification problem.

Why Data Science Fails

Just as mergers often fail to live up to their promise, so have many data science teams. A KPMG study indicated that 83% of merger deals were unable to boost shareholder returns. And, while I think data science has received general success in companies that have built teams, at many companies it has far underwhelmed its promise. Data science often fails to achieve its full promise in many companies for the same reasons mergers do not. It is like turkeys voting for Christmas. The people required to participate in the change will very likely, or see themselves as very likely, to lose control as part of their participation.

R vs Python vs MATLAB vs Octave

As machine learning engineer I would like to compare 4 machine learning programming languages(tools). Let’s take this a bit deeper. Since most of us are concerned with ML and analysis being a big part of why we are using these programs.I want to list a few advantages and disadvantages of each for who want to start learning them as a data scientist.

Understanding Gradient Boosting Machines

Although most winning models in Kaggle competitions are ensembles of some advanced machine learning algorithms, one particular model that is usually a part of such ensembles is the Gradient Boosting Machines. Take for an example, in this post, the winner of the Allstate Claims Severity Kaggle Competition, Alexey Noskov attributes his success in the competition to another variant of GBM algorithm called XGBOOST. However despite its massive popularity, many professionals still use this algorithm as a black box. As such, the purpose of this article is to lay an intuitive framework for this powerful machine learning technique.

How to design an Artificial Intelligent System. Part 1?—?Concept development.

I am an applied mathematician with a Masters degree in Complex Systems Modelling, and performed data translation research in computational neuroscience. For several years, I ran consultancy agency where I analysed different sources of business data to derive the best solutions for client automation. Within a series of five articles, I will demonstrate a roadmap of data-driven projects, relevant for both research and development. In final chapters, I will show the deploying of a model using machine learning approaches. This manual is aimed for data scientists recently joining projects, however, if you are a decision maker it might give you a better understanding of technological processes.

Why it’s Not Difficult to Train A Neural Network with a Dynamic Structure Anymore!

Finally, the open source community addressed the demands for dynamic structure in Neural Networks. We saw 3 major library releases in last 3 months that support dynamic structures.

Demystifying KL Divergence

If you want to intuitively understand what the KL divergence is, you are in the right place, I’ll demystify the KL divergence for you. As I’m going to explain the KL divergence from the information theory point of view, it is required to know the entropy and the cross-entropy concepts to fully apprehend this article. If you are not familiar with them, you may want to read the following two articles: one for the entropy and the other for the cross-entropy. If you are ready, read on.