AI & Architecture

Artificial Intelligence, as a discipline, has already been permeating countless fields, bringing means and methods to previously unresolved challenges, across industries. The advent of AI in Architecture, described in a previous article, is still in its early days but offers promising results. More than a mere opportunity, such potential represents for us a major step ahead, about to reshape the architectural discipline. Our work proposes to evidence this promise when applied to the built environment. Specifically, we offer to apply AI to floor plans analysis and generation. Our ultimate goal is two-fold: (1) to generate floor plans i.e. optimize the generation of a large and highly diverse quantity of floor plan designs, (2) to qualify floor plans i.e. offer a proper classification methodology (3) to allow users to ‘browse’ through generated design options.

Gate Recurrent Units explained using Matrices: Part 1

Often times we get consumed with using Deep learning frameworks that perform all of the required operations needed to build our models. However, there is some value to first understanding some of the basic matrix operations used under the hood. In this tutorial we will walk you through the simple matrix operations needed to understand how a GRU works.

Brief introduction to Markov chains

In 1998, Lawrence Page, Sergey Brin, Rajeev Motwani and Terry Winograd published ‘The PageRank Citation Ranking: Bringing Order to the Web’, an article in which they introduced the now famous PageRank algorithm at the origin of Google. A little bit more than two decades later, Google has became a giant and, even if the algorithm has evolved a lot, the PageRank is still a ‘symbol’ of the Google ranking algorithm (even if few people can really say the weight it still occupies in the algorithm). From a theoretical point of view, it is interesting to notice that one common interpretation of the PageRank algorithm relies on the simple but fundamental mathematical notion of Markov chains. We will see in this article that Markov chains are powerful tools for stochastic modelling that can be useful to any data scientist. More especially, we will answer basic questions such as: what are Markov chains, what good properties do they have and what can be done with them?

Bayesian Optimization for Hyper-Parameter

In past several weeks, I spent a tremendous amount of time on reading literature about automatic parameter tuning in the context of Machine Learning (ML), most of which can be classified into two major categories, e.g. search and optimization. Searching mechanisms, such as grid search, random search, and Sobol sequence, can be somewhat computationally expensive. However, they are extremely easy to implement and parallelize on a multi-core PC, as shown in https://…m-random-in-hyper-parameter-optimization. On the other hand, optimization algorithms, especially gradient-free optimizers such as Nelder-Mead simplex and particle swarm, are often able to quickly locate close-to-optimal solutions in cases that the global optimal is neither feasible nor necessary, as shown in https://…/direct-optimization-of-hyper-parameter and https://…-free-optimization-for-glmnet-parameters.

Python Tutorial: Short Stop To Introduce Main Statistical Concepts

planned to introduce in this session the merging algorithm for our two data sets via the name variables. To outline the overall approach understandably and concisely, it’s essential to bring some statistical concepts beforehand. I decided to do a short bus stop and embed the work into the statistical framework, which underlies any data exploration project. So no programming today just some concepts, which are vital for the overall understanding. I plan more statistics bus stops in the future to get the scientific context of our program established.

Performing Classification in TensorFlow

In this article, I will explain how to perform classification using TensorFlow library in Python. We’ll be working with the California Census Data and will try to use various features of individuals to predict what class of income they belong in (>50k or <=50k). The data can be accessed at my GitHub profile in the TensorFlow repository. Here is the link to access the data.

Time Series in Python – Part 3: Forecasting taxi trips with LSTMs

LSTM (Long Short-Term Memory) is a type a type of recurrent neural network (RNN) architecture, and was proposed in 1997 by Sepp Hochreiter and Jürgen Schmidhuber. RNNs are Deep neural networks specially designed to handle sequential data via recurrence mechanisms. They behave in an autoregressive manner, as they keep track of the past via internal states (hence the ‘memory’ part). They have been used extensively for speech recognition, machine translation, speech synthesis, etc. But what are LSTMs worth when used on time-series ? Well, they can prove to be very useful to model non-linear relationships, assuming the size of the data available is large enough..

How AI will change the future of Software Development in 2019

In this article, we’ll explore how the future of software development and its processes would be streamlined with disruptive technology ‘Artificial Intelligence’ and its subsets. Most of the Software Developers in India and all over the world have been solving the deterministic problems by employing their logic and writing codes that follow certain rules. Softwares following the above approach is a result of human-driven phase and processes. However, now we have new technologies that can assist us in solving intricate problems with ease.

Explaining data science, AI, ML and deep learning to management – a presentation and a script – Part 1 of 3

If you are a data scientist, a machine learning engineer, an AI specialist, or however the heck you want to call yourself, chances are that at some point in your professional career you have encountered at least one manager that doesn’t understand the differences between some of the concepts listed in the title. After all, why should he?! Chances are that you have struggled to explain those differences without having to employ technical jargon, and you have alienated your already confused manager by doing so. Fear no more! We, at Yuxi Global, have developed a nearly perfect solution for this kind of situation. As the Head of Data Analytics at Yuxi Global, I was tasked with the construction of a Prezi presentation to aid in explaining what ‘Data Science’, ‘Artificial Intelligence’, ‘Machine Learning’ (ML) and ‘Deep Learning’ (DL) mean to medium and upper management folks. The ultimate goal was to help them guide their strategic as well as commercial decisions about what services and kinds of solutions we want to develop for clients in these areas. After doing this we thought it would be valuable to share our work with the world, to save every data scientist out there many hours of intense labor with a presentation software. The presentation link is just below this paragraph. Further below, we also share a script that presents the material in a fully linear fashion. Of course, you can pick bits and pieces on which to focus the presentation and jump between them. That was one of the reasons for choosing Prezi as our presentation solution.

Demystifying Maths of Gradient Boosting

Boosting is an ensemble learning technique. Conceptually, these techniques involve: 1. learning base learners; 2. using all of the models to come to a final prediction. Ensemble learning techniques are of different types and all differ from each other in terms of how they go about implementing the learning process for the base learners and then using their output to give out the final result. Techniques that are used in ensemble learning are Bootstrap Aggregation (a.k.a. Bagging), Boosting, Cascading models and Stacked Ensemble Models. In this article, we shall discuss briefly about Bagging and then move on to Gradient Boosting which is the focus of this article. There are a lot of sources which explain the steps in the algorithm of Gradient Boosting. But if you try finding a source that explains what does each step really do that makes the entire algorithm work, you probably will find articles which use squared error as an example to do so. Those explanations are very nice but the problem is that they focus so much on squared error that they almost fail to convey a generalised idea. Gradient Boosting is a generic model which works with any loss function which is differentialble, however, seeing it work with a squared loss model alone does not completely explain what it does during the learning process. In this article, I intend to explain the algorithm through a more generic approach.

Real-Time Streaming and Anomaly detection Pipeline on AWS

Real-Time Streaming and Anomaly detection Pipeline on AWS

Understand how your TensorFlow Model is Making Predictions

Machine learning can answer questions more quickly and accurately than ever before. As machine learning is used in more mission-critical applications, it is increasingly important to understand how these predictions are derived. In this blog post, we’ll build a neural network model using the Keras API from TensorFlow, an open-source machine learning framework. One our model is trained, we’ll integrate it with SHAP, an interpretability library. We’ll use SHAP to learn which factors are correlated with the model predictions.

A example in causal inference designed to frustrate: an estimate pretty much guaranteed to be biased

I am putting together a brief lecture introducing causal inference for graduate students studying biostatistics. As part of this lecture, I thought it would be helpful to spend a little time describing directed acyclic graphs (DAGs), since they are an extremely helpful tool for communicating assumptions about the causal relationships underlying a researcher’s data. The strength of DAGs is that they help us think how these underlying relationships in the data might lead to biases in causal effect estimation, and suggest ways to estimate causal effects that eliminate these biases. (For a real introduction to DAGs, you could take a look at this paper by Greenland, Pearl, and Robins or better yet take a look at Part I of this book on causal inference by Hernán and Robins.)

Simple guide for ensemble learning methods

Before this one, I had published a post on ‘Holy grail for Bias variance trade-off, Overfitting and Underfitting’. This comprehensive article serves as an important prequel to this post if you are a newbie or would just like to brush up the concepts of bias and variance before diving in with full force in the sea of Ensemble modelling. All the others in the audience can readily move on to know more about Ensemble modelling from my pen. I will resort to quoting some real life examples to simplify the concepts of what,why and how of the ensemble models with focus on bagging and boosting techniques.

Custom Transformers and ML Data Pipelines with Python

How you can use inheritance and sklearn to write your own custom transformers and pipelines for machine learning preprocessing. 80% of the total time spent on most data science projects is spent on cleaning and preprocessing the data. We’ve all heard that right? So it only makes sense we find ways to automate the pre-processing and cleaning as much as we can. Scikit-Learn pipelines are composed of steps , each of which has to be some kind of transformer except the last step which can be a transformer or an estimator such as a machine learning model. When I say transformer , I mean transformers such as the Normalizer, StandardScaler or the One Hot Encoder to name a few. But say, what if before I use any of those, I wanted to write my own custom transformer not provided by Scikit-Learn that would take the weighted average of the 3rd, 7th and 11th columns in my dataset with a weight vector I provide as an argument ,create a new column with the result and drop the original columns? In addition to doing that and most importantly what if I also wanted my custom transformer to seamlessly integrate with my existing Scikit-Learn pipeline and its other transformers? Sounds great and lucky for us Scikit-Learn allows us to do that.

Using Rstudio Jobs for training many models in parallel

Recently, Rstudio added the Jobs feature, which allows you to run R scripts in the background. Computations are done in a separate R session that is not interactive, but just runs the script. In the meantime your regular R session stays live so you can do other work while waiting for the Job to complete. Instead refreshing your Twitter for the 156th time, you can stay productive! (I am actually writing this blog in Rmarkdown while I am waiting for my results to come in.) The number of jobs you can spin-up is not limited to one. As each new job is started at a different core, you can start as many jobs as your system got cores (although leaving one idle is a good idea for other processes, like your interactive R session).

Regression: Kernel and Nearest Neighbor Approach

In this article, I will talk about the Kernel and Nearest Neighbor Approach which forms a major class of non-parametric methods to solve a regression setting.