Future and Automation: Key Things AI Can & Can’t Do (Yet)

Everything you need to know about the current stage of AI development.
Things Algorithmic Mind Can’t Do but Tries to Do:
#1: Emotions, Empathy, Spontaneity
#2: Be Creative (On Its Own, Anyway)
#3: Be Ethically and Politically Correct
#4: Make a Clear and Consistent Decision


Meta-learning of Adversarial Generative models

Convolutional neural networks have been successful in generating realistic human head images by training neural networks on a large dataset of images of a single person. However, in many practical scenarios, such personalized talking head models need to be learnt from a few image views of a person, sometimes limited to a single image. Due to this limitation Samsung AI Research with Skolkovo Institute of Science and Technology discovered the concept of few-shot capability which is capable of performing the following tasks, with its high capacity generators and discriminators
• Performing lengthy meta-learning on a large dataset of videos.
• Frame few- and one-shot learning of neural talking head models of previously unseen people as adversarial training problems.
• Initialize the parameters of both the generator and the discriminator in a person-specific way.
• Fast and efficient training on just a few images.
• Ability to tune tens of millions of parameters.
• Scalable to generate personalized talking head models of new people and even portrait paintings.


To understand LSTM architecture, code a forward pass with just NumPy

If you’re like me, you learn best by starting simple and building from the ground up. No matter how often I read colah’s famous LSTM post, or Karpathy’s post on RNNs (great resources!), the LSTM network architecture seemed overly complicated and the gates were hazy. Eventually, I wrote an LSTM forward pass with just NumPy. I highly recommend doing this exercise, because you can examine the hidden and cell states, inputs and outputs, and clearly see how they were created.


Reverse Engineering the Walk Score Algorithm

I live in Seattle and recently moved to a different neighborhood. According to Walk Score’s proprietary algorithms, I moved from the 9th most walkable Seattle neighborhood to the 30th. I can still easily walk to a local coffee shop and barber, but that’s about it! I can tell that I’ve moved to a considerably less walkable neighborhood but it’s unclear how to quantify the magnitude or what goes into a walkability score. I’ve previously used the Walk Score API as a data source for predicting clustering of electric scooter locations. Walk Score is a website that takes an address and computes a measure of its walkability on a scale from 0-100 using proprietary algorithms and various data streams.


Introduction to Latent Matrix Factorization Recommender Systems

Latent Matrix Factorization is an incredibly powerful method to use when creating a Recommender System. Ever since Latent Matrix Factorization was shown to outperform other recommendation methods in the Netflix Recommendation contest, its been a cornerstone in building Recommender Systems. This article will aim to give you some intuition for when to use Latent Matrix Factorization for Recommendation, while also giving some intuition behind why it works. If you’d like to see a full implementation, you can go to my Kaggle kernel: Probabilistic Matrix Factorization.


Hands-on Graph Neural Networks with PyTorch & PyTorch Geometric

In my last article, I introduced the concept of Graph Neural Network (GNN) and some recent advancements of it. Since this topic is getting seriously hyped up, I decided to make this tutorial on how to easily implement your Graph Neural Network in your project. You will learn how to construct your own GNN with PyTorch Geometric, and how to use GNN to solve a real-world problem (Recsys Challenge 2015). In this blog post, we will be using PyTorch and PyTorch Geometric (PyG), a Graph Neural Network framework built on top of PyTorch that runs blazingly fast. Well … how fast is it? Compared to another popular Graph Neural Network Library, DGL, in terms of training time, it is at most 80% faster!!


What if AI model understanding were easy?

Let’s talk about the What-If Tool, as in ‘What if getting a look at your model performance and data during ML/AI development weren’t such a royal pain in the butt?’ (Or ignore my chatter and scroll straight to the walkthrough screenshots below!)


Interactive charts with chartbookR

There is nothing worse than charts overladed with information. One solution to this are interactive charts that let users select the time series they’re interested in, zoom in on them, and focus on individual data points. The chartbookR package makes the creation of interactive charts very easy, with minimal coding. This tutorial outlines how to create such charts and how to add further interactive features.


Who Is Your Golden Goose?: Cohort Analysis

Customer segmentation is the technique of diving customers into groups based on their purchase patterns to identify who are the most profitable groups. In segmenting customers, various criteria can also be used depending on the market such as geographic, demographic characteristics or behavior bases. This technique assumes that groups with different features require different approaches to marketing and wants to figure out the groups who can boost their profitability the most. Today, we are going to discuss how to do customer segmentation analysis with the online retail dataset from UCI ML repo. This analysis will be focused on two steps getting the RFM values and making clusters with K-means algorithms. The dataset and the full code is also available on my Github. The original resource of this note is from the course ‘Customer Segmentation Analysis in Python.’


Python leads the 11 top Data Science, Machine Learning platforms: Trends and Analysis

Python continues to lead the top Data Science platforms, but R and RapidMiner hold their share; Almost 50% have used Deep Learning tools; SQL is steady; Consolidation continues.


EfficientNet: Improving Accuracy and Efficiency through AutoML and Model Scaling

Convolutional neural networks (CNNs) are commonly developed at a fixed resource cost, and then scaled up in order to achieve better accuracy when more resources are made available. For example, ResNet can be scaled up from ResNet-18 to ResNet-200 by increasing the number of layers, and recently, GPipe achieved 84.3% ImageNet top-1 accuracy by scaling up a baseline CNN by a factor of four. The conventional practice for model scaling is to arbitrarily increase the CNN depth or width, or to use larger input image resolution for training and evaluation. While these methods do improve accuracy, they usually require tedious manual tuning, and still often yield suboptimal performance. What if, instead, we could find a more principled method to scale up a CNN to obtain better accuracy and efficiency? In our ICML 2019 paper, ‘EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks’, we propose a novel model scaling method that uses a simple yet highly effective compound coefficient to scale up CNNs in a more structured manner. Unlike conventional approaches that arbitrarily scale network dimensions, such as width, depth and resolution, our method uniformly scales each dimension with a fixed set of scaling coefficients. Powered by this novel scaling method and recent progress on AutoML, we have developed a family of models, called EfficientNets, which superpass state-of-the-art accuracy with up to 10x better efficiency (smaller and faster).


Ideas: Design Methodologies for Data Sprints

I recently spent four days at a research lab with a group of data scientists and a few behavior scientists diving into a large, messy data set to see if we might find insights related to group dynamics. It was framed as an initial dive to look around, get an initial assessment of potential for useful research insights, and then build a proposal for two to three months of work based on what was found that week. I attended theoretically in the role of a behavioral scientist, but since I’m also obsessed with creative problem solving processes, or design thinking (and I’m opinionated as hell), the following are some reflections on the process we used and ideas for what’s next.


Asynchronous Workflows in Data Science

Pointlessly staring at live logs and waiting for a miracle to happen is a huge time sink for data scientists everywhere. Instead, one should strive for an asynchronous workflow. In this article, we define asynchronous workflows, figure out some of the obstacles and finally guide you to a next article to look at a Before we can say anything about asynchronous workflows, we have to define the synchronous counterpart. A synchronous workflow is the iterative cycle of creation & testing, where only one or the other is taking place at any one time. For example in software development, you write code, test it, then write some more code, then test that code, write even more code, test it again, code, test, code, test, code… repeat until IPO. Simple! A synchronous workflow is often so ingrained into our way of doing things that we do not even recognize it as a thing. This is also the reason why it often isn’t even questioned. It is hard to question something that you don’t even realize exists.


ML Algorithms: One SD (s)- Association Rule Learning Algorithms

The obvious questions to ask when facing a wide variety of machine learning algorithms, is ‘which algorithm is better for a specific task, and which one should I use?’ Answering these questions vary depending on several factors, including: (1) The size, quality, and nature of data; (2) The available computational time; (3) The urgency of the task; and (4) What do you want to do with the data. This is one section of the many algorithms I wrote about in a previous article. In this part I tried to display and briefly explain the main algorithms (though not all of them) that are available for association rule learning tasks as simply as possible.


R-Squared in One Picture

R-squared measures how well your data fits a regression line. More specifically, it’s how much variation in the response variable your linear model explains. it is expressed as a percentage (0 to 100%). The percentage is problem specific, so you can’t compare R-squared across different situations; You can use it to compare different models for one specific set of data. R-squared is also influenced by the number of observations: 0.80 R-squared on 100 observations doesn’t mean the same thing as 0.80 R-squared on 1,000 observations. One way around this is to compute R-squared on multiple sub-samples with 100 observations, then compute its median. That way, you can compare an R-squared on (say) 1,000 observations, with one on 100 observations.


Visualize correlation matrices in Python

When working with data it is helpful to build a correlation matrix to describe data and the associations between variables. In this article, you learn how to use visualizations for correlation matrices in Python.