Meet ALBERT: a new ‘Lite BERT’ from Google & Toyota with State of the Art NLP performance and 18x fewer parameters.

Your previous NLP models are parameter inefficient and kind of obsolete. Have a great day. Google Research and Toyota Technological Institute jointly released a new paper that introduces the world to what is arguably BERT’s successor, a much smaller/smarter Lite Bert called ALBERT. (‘ALBERT: A Lite BERT for Self-supervised Learning of Language Representations’).

AI for portfolio management: from Markowitz to Reinforcement Learning

The evolution of quantitative asset management techniques with empirical evaluation and Python source code. Artificial intelligence, machine learning, big data, and other buzzwords are disrupting decision making in almost any area of finance. On the back office, machine learning is widely applied to spot anomalies in execution logs, for risk management and fraudulent transaction detection. At the front office, AI is used for customer segmentation and support and pricing the derivatives. But of course, the most interesting applications of AI in finance are in the buy-side and are related to searching the predictive signal in the noise and catching that alpha. They include but are not restricted to time series forecasting, the regime-switching detection, market segmentation, and, of course, asset portfolio management. This article is fully devoted to the latter problem – we will review classical mathematical methods for optimizing portfolio, unsupervised, supervised machine learning approaches, reinforcement learning agents and some more exotic options. The material of this topic is tightly correlated with the inner expertise of Neurons Lab where I am co-founder and CTO and the course I was teaching at UCU data science summer school. As always, you can find all source code on GitHub and the results of the experiments further in this article.

Automation via Reinforcement Learning

The dream of reinforcement learning is that it can one day be used to derive automated solutions to real-world tasks, with little-to-no human effort1. Unfortunately, in its current state, RL fails to deliver. There have been basically no real-world problems solved by DRL; even on toy problems, the solutions found are often brittle and fail to generalize to new environments. This means that the per-task human effort – i.e. task-specific engineering effort and hyperparameter tuning – is quite high. Algorithms are sample-inefficient, making them expensive in terms of both data collection effort and compute effort, too. Currently RL-based automated solutions compare very unfavorably to alternatives (such as hiring a team of roboticists to engineer a solution, or just not automating at all). But reinforcement learning, especially deep RL, is an exciting research area precisely because of its enormous unfulfilled potential. Improvements in RL directly translate into an improved ability to automate complex, cognitively-demanding tasks, which is where humanity, collectively, currently spends a lot of effort. If we can push reinforcement learning forward far enough, we will be able to take tasks that currently require substantial human effort, and solve them with no human effort: just a little bit of data and a lot of computation. With this in mind, let’s delve a bit more into what it means to automate a task with reinforcement learning. The basic process can be decomposed into two steps: first reduce the problem to RL by writing it as an MDP or POMDP, and then solve for the optimal policy of the MDP or POMDP2. The optimal policy then allows us to fully automate the task, completing it any number of times with no further human effort.

AI equal with human experts in medical diagnosis, study finds

Artificial intelligence is on a par with human experts when it comes to making medical diagnoses based on images, a review has found. The potential for artificial intelligence in healthcare has caused excitement, with advocates saying it will ease the strain on resources, free up time for doctor-patient interactions and even aid the development of tailored treatment. Last month the government announced £250m of funding for a new NHS artificial intelligence laboratory.

Feature Selection: Beyond feature importance?

In machine learning, Feature Selection is the process of choosing features that are most useful for your prediction. Although it sounds simple it is one of the most complex problems in the work of creating a new machine learning model. In this post, I will share with you some of the approaches that were researched during the last project I led at Fiverr. You will get some ideas on the basic method I tried and also the more complex approach, which got the best results – removing over 60% of the features, while maintaining accuracy and achieving more stability for our model. I’ll also be sharing our improvement to this algorithm.

Debating the AI Safety Debate

As I am moving into the area of AI Safety within the field of artificial intelligence (AI) I find myself both confused and perplexed. Where do you even start? I covered the financial developments in OpenAI yesterday, and they are one of the foremost authorities on AI Safety. As such I thought it would be interesting to look at one of their papers. The paper that I will be looking at is called AI safety via debate published October 2018. You can of course read the article yourself in arXiv, and critique my article in turn; that would be the ideal situation. This debate about AI debates is of course ongoing.

A Human Centered approach to AI

Tools and approaches to help CX Designers and Product Owners approach emergent tech projects.

First impressions about Uber’s Ludwig. A simple machine learning tool. Or not?

Not so long ago Uber presented a promising tool intended to simplify work with deep machine learning algorithms. Ludwig is a toolbox that allows to train and test deep learning models without the need to write code. And so, we were very interested to check that out. Is it a machine that will take our jobs or a useful tool that we can work with. The interesting part – is that Ludwig provides a Python API, so it must be simple to integrate the model with your applications.

Attention, please: forget about Recurrent Neural Networks

Some say translation from one language to another one is more an art than a science. Not long ago, Douglas Hofstadter pointed out in an article published on The Atlantic the ‘shallowness’ of machine translations. Despite the limitations, it’s hard to deny that automatic translation softwares not only work quite well in many cases, but also that the technology behind it has broad applications in any context where information flows from one realm to another, such as RNA-to-protein encoding in genomics. Until 2015, the field of sequence-to-sequence mapping (or translation) was dominated by recurrent neural networks, and in particular by long short-term memory (LSTM) networks. I covered the basics of these architectures in a previous post, where LSTMs are applied to the kinematic reconstruction of the decay of pairs of top quarks at the LHC. Then, something new happened: the ResNet architecture and the Attention mechanism were proposed, paving the way towards a more general framework for this kind of task. In particular, these novel architectures also solved another problem along the way: in fact, due to the intrinsic sequential nature of RNNs, that kind of networks are hard to train on a parallel system such as a GPU. And here’s where convolutional neural networks come very handy.

AI, Truth, and Society: Deepfakes at the front of the Technological Cold War

This is the first part of our special feature series on Deepfakes, exploring the latest developments and implications in this nascent field of AI. We will be covering detailed implementations on generation and countering strategies in future articles, please stay tuned to GradientCrescent to learn more.

Data science effectiveness as a UX problem

We data scientists spend so much of our effort helping you understand your users that… you forget that we are users too. Data scientists are users too. There are many instances where it feels like someone attempted to make a data science tool for data scientists without ever having met a live one. If you take that product development approach, you remind me of bros trying to break into the tampon market without ever consulting a woman. What could possibly go wrong…?

Implementing Adversarial Attacks and Defenses in Keras & Tensorflow 2.0

State-of-the-art image classification is essential for self-driving cars; a single misclassification can lead to the loss of human life. Adversarial attacks are a method of creating imperceptible changes to an image that can cause seemingly robust image classification techniques to misclassify an image consistently. In this article, we will cover the basics of implementing adversarial attacks and how we can defend our models against them.