RInside is stressing the CRAN system a little in that it triggers a number of NOTE and WARNING messages. Some of these are par for the course as we get close to R internals not all of which are ‘officially’ in the API. My continued thanks to the CRAN team for supporting the package. It has (once again!) been nearly two years since the last release, and a number of nice extensions, build robustifications (mostly for Windows) and fixes had been submitted over this period – see below for the three key pull requests. There are no new user-facing changes. The most recent change, and the one triggering the change, was based on a rchk report: the one time we call Rf_eval() could conceivably have a memory allocation race so two additional PROTECT calls make it more watertight. The joys of programming with the C API …

How to Get to the Data-Enabled Data Center

Despite their many promising benefits, advancements in artificial intelligence (AI) and deep learning (DL) are creating some of the most challenging workloads in modern computing history and put significant strain on the underlying I/O, storage, compute and network. Current enterprise and research data center IT infrastructures can’t keep up with these demanding needs of AI and DL. Designed to handle modest workloads, minimal scalability, limited performance needs, and small data volumes, these platforms are highly bottlenecked and lack the fundamental capabilities needed for AI-enabled deployments. An AI-enabled data center must be able to concurrently and efficiently service the entire spectrum of activities involved in the AI and DL process, including data ingest, training, and inference. The IT infrastructure supporting an AI-enabled datacenter must adapt and scale rapidly, efficiently, and reliably, as data volumes grow and application workloads become more intense, complex, and diverse. It must seamlessly and continuously handle transitions between different phases of experimental training and production inference in order to provide more accurate answers, faster. In short, the IT infrastructure is key to realizing the full potential of AI, and DL in business and research.

Multi-view Clustering: A Survey

In the big data era, the data are generated from different sources or observed from different views. These data are referred to as multi-view data. Unleashing the power of knowledge in multi-view data is very important in big data mining and analysis. This calls for advanced techniques that consider the diversity of different views, while fusing these data. Multi-view Clustering (MvC) has attracted increasing attention in recent years by aiming to exploit complementary and consensus information across multiple views. This paper summarizes a large number of multi-view clustering algorithms, provides a taxonomy according to the mechanisms and principles involved, and classifies these algorithms into five categories, namely, co-training style algorithms, multi-kernel learning, multiview graph clustering, multi-view subspace clustering, and multi-task multi-view clustering. Therein, multi-view graph clustering is further categorized as graph-based, network-based, and spectral-based methods. Multi-view subspace clustering is further divided into subspace learning-based, and non-negative matrix factorization-based methods. This paper does not only introduce the mechanisms for each category of methods, but also gives a few examples for how these techniques are used. In addition, it lists some publically available multi-view datasets. Overall, this paper serves as an introductory text and survey for multi-view clustering.

Applications of Distance Correlation to Time Series

The use of empirical characteristic functions for inference problems, including estimation in some special parametric settings and testing for goodness of fit, has a long history dating back to the 70s (see for example, Feuerverger and Mureika (1977), Csorgo (1981a,1981b,1981c), Feuerverger (1993)). More recently, there has been renewed interest in using empirical characteristic functions in other inference settings. The distance covariance and correlation, developed by Szekely and Rizzo (2009) for measuring dependence and testing independence between two random vectors, are perhaps the best known illustrations of this. We apply these ideas to stationary univariate and multivariate time series to measure lagged auto- and cross-dependence in a time series. Assuming strong mixing, we establish the relevant asymptotic theory for the sample auto- and cross-distance correlation functions. We also apply the auto-distance correlation function (ADCF) to the residuals of an autoregressive processes as a test of goodness of fit. Under the null that an autoregressive model is true, the limit distribution of the empirical ADCF can differ markedly from the corresponding one based on an iid sequence. We illustrate the use of the empirical auto- and cross-distance correlation functions for testing dependence and cross-dependence of time series in a variety of different contexts.

11 Myths About Artificial Intelligence and the Edge

1. AI is science fiction.
2. AI must be edge-device-driven or cloud-based.
3. AI at the Edge must be fast.
4. Humans will always beat the best AI.
5. AI is a threat to our privacy.
6. High-speed hardware and cloud technologies are required by all AI systems.
7. AI at the edge is expensive.
8. AI is too power-hungry to deploy at the edge.
9. Systems using AI at the edge are complex to design.
10. Edge computing can be a server.
11. Edge AI operates on high-resolution images.

Story of every other DataScience Enthusiast !!!

There are so many newbies around the world, who are eager to test the waters with the ‘Sexiest job of 21st Century’, but unaware of the trends that are prevalent. Knowing the correct information at the right time will help you prepare better and make an informed decision (or you need to have your own ‘Mike’ 🙂 ). These various insights will help other newbies to analyse and take a good decision with regards to joining the data science wave.

An Empirical Model of Large-Batch Training

In an increasing number of domains it has been demonstrated that deep learning models can be trained using relatively large batch sizes without sacrificing data efficiency. However the limits of this massive data parallelism seem to differ from domain to domain, ranging from batches of tens of thousands in ImageNet to batches of millions in RL agents that play the game Dota 2. To our knowledge there is limited conceptual understanding of why these limits to batch size differ or how we might choose the correct batch size in a new domain. In this paper, we demonstrate that a simple and easy-to-measure statistic called the gradient noise scale predicts the largest useful batch size across many domains and applications, including a number of supervised learning datasets (MNIST, SVHN, CIFAR- 10, ImageNet, Billion Word), reinforcement learning domains (Atari and Dota), and even generative model training (autoencoders on SVHN). We find that the noise scale increases as the loss decreases over a training run and depends on the model size primarily through improved model performance. Our empirically-motivated theory also describes the tradeoff between compute-efficiency and time-efficiency, and provides a rough model of the benefits of adaptive batch-size training.

Deep Learning with Satellite Data

While at the University of Sannio in Benevento, Italy this January, my friend Tuomas Oikarinen and I created a (semi-automated) pipeline for downloading publicly available images, and trained a 3-D Convolutional Neural Network on the data. Ultimately, our model achieves a balanced accuracy of around 0.65 on Sentinel-2 optical satellite imagery. I will go into more detail regarding the results (and why this model might actually be useful)

What Are Machine Learning Models Hiding?

Machine learning is eating the world. The abundance of training data has helped ML achieve amazing results for object recognition, natural language processing, predictive analytics, and all manner of other tasks. Much of this training data is very sensitive, including personal photos, search queries, location traces, and health-care records.

AI Safety Needs Social Scientists

We’ve written a paper arguing that long-term AI safety research needs social scientists to ensure AI alignment algorithms succeed when actual humans are involved. Properly aligning advanced AI systems with human values requires resolving many uncertainties related to the psychology of human rationality, emotion, and biases. The aim of this paper is to spark further collaboration between machine learning and social science researchers, and we plan to hire social scientists to work on this full time at OpenAI.

Time series forecasting with deep stacked unidirectional and bidirectional LSTMs

This post assumes the reader has a basic understanding of how LSTMs work. However, you can get a brief introduction to LSTMs here. Also, if you are an absolute beginner to time series forecasting, I recommend you to check out this Blog. The main objective of this post is to showcase how deep stacked unidirectional and bidirectional LSTMs can be applied to time series data as a Seq-2-Seq based encoder-decoder model. We’ll first get introduced to the architecture and then look at the code to implement the same.

Neural-Symbolic VQN – Disentagled Reasoning – Or – The answer: disentanglement

An explanation of an interpretable deep learning system.

AutoML for predictive modeling

Automating machine learning is the topic of growing importance as first results are being used in practice bringing significant cost reduction. My talk at ML Prague conference maps state of the art techniques and open source AutoML frameworks mostly in the field of predictive modeling.

Deep into End-to-end Neural Coreference Model

In the previous article about the end-to-end neural coreference model, we have seen the results and its application on chatbot. Would you want to dig deeper into how the model works? This article will fulfill your curiosity. This article contains formulas for more details, but I have tried to make the description of theoretical part in papers more accessible. Medium does not support superscript, subscript, or latex-like syntax, it causes some inconvenience when reading this article. Before we begin to know more about this model, several concepts about coreference help our understandings.

Artificial Intelligence can never be truly intelligent

Let’s say I’m locked in a room and given a large batch of Chinese writing. I don’t know any Chinese, neither written nor spoken. I can’t even differentiate the writing from other similar scripts, such as Japanese. Now, I receive a second batch of writing, but this time with a set of instructions in English (which I do know) on how to correlate the first batch with the second. I use these English instructions to find patterns and common symbols in the writing. I then receive a third batch of writing, once again with a set of English instructions, which help me correlate it with the first two batches. These instructions also help me frame a response using the same set of symbols and characters. This goes on. After a while I get so good at this game that nobody, just by looking at my responses, can tell if I’m a native Chinese speaker or not. But does this mean I understand Chinese? Of course not. This brilliant example, called the Chinese room argument, from John R. Searle’s Minds, Brains and Programs(1984)[1] highlights the fundamental flaws in our understanding of Artificial Intelligence. This article is an ode to the idea that the ‘intelligence’ we try to manufacture artificially, is not really the ‘intelligence’ that you and I identify intrinsically as a trait of our species. It is but an imitation or an illusion of human intelligence.

Data Scientists Role and Ethics

We need to have that ethical understanding, we need to have that training, and we need to have something akin to a Hippocratic oath. And we need to actually have proper licenses so that if you actually do something unethical, perhaps you have some kind of penalty, or disbarment, or some kind of recourse, something to say this is not what we want to do as an industry, and then figure out ways to remediate people who go off the rails and do things because people just aren’t trained and they don’t know.