Heterogeneous Multi-Task Metric Learning (HMTML) google
Distance metric learning (DML) plays a crucial role in diverse machine learning algorithms and applications. When the labeled information in target domain is limited, transfer metric learning (TML) helps to learn the metric by leveraging the sufficient information from other related domains. Multi-task metric learning (MTML), which can be regarded as a special case of TML, performs transfer across all related domains. Current TML tools usually assume that the same feature representation is exploited for different domains. However, in real-world applications, data may be drawn from heterogeneous domains. Heterogeneous transfer learning approaches can be adopted to remedy this drawback by deriving a metric from the learned transformation across different domains. But they are often limited in that only two domains can be handled. To appropriately handle multiple domains, we develop a novel heterogeneous multi-task metric learning (HMTML) framework. In HMTML, the metrics of all different domains are learned together. The transformations derived from the metrics are utilized to induce a common subspace, and the high-order covariance among the predictive structures of these domains is maximized in this subspace. There do exist a few heterogeneous transfer learning approaches that deal with multiple domains, but the high-order statistics (correlation information), which can only be exploited by simultaneously examining all domains, is ignored in these approaches. Compared with them, the proposed HMTML can effectively explore such high-order information, thus obtaining more reliable feature transformations and metrics. Effectiveness of our method is validated by the extensive and intensive experiments on text categorization, scene classification, and social image annotation. …

ArrayFire Library google
The ArrayFire accelerated computing library is a free, general-purpose, open-source library that simplifies the process of developing software that targets parallel and massively-parallel architectures including CPUs, GPUs, and other hardware acceleration devices. ArrayFire is used on devices from low-powered mobile phones to high-powered GPU-enabled supercomputers including CPUs from all major vendors (Intel, AMD, Arm), GPUs from the dominant manufacturers (NVIDIA, AMD, and Qualcomm), as well as a variety of other accelerator devices on Windows, Mac, and Linux. …

Deep Probabilistic Programming google
Deep probabilistic programming combines deep neural networks (for automatic hierarchical representation learning) with probabilistic models (for principled handling of uncertainty). Unfortunately, it is difficult to write deep probabilistic models, because existing programming frameworks lack concise, high-level, and clean ways to express them. To ease this task, we extend Stan, a popular high-level probabilistic programming language, to use deep neural networks written in PyTorch. Training deep probabilistic models works best with variational inference, so we also extend Stan for that. We implement these extensions by translating Stan programs to Pyro. Our translation clarifies the relationship between different families of probabilistic programming languages. Overall, our paper is a step towards making deep probabilistic programming easier. …

HellaSwag google
Recent work by Zellers et al. (2018) introduced a new task of commonsense natural language inference: given an event description such as ‘A woman sits at a piano,’ a machine must select the most likely followup: ‘She sets her fingers on the keys.’ With the introduction of BERT, near human-level performance was reached. Does this mean that machines can perform human level commonsense inference? In this paper, we show that commonsense inference still proves difficult for even state-of-the-art models, by presenting HellaSwag, a new challenge dataset. Though its questions are trivial for humans (>95% accuracy), state-of-the-art models struggle (<48%). We achieve this via Adversarial Filtering (AF), a data collection paradigm wherein a series of discriminators iteratively select an adversarial set of machine-generated wrong answers. AF proves to be surprisingly robust. The key insight is to scale up the length and complexity of the dataset examples towards a critical ‘Goldilocks’ zone wherein generated text is ridiculous to humans, yet often misclassified by state-of-the-art models. Our construction of HellaSwag, and its resulting difficulty, sheds light on the inner workings of deep pretrained models. More broadly, it suggests a new path forward for NLP research, in which benchmarks co-evolve with the evolving state-of-the-art in an adversarial way, so as to present ever-harder challenges. …