Unicorn: Continual Learning with a Universal, Off-policy Agent

Some real-world domains are best characterized as a single task, but for others this perspective is limiting. Instead, some tasks continually grow in complexity, in tandem with the agent’s competence. In continual learning, also referred to as lifelong learning, there are no explicit task boundaries or curricula. As learning agents have become more powerful, continual learning remains one of the frontiers that has resisted quick progress. To test continual learning capabilities we consider a challenging 3D domain with an implicit sequence of tasks and sparse rewards. We propose a novel agent architecture called Unicorn, which demonstrates strong continual learning and outperforms several baseline agents on the proposed domain. The agent achieves this by jointly representing and learning multiple policies efficiently, using a parallel off-policy learning setup.

Content-Based Citation Recommendation

We present a content-based method for recommending citations in an academic paper draft. We embed a given query document into a vector space, then use its nearest neighbors as candidates, and rerank the candidates using a discriminative model trained to distinguish between observed and unobserved citations. Unlike previous work, our method does not require metadata such as author names which can be missing, e.g., during the peer review process. Without using metadata, our method outperforms the best reported results on PubMed and DBLP datasets with relative improvements of over 18% in F1@20 and over 22% in MRR. We show empirically that, although adding metadata improves the performance on standard metrics, it favors self-citations which are less useful in a citation recommendation setup. We release an online portal (http://…/citeomatic ) for citation recommendation based on our method, and a new dataset OpenCorpus of 7 million research articles to facilitate future research on this task.

Structured Control Nets for Deep Reinforcement Learning

In recent years, Deep Reinforcement Learning has made impressive advances in solving several important benchmark problems for sequential decision making. Many control applications use a generic multilayer perceptron (MLP) for non-vision parts of the policy network. In this work, we propose a new neural network architecture for the policy network representation that is simple yet effective. The proposed Structured Control Net (SCN) splits the generic MLP into two separate sub-modules: a nonlinear control module and a linear control module. Intuitively, the nonlinear control is for forward-looking and global control, while the linear control stabilizes the local dynamics around the residual of global control. We hypothesize that this will bring together the benefits of both linear and nonlinear policies: improve training sample efficiency, final episodic reward, and generalization of learned policy, while requiring a smaller network and being generally applicable to different training methods. We validated our hypothesis with competitive results on simulations from OpenAI MuJoCo, Roboschool, Atari, and a custom 2D urban driving environment, with various ablation and generalization tests, trained with multiple black-box and policy gradient training methods. The proposed architecture has the potential to improve upon broader control tasks by incorporating problem specific priors into the architecture. As a case study, we demonstrate much improved performance for locomotion tasks by emulating the biological central pattern generators (CPGs) as the nonlinear part of the architecture.

Eliciting Expertise without Verification

A central question of crowd-sourcing is how to elicit expertise from agents. This is even more difficult when answers cannot be directly verified. A key challenge is that sophisticated agents may strategically withhold effort or information when they believe their payoff will be based upon comparison with other agents whose reports will likely omit this information due to lack of effort or expertise. Our work defines a natural model for this setting based on the assumption that \emph{more sophisticated agents know the beliefs of less sophisticated agents}. We then provide a mechanism design framework for this setting. From this framework, we design several novel mechanisms, for both the single and multiple question settings, that (1) encourage agents to invest effort and provide their information honestly; (2) output a correct ‘hierarchy’ of the information when agents are rational.

Deep learning algorithm for data-driven simulation of noisy dynamical system

We present a deep learning model, DE-LSTM, for the simulation of a stochastic process with underlying nonlinear dynamics. The deep learning model aims to approximate the probability density function of a stochastic process via numerical discretization and the underlying nonlinear dynamics is modeled by the Long Short-Term Memory (LSTM) network. After the numerical discretization by a softmax function, the function estimation problem is solved by a multi-label classification problem. A penalized maximum log likelihood method is proposed to impose smoothness in the predicted probability distribution. It is shown that LSTM is a state space model, where the internal dynamics consists of a system of relaxation processes. A sequential Monte Carlo method is outlined to compute the time evolution of the probability distribution. The behavior of DE-LSTM is investigated by using the Ornstein-Uhlenbeck process and noisy observations of Mackey-Glass equation and forced Van der Pol oscillators. While the probability distribution computed by the conventional maximum log likelihood method makes a good prediction of the first and second moments, the Kullback-Leibler divergence shows that the penalized maximum log likelihood method results in a probability distribution closer to the ground truth. It is shown that DE-LSTM makes a good prediction of the probability distribution without assuming any distributional properties of the noise. For a multiple-step forecast, it is found that the prediction uncertainty, denoted by the 95% confidence interval, does not grow monotonically in time. For a chaotic system, Mackey-Glass time series, the 95% confidence interval first grows, then exhibits an oscillatory behavior, instead of growing indefinitely, while for the forced Van der Pol oscillator, the prediction uncertainty does not grow in time even for 3,000-step forecast.

Learning to Make Predictions on Graphs with Autoencoders

We examine two fundamental tasks associated with graph representation learning: link prediction and semi-supervised node classification. We present a densely connected autoencoder architecture capable of learning a joint representation of both local graph structure and available external node features for the multi-task learning of link prediction and node classification. To the best of our knowledge, this is the first architecture that can be efficiently trained end-to-end in a single learning stage to simultaneously perform link prediction and node classification. We provide comprehensive empirical evaluation of our models on a range of challenging benchmark graph-structured datasets, and demonstrate significant improvement in accuracy over related methods for graph representation learning. Code implementation is available at https://…/graph-representation-learning

BigDataBench: A Dwarf-based Big Data and AI Benchmark Suite

As architecture, system, data management, and machine learning communities pay greater attention to innovative big data and data-driven artificial intelligence (in short, AI) algorithms, architecture, and systems, the pressure of benchmarking rises. However, complexity, diversity, frequently changed workloads, and rapid evolution of big data, especially AI systems raise great challenges in benchmarking. First, for the sake of conciseness, benchmarking scalability, portability cost, reproducibility, and better interpretation of performance data, we need understand what are the abstractions of frequently-appearing units of computation, which we call dwarfs, among big data and AI workloads. Second, for the sake of fairness, the benchmarks must include diversity of data and workloads. Third, for co-design of software and hardware, the benchmarks should be consistent across different communities. Other than creating a new benchmark or proxy for every possible workload, we propose using dwarf-based benchmarks–the combination of eight dwarfs–to represent diversity of big data and AI workloads. The current version–BigDataBench 4.0 provides 13 representative real-world data sets and 47 big data and AI benchmarks, including seven workload types: online service, offline analytics, graph analytics, AI, data warehouse, NoSQL, and streaming. BigDataBench 4.0 is publicly available from http://…/BigDataBench. Also, for the first time, we comprehensively characterize the benchmarks of seven workload types in BigDataBench 4.0 in addition to traditional benchmarks like SPECCPU, PARSEC and HPCC in a hierarchical manner and drill down on five levels, using the Top-Down analysis from an architecture perspective.

An efficient $k$-means-type algorithm for clustering datasets with incomplete records

The k-means algorithm is the most popular nonparametric clustering method in use, but cannot generally be applied to data sets with missing observations. The usual practice with such data sets is to either impute the values under an assumption of a missing-at-random mechanism or to ignore the incomplete records, and then to use the desired clustering method. We develop an efficient version of the k-means algorithm that allows for clustering cases where not all the features have observations recorded. Our extension is called k_m-means and reduces to the k-means algorithm when all records are complete. We also provide strategies to initialize our algorithm and to estimate the number of groups in the data set. Illustrations and simulations demonstrate the efficacy of our approach in a variety of settings and patterns of missing data. Our methods are also applied to the clustering of gamma-ray bursts and to the analysis of activation images obtained from a functional Magnetic Resonance Imaging experiment.

Missing Data Reconstruction in Remote Sensing image with a Unified Spatial-Temporal-Spectral Deep Convolutional Neural Network

Because of the internal malfunction of satellite sensors and poor atmospheric conditions such as thick cloud, the acquired remote sensing data often suffer from missing information, i.e., the data usability is greatly reduced. In this paper, a novel method of missing information reconstruction in remote sensing images is proposed. The unified spatial-temporal-spectral framework based on a deep convolutional neural network (STS-CNN) employs a unified deep convolutional neural network combined with spatial-temporal-spectral supplementary information. In addition, to address the fact that most methods can only deal with a single missing information reconstruction task, the proposed approach can solve three typical missing information reconstruction tasks: 1) dead lines in Aqua MODIS band 6; 2) the Landsat ETM+ Scan Line Corrector (SLC)-off problem; and 3) thick cloud removal. It should be noted that the proposed model can use multi-source data (spatial, spectral, and temporal) as the input of the unified framework. The results of both simulated and real-data experiments demonstrate that the proposed model exhibits high effectiveness in the three missing information reconstruction tasks listed above.

Approximate Positively Correlated Distributions and Approximation Algorithms for D-optimal Design

Experimental design is a classical problem in statistics and has also found new applications in machine learning. In the experimental design problem, the aim is to estimate an unknown vector x in m-dimensions from linear measurements where a Gaussian noise is introduced in each measurement. The goal is to pick k out of the given n experiments so as to make the most accurate estimate of the unknown parameter x. Given a set S of chosen experiments, the most likelihood estimate x’ can be obtained by a least squares computation. One of the robust measures of error estimation is the D-optimality criterion which aims to minimize the generalized variance of the estimator. This corresponds to minimizing the volume of the standard confidence ellipsoid for the estimation error x-x’. The problem gives rise to two natural variants depending on whether repetitions are allowed or not. The latter variant, while being more general, has also found applications in the geographical location of sensors. In this work, we first show that a 1/e-approximation for the D-optimal design problem with and without repetitions giving us the first constant factor approximation for the problem. We also consider the case when the number of experiments chosen is much larger than the dimension of the measurements and provide an asymptotically optimal approximation algorithm.

Novel Approaches to Accelerating the Convergence Rate of Markov Decision Process for Search Result Diversification

Recently, some studies have utilized the Markov Decision Process for diversifying (MDP-DIV) the search results in information retrieval. Though promising performances can be delivered, MDP-DIV suffers from a very slow convergence, which hinders its usability in real applications. In this paper, we aim to promote the performance of MDP-DIV by speeding up the convergence rate without much accuracy sacrifice. The slow convergence is incurred by two main reasons: the large action space and data scarcity. On the one hand, the sequential decision making at each position needs to evaluate the query-document relevance for all the candidate set, which results in a huge searching space for MDP; on the other hand, due to the data scarcity, the agent has to proceed more ‘trial and error’ interactions with the environment. To tackle this problem, we propose MDP-DIV-kNN and MDP-DIV-NTN methods. The MDP-DIV-kNN method adopts a k nearest neighbor strategy, i.e., discarding the k nearest neighbors of the recently-selected action (document), to reduce the diversification searching space. The MDP-DIV-NTN employs a pre-trained diversification neural tensor network (NTN-DIV) as the evaluation model, and combines the results with MDP to produce the final ranking solution. The experiment results demonstrate that the two proposed methods indeed accelerate the convergence rate of the MDP-DIV, which is 3x faster, while the accuracies produced barely degrade, or even are better.

Sequence-Aware Recommender Systems

Recommender systems are one of the most successful applications of data mining and machine learning technology in practice. Academic research in the field is historically often based on the matrix completion problem formulation, where for each user-item-pair only one interaction (e.g., a rating) is considered. In many application domains, however, multiple user-item interactions of different types can be recorded over time. And, a number of recent works have shown that this information can be used to build richer individual user models and to discover additional behavioral patterns that can be leveraged in the recommendation process. In this work we review existing works that consider information from such sequentially-ordered user- item interaction logs in the recommendation process. Based on this review, we propose a categorization of the corresponding recommendation tasks and goals, summarize existing algorithmic solutions, discuss methodological approaches when benchmarking what we call sequence-aware recommender systems, and outline open challenges in the area.

Coloring black boxes: visualization of neural network decisions

Neural networks are commonly regarded as black boxes performing incomprehensible functions. For classification problems networks provide maps from high dimensional feature space to K-dimensional image space. Images of training vector are projected on polygon vertices, providing visualization of network function. Such visualization may show the dynamics of learning, allow for comparison of different networks, display training vectors around which potential problems may arise, show differences due to regularization and optimization procedures, investigate stability of network classification under perturbation of original vectors, and place new data sample in relation to training data, allowing for estimation of confidence in classification of a given sample. An illustrative example for the three-class Wine data and five-class Satimage data is described. The visualization method proposed here is applicable to any black box system that provides continuous outputs.

Benchmarking Distributed Stream Processing Engines

Over the last years, stream data processing has been gaining attention both in industry and in academia due to its wide range of applications. To fulfill the need for scalable and efficient stream analytics, numerous open source stream data processing systems (SDPSs) have been developed, with high throughput and low latency being their key performance targets. In this paper, we propose a framework to evaluate the performance of three SDPSs, namely Apache Storm, Apache Spark, and Apache Flink. Our evaluation focuses in particular on measuring the throughput and latency of windowed operations. For this benchmark, we design workloads based on real-life, industrial use-cases. The main contribution of this work is threefold. First, we give a definition of latency and throughput for stateful operators. Second, we completely separate the system under test and driver, so that the measurement results are closer to actual system performance under real conditions. Third, we build the first driver to test the actual sustainable performance of a system under test. Our detailed evaluation highlights that there is no single winner, but rather, each system excels in individual use-cases.

Database Aggregation

Knowledge can be represented compactly in a multitude ways, from a set of propositional formulas, to a Kripke model, to a database. In this paper we study the aggregation of information coming from multiple sources, each source submitting a database modelled as a first-order relational structure. In the presence of an integrity constraint, we identify classes of aggregators that respect it in the aggregated database, provided all individual databases satisfy it. We also characterise languages for first-order queries on which the answer to queries on the aggregated database coincides with the aggregation of the answers to the query obtained on each individual database. This contribution is meant to be a first step on the application of techniques from rational choice theory to knowledge representation in databases.

Loss-aware Weight Quantization of Deep Networks

The huge size of deep networks hinders their use in small computing devices. In this paper, we consider compressing the network by weight quantization. We extend a recently proposed loss-aware weight binarization scheme to ternarization, with possibly different scaling parameters for the positive and negative weights, and m-bit (where m > 2) quantization. Experiments on feedforward and recurrent neural networks show that the proposed scheme outperforms state-of-the-art weight quantization algorithms, and is as accurate (or even more accurate) than the full-precision network.

Ranking Sentences for Extractive Summarization with Reinforcement Learning

Single document summarization is the task of producing a shorter version of a document while preserving its principal information content. In this paper we conceptualize extractive summarization as a sentence ranking task and propose a novel training algorithm which globally optimizes the ROUGE evaluation metric through a reinforcement learning objective. We use our algorithm to train a neural summarization model on the CNN and DailyMail datasets and demonstrate experimentally that it outperforms state-of-the-art extractive and abstractive systems when evaluated automatically and by humans.

On detecting changes in the jumps of arbitrary size of a time-continuous stochastic process

This paper introduces test and estimation procedures for abrupt and gradual changes in the entire jump behaviour of a discretely observed Ito semimartingale. In contrast to existing work we analyse jumps of arbitrary size which are not restricted to a minimum height. Our methods are based on weak convergence of a truncated sequential empirical distribution function of the jump characteristic of the underlying Ito semimartingale. Critical values for the new tests are obtained by a multiplier bootstrap approach and we investigate the performance of the tests also under local alternatives. An extensive simulation study shows the finite-sample properties of the new procedures.

Learning Optimal Policies from Observational Data

Choosing optimal (or at least better) policies is an important problem in domains from medicine to education to finance and many others. One approach to this problem is through controlled experiments/trials – but controlled experiments are expensive. Hence it is important to choose the best policies on the basis of observational data. This presents two difficult challenges: (i) missing counterfactuals, and (ii) selection bias. This paper presents theoretical bounds on estimation errors of counterfactuals from observational data by making connections to domain adaptation theory. It also presents a principled way of choosing optimal policies using domain adversarial neural networks. We illustrate the effectiveness of domain adversarial training together with various features of our algorithm on a semi-synthetic breast cancer dataset and a supervised UCI dataset (Statlog).

SPLATNet: Sparse Lattice Networks for Point Cloud Processing
Quantum entropy and polarization measurements of the two-photon system
Complete intersection P-partition rings
On the quotient set of the distance set
Conflict and Convention in Dynamic Networks
A Bayesian Mark Interaction Model for Analysis of Tumor Pathology Images
Sleep-deprived Fatigue Pattern Analysis using Large-Scale Selfies from Social Med
High Order Recurrent Neural Networks for Acoustic Modelling
Proportional Volume Sampling and Approximation Algorithms for A-Optimal Design
On Looking for Local Expansion Invariants in Argumentation Semantics
Kemeny’s Function for Markov Chains and Markov Renewl Processes
Diverse Exploration for Fast and Safe Policy Improvement
Deep Multimodal Learning for Emotion Recognition in Spoken Language
Learning Without Mixing: Towards A Sharp Analysis of Linear System Identification
The Rise-Contact involution on Tamari intervals
The left-curtain martingale coupling in the presence of atoms
A Class of Tests for Trend in Time Censored Recurrent Event Data
A Cut-And-Choose Mechanism to Prevent Gerrymandering
The Edge-Isoperimetric Problem on Sierpinski Graphs: Final Resolution
Time Consistent Stopping For The Mean-Standard Deviation Problem — The Discrete Time Case
Weighted cogrowth formula for free groups
Real-Time End-to-End Action Detection with Two-Stream Networks
Budget Constrained Bidding by Model-free Reinforcement Learning in Display Advertising
Do WaveNets Dream of Acoustic Waves?
Reusing Weights in Subword-aware Neural Language Models
Control and Sensing Co-design
Multi-scale Spectrum Sensing in 5G Cognitive Networks
EmotionLines: An Emotion Corpus of Multi-Party Conversations
On Abruptly-Changing and Slowly-Varying Multiarmed Bandit Problems
A role of asymmetry in linear response of globally coupled oscillator systems
Locally Adaptive Learning Loss for Semantic Image Segmentation
Boltzmann transport theory for many body localization
Towards end-to-end spoken language understanding
Bounds on the Zero-Error List-Decoding Capacity of the $q/(q-1)$ Channel
Harnessing Structures in Big Data via Guaranteed Low-Rank Matrix Estimation
The Squared Coefficient of Variation for MMPP is Greater than Unity
Adaptive specular reflection detection and inpainting in colonoscopy video frames
Kernel Recursive ABC: Point Estimation with Intractable Likelihood
Local moment matching: A unified methodology for symmetric functional estimation and distribution estimation under Wasserstein distance
Solving Linear Inverse Problems Using GAN Priors: An Algorithm with Provable Guarantees
Exponentially Consistent Kernel Two-Sample Tests
A Game Problem for Heat Equation
Pontryagin’s maximum principle for optimal control of the nonlocal Cahn-Hilliard-Navier-Stokes systems in two dimensions
Geometric Lower Bounds for Distributed Parameter Estimation under Communication Constraints
Challenges of Growing Social Media Networks From the Bottom-Up Through the Agent Perspective
Exact Sampling of Determinantal Point Processes without Eigendecomposition
Critical Liouville measure as a limit of subcritical measures
Efficient Neural Audio Synthesis
Ratio ergodic theorems: From Hopf to Birkhoff and Kingman
Another Identity for Complete Bell Polynomials based on Ramanujan’s Congruences
Deterministic factoring with oracles
A Matrix Approach for Weighted Argumentation Frameworks: a Preliminary Report
Faithful Semantical Embedding of a Dyadic Deontic Logic in HOL
A Dual Certificates Analysis of Compressive Off-the-Grid Recovery
AEkNN: An AutoEncoder kNN-based classifier with built-in dimensionality reduction
Parameterized verification of synchronization in constrained reconfigurable broadcast networks
Optimized Algorithms to Sample Determinantal Point Processes
Random triangles in random graphs
IPA: Invariant-preserving Applications for Weakly-consistent Replicated Databases
The asymptotic behaviour of convex combinations of firmly nonexpansive mappings
GPU Implementation and Optimization of a Flexible MAP Decoder for Synchronization Correction
Graph polynomials and symmetries
Central Limit theorem for toric \kahler manifolds
Interpretable Charge Predictions for Criminal Cases: Learning to Generate Court Views from Fact Descriptions
The Laplacian spectrum of power graphs of some finite abelian p-groups
Graph Similarity and Approximate Isomorphism
Fast and Sample Near-Optimal Algorithms for Learning Multidimensional Histograms
Closed-form solution to cooperative visual-inertial structure from motion
6D Pose Estimation using an Improved Method based on Point Pair Features
Non-Uniqueness of Stationary Solutions in Extremum Seeking Control
SimCommSys: Taking the errors out of error-correcting code simulations
The Weighted Kendall and High-order Kernels for Permutations
Training wide residual networks for deployment using a single bit for each weight
ZpL: a p-adic precision package
Weighted Double Deep Multiagent Reinforcement Learning in Stochastic Cooperative Environments
Can Neural Networks Understand Logical Entailment?
Continuous-time Markov games with asymmetric information
Computation of optimal transport and related hedging problems via penalization and neural networks
Unsupervised Grammar Induction with Depth-bounded PCFG
Semantic Vector Spaces for Broadening Consideration of Consequences
The Parameterized Hardness of the k-Center Problem in Transportation Networks
Indic Handwritten Script Identification using Offline-Online Multimodal Deep Network
On Streaming Algorithms for the Steiner Cycle and Path Cover Problem on Interval Graphs and Falling Platforms in Video Games
Nonparametric Estimation of a distribution function from doubly truncated data under dependence
Feedback Control of Scalar Conservation Laws with Application to Density Control in Freeways by Means of Variable Speed Limits
Reservoir computing with simple oscillators: Virtual and real networks
Time-Varying Block Codes for Synchronization Errors: MAP Decoder and Practical Issues
Learning Weighted Representations for Generalization Across Designs
Evaluating Scoped Meaning Representations
Distributed Optimal Power Flow using ALADIN
Limiting gaming opportunities on incentive-based demand response programs
Accelerate iterated filtering
Visualizing the Flow of Discourse with a Concept Ontology
Mastery Learning in Practice: A (Mostly) Descriptive Analysis of Log Data from the Cognitive Tutor Algebra I Effectiveness Trial
Simple derivation of the $(- λH)^{5/2}$ tail for the 1D KPZ equation
Variable selection via Group LASSO Approach : Application to the Cox Regression and frailty model
2D Navier-Stokes equation with cylindrical fractional Brownian noise
Empirical Risk Minimization under Fairness Constraints
Conditional infimum and recovery of monotone processes
An Approach to Vehicle Trajectory Prediction Using Automatically Generated Traffic Maps
Fully Asynchronous Push-Sum With Growing Intercommunication Intervals
Network Models for Multiobjective Discrete Optimization
Interactive Image Manipulation with Natural Language Instruction Commands
Comparative Analysis of Unsupervised Algorithms for Breast MRI Lesion Segmentation
Homomorphism Extension
Skew cyclic codes over F_{p}+uF_{p}+\dots +u^{k-1}F_{p}
Synchronization Strings: List Decoding for Insertions and Deletions
Modeling goal chances in soccer: a Bayesian inference approach
Learning Latent Permutations with Gumbel-Sinkhorn Networks
Numerical performance of optimized Frolov lattices in tensor product reproducing kernel Sobolev spaces
Double/De-Biased Machine Learning Using Regularized Riesz Representers
On a new conjecture about super-monochromatic factorisations and ultimate periodicity
Langevin Monte Carlo and JKO splitting
The Poset of Mesh Patterns
An Algorithmic Framework to Control Bias in Bandit-based Personalization
A Quantum-Search-Aided Dynamic Programming Framework for Pareto Optimal Routing in Wireless Multihop Networks
Advantages of versatile neural-network decoding for topological codes