We consider the problem of communication over a channel with a causal jamming adversary subject to quadratic constraints. A sender Alice wishes to communicate a message to a receiver Bob by transmitting a real-valued length-$n$ codeword $\mathbf{x}=x_1,...,x_n$ through a communication channel. Alice and Bob do not share common randomness. Knowing Alice’s encoding strategy, an adversarial jammer James chooses a real-valued length-n noise sequence $\mathbf{s}=s_1,..,s_n$ in a causal manner, i.e., each $s_t (1<=t<=n)$ can only depend on $x_1,...,x_t$. Bob receives $\mathbf{y}$, the sum of Alice’s transmission $\mathbf{x}$ and James’ jamming vector $\mathbf{s}$, and is required to reliably estimate Alice’s message from this sum. In addition, Alice and James’s transmission powers are restricted by quadratic constraints $P>0$ and $N>0$. In this work, we characterize the channel capacity for such a channel as the limit superior of the optimal values of a series of optimizations. Upper and lower bounds on the optimal values are provided both analytically and numerically. Interestingly, unlike many communication problems, in this causal setting Alice’s optimal codebook may not have a uniform power allocation – for certain SNR, a codebook with a two-level uniform power allocation results in a strictly higher rate than a codebook with a uniform power allocation would.
We present a sample path dependent measure of causal influence between two time series. The proposed measure is a random variable whose expected sum is the directed information. A realization of the proposed measure may be used to identify the specific patterns in the data that yield a greater flow of information from one process to another, even in stationary processes. We demonstrate how sequential prediction theory may be leveraged to obtain accurate estimates of the causal measure at each point in time and introduce a notion of regret for assessing the performance of estimators of the measure. We prove a finite sample bound on this regret that is determined by the regret of the sequential predictors used in obtaining the estimate. We estimate the causal measure for a simulated collection of binary Markov processes using a Bayesian updating approach. Finally, given that the measure is a function of time, we demonstrate how estimators of the causal measure may be extended to effectively capture causality in time-varying scenarios.
This paper proposes a novel semiparametric multivariate D-vine time series model (mDvine) that enables the simultaneous copula-based modeling of both temporal and cross-sectional dependence for multivariate time series. To construct the mDvine, we first build a semiparametric univariate D-vine time series model (uDvine) based on a D-vine. The uDvine generalizes the existing first-order copula-based Markov chain models to Markov chains of an arbitrary-order. Building upon the uDvine, we then construct the mDvine by joining multiple uDvines via another parametric copula. As a simple and tractable model, the mDvine provides flexible models for marginal behavior of time series and can also generate sophisticated temporal and cross-sectional dependence structures. Probabilistic properties of both the uDvine and mDvine are studied in detail. Furthermore, robust and computationally efficient procedures, including a sequential model selection method and a two-stage MLE, are proposed for model estimation and inference, and their statistical properties are investigated. Numerical experiments are conducted to demonstrate the flexibility of the mDvine, and to examine the performance of the sequential model selection procedure and the two-stage MLE. Real data applications on the Australian electricity price and the Ireland wind speed data demonstrate the superior performance of the mDvine to traditional multivariate time series models.
Learning compressed representations of multivariate time series (MTS) facilitate the analysis and process of the data in presence of noise, redundant information, and large amount of variables and time steps. However, classic dimensionality reduction approaches are not designed to process sequential data, especially in the presence of missing values. In this work, we propose a novel autoencoder architecture based on recurrent neural networks to generate compressed representations of MTS, which may contain missing values and have variable lengths. Our autoencoder learns fixed-length vectorial representations, whose pairwise similarities are aligned with a kernel function that operates in input space and handles missing values. This, allows to preserve relationships in the low-dimensional vector space even in presence of missing values. To highlight the main features of the proposed autoencoder, we first investigate its performance in controlled experiments. Successively, we show how the learned representations can be exploited both in several benchmark and real-world classification tasks on medical data. Finally, based on the proposed architecture, we conceive a framework for one-class classification and imputation of missing data in time series extracted from ECG signals.
In pursuit of efficient and scalable data analytics, the insight that ‘one size does not fit all’ has given rise to a plethora of specialized data processing platforms and today’s complex data analytics are moving beyond the limits of a single platform. To cope with these new requirements, we present a cross-platform optimizer that allocates the subtasks of data analytic tasks to the most suitable platforms. Our main contributions are: (i)~a mechanism based on graph transformations to explore alternative execution strategies; (ii)~a novel graph-based approach to efficiently plan data movement among subtasks and platforms; and (iii)~an efficient plan enumeration algorithm, based on a novel enumeration algebra. We extensively evaluate our optimizer under diverse real tasks. The results show that our optimizer is capable of selecting the most efficient platform combination for a given task, freeing data analysts from the need to choose and orchestrate platforms. In particular, our optimizer allows certain tasks to run more than one order of magnitude faster than on state-of-the-art platforms, such as Spark.
Statistical tests can only reject the null hypothesis, never prove it. However, when researchers test modeling assumptions, they often interpret the failure to reject a null of ‘no violation’ as evidence that the assumption holds. We discuss the statistical and conceptual problems with this approach. We show that equivalence/non-inferiority tests, while giving correct Type I error, have low power to rule out many violations that are practically significant. We suggest sensitivity analyses that may be more appropriate than hypothesis testing.
Cooperation is a fundamental social mechanism, whose effects on human performance have been investigated in several environments. Online games are modern-days natural settings in which cooperation strongly affects human behavior. Every day, millions of players connect and play together in team-based games: the patterns of cooperation can either foster or hinder individual skill learning and performance. This work has three goals: (i) identifying teammates’ influence on players’ performance in the short and long term, (ii) designing a computational framework to recommend teammates to improve players’ performance, and (iii) setting to demonstrate that such improvements can be predicted via deep learning. We leverage a large dataset from Dota 2, a popular Multiplayer Online Battle Arena game. We generate a directed co-play network, whose links’ weights depict the effect of teammates on players’ performance. Specifically, we propose a measure of network influence that captures skill transfer from player to player over time. We then use such framing to design a recommendation system to suggest new teammates based on a modified deep neural autoencoder and we demonstrate its state-of-the-art recommendation performance. We finally provide insights into skill transfer effects: our experimental results demonstrate that such dynamics can be predicted using deep neural networks.
In many real world networks, a vertex is usually associated with a transaction database that comprehensively describes the behaviour of the vertex. A typical example is the social network, where the behaviour of every user is depicted by a transaction database that stores his daily posted contents. A transaction database is a set of transactions, where a transaction is a set of items. Every path of the network is a sequence of vertices that induces multiple sequences of transactions. The sequences of transactions induced by all of the paths in the network forms an extremely large sequence database. Finding frequent sequential patterns from such sequence database discovers interesting subsequences that frequently appear in many paths of the network. However, it is a challenging task, since the sequence database induced by a database graph is too large to be explicitly induced and stored. In this paper, we propose the novel notion of database graph, which naturally models a wide spectrum of real world networks by associating each vertex with a transaction database. Our goal is to find the top-k frequent sequential patterns in the sequence database induced from a database graph. We prove that this problem is #P-hard. To tackle this problem, we propose an efficient two-step sampling algorithm that approximates the top-k frequent sequential patterns with provable quality guarantee. Extensive experimental results on synthetic and real-world data sets demonstrate the effectiveness and efficiency of our method.
We give optimal sorting algorithms in the evolving data framework, where an algorithm’s input data is changing while the algorithm is executing. In this framework, instead of producing a final output, an algorithm attempts to maintain an output close to the correct output for the current state of the data, repeatedly updating its best estimate of a correct output over time. We show that a simple repeated insertion-sort algorithm can maintain an O(n) Kendall tau distance, with high probability, between a maintained list and an underlying total order of n items in an evolving data model where each comparison is followed by a swap between a random consecutive pair of items in the underlying total order. This result is asymptotically optpimal, since there is an Omega(n) lower bound for Kendall tau distance for this problem. Our result closes the gap between this lower bound and the previous best algorithm for this problem, which maintains a Kendall tau distance of O(n log log n) with high probability. It also confirms previous experimental results that suggested that insertion sort tends to perform better than quicksort in practice.
Online reviews play an important role in influencing buyers’ daily purchase decisions. However, fake and meaningless reviews, which cannot reflect users’ genuine purchase experience and opinions, widely exist on the Web and pose great challenges for users to make right choices. Therefore,it is desirable to build a fair model that evaluates the quality of products by distinguishing spamming reviews. We present an end-to-end trainable unified model to leverage the appealing properties from Autoencoder and random forest. A stochastic decision tree model is implemented to guide the global parameter learning process. Extensive experiments were conducted on a large Amazon review dataset. The proposed model consistently outperforms a series of compared methods.
Getting a better understanding of user behavior is important for advancing information retrieval systems. Existing work focuses on modeling and predicting single interaction events, such as clicks. In this paper, we for the first time focus on modeling and predicting sequences of interaction events. And in particular, sequences of clicks. We formulate the problem of click sequence prediction and propose a click sequence model (CSM) that aims to predict the order in which a user will interact with search engine results. CSM is based on a neural network that follows the encoder-decoder architecture. The encoder computes contextual embeddings of the results. The decoder predicts the sequence of positions of the clicked results. It uses an attention mechanism to extract necessary information about the results at each timestep. We optimize the parameters of CSM by maximizing the likelihood of observed click sequences. We test the effectiveness of CSM on three new tasks: (i) predicting click sequences, (ii) predicting the number of clicks, and (iii) predicting whether or not a user will interact with the results in the order these results are presented on a search engine result page (SERP). Also, we show that CSM achieves state-of-the-art results on a standard click prediction task, where the goal is to predict an unordered set of results a user will click on.
In the last decade, machine learning based compilation has moved from an an obscure research niche to a mainstream activity. In this article, we describe the relationship between machine learning and compiler optimisation and introduce the main concepts of features, models, training and deployment. We then provide a comprehensive survey and provide a road map for the wide variety of different research areas. We conclude with a discussion on open issues in the area and potential research directions. This paper provides both an accessible introduction to the fast moving area of machine learning based compilation and a detailed bibliography of its main achievements.
We explore the fundamental limits of heterogeneous distributed detection in an anonymous sensor network with n sensors and a single fusion center. The fusion center collects the single observation from each of the n sensors to detect a binary parameter. The sensors are clustered into multiple groups, and different groups follow different distributions under a given hypothesis. The key challenge for the fusion center is the anonymity of sensors — although it knows the exact number of sensors and the distribution of observations in each group, it does not know which group each sensor belongs to. It is hence natural to consider it as a composite hypothesis testing problem. First, we propose an optimal test called mixture likelihood ratio test, which is a randomized threshold test based on the ratio of the uniform mixture of all the possible distributions under one hypothesis to that under the other hypothesis. Optimality is shown by first arguing that there exists an optimal test that is symmetric, that is, it does not depend on the order of observations across the sensors, and then proving that the mixture likelihood ratio test is optimal among all symmetric tests. Second, we focus on the Neyman-Pearson setting and characterize the error exponent of the worst-case type-II error probability as n tends to infinity, assuming the number of sensors in each group is proportional to n. Finally, we generalize our result to find the collection of all achievable type-I and type-II error exponents, showing that the boundary of the region can be obtained by solving a convex optimization problem. Our results elucidate the price of anonymity in heterogeneous distributed detection. The results are also applied to distributed detection under Byzantine attacks, which hints that the conventional approach based on simple hypothesis testing might be too pessimistic.