Quadratically Constrained Channels with Causal Adversaries

We consider the problem of communication over a channel with a causal jamming adversary subject to quadratic constraints. A sender Alice wishes to communicate a message to a receiver Bob by transmitting a real-valued length-n codeword \mathbf{x}=x_1,...,x_n through a communication channel. Alice and Bob do not share common randomness. Knowing Alice’s encoding strategy, an adversarial jammer James chooses a real-valued length-n noise sequence \mathbf{s}=s_1,..,s_n in a causal manner, i.e., each s_t (1<=t<=n) can only depend on x_1,...,x_t. Bob receives \mathbf{y}, the sum of Alice’s transmission \mathbf{x} and James’ jamming vector \mathbf{s}, and is required to reliably estimate Alice’s message from this sum. In addition, Alice and James’s transmission powers are restricted by quadratic constraints P>0 and N>0. In this work, we characterize the channel capacity for such a channel as the limit superior of the optimal values of a series of optimizations. Upper and lower bounds on the optimal values are provided both analytically and numerically. Interestingly, unlike many communication problems, in this causal setting Alice’s optimal codebook may not have a uniform power allocation – for certain SNR, a codebook with a two-level uniform power allocation results in a strictly higher rate than a codebook with a uniform power allocation would.

A Sample Path Measure of Causal Influence

We present a sample path dependent measure of causal influence between two time series. The proposed measure is a random variable whose expected sum is the directed information. A realization of the proposed measure may be used to identify the specific patterns in the data that yield a greater flow of information from one process to another, even in stationary processes. We demonstrate how sequential prediction theory may be leveraged to obtain accurate estimates of the causal measure at each point in time and introduce a notion of regret for assessing the performance of estimators of the measure. We prove a finite sample bound on this regret that is determined by the regret of the sequential predictors used in obtaining the estimate. We estimate the causal measure for a simulated collection of binary Markov processes using a Bayesian updating approach. Finally, given that the measure is a function of time, we demonstrate how estimators of the causal measure may be extended to effectively capture causality in time-varying scenarios.

Semiparametric multivariate D-vine time series model

This paper proposes a novel semiparametric multivariate D-vine time series model (mDvine) that enables the simultaneous copula-based modeling of both temporal and cross-sectional dependence for multivariate time series. To construct the mDvine, we first build a semiparametric univariate D-vine time series model (uDvine) based on a D-vine. The uDvine generalizes the existing first-order copula-based Markov chain models to Markov chains of an arbitrary-order. Building upon the uDvine, we then construct the mDvine by joining multiple uDvines via another parametric copula. As a simple and tractable model, the mDvine provides flexible models for marginal behavior of time series and can also generate sophisticated temporal and cross-sectional dependence structures. Probabilistic properties of both the uDvine and mDvine are studied in detail. Furthermore, robust and computationally efficient procedures, including a sequential model selection method and a two-stage MLE, are proposed for model estimation and inference, and their statistical properties are investigated. Numerical experiments are conducted to demonstrate the flexibility of the mDvine, and to examine the performance of the sequential model selection procedure and the two-stage MLE. Real data applications on the Australian electricity price and the Ireland wind speed data demonstrate the superior performance of the mDvine to traditional multivariate time series models.

Learning representations for multivariate time series with missing data using Temporal Kernelized Autoencoders

Learning compressed representations of multivariate time series (MTS) facilitate the analysis and process of the data in presence of noise, redundant information, and large amount of variables and time steps. However, classic dimensionality reduction approaches are not designed to process sequential data, especially in the presence of missing values. In this work, we propose a novel autoencoder architecture based on recurrent neural networks to generate compressed representations of MTS, which may contain missing values and have variable lengths. Our autoencoder learns fixed-length vectorial representations, whose pairwise similarities are aligned with a kernel function that operates in input space and handles missing values. This, allows to preserve relationships in the low-dimensional vector space even in presence of missing values. To highlight the main features of the proposed autoencoder, we first investigate its performance in controlled experiments. Successively, we show how the learned representations can be exploited both in several benchmark and real-world classification tasks on medical data. Finally, based on the proposed architecture, we conceive a framework for one-class classification and imputation of missing data in time series extracted from ECG signals.

RHEEMix in the Data Jungle — A Cross-Platform Query Optimizer —

In pursuit of efficient and scalable data analytics, the insight that ‘one size does not fit all’ has given rise to a plethora of specialized data processing platforms and today’s complex data analytics are moving beyond the limits of a single platform. To cope with these new requirements, we present a cross-platform optimizer that allocates the subtasks of data analytic tasks to the most suitable platforms. Our main contributions are: (i)~a mechanism based on graph transformations to explore alternative execution strategies; (ii)~a novel graph-based approach to efficiently plan data movement among subtasks and platforms; and (iii)~an efficient plan enumeration algorithm, based on a novel enumeration algebra. We extensively evaluate our optimizer under diverse real tasks. The results show that our optimizer is capable of selecting the most efficient platform combination for a given task, freeing data analysts from the need to choose and orchestrate platforms. In particular, our optimizer allows certain tasks to run more than one order of magnitude faster than on state-of-the-art platforms, such as Spark.

Seeking evidence of absence: Reconsidering tests of model assumptions

Statistical tests can only reject the null hypothesis, never prove it. However, when researchers test modeling assumptions, they often interpret the failure to reject a null of ‘no violation’ as evidence that the assumption holds. We discuss the statistical and conceptual problems with this approach. We show that equivalence/non-inferiority tests, while giving correct Type I error, have low power to rule out many violations that are practically significant. We suggest sensitivity analyses that may be more appropriate than hypothesis testing.

Deep Neural Networks for Optimal Team Composition

Cooperation is a fundamental social mechanism, whose effects on human performance have been investigated in several environments. Online games are modern-days natural settings in which cooperation strongly affects human behavior. Every day, millions of players connect and play together in team-based games: the patterns of cooperation can either foster or hinder individual skill learning and performance. This work has three goals: (i) identifying teammates’ influence on players’ performance in the short and long term, (ii) designing a computational framework to recommend teammates to improve players’ performance, and (iii) setting to demonstrate that such improvements can be predicted via deep learning. We leverage a large dataset from Dota 2, a popular Multiplayer Online Battle Arena game. We generate a directed co-play network, whose links’ weights depict the effect of teammates on players’ performance. Specifically, we propose a measure of network influence that captures skill transfer from player to player over time. We then use such framing to design a recommendation system to suggest new teammates based on a modified deep neural autoencoder and we demonstrate its state-of-the-art recommendation performance. We finally provide insights into skill transfer effects: our experimental results demonstrate that such dynamics can be predicted using deep neural networks.

Mining Top-k Sequential Patterns in Database Graphs:A New Challenging Problem and a Sampling-based Approach

In many real world networks, a vertex is usually associated with a transaction database that comprehensively describes the behaviour of the vertex. A typical example is the social network, where the behaviour of every user is depicted by a transaction database that stores his daily posted contents. A transaction database is a set of transactions, where a transaction is a set of items. Every path of the network is a sequence of vertices that induces multiple sequences of transactions. The sequences of transactions induced by all of the paths in the network forms an extremely large sequence database. Finding frequent sequential patterns from such sequence database discovers interesting subsequences that frequently appear in many paths of the network. However, it is a challenging task, since the sequence database induced by a database graph is too large to be explicitly induced and stored. In this paper, we propose the novel notion of database graph, which naturally models a wide spectrum of real world networks by associating each vertex with a transaction database. Our goal is to find the top-k frequent sequential patterns in the sequence database induced from a database graph. We prove that this problem is #P-hard. To tackle this problem, we propose an efficient two-step sampling algorithm that approximates the top-k frequent sequential patterns with provable quality guarantee. Extensive experimental results on synthetic and real-world data sets demonstrate the effectiveness and efficiency of our method.

Optimally Sorting Evolving Data

We give optimal sorting algorithms in the evolving data framework, where an algorithm’s input data is changing while the algorithm is executing. In this framework, instead of producing a final output, an algorithm attempts to maintain an output close to the correct output for the current state of the data, repeatedly updating its best estimate of a correct output over time. We show that a simple repeated insertion-sort algorithm can maintain an O(n) Kendall tau distance, with high probability, between a maintained list and an underlying total order of n items in an evolving data model where each comparison is followed by a swap between a random consecutive pair of items in the underlying total order. This result is asymptotically optpimal, since there is an Omega(n) lower bound for Kendall tau distance for this problem. Our result closes the gap between this lower bound and the previous best algorithm for this problem, which maintains a Kendall tau distance of O(n log log n) with high probability. It also confirms previous experimental results that suggested that insertion sort tends to perform better than quicksort in practice.

Opinion Fraud Detection via Neural Autoencoder Decision Forest

Online reviews play an important role in influencing buyers’ daily purchase decisions. However, fake and meaningless reviews, which cannot reflect users’ genuine purchase experience and opinions, widely exist on the Web and pose great challenges for users to make right choices. Therefore,it is desirable to build a fair model that evaluates the quality of products by distinguishing spamming reviews. We present an end-to-end trainable unified model to leverage the appealing properties from Autoencoder and random forest. A stochastic decision tree model is implemented to guide the global parameter learning process. Extensive experiments were conducted on a large Amazon review dataset. The proposed model consistently outperforms a series of compared methods.

A Click Sequence Model for Web Search

Getting a better understanding of user behavior is important for advancing information retrieval systems. Existing work focuses on modeling and predicting single interaction events, such as clicks. In this paper, we for the first time focus on modeling and predicting sequences of interaction events. And in particular, sequences of clicks. We formulate the problem of click sequence prediction and propose a click sequence model (CSM) that aims to predict the order in which a user will interact with search engine results. CSM is based on a neural network that follows the encoder-decoder architecture. The encoder computes contextual embeddings of the results. The decoder predicts the sequence of positions of the clicked results. It uses an attention mechanism to extract necessary information about the results at each timestep. We optimize the parameters of CSM by maximizing the likelihood of observed click sequences. We test the effectiveness of CSM on three new tasks: (i) predicting click sequences, (ii) predicting the number of clicks, and (iii) predicting whether or not a user will interact with the results in the order these results are presented on a search engine result page (SERP). Also, we show that CSM achieves state-of-the-art results on a standard click prediction task, where the goal is to predict an unordered set of results a user will click on.

Machine Learning in Compiler Optimisation

In the last decade, machine learning based compilation has moved from an an obscure research niche to a mainstream activity. In this article, we describe the relationship between machine learning and compiler optimisation and introduce the main concepts of features, models, training and deployment. We then provide a comprehensive survey and provide a road map for the wide variety of different research areas. We conclude with a discussion on open issues in the area and potential research directions. This paper provides both an accessible introduction to the fast moving area of machine learning based compilation and a detailed bibliography of its main achievements.

Anonymous Heterogeneous Distributed Detection: Optimal Decision Rules, Error Exponents, and the Price of Anonymity

We explore the fundamental limits of heterogeneous distributed detection in an anonymous sensor network with n sensors and a single fusion center. The fusion center collects the single observation from each of the n sensors to detect a binary parameter. The sensors are clustered into multiple groups, and different groups follow different distributions under a given hypothesis. The key challenge for the fusion center is the anonymity of sensors — although it knows the exact number of sensors and the distribution of observations in each group, it does not know which group each sensor belongs to. It is hence natural to consider it as a composite hypothesis testing problem. First, we propose an optimal test called mixture likelihood ratio test, which is a randomized threshold test based on the ratio of the uniform mixture of all the possible distributions under one hypothesis to that under the other hypothesis. Optimality is shown by first arguing that there exists an optimal test that is symmetric, that is, it does not depend on the order of observations across the sensors, and then proving that the mixture likelihood ratio test is optimal among all symmetric tests. Second, we focus on the Neyman-Pearson setting and characterize the error exponent of the worst-case type-II error probability as n tends to infinity, assuming the number of sensors in each group is proportional to n. Finally, we generalize our result to find the collection of all achievable type-I and type-II error exponents, showing that the boundary of the region can be obtained by solving a convex optimization problem. Our results elucidate the price of anonymity in heterogeneous distributed detection. The results are also applied to distributed detection under Byzantine attacks, which hints that the conventional approach based on simple hypothesis testing might be too pessimistic.

On the Fairness of Wi-Fi and LTE-LAA Coexistence
The interacting 2D Bose gas and nonlinear Gibbs measures
Efficient Shortest Paths in Scale-Free Networks with Underlying Hyperbolic Geometry
Necessary and Sufficient Budgets in Information Source Finding with Querying: Adaptivity Gap
Stolarsky’s invariance principle for projective spaces
A Mixed Classification-Regression Framework for 3D Pose Estimation from 2D Images
Post-Specialisation: Retrofitting Vectors of Words Unseen in Lexical Resources
On the Cauchy problem for stochastic integrodifferential parabolic equations in the scale of Lp-spaces of generalized smoothness
A Systematic Approach to Incremental Redundancy over Erasure Channels
Spatial shrinkage via the product independent Gaussian process prior
Towards blockchain-based robonomics: autonomous agents behavior validation
Multimodal Hierarchical Reinforcement Learning Policy for Task-Oriented Visual Dialog
Tiling with punctured intervals
Analysis of Relaxation Time in Random Walk with Jumps
Matroid fragility and relaxations of circuit hyperplanes
QGLBT for polytopes
Sparse Blind Deconvolution for Distributed Radar Autofocus Imaging
Continuous-time integral dynamics for monotone aggregative games with coupling constraints
Delay and Peak-Age Violation Probability in Short-Packet Transmissions
Optimal Linear Instrumental Variables Approximations
Tracking the Orientation and Axes Lengths of an Elliptical Extended Object
Fully Automated Segmentation of Hyperreflective Foci in Optical Coherence Tomography Images
Capturing Edge Attributes via Network Embedding
A convergent hierarchy of non-linear eigenproblems to compute the joint spectral radius of nonnegative matrices
Fused Density Estimation: Theory and Methods
Improved training of end-to-end attention models for speech recognition
Highly Scalable Image Reconstruction using Deep Neural Networks with Bandpass Filtering
Bayesian parameter identification in Cahn-Hilliard models for biological growth
The Effectiveness of Instance Normalization: a Strong Baseline for Single Image Dehazing
Investor Reaction to Financial Disclosures Across Topics: An Application of Latent Dirichlet Allocation
Vecchia approximations of Gaussian-process predictions
Inverse point source location with the Helmholtz equation on a bounded domain
Extremal properties of the extended skew-normal distribution
Subsampling Sequential Monte Carlo for Static Bayesian Models
Multivariate Spatial-Temporal Variable Selection with Applications to Seasonal Tropical Cyclone Modeling
Mining and Forecasting Career Trajectories of Music Artists
Reliability Estimation for Networks with Minimal Flow Demand and Random Link Capacities
Network Enhancement: a general method to denoise weighted biological networks
wubi2en: Character-level Chinese-English Translation through ASCII Encoding
$P$-Matchings in Graphs: A Brief Survey with Some Open Problems
Perfect Domination in Knights Graphs
Multi-scale metrics and self-organizing maps: a computational approach to the structure of sensory maps
Optimal Achievable Rates for Computation With Random Homologous Codes
Attention-Aware Compositional Network for Person Re-identification
Wisdom in Sum of Parts: Multi-Platform Activity Prediction in Social Collaborative Sites
Learning Heterogeneous Knowledge Base Embeddings for Explainable Recommendation
Nonparametric Estimation of Conditional Expectation with Auxiliary Information and Dimension Reduction
SPG-Net: Segmentation Prediction and Guidance Network for Image Inpainting
Efficient Distributed Computation of MIS and Generalized MIS in Linear Hypergraphs
A Collision-Free Path Planning Algorithm for Unmanned Aerial Vehicle Delivery
Reward Estimation for Variance Reduction in Deep Reinforcement Learning
Anchor Cascade for Efficient Face Detection
A Symbolic Approach to Explaining Bayesian Network Classifiers
Combinatorics of certain abelian Lie group arrangements and chromatic quasi-polynomials
LearningWord Embeddings for Low-resource Languages by PU Learning
Dispersion Bound for the Wyner-Ahlswede-Körner Network via Reverse Hypercontractivity on Types
PSGAN: A Generative Adversarial Network for Remote Sensing Image Pan-Sharpening
Interpretable Proximate Factors for Large Dimensions
Computer-aided mechanism design: designing revenue-optimal mechanisms via neural networks
New Techniques for Preserving Global Structure and Denoising with Low Information Loss in Single-Image Super-Resolution
Edit Probability for Scene Text Recognition
Interactive Proofs with Polynomial-Time Quantum Prover for Computing the Order of Solvable Groups
Communication-Efficient Byzantine Agreement without Erasures
VLSI Architecture of Compact Non-RLL Beacon-based Visible Light Communication Transmitter and Receiver
Cross Domain Regularization for Neural Ranking Models Using Adversarial Learning
Characterizing and decomposing classes of threshold, split, and bipartite graphs via 1-Sperner hypergraphs
Normality and Gap Phenomena in Optimal Unbounded Control
N-BaIoT: Network-based Detection of IoT Botnet Attacks Using Deep Autoencoders
Rainbow triangles in arc-colored tournaments
Inhomogeneous percolation on ladder graphs
Deep Directional Statistics: Pose Estimation with Uncertainty Quantification
Ergodicity for Neutral Type SDEs with Infinite Length of Memory
Spatial Poisson Processes for Fatigue Crack Initiation
Decoding Decoders: Finding Optimal Representation Spaces for Unsupervised Similarity Tasks
Exact Lexicographic Scheduling and Approximate Rescheduling
Robust Classification with Convolutional Prototype Learning
Controlling the privacy loss with the input feature maps of the layers in convolutional neural networks
Non-polyhedral extensions of the Frank-and-Wolfe theorem
Object Tracking with Correlation Filters using Selective Single Background Patch
On the $α$-spectral radius of graphs
Dealing with Categorical and Integer-valued Variables in Bayesian Optimization with Gaussian Processes
Pushing the limits of optical information storage using deep learning
Skeap & Leap: Scalable Distributed Priority Queues for constant and arbitrary Priorities
Tight bounds for undirected graph exploration with pebbles and multiple agents
Full 3D Reconstruction of Transparent Objects
Analysis of Hard-Thresholding for Distributed Compressed Sensing with One-Bit Measurements
Joint Action Unit localisation and intensity estimation through heatmap regression
Parameterized circuit complexity of model checking first-order logic on sparse structures
Assessing Security and Performances of Consensus algorithms for Permissioned Blockchains
Deterministically Maintaining a $(2+ε)$-Approximate Minimum Vertex Cover in $O(1/ε^2)$ Amortized Update Time
Discrete Scaling Based on Operator Theory
Diffusion Based Network Embedding
A median-type condition for graph tiling
Rethinking Diversified and Discriminative Proposal Generation for Visual Grounding
Exact explosive synchronization transitions in Kuramoto oscillators with time-delayed coupling
Deep 2.5D Vehicle Classification with Sparse SfM Depth Prior for Automated Toll Systems
FlowFields++: Accurate Optical Flow Correspondences Meet Robust Interpolation
Thermodynamic Properties of Molecular Communication
Solving Sudoku with Ant Colony Optimisation
A Unified Framework of Deep Neural Networks by Capsules
On Visual Hallmarks of Robustness to Adversarial Malware
The $ν$-Tamari lattice as the rotation lattice of $ν$-trees
Stochastic Modelling of Urban Structure
Anisotropic scaling limits of long-range dependent linear random fields on ${\mathbb {Z}}^3$
Minimum Segmentation for Pan-genomic Founder Reconstruction in Optimal Time
Concentration inequalities for randomly permuted sums
Loyalty Programs in the Sharing Economy: Optimality and Competition
Policy Optimization with Second-Order Advantage Information
Distributionally robust optimization with polynomial densities: theory, models and algorithms
Secure Mobile Edge Computing in IoT via Collaborative Online Learning
From synaptic interactions to collective dynamics in random neuronal networks models: critical role of eigenvectors and transient behavior
Phase retrieval for Fourier Ptychography under varying amount of measurements
Layered Optical Flow Estimation Using a Deep Neural Network with a Soft Mask
Using Machine Learning to Assess the Risk of and Prevent Water Main Breaks
Brussels Sprouts, Noncrossing Trees, and Parking Functions
Fundamental Limits of Decentralized Caching in Fog-RANs with Wireless Fronthaul
A Reinforced Topic-Aware Convolutional Sequence-to-Sequence Model for Abstractive Text Summarization
On the Limitations of Unsupervised Bilingual Dictionary Induction
Presburger Arithmetic with algebraic scalar multiplications
Multicast Networks Solvable over Every Finite Field
On three soft rectangle packing problems with guillotine constraints