Distilled-Exposition Enhanced Matching Network (DEMN)
This paper proposes a Distilled-Exposition Enhanced Matching Network (DEMN) for story-cloze test, which is still a challenging task in story comprehension. We divide a complete story into three narrative segments: an \textit{exposition}, a \textit{climax}, and an \textit{ending}. The model consists of three modules: input module, matching module, and distillation module. The input module provides semantic representations for the three segments and then feeds them into the other two modules. The matching module collects interaction features between the ending and the climax. The distillation module distills the crucial semantic information in the exposition and infuses it into the matching module in two different ways. We evaluate our single and ensemble model on ROCStories Corpus \cite{Mostafazadeh2016ACA}, achieving an accuracy of 80.1\% and 81.2\% on the test set respectively. The experimental results demonstrate that our DEMN model achieves a state-of-the-art performance. …

Truncated Variance Reduction (TruVaR)
We present a new algorithm, truncated variance reduction (TruVaR), that treats Bayesian optimization (BO) and level-set estimation (LSE) with Gaussian processes in a unified fashion. The algorithm greedily shrinks a sum of truncated variances within a set of potential maximizers (BO) or unclassified points (LSE), which is updated based on confidence bounds. TruVaR is effective in several important settings that are typically non-trivial to incorporate into myopic algorithms, including pointwise costs and heteroscedastic noise. We provide a general theoretical guarantee for TruVaR covering these aspects, and use it to recover and strengthen existing results on BO and LSE. Moreover, we provide a new result for a setting where one can select from a number of noise levels having associated costs. We demonstrate the effectiveness of the algorithm on both synthetic and real-world data sets. …

Proportional Degree
Several algorithms have been proposed to filter information on a complete graph of correlations across stocks to build a stock-correlation network. Among them the planar maximally filtered graph (PMFG) algorithm uses $3n-6$ edges to build a graph whose features include a high frequency of small cliques and a good clustering of stocks. We propose a new algorithm which we call proportional degree (PD) to filter information on the complete graph of normalised mutual information (NMI) across stocks. Our results show that the PD algorithm produces a network showing better homogeneity with respect to cliques, as compared to economic sectoral classification than its PMFG counterpart. We also show that the partition of the PD network obtained through normalised spectral clustering (NSC) agrees better with the NSC of the complete graph than the corresponding one obtained from PMFG. Finally, we show that the clusters in the PD network are more robust with respect to the removal of random sets of edges than those in the PMFG network. …

Non-Stationary Streaming PCA
We consider the problem of streaming principal component analysis (PCA) when the observations are noisy and generated in a non-stationary environment. Given $T$, $p$-dimensional noisy observations sampled from a non-stationary variant of the spiked covariance model, our goal is to construct the best linear $k$-dimensional subspace of the terminal observations. We study the effect of non-stationarity by establishing a lower bound on the number of samples and the corresponding recovery error obtained by any algorithm. We establish the convergence behaviour of the noisy power method using a novel proof technique which maybe of independent interest. We conclude that the recovery guarantee of the noisy power method matches the fundamental limit, thereby generalizing existing results on streaming PCA to a non-stationary setting. …