NEXTSUM google
Existing approaches to automatic summarization assume that a length limit for the summary is given, and view content selection as an optimization problem to maximize informativeness and minimize redundancy within this budget. This framework ignores the fact that human-written summaries have rich internal structure which can be exploited to train a summarization system. We present NEXTSUM, a novel approach to summarization based on a model that predicts the next sentence to include in the summary using not only the source article, but also the summary produced so far. We show that such a model successfully captures summary-specific discourse moves, and leads to better content selection performance, in addition to automatically predicting how long the target summary should be. We perform experiments on the New York Times Annotated Corpus of summaries, where NEXTSUM outperforms lead and content-model summarization baselines by significant margins. We also show that the lengths of summaries produced by our system correlates with the lengths of the human-written gold standards. …

Rand-Interleaving google
Ranking functions return ranked lists of items, and users often interact with these items. How to evaluate ranking functions using historical interaction logs, also known as off-policy evaluation, is an important but challenging problem. The commonly used Inverse Propensity Scores (IPS) approaches work better for the single item case, but suffer from extremely low data efficiency for the ranked list case. In this paper, we study how to improve the data efficiency of IPS approaches in the offline comparison setting. We propose two approaches Trunc-match and Rand-interleaving for offline comparison using uniformly randomized data. We show that these methods can improve the data efficiency and also the comparison sensitivity based on one of the largest email search engines. …

SPIKE google
Maintaining web-services is a mission-critical task. Any downtime of web-based services means loss of revenue. Worse, such downtimes can damage the reputation of an organization as a reliable service provider (and in the current competitive web services market, such a loss of reputation causes extensive loss of future revenue). To address this issue, we developed SPIKE, a data mining tool which can predict upcoming service breakdowns, half an hour into the future. Such predictions let an organization alert and assemble the tiger team to address the problem (e.g. by reconfiguring cloud hardware in order to reduce the likelihood of that breakdown). SPIKE utilizes (a) regression tree learning (with CART); (b) synthetic minority over-sampling (to handle how rare spikes are in our data); (c) hyperparameter optimization (to learn best settings for our local data) and (d) a technique we called ‘topology sampling’ where training vectors are built from extensive details of an individual node plus summary details on all their neighbors. In the experiments reported here, SPIKE predicted service spikes 30 minutes into future with recalls and precision of 75% and above. Also, SPIKE performed relatively better than other widely-used learning methods (neural nets, random forests, logistic regression). …

Known-class Aware Adaptation (KAA) google
Existing domain adaptation methods generally assume different domains have the identical label space, which is quite restrict for real-world applications. In this paper, we focus on a more realistic and challenging case of open set domain adaptation. Particularly, in open set domain adaptation, we allow the classes from the source and target domains to be partially overlapped. In this case, the assumption of conventional distribution alignment does not hold anymore, due to the different label spaces in two domains. To tackle this challenge, we propose a new approach coined as Known-class Aware Self-Ensemble (KASE), which is built upon the recently developed self-ensemble model. In KASE, we first introduce a Known-class Aware Recognition (KAR) module to identify the known and unknown classes from the target domain, which is achieved by encouraging a low cross-entropy for known classes and a high entropy based on the source data from the unknown class. Then, we develop a Known-class Aware Adaptation (KAA) module to better adapt from the source domain to the target by reweighing the adaptation loss based on the likeliness to belong to known classes of unlabeled target samples as predicted by KAR. Extensive experiments on multiple benchmark datasets demonstrate the effectiveness of our approach. …