Shift Compensation Network (SCN)
Citizen science projects are successful at gathering rich datasets for various applications. Nevertheless, the data collected by the citizen scientists are often biased, more aligned with the citizens’ preferences rather than scientific objectives. We propose the Shift Compensation Network (SCN), an end-to-end learning scheme which learns the shift from the scientific objectives to the biased data, while compensating the shift by re-weighting the training data. Applied to bird observational data from the citizen science project \textit{eBird}, we demonstrate how SCN quantifies the data distribution shift as well as outperforms supervised learning models that do not address the data bias. Compared with other competing models in the context of covariate shift, we further demonstrate the advantage of SCN in both the effectiveness and the capability of handling massive high-dimensional data. …
Markov-Modulated Linear Regression
Classical linear regression is considered for a case when regression parameters depend on the external random environment. The last is described as a continuous time Markov chain with finite state space. Here the expected sojourn times in various states are additional regressors. Necessary formulas for an estimation of regression parameters have been derived. The numerical example illustrates the results obtained. …
BERTSUM
BERT, a pre-trained Transformer model, has achieved ground-breaking performance on multiple NLP tasks. In this paper, we describe BERTSUM, a simple variant of BERT, for extractive summarization. Our system is the state of the art on the CNN/Dailymail dataset, outperforming the previous best-performed system by 1.65 on ROUGE-L. The codes to reproduce our results are available at https://…/BertSum …
Attentive Neural Process
Neural Processes (NPs) (Garnelo et al 2018a;b) approach regression by learning to map a context set of observed input-output pairs to a distribution over regression functions. Each function models the distribution of the output given an input, conditioned on the context. NPs have the benefit of fitting observed data efficiently with linear complexity in the number of context input-output pairs, and can learn a wide family of conditional distributions; they learn predictive distributions conditioned on context sets of arbitrary size. Nonetheless, we show that NPs suffer a fundamental drawback of underfitting, giving inaccurate predictions at the inputs of the observed data they condition on. We address this issue by incorporating attention into NPs, allowing each input location to attend to the relevant context points for the prediction. We show that this greatly improves the accuracy of predictions, results in noticeably faster training, and expands the range of functions that can be modelled. …
If you did not already know
09 Tuesday Jun 2020
Posted What is ...
in