Fairness-aware Generative Adversarial Network (FairGAN) google
Fairness-aware learning is increasingly important in data mining. Discrimination prevention aims to prevent discrimination in the training data before it is used to conduct predictive analysis. In this paper, we focus on fair data generation that ensures the generated data is discrimination free. Inspired by generative adversarial networks (GAN), we present fairness-aware generative adversarial networks, called FairGAN, which are able to learn a generator producing fair data and also preserving good data utility. Compared with the naive fair data generation models, FairGAN further ensures the classifiers which are trained on generated data can achieve fair classification on real data. Experiments on a real dataset show the effectiveness of FairGAN. …

Bi-Directional Long Short Term Memory Network (BLSTM) google
Most existing methods for biomedical entity recognition task rely on explicit feature engineering where many features either are specific to a particular task or depends on output of other existing NLP tools. Neural architectures have been shown across various domains that efforts for explicit feature design can be reduced. In this work we propose an unified framework using bi-directional long short term memory network (BLSTM) for named entity recognition (NER) tasks in biomedical and clinical domains. Three important characteristics of the framework are as follows – (1) model learns contextual as well as morphological features using two different BLSTM in hierarchy, (2) model uses first order linear conditional random field (CRF) in its output layer in cascade of BLSTM to infer label or tag sequence, (3) model does not use any domain specific features or dictionary, i.e., in another words, same set of features are used in the three NER tasks, namely, disease name recognition (Disease NER), drug name recognition (Drug NER) and clinical entity recognition (Clinical NER). We compare performance of the proposed model with existing state-of-the-art models on the standard benchmark datasets of the three tasks. We show empirically that the proposed framework outperforms all existing models. Further our analysis of CRF layer and word-embedding obtained using character based embedding show their importance. …

FactChecker google
We present a novel natural language query interface, the FactChecker, aimed at text summaries of relational data sets. The tool focuses on natural language claims that translate into an SQL query and a claimed query result. Similar in spirit to a spell checker, the FactChecker marks up text passages that seem to be inconsistent with the actual data. At the heart of the system is a probabilistic model that reasons about the input document in a holistic fashion. Based on claim keywords and the document structure, it maps each text claim to a probability distribution over associated query translations. By efficiently executing tens to hundreds of thousands of candidate translations for a typical input document, the system maps text claims to correctness probabilities. This process becomes practical via a specialized processing backend, avoiding redundant work via query merging and result caching. Verification is an interactive process in which users are shown tentative results, enabling them to take corrective actions if necessary. Our system was tested on a set of 53 public articles containing 392 claims. Our test cases include articles from major newspapers, summaries of survey results, and Wikipedia articles. Our tool revealed erroneous claims in roughly a third of test cases. A detailed user study shows that users using our tool are in average six times faster at checking text summaries, compared to generic SQL interfaces. In fully automated verification, our tool achieves significantly higher recall and precision than baselines from the areas of natural language query interfaces and fact checking. …

Advertisements