iFair google
People are rated and ranked, towards algorithmic decision making in an increasing number of applications, typically based on machine learning. Research on how to incorporate fairness into such tasks has prevalently pursued the paradigm of group fairness: ensuring that each ethnic or social group receives its fair share in the outcome of classifiers and rankings. In contrast, the alternative paradigm of individual fairness has received relatively little attention. This paper introduces a method for probabilistically clustering user records into a low-rank representation that captures individual fairness yet also achieves high accuracy in classification and regression models. Our notion of individual fairness requires that users who are similar in all task-relevant attributes such as job qualification, and disregarding all potentially discriminating attributes such as gender, should have similar outcomes. Since the case for fairness is ubiquitous across many tasks, we aim to learn general representations that can be applied to arbitrary downstream use-cases. We demonstrate the versatility of our method by applying it to classification and learning-to-rank tasks on two real-world datasets. Our experiments show substantial improvements over the best prior work for this setting. …

Interrupted Time Series (ITS) google
Interrupted time series analysis, sometimes known as quasi-experimental time series analysis, is an approach for the analysis of a single time series of data known or conjectured to be affected by interventions (controlled external influences). Interrupted time series design is the design of experiments based on the interrupted time series approach.
Program Evaluation: Interrupted Time Series in R


Representer Values google
We propose to explain the predictions of a deep neural network, by pointing to the set of what we call representer points in the training set, for a given test point prediction. Specifically, we show that we can decompose the pre-activation prediction of a neural network into a linear combination of activations of training points, with the weights corresponding to what we call representer values, which thus capture the importance of that training point on the learned parameters of the network. But it provides a deeper understanding of the network than simply training point influence: with positive representer values corresponding to excitatory training points, and negative values corresponding to inhibitory points, which as we show provides considerably more insight. Our method is also much more scalable, allowing for real-time feedback in a manner not feasible with influence functions. …

Persistent Homology for Generative Model (PHom-GeM) google
Generative neural network models, including Generative Adversarial Network (GAN) and Auto-Encoders (AE), are among the most popular neural network models to generate adversarial data. The GAN model is composed of a generator that produces synthetic data and of a discriminator that discriminates between the generator’s output and the true data. AE consist of an encoder which maps the model distribution to a latent manifold and of a decoder which maps the latent manifold to a reconstructed distribution. However, generative models are known to provoke chaotically scattered reconstructed distribution during their training, and consequently, incomplete generated adversarial distributions. Current distance measures fail to address this problem because they are not able to acknowledge the shape of the data manifold, i.e. its topological features, and the scale at which the manifold should be analyzed. We propose Persistent Homology for Generative Models, PHom-GeM, a new methodology to assess and measure the distribution of a generative model. PHom-GeM minimizes an objective function between the true and the reconstructed distributions and uses persistent homology, the study of the topological features of a space at different spatial resolutions, to compare the nature of the true and the generated distributions. Our experiments underline the potential of persistent homology for Wasserstein GAN in comparison to Wasserstein AE and Variational AE. The experiments are conducted on a real-world data set particularly challenging for traditional distance measures and generative neural network models. PHom-GeM is the first methodology to propose a topological distance measure, the bottleneck distance, for generative models used to compare adversarial samples in the context of credit card transactions. …