Modeling User Exposure in Recommendation

Collaborative filtering analyzes user preferences for items (e.g., books, movies, restaurants, academic papers) by exploiting the similarity patterns across users. In implicit feedback settings, all the items, including the ones that a user did not consume, are taken into consideration. But this assumption does not accord with the common sense understanding that users have a limited scope and awareness of items. For example, a user might not have heard of a certain paper, or might live too far away from a restaurant to experience it. In the language of causal analysis, the assignment mechanism (i.e., the items that a user is exposed to) is a latent variable that may change for various user/item combinations. In this paper, we propose a new probabilistic approach that directly incorporates user exposure to items into collaborative filtering. The exposure is modeled as a latent variable and the model infers its value from data. In doing so, we recover one of the most successful state-of-the-art approaches as a special case of our model, and provide a plug-in method for conditioning exposure on various forms of exposure covariates (e.g., topics in text, venue locations). We show that our scalable inference algorithm outperforms existing benchmarks in four different domains both with and without exposure covariates.

Confusing Deep Convolution Networks by Relabelling

Deep convolutional neural networks have become the gold standard for image recognition tasks, demonstrating many current state-of-the-art results and even achieving near-human level performance on some tasks. Despite this fact it has been shown that their strong generalisation qualities can be fooled to misclassify previously correctly classified natural images and give erroneous high confidence classifications to nonsense synthetic images. In this paper we extend that work, by presenting a straightforward way to perturb an image in such a way as to cause it to acquire any other label from within the dataset while leaving this perturbed image visually indistinguishable from the original.

mgm: Structure Estimation for Mixed Graphical Models in high-dimensional Data

We present the R-package mgm for the estimation of mixed graphical models underlying multivariate probability distributions over variables with different domains in high-dimensional data. Our method goes beyond existing methods in that it is the first general method to combine variables with categorical, count-measure and continuous domain, while modeling all variables on their proper domain, which avoids possible loss of information due to transformations. In addition to the presenting the estimation function, we provide a function to sample from pairwise mixed distributions and apply our method to a medical dataset.

Cascaded High Dimensional Histograms: An Approach to Interpretable Density Estimation for Categorical Data

We consider the problem of interpretable density estimation for high dimensional categorical data. In one or two dimensions, we would naturally consider histograms (bar charts) for simple density estimation problems. However, histograms do not scale to higher dimensions in an interpretable way, and one cannot usually visualize a high dimensional histogram. This work presents an alternative to the histogram for higher dimensions that can be directly visualized. These density models are in the form of a cascaded set of conditions (a tree structure), where each node in the tree is estimated to have constant density. We present two algorithms for this task, where the first one allows the user to specify the number of desired leaves in the tree as a Bayesian prior. The second algorithm allows the user to specify the desired number of branches within the prior. Our results indicate that the new approach yields sparser trees than other approaches that achieve similar test performance.

Liquidity, risk measures, and concentration of measure

Law invariant risk measures and information divergences

Maximal $k$-Edge-Colorable Subgraphs, Vizing’s Theorem, and Tuza’s Conjecture

Bayesian updating and model class selection with Subset Simulation

Sobolev and SBV Representation Theorems for large volume limit Gibbs measures

Simplified vine copula models: Approximations based on the simplifying assumption

Opacity Proof for CaPR+ Algorithm

On a conjecture of Mohar concerning Kempe equivalence of regular graphs

Scaling Exponents for Ordered Maxima

The Bismut-Elworthy-Li formula for mean-field stochastic differential equations

Large deviations for spatially extended random neural networks

Uniform Asymptotics for Compound Poisson Processes with Regularly Varying Jumps and Vanishing Drift

Parrondo games with two-dimensional spatial dependence

Quantile Cross-Spectral Measures of Dependence between Economic Variables

Using MapReduce for Large-scale Medical Image Analysis

A Note on Altermatic Number

On the complexity of switching linear regression

A probabilistic interpretation of the parametrix method

A New Method for Partial Correction of Residual Confounding in Time-Series and other Observational Studies

Completely regular codes with different parameters and the same distance-regular coset graphs

Nonconvex Nonsmooth Low-Rank Minimization via Iteratively Reweighted Nuclear Norm

On Computing the Galois Lattice of Bipartite Distance Hereditary Graphs

Simple and Efficient Reliable Broadcast in the Presence of Byzantine Processes

A diffusion process associated with Fréchet means

General linear-fractional branching processes with discrete time

Correlation structure and variable selection in generalized estimating equations via composite likelihood information criteria

Forbidden Subgraph Characterization of Quasi-line Graphs

Extrema of locally stationary Gaussian fields on growing manifolds

A conditional randomization test to account for covariate imbalance in randomized experiments

On Generalized Hadamard Matrices and Difference Matrices: $Z_6$

Learning in the Rational Speech Acts Model

Avalanches in Tip-Driven Interfaces in Random Media

Cross-sectional Markov model for trend analysis of observed discrete distributions of population characteristics

Freshman or Fresher? Quantifying the Geographic Variation of Internet Language

Models for generalized spherical and related distributions

Random-Cluster Dynamics in $\mathbb{Z}^2$