Linear k-Junta
We study the problem of testing if a function depends on a small number of linear directions of its input data. We call a function $f$ a \emph{linear $k$-junta} if it is completely determined by some $k$-dimensional subspace of the input space. In this paper, we study the problem of testing whether a given $n$ variable function $f : \mathbb{R}^n \to$, is a linear $k$-junta or $\epsilon$-far from all linear $k$-juntas, where the closeness is measured with respect to the Gaussian measure on $\mathbb{R}^n$. This problems is a common generalization of (i) The combinatorial problem of junta testing on the hypercube which tests whether a Boolean function is dependent on at most $k$ of its variables and (ii) Geometric testing problems such as testing if a function is an intersection of $k$ halfspaces. We prove the existence of a $\mathsf{poly}(k \cdot s/\epsilon)$-query non-adaptive tester for linear $k$-juntas with surface area at most $s$. The polynomial dependence on $s$ is necessary as we provide a $\mathsf{poly}(s)$ lower bound on the query complexity of any non-adaptive test for linear juntas. Moreover, we show that if the function is a linear $k$-junta with surface area at most $s$, then there is a $(s \cdot k)^{O(k)}$-query non-adaptive algorithm to learn the function \emph{up to a rotation of the basis}. {We also use this procedure to obtain a non-adaptive tester (with the same query complexity) for subclasses of linear $k$-juntas closed under rotation.} …

Hyperspherical Prototype Network (HPN)
This paper introduces hyperspherical prototype networks, which unify regression and classification by prototypes on hyperspherical output spaces. Rather than defining prototypes as the mean output vector over training examples per class, we propose hyperspheres as output spaces to define class prototypes a priori with large margin separation. By doing so, we do not require any prototype updating, we can handle any training size, and the output dimensionality is no longer constrained to the number of classes. Furthermore, hyperspherical prototype networks generalize to regression, by optimizing outputs as an interpolation between two prototypes on the hypersphere. Since both tasks are now defined by the same loss function, they can be jointly optimized for multi-task problems. Experimental evaluation shows the benefits of hyperspherical prototype networks for classification, regression, and their combination. …

Deep Diffeomorphic Normalizing Flow (DDNF)
The Normalizing Flow (NF) models a general probability density by estimating an invertible transformation applied on samples drawn from a known distribution. We introduce a new type of NF, called Deep Diffeomorphic Normalizing Flow (DDNF). A diffeomorphic flow is an invertible function where both the function and its inverse are smooth. We construct the flow using an ordinary differential equation (ODE) governed by a time-varying smooth vector field. We use a neural network to parametrize the smooth vector field and a recursive neural network (RNN) for approximating the solution of the ODE. Each cell in the RNN is a residual network implementing one Euler integration step. The architecture of our flow enables efficient likelihood evaluation, straightforward flow inversion, and results in highly flexible density estimation. An end-to-end trained DDNF achieves competitive results with state-of-the-art methods on a suite of density estimation and variational inference tasks. Finally, our method brings concepts from Riemannian geometry that, we believe, can open a new research direction for neural density estimation. …

GANsfer Learning
Medical imaging is a domain which suffers from a paucity of manually annotated data for the training of learning algorithms. Manually delineating pathological regions at a pixel level is a time consuming process, especially in 3D images, and often requires the time of a trained expert. As a result, supervised machine learning solutions must make do with small amounts of labelled data, despite there often being additional unlabelled data available. Whilst of less value than labelled images, these unlabelled images can contain potentially useful information. In this paper we propose combining both labelled and unlabelled data within a GAN framework, before using the resulting network to produce images for use when training a segmentation network. We explore the task of deep grey matter multi-class segmentation in an AD dataset and show that the proposed method leads to a significant improvement in segmentation results, particularly in cases where the amount of labelled data is restricted. We show that this improvement is largely driven by a greater ability to segment the structures known to be the most affected by AD, thereby demonstrating the benefits of exposing the system to more examples of pathological anatomical variation. We also show how a shift in domain of the training data from young and healthy towards older and more pathological examples leads to better segmentations of the latter cases, and that this leads to a significant improvement in the ability for the computed segmentations to stratify cases of AD. …