Semantically Aligned Bias Reducing (SABR) google
Zero shot learning (ZSL) aims to recognize unseen classes by exploiting semantic relationships between seen and unseen classes. Two major problems faced by ZSL algorithms are the hubness problem and the bias towards the seen classes. Existing ZSL methods focus on only one of these problems in the conventional and generalized ZSL setting. In this work, we propose a novel approach, Semantically Aligned Bias Reducing (SABR) ZSL, which focuses on solving both the problems. It overcomes the hubness problem by learning a latent space that preserves the semantic relationship between the labels while encoding the discriminating information about the classes. Further, we also propose ways to reduce the bias of the seen classes through a simple cross-validation process in the inductive setting and a novel weak transfer constraint in the transductive setting. Extensive experiments on three benchmark datasets suggest that the proposed model significantly outperforms existing state-of-the-art algorithms by ~1.5-9% in the conventional ZSL setting and by ~2-14% in the generalized ZSL for both the inductive and transductive settings. …

Very Sparse Tucker Factorization (VeST) google
Given a large tensor, how can we decompose it to sparse core tensor and factor matrices such that it is easier to interpret the results? How can we do this without reducing the accuracy? Existing approaches either output dense results or give low accuracy. In this paper, we propose VeST, a tensor factorization method for partially observable data to output a very sparse core tensor and factor matrices. VeST performs initial decomposition, determines unimportant entries in the decomposition results, removes the unimportant entries, and carefully updates the remaining entries. To determine unimportant entries, we define and use entry-wise ‘responsibility’ for the decomposed results. The entries are updated iteratively in a coordinate descent manner in parallel for scalable computation. Extensive experiments show that our method VeST is at least 2.2 times more sparse and at least 2.8 times more accurate compared to competitors. Moreover, VeST is scalable in terms of input order, dimension, and the number of observable entries. Thanks to VeST, we successfully interpret the result of real-world tensor data based on the sparsity pattern of the resulting factor matrices. …

L1-Regularized Maximum Likelihood Estimator google
We consider the problem of estimating the parameters of a multivariate Bernoulli process with auto-regressive feedback in the high-dimensional setting where the number of samples available is much less than the number of parameters. This problem arises in learning interconnections of networks of dynamical systems with spiking or binary-valued data. We allow the process to depend on its past up to a lag $p$, for a general $p \ge 1$, allowing for more realistic modeling in many applications. We propose and analyze an $\ell_1$-regularized maximum likelihood estimator (MLE) under the assumption that the parameter tensor is approximately sparse. Rigorous analysis of such estimators is made challenging by the dependent and non-Gaussian nature of the process as well as the presence of the nonlinearities and multi-level feedback. We derive precise upper bounds on the mean-squared estimation error in terms of the number of samples, dimensions of the process, the lag $p$ and other key statistical properties of the model. The ideas presented can be used in the high-dimensional analysis of regularized $M$-estimators for other sparse nonlinear and non-Gaussian processes with long-range dependence. …

Alpha-Pooling google
Convolutional neural networks (CNNs) have achieved remarkable performance in many applications, especially image recognition. As a crucial component of CNNs, sub-sampling plays an important role, and max pooling and arithmetic average pooling are commonly used sub-sampling methods. In addition to the two pooling methods, however, there could be many other pooling types, such as geometric average, harmonic average, and so on. Since it is not easy for algorithms to find the best pooling method, human experts choose types of pooling, which might not be optimal for different tasks. Following deep learning philosophy, the type of pooling can be driven by data for a given task. In this paper, we propose {\em alpha-pooling}, which has a trainable parameter $\alpha$ to decide the type of pooling. Alpha-pooling is a general pooling method including max pooling and arithmetic average pooling as a special case, depending on the parameter $\alpha$. In experiments, alpha-pooling improves the accuracy of image recognition tasks, and we found that max pooling is not the optimal pooling scheme. Moreover each layer has different optimal pooling types. …