Restricted Connection Orthogonal Matching Pursuit for Sparse Subspace Clustering (RCOMP-SSC) google
Sparse Subspace Clustering (SSC) is one of the most popular methods for clustering data points into their underlying subspaces. However, SSC may suffer from heavy computational burden. Orthogonal Matching Pursuit applied on SSC accelerates the computation but the trade-off is the loss of clustering accuracy. In this paper, we propose a noise-robust algorithm, Restricted Connection Orthogonal Matching Pursuit for Sparse Subspace Clustering (RCOMP-SSC), to improve the clustering accuracy and maintain the low computational time by restricting the number of connections of each data point during the iteration of OMP. Also, we develop a framework of control matrix to realize RCOMP-SCC. And the framework is scalable for other data point selection strategies. Our analysis and experiments on synthetic data and two real-world databases (EYaleB & Usps) demonstrate the superiority of our algorithm compared with other clustering methods in terms of accuracy and computational time. …

Deep Curiosity Loop (DCL) google
Inspired by infants’ intrinsic motivation to learn, which values informative sensory channels contingent on their immediate social environment, we developed a deep curiosity loop (DCL) architecture. The DCL is composed of a learner, which attempts to learn a forward model of the agent’s state-action transition, and a novel reinforcement-learning (RL) component, namely, an Action-Convolution Deep Q-Network, which uses the learner’s prediction error as reward. The environment for our agent is composed of visual social scenes, composed of sitcom video streams, thereby both the learner and the RL are constructed as deep convolutional neural networks. The agent’s learner learns to predict the zero-th order of the dynamics of visual scenes, resulting in intrinsic rewards proportional to changes within its social environment. The sources of these socially informative changes within the sitcom are predominantly motions of faces and hands, leading to the unsupervised curiosity-based learning of social interaction features. The face and hand detection is represented by the value function and the social interaction optical-flow is represented by the policy. Our results suggest that face and hand detection are emergent properties of curiosity-based learning embedded in social environments. …

Bayesian Kernel Machine Regression – Causal Mediation Analysis (BKMR-CMA) google
Exposure to complex mixtures is a real-world scenario. As such, it is important to understand the mechanisms through which a mixture operates in order to reduce the burden of disease. Currently, there are few methods in the causal mediation analysis literature to estimate the direct and indirect effects of a exposure mixture on an outcome operating through a intermediate (mediator) variable. This paper presents new statistical methodology to estimate the natural direct effect (NDE), natural indirect effect (NIE), and controlled direct effects (CDEs) of a potentially complex mixture exposure on an outcome through a mediator variable. We implement Bayesian kernel machine regression (BKMR) to allow for all possible interactions and nonlinear effects of the co-exposures on the mediator, and the co-exposures and mediator on the outcome. From the posterior predictive distributions of the mediator and the outcome, we simulate counterfactual outcomes to obtain posterior samples, estimates, and credible intervals (CI) of the NDE, NIE, and CDE. We perform a simulation study that shows when the exposure-mediator and exposure-mediator-outcome relationships are complex, our proposed Bayesian kernel machine regression — causal mediation analysis (BKMR–CMA) preforms better than current mediation methods. We apply our methodology to quantify the contribution of birth length as a mediator between in utero co-exposure of arsenic, manganese and lead, and children’s neurodevelopment, in a prospective birth cohort in rural Bangladesh. We found a negative association of co-exposure to lead, arsenic, and manganese and neurodevelopment, a negative association of exposure to this metal mixture and birth length, and evidence that birth length mediates the effect of co-exposure to lead, arsenic, and manganese on children’s neurodevelopment. …

Poisson PCA google
In this paper, we study the problem of computing a Principal Component Analysis of data affected by Poisson noise. We assume samples are drawn from independent Poisson distributions. We want to estimate principle components of a fixed transformation of the latent Poisson means. Our motivating example is microbiome data, though the methods apply to many other situations. We develop a semiparametric approach to correct the bias of variance estimators, both for untransformed and transformed (with particular attention to log-transformation) Poisson means. Furthermore, we incorporate methods for correcting different exposure or sequencing depth in the data. In addition to identifying the principal components, we also address the non-trivial problem of computing the principal scores in this semiparametric framework. Most previous approaches tend to take a more parametric line. For example the Poisson-log-normal (PLN) model, approach. We compare our method with the PLN approach and find that our method is better at identifying the main principal components of the latent log-transformed Poisson means, and as a further major advantage, takes far less time to compute. Comparing methods on real data, we see that our method also appears to be more robust to outliers than the parametric method. …