Bayesian Robust Principal Component Regression google
Principal component regression is a linear regression model with principal components as regressors. This type of modelling is particularly useful for prediction in settings with high-dimensional covariates. Surprisingly, the existing literature treating of Bayesian approaches is relatively sparse. In this paper, we aim at filling some gaps through the following practical contribution: we introduce a Bayesian approach with detailed guidelines for a straightforward implementation. The approach features two characteristics that we believe are important. First, it effectively involves the relevant principal components in the prediction process. This is achieved in two steps. The first one is model selection; the second one is to average out the predictions obtained from the selected models according to model averaging mechanisms, allowing to account for model uncertainty. The model posterior probabilities are required for model selection and model averaging. For this purpose, we include a procedure leading to an efficient reversible jump algorithm. The second characteristic of our approach is whole robustness, meaning that the impact of outliers on inference gradually vanishes as they approach plus or minus infinity. The conclusions obtained are consequently consistent with the majority of observations (the bulk of the data). …

Uniform Manifold Approximation and Projection (UMAP) google
UMAP (Uniform Manifold Approximation and Projection) is a novel manifold learning technique for dimension reduction. UMAP is constructed from a theoretical framework based in Riemannian geometry and algebraic topology. The result is a practical scalable algorithm that applies to real world data. The UMAP algorithm is competitive with t-SNE for visualization quality, and arguably preserves more of the global structure with superior run time performance. Furthermore, UMAP has no computational restrictions on embedding dimension, making it viable as a general purpose dimension reduction technique for machine learning. …

MLJAR google
MLJAR is a platform for rapid prototyping, development and deploying pattern recognition algorithms. It works with many data types – basically all data are arrays 🙂 …

Network Estimation Across Unequal Sample Sizes (NExUS) google
Network-based analyses of high-throughput genomics data provide a holistic, systems-level understanding of various biological mechanisms for a common population. However, when estimating multiple networks across heterogeneous sub-populations, varying sample sizes pose a challenge in the estimation and inference, as network differences may be driven by differences in power. We are particularly interested in addressing this challenge in the context of proteomic networks for related cancers, as the number of subjects available for rare cancer (sub-)types is often limited. We develop NExUS (Network Estimation across Unequal Sample sizes), a Bayesian method that enables joint learning of multiple networks while avoiding artefactual relationship between sample size and network sparsity. We demonstrate through simulations that NExUS outperforms existing network estimation methods in this context, and apply it to learn network similarity and shared pathway activity for groups of cancers with related origins represented in The Cancer Genome Atlas (TCGA) proteomic data. …

Advertisements