Correlated Components Analysis google
How does one find data dimensions that are reliably expressed across repetitions? For example, in neuroscience one may want to identify combinations of brain signals that are reliably activated across multiple trials or subjects. For a clinical assessment with multiple ratings, one may want to identify an aggregate score that is reliably reproduced across raters. The approach proposed here — ‘correlated components analysis’ — is to identify components that maximally correlate between repetitions (e.g. trials, subjects, raters). This can be expressed as the maximization of the ratio of between-repetition to within-repetition covariance, resulting in a generalized eigenvalue problem. We show that covariances can be computed efficiently without explicitly considering all pairs of repetitions, that the result is equivalent to multi-class linear discriminant analysis for unbiased signals, and that the approach also maximize reliability, defined as the mean divided by the deviation across repetitions. We also extend the method to non-linear components using kernels, discuss regularization to improve numerical stability, present parametric and non-parametric tests to establish statistical significance, and provide code. …

Manhattan Plot google
A Manhattan plot is a type of scatter plot, usually used to display data with a large number of data-points – many of non-zero amplitude, and with a distribution of higher-magnitude values, for instance in genome-wide association studies (GWAS).
It gains its name from the similarity of such a plot to the Manhattan skyline: a profile of skyscrapers towering above the lower level “buildings” which vary around a lower height.


Limited Memory Steepest Descent Method (LMSD) google
The possibilities inherent in steepest descent methods have been considerably amplified by the introduction of the Barzilai-Borwein choice of step-size, and other related ideas. These methods have proved to be competitive with conjugate gradient methods for the minimization of large dimension unconstrained minimization problems. This paper suggests a method which is able to take advantage of the availability of a few additional ‘long’ vectors of storage to achieve a significant improvement in performance, both for quadratic and non-quadratic objective functions. It makes use of certain Ritz values related to the Lanczos process (Lanczos in J Res Nat Bur Stand 45:255-282, 1950). Some underlying theory is provided, and numerical evidence is set out showing that the new method provides a competitive and more simple alternative to the state of the art l-BFGS limited memory method. …