Topological Data Analysis (TDA)
Topological data analysis (TDA) is a new area of study aimed at having applications in areas such as data mining and computer vision. The main problems are:
1. how one infers high-dimensional structure from low-dimensional representations; and
2. how one assembles discrete points into global structure.
The human brain can easily extract global structure from representations in a strictly lower dimension, i.e. we infer a 3D environment from a 2D image from each eye. The inference of global structure also occurs when converting discrete data into continuous images, e.g. dot-matrix printers and televisions communicate images via arrays of discrete points.
The main method used by topological data analysis is:
1. Replace a set of data points with a family of simplicial complexes, indexed by a proximity parameter.
2. Analyse these topological complexes via algebraic topology – specifically, via the theory of persistent homology.
3. Encode the persistent homology of a data set in the form of a parameterized version of a Betti number which is called a persistence diagram or barcode.
Topological Analysis of Data …
Simple Temporal Point Process (SPP)
A simple temporal point process (SPP) is an important class of time series, where the sample realization of the process is solely composed of the times at which events occur. Particular examples of point process data are neuronal spike patterns or spike trains, and a large number of distance and similarity metrics for those data have been proposed. A marked point process (MPP) is an extension of a simple temporal point process, in which a certain vector valued mark is associated with each of the temporal points in the SPP. Analyses of MPPs are of practical importance because instances of MPPs include recordings of natural disasters such as earthquakes and tornadoes. …
Generalized Estimation Equations (GEE)
In statistics, a generalized estimating equation (GEE) is used to estimate the parameters of a generalized linear model with a possible unknown correlation between outcomes. Parameter estimates from the GEE are consistent even when the covariance structure is misspecified, under mild regularity conditions. The focus of the GEE is on estimating the average response over the population (‘population-averaged’ effects) rather than the regression parameters that would enable prediction of the effect of changing one or more covariates on a given individual. GEEs are usually used in conjunction with Huber-White standard error estimates, also known as ‘robust standard error’ or ‘sandwich variance’ estimates. In the case of a linear model with a working independence variance structure, these are known as ‘heteroscedasticity consistent standard error’ estimators. Indeed, the GEE unified several independent formulations of these standard error estimators in a general framework. GEEs belong to a class of semiparametric regression techniques because they rely on specification of only the first two moments. Under correct model specification and mild regularity conditions, parameter estimates from GEEs are consistent. They are a popular alternative to the likelihood-based generalized linear mixed model which is more sensitive to variance structure specification. They are commonly used in large epidemiological studies, especially multi-site cohort studies because they can handle many types of unmeasured dependence between outcomes. …
Topological Data Analysis (TDA)