Variable Selection in Clustering by Mixture Models for Discrete Data (ClustMMDD)
An implementation of a variable selection procedure in clustering by mixture of multinomial models for discrete data. Genotype data are examples of such data with two unordered observations (alleles) at each locus for diploid individual. The two-fold problem is seen as a model selection problem where competing models are characterized by the number of clusters K, and the subset S of clustering variables. Competing models are compared by penalized maximum likelihood criteria. We considered asymptotic criteria such as Akaike and Bayesian Information criteria, and a family of penalized criteria with penalty function to be data driven calibrated.

Methods for Estimating Causal Effects from Observational Data (CausalFX)
Estimate causal effects of one variable on another, currently for binary data only. Methods include instrumental variable bounds, adjustment by a given covariate set, adjustment by an induced covariate set using a variation of the PC algorithm, and an effect bounding method (the Witness Protection Program) based on covariate adjustment with observable independence constraints.

Create 2D Principal Component Plots with Bootstrapping (pcaBootPlot)
Draws a 2D principal component plot using the first 2 principal components from the original and bootstrapped data to give some sense of variability.