Software and Code


Statistical methods to evaluate association between variables and their estimated latent variables. Latent variables may be estimated by principal component analysis (PCA), logistic factor analysis (LFA), and other techniques. This resampling strategy is extended to hard clustering where cluster centers are used to estimate latent variables.


Association test with Principal Components
Statistical test of cluster memberships with the mtcars example
Unsupervised evaluation of cell identities in single cell genomics


Statistical tests of similarity between binary data using the Jaccard/Tanimoto similarity coefficient – the ratio of intersection to union. Biochemical fingerprints, genomic intervals, and ecological communities are some examples of binary data in life sciences.

Concept Saliency Maps

Evaluate and visualize latent representation of high-level concepts in generative models, such as variational autoencoders (VAEs).


Jackstraw weighted shrinkage estimation for high-dimensional latent variable models. The jackstraw is used to estimate sparse loadings (i.e., coefficients) of Principal Component Analysis, Logistic Factor Analysis, and related techniques.