Uncertainty-Aware Principal Component Analysis
We present a technique to perform dimensionality reduction on data that is subject to uncertainty. Our method is a generalization of traditional principal component analysis (PCA) to multivariate probability distributions. In comparison to non-linear methods, linear dimensionality reduction techniques have the advantage that the characteristics of such probability distributions remain intact after projection. We derive a representation of the covariance matrix that respects potential uncertainty in each of the observations, building the mathematical foundation of our new method uncertainty-aware PCA. In addition to the accuracy and performance gained by our approach over sampling-based strategies, our formulation allows us to perform sensitivity analysis with regard to the uncertainty in the data. For this, we propose factor traces as a novel visualization that enables us to better understand the influence of uncertainty on the chosen principal components. We provide multiple examples of our technique using real-world datasets and show how to propagate multivariate normal distributions through PCA in closed-form. Furthermore, we discuss extensions and limitations of our approach. …

Waterfall Bandits
A popular approach to selling online advertising is by a waterfall, where a publisher makes sequential price offers to ad networks for an inventory, and chooses the winner in that order. The publisher picks the order and prices to maximize her revenue. A traditional solution is to learn the demand model and then subsequently solve the optimization problem for the given demand model. This will incur a linear regret. We design an online learning algorithm for solving this problem, which interleaves learning and optimization, and prove that this algorithm has sublinear regret. We evaluate the algorithm on both synthetic and real-world data, and show that it quickly learns high quality pricing strategies. This is the first principled study of learning a waterfall design online by sequential experimentation. …

TopoResNet
Skin cancer is one of the most common cancers in the United States. As technological advancements are made, algorithmic diagnosis of skin lesions is becoming more important. In this paper, we develop algorithms for segmenting the actual diseased area of skin in a given image of a skin lesion, and for classifying different types of skin lesions pictured in a given image. The cores of the algorithms used were based in persistent homology, an algebraic topology technique that is part of the rising field of Topological Data Analysis (TDA). The segmentation algorithm utilizes a similar concept to persistent homology that captures the robustness of segmented regions. For classification, we design two families of topological features from persistence diagrams—which we refer to as {\em persistence statistics} (PS) and {\em persistence curves} (PC), and use linear support vector machine as classifiers. We also combined those topological features, PS and PC, into ResNet-101 model, which we call {\em TopoResNet-101}, the results show that PS and PC are effective in two folds—improving classification performances and stabilizing the training process. Although convolutional features are the most important learning targets in CNN models, global information of images may be lost in the training process. Because topological features were extracted globally, our results show that the global property of topological features provide additional information to machine learning models. …

LSH Count
An important question that arises in the study of high dimensional vector representations learned from data is: given a set $\mathcal{D}$ of vectors and a query $q$, estimate the number of points within a specified distance threshold of $q$. We develop two estimators, LSH Count and Multi-Probe Count that use locality sensitive hashing to preprocess the data to accurately and efficiently estimate the answers to such questions via importance sampling. A key innovation is the ability to maintain a small number of hash tables via preprocessing data structures and algorithms that sample from multiple buckets in each hash table. We give bounds on the space requirements and sample complexity of our schemes, and demonstrate their effectiveness in experiments on a standard word embedding dataset. …