Fast and Scalable Structural SVM with Slack Rescaling

We present an efficient method for training slack-rescaled structural SVM. Although finding the most violating label in a margin-rescaled formulation is often easy since the target function decomposes with respect to the structure, this is not the case for a slack-rescaled formulation, and finding the most violated label might be very difficult. Our core contribution is an efficient method for finding the most-violating-label in a slack-rescaled formulation, given an oracle that returns the most-violating-label in a (slightly modified) margin-rescaled formulation. We show that our method enables accurate and scalable training for slack-rescaled SVMs, reducing runtime by an order of magnitude compared to previous approaches to slack-rescaled SVMs.

Semantic, Cognitive, and Perceptual Computing: Advances toward Computing for Human Experience

The World Wide Web continues to evolve and serve as the infrastructure for carrying massive amounts of multimodal and multisensory observations. These observations capture various situations pertinent to people’s needs and interests along with all their idiosyncrasies. To support human-centered computing that empower people in making better and timely decisions, we look towards computation that is inspired by human perception and cognition. Toward this goal, we discuss computing paradigms of semantic computing, cognitive computing, and an emerging aspect of computing, which we call perceptual computing. In our view, these offer a continuum to make the most out of vast, growing, and diverse data pertinent to human needs and interests. We propose details of perceptual computing characterized by interpretation and exploration operations comparable to the interleaving of bottom and top brain processing. This article consists of two parts. First we describe semantic computing, cognitive computing, and perceptual computing to lay out distinctions while acknowledging their complementary capabilities. We then provide a conceptual overview of the newest of these three paradigms–perceptual computing. For further insights, we focus on an application scenario of asthma management converting massive, heterogeneous and multimodal (big) data into actionable information or smart data.

Optimality of Spectral Algorithms for Community Detection in the Labeled Stochastic Block Model

We consider the problem of community detection in the labeled Stochastic Block Model (labeled SBM) with a finite number K of communities of sizes linearly growing with the network size n. Every pair of nodes is labeled independently at random, and label \ell appears with probability p(i,j,\ell) between two nodes in community i and j, respectively. One observes a realization of these random labels, and the objective is to reconstruct the communities from this observation. Under mild assumptions on the parameters p, we show that under spectral algorithms, the number of misclassified nodes does not exceed s with high probability as n grows large, whenever \bar{p}n=\omega(1) (where \bar{p}=\max_{i,j,\ell\ge 1}p(i,j,\ell)), s=o(n) and \frac{n D(p)}{ \log (n/s)} >1, where D(p), referred to as the {\it divergence}, is an appropriately defined function of the parameters p=(p(i,j,\ell), i,j, \ell). We further show that \frac{n D(p)}{ \log (n/s)} >1 is actually necessary to obtain less than s misclassified nodes asymptotically. This establishes the optimality of spectral algorithms, i.e., when \bar{p}n=\omega(1) and nD(p)=\omega(1), no algorithm can perform better in terms of expected misclassified nodes than spectral algorithms.

Fact Checking in Large Knowledge Graphs – A Discriminative Predicate Path Mining Approach

Traditional fact checking by experts and analysts cannot keep pace with the volume of newly created information. It is important and necessary, therefore, to enhance our ability to computationally determine whether some statement of fact is true or false. We view this problem as a link-prediction task in a knowledge graph, and show that a new model of the top discriminative predicate paths is able to understand the meaning of some statement and accurately determine its veracity. We evaluate our approach by examining thousands of claims related to history, geography, biology, and politics using a public, million node knowledge graph extracted from Wikipedia and PubMedDB. Not only does our approach significantly outperform related models, we also find that the discriminative predicate path model is easily interpretable and provides sensible reasons for the final determination.

Unsupervised Ensemble Learning with Dependent Classifiers

In unsupervised ensemble learning, one obtains predictions from multiple sources or classifiers, yet without knowing the reliability and expertise of each source, and with no labeled data to assess it. The task is to combine these possibly conflicting predictions into an accurate meta-learner. Most works to date assumed perfect diversity between the different sources, a property known as conditional independence. In realistic scenarios, however, this assumption is often violated, and ensemble learners based on it can be severely sub-optimal. The key challenges we address in this paper are:\ (i) how to detect, in an unsupervised manner, strong violations of conditional independence; and (ii) construct a suitable meta-learner. To this end we introduce a statistical model that allows for dependencies between classifiers. Our main contributions are the development of novel unsupervised methods to detect strongly dependent classifiers, better estimate their accuracies, and construct an improved meta-learner. Using both artificial and real datasets, we showcase the importance of taking classifier dependencies into account and the competitive performance of our approach.

Qualitative Projection Using Deep Neural Networks

Deep neural networks (DNN) abstract by demodulating the output of linear filters. In this article, we refine this definition of abstraction to show that the inputs of a DNN are abstracted with respect to the filters. Or, to restate, the abstraction is qualified by the filters. This leads us to introduce the notion of qualitative projection. We use qualitative projection to abstract MNIST hand-written digits with respect to the various dogs, horses, planes and cars of the CIFAR dataset. We then classify the MNIST digits according to the magnitude of their dogness, horseness, planeness and carness qualities, illustrating the generality of qualitative projection.

NYTRO: When Subsampling Meets Early Stopping

Early stopping is a well known approach to reduce the time complexity for performing training and model selection of large scale learning machines. On the other hand, memory/space (rather than time) complexity is the main constraint in many applications, and randomized subsampling techniques have been proposed to tackle this issue. In this paper we ask whether early stopping and subsampling ideas can be combined in a fruitful way. We consider the question in a least squares regression setting and propose a form of randomized iterative regularization based on early stopping and subsampling. In this context, we analyze the statistical and computational properties of the proposed method. Theoretical results are complemented and validated by a thorough experimental analysis.

Structure estimation for mixed graphical models in high-dimensional data

Undirected graphical models are a key component in the analysis of complex observational data in a large variety of disciplines. In many of these applications one is interested in estimating the undirected graphical model underlying a distribution over variables with different domains. Despite the pervasive need for such an estimation method, to date there is no such method that models all variables on their proper domain. We close this methodological gap by combining a new class of mixed graphical models with a structure estimation approach based on generalized covariance matrices. We report the performance of our methods using simulations, illustrate the method with a dataset on Autism Spectrum Disorder (ASD) and provide an implementation as an R-package.

An exact algorithm and efficient importance sampling for computing two-locus likelihoods under variable population size

Size biased couplings and the spectral gap for random regular graphs

Extractors in Paley graphs: a random model

Frog model wakeup time on the complete graph

One-point concentration of the clique and chromatic numbers of the random Cayley graph on F_2^n

Additive triples of bijections, or the toroidal semiqueens problem

Dichromatic number and fractional chromatic number

A latent shared-component generative model for real-time disease surveillance using Twitter data

Transductive Optimization of Top k Precision

Stereo Matching by Training a Convolutional Neural Network to Compare Image Patches

Improving prediction performance of stellar parameters using functional models

Abelian covers of alternating groups

Max-margin Metric Learning for Speaker Recognition

Binary Speaker Embedding

Generalized $Γ$ calculus and application to interacting particles on a graph

Ranks of subgroups in boundedly generated groups

Online unmixing of multitemporal hyperspectral images accounting for spectral variability

Approximation Algorithm for Minimum Weight Connected $m$-Fold Dominating Set

Theory of heterogeneous viscoelasticity

Global spectral fluctuations in the Gaussian Unitary Ensemble

Mortality Risk Minimisation and Optional Martingale Representation Theorem for Enlarged Filtration

On the connectivity Waiter-Client game

Games with Penalties on Winning

Distances between nested densities and a measure of the impact of the prior in Bayesian statistics

Finite dynamical systems, hat games, and coding theory

The impact of model detail on power grid resilience measures

Exponential convergence to quasi-stationary distribution for absorbed one-dimensional diffusions with killing

Combinatorics of poly-Bernoulli numbers

Cretan(4t+1) Matrices

Models with time-varying predictors for meningitis in Navrongo, Ghana

Shortest circuit covers of signed graphs

When Two Choices Are not Enough: Balancing at Scale in Distributed~Stream~Processing

Single Memristor Logic Gates: From NOT to a Full Adder

Protein Structure Prediction by Protein Alignments