We present an efficient method for training slack-rescaled structural SVM. Although finding the most violating label in a margin-rescaled formulation is often easy since the target function decomposes with respect to the structure, this is not the case for a slack-rescaled formulation, and finding the most violated label might be very difficult. Our core contribution is an efficient method for finding the most-violating-label in a slack-rescaled formulation, given an oracle that returns the most-violating-label in a (slightly modified) margin-rescaled formulation. We show that our method enables accurate and scalable training for slack-rescaled SVMs, reducing runtime by an order of magnitude compared to previous approaches to slack-rescaled SVMs.
The World Wide Web continues to evolve and serve as the infrastructure for carrying massive amounts of multimodal and multisensory observations. These observations capture various situations pertinent to people’s needs and interests along with all their idiosyncrasies. To support human-centered computing that empower people in making better and timely decisions, we look towards computation that is inspired by human perception and cognition. Toward this goal, we discuss computing paradigms of semantic computing, cognitive computing, and an emerging aspect of computing, which we call perceptual computing. In our view, these offer a continuum to make the most out of vast, growing, and diverse data pertinent to human needs and interests. We propose details of perceptual computing characterized by interpretation and exploration operations comparable to the interleaving of bottom and top brain processing. This article consists of two parts. First we describe semantic computing, cognitive computing, and perceptual computing to lay out distinctions while acknowledging their complementary capabilities. We then provide a conceptual overview of the newest of these three paradigms–perceptual computing. For further insights, we focus on an application scenario of asthma management converting massive, heterogeneous and multimodal (big) data into actionable information or smart data.
We consider the problem of community detection in the labeled Stochastic Block Model (labeled SBM) with a finite number $K$ of communities of sizes linearly growing with the network size $n$. Every pair of nodes is labeled independently at random, and label $\ell$ appears with probability $p(i,j,\ell)$ between two nodes in community $i$ and $j$, respectively. One observes a realization of these random labels, and the objective is to reconstruct the communities from this observation. Under mild assumptions on the parameters $p$, we show that under spectral algorithms, the number of misclassified nodes does not exceed $s$ with high probability as $n$ grows large, whenever $\bar{p}n=\omega(1)$ (where $\bar{p}=\max_{i,j,\ell\ge 1}p(i,j,\ell)$), $s=o(n)$ and $\frac{n D(p)}{ \log (n/s)} >1$, where $D(p)$, referred to as the {\it divergence}, is an appropriately defined function of the parameters $p=(p(i,j,\ell), i,j, \ell)$. We further show that $\frac{n D(p)}{ \log (n/s)} >1$ is actually necessary to obtain less than $s$ misclassified nodes asymptotically. This establishes the optimality of spectral algorithms, i.e., when $\bar{p}n=\omega(1)$ and $nD(p)=\omega(1)$, no algorithm can perform better in terms of expected misclassified nodes than spectral algorithms.
Traditional fact checking by experts and analysts cannot keep pace with the volume of newly created information. It is important and necessary, therefore, to enhance our ability to computationally determine whether some statement of fact is true or false. We view this problem as a link-prediction task in a knowledge graph, and show that a new model of the top discriminative predicate paths is able to understand the meaning of some statement and accurately determine its veracity. We evaluate our approach by examining thousands of claims related to history, geography, biology, and politics using a public, million node knowledge graph extracted from Wikipedia and PubMedDB. Not only does our approach significantly outperform related models, we also find that the discriminative predicate path model is easily interpretable and provides sensible reasons for the final determination.
In unsupervised ensemble learning, one obtains predictions from multiple sources or classifiers, yet without knowing the reliability and expertise of each source, and with no labeled data to assess it. The task is to combine these possibly conflicting predictions into an accurate meta-learner. Most works to date assumed perfect diversity between the different sources, a property known as conditional independence. In realistic scenarios, however, this assumption is often violated, and ensemble learners based on it can be severely sub-optimal. The key challenges we address in this paper are:\ (i) how to detect, in an unsupervised manner, strong violations of conditional independence; and (ii) construct a suitable meta-learner. To this end we introduce a statistical model that allows for dependencies between classifiers. Our main contributions are the development of novel unsupervised methods to detect strongly dependent classifiers, better estimate their accuracies, and construct an improved meta-learner. Using both artificial and real datasets, we showcase the importance of taking classifier dependencies into account and the competitive performance of our approach.
Deep neural networks (DNN) abstract by demodulating the output of linear filters. In this article, we refine this definition of abstraction to show that the inputs of a DNN are abstracted with respect to the filters. Or, to restate, the abstraction is qualified by the filters. This leads us to introduce the notion of qualitative projection. We use qualitative projection to abstract MNIST hand-written digits with respect to the various dogs, horses, planes and cars of the CIFAR dataset. We then classify the MNIST digits according to the magnitude of their dogness, horseness, planeness and carness qualities, illustrating the generality of qualitative projection.
Early stopping is a well known approach to reduce the time complexity for performing training and model selection of large scale learning machines. On the other hand, memory/space (rather than time) complexity is the main constraint in many applications, and randomized subsampling techniques have been proposed to tackle this issue. In this paper we ask whether early stopping and subsampling ideas can be combined in a fruitful way. We consider the question in a least squares regression setting and propose a form of randomized iterative regularization based on early stopping and subsampling. In this context, we analyze the statistical and computational properties of the proposed method. Theoretical results are complemented and validated by a thorough experimental analysis.
Undirected graphical models are a key component in the analysis of complex observational data in a large variety of disciplines. In many of these applications one is interested in estimating the undirected graphical model underlying a distribution over variables with different domains. Despite the pervasive need for such an estimation method, to date there is no such method that models all variables on their proper domain. We close this methodological gap by combining a new class of mixed graphical models with a structure estimation approach based on generalized covariance matrices. We report the performance of our methods using simulations, illustrate the method with a dataset on Autism Spectrum Disorder (ASD) and provide an implementation as an R-package.