• Visually Indicated Sounds
• Interlacements and the Wired Uniform Spanning Forest
• A characterization of dissimilarity families of trees
• The Longevity of Lava Dome Eruptions
• Families of multiweights and pseudostars
• Parisi formula, disorder chaos and fluctuation for the ground state energy in the spherical mixed p-spin models
• Coarsening with a frozen vertex
• Invariance, quasi-invariance and unimodularity for random graphs
• Shotgun assembly of random regular graphs
• Efficient nonparametric inference for discretely observed compound Poisson processes
• Approximate Hubel-Wiesel Modules and the Data Structures of Neural Computation
• GELATO and SAGE: An Integrated Framework for MS Annotation
• On degree sequences of undirected, directed, and bidirected graphs
• Convexified Modularity Maximization for Degree-corrected Stochastic Block Models
• Recognizing Entailment and Contradiction by Tree-based Convolution
• Fréchet Barycenters and a Law of Large Numbers for Measures on the Real Line
• Evaluating Hive and Spark SQL with BigBench
• Webs of stars or how to triangulate free sums of point configurations
• Pseudolikelihood inference for Gibbsian T-tessellations … and point processes
• The data arrangement problem on binary trees
• Data Migration among Different Clouds
• Renewal theorems for a class of processes with dependent interarrival times
• Hook formulas for skew shapes
• Communicating with sentences: A multi-word naming game model
• On restricted edge-connectivity of replacement product graphs
• Volume, facets and dual polytopes of twinned chain polytopes
• Random Steiner systems and bounded degree coboundary expanders of every dimension
• Data Driven SMART Intercontinental Overlay Networks
• Identifying Seizure Onset Zone from the Causal Connectivity Inferred Using Directed Information
• Feedforward Sequential Memory Networks: A New Structure to Learn Long-term Dependency
• Post-regularization Inference for Dynamic Nonparanormal Graphical Models
• Using Causal Discovery to Track Information Flow in Spatio-Temporal Data – A Testbed and Experimental Results Using Advection-Diffusion Simulations
• Asymptotic lower bound for the gap of Hermitian matrices having ergodic ground states and infinitesimal off-diagonal elements
• Statistical and Computational Guarantees for the Baum-Welch Algorithm
• Incidences with curves in R^d
• Graph reduction techniques and the multiplicity of the Laplacian eigenvalues
• Eventual Wait-Free Synchronization
• Robust Semi-supervised Least Squares Classification by Implicit Constraints
• Large Deviations on a Cayley Tree I: Rate Functions
• Beyond in-phase and anti-phase coordination in a model of joint action
• New Perspectives on $k$-Support and Cluster Norms
• Fully Developed Turbulence in the view of Horizontal Visibility Graphs
• RADICAL-Pilot: Scalable Execution of Heterogeneous and Dynamic Workloads on Supercomputers
• Parametric Inference of Hidden Discrete-Time Diffusion Processes by Deconvolution
• Estimation of Kullback-Leibler losses for noisy recovery problems within the exponential family
• Product of Independent Cauchy-Lorentz Random Matrices
• Electricity Demand Forecasting by Multi-Task Learning
• Self-Excitation: An Enabler for Online Thermal Estimation and Model Predictive Control of Buildings
• On a class of transformations of sequences of complex numbers
• Chi-Square Mixture Representations for the Distribution of the Scalar Schur Complement in a Noncentral Wishart Matrix
• Pattern avoiding permutations and independent sets in graphs
• Counterexamples to conjectures on graph distance measures based on topological indexes
• Decremental Single-Source Shortest Paths on Undirected Graphs in Near-Linear Total Update Time
• Sublinear-Time Maintenance of Breadth-First Spanning Trees in Partially Dynamic Networks
• Circulant $S_2$ graphs
• Curvature and higher order Buser inequalities for the graph connection Laplacian
• Extracting list colorings from large independent sets
• Efficient Estimation of Quantiles in Missing Data Models
• Equivariant Euler characteristics of partition posets
• A Dynamic Linear Model to Forecast Hotel Registrations in Puerto Rico Using Google Trends Data
• Regularization by noise: a McKean-Vlasov case
• The Raney Numbers and $(s,s+1)$-Core Partitions
• The Improvement of Negative Sentences Translation in English-to-Korean Machine Translation
• Statistical Learning under Nonstationary Mixing Processes
• Risk Aversion in the Small and in the Large under Rank-Dependent Utility
• Device and System Level Design Considerations for Analog-Non-Volatile-Memory Based Neuromorphic Architectures
• Euler Polytopes and Convex Matroid Optimization
• An Analytical Evaluation of Matricizing Least-Square-Errors Curve Fitting to Support High Performance Computation on Large Datasets
• Discovering topic structures of a temporally evolving document corpus
• Classification of graphs using contractible transformations. Homotopy equivalence of graphs. Basic representatives and complexity of homotopy equivalence classes
• Inducing Generalized Multi-Label Rules with Learning Classifier Systems
• Micro-Differential Evolution: Diversity Enhancement and Comparative Study
• Partition functions of integrable lattice models and combinatorics of symmetric polynomials
• A Relaxed Drift Diffusion Model for Phylogenetic Trait Evolution
• Sparse Reconstruction of Compressive Sensing MRI using Cross-Domain Stochastically Fully Connected Conditional Random Fields
• Nonsymmetric Dependence Measures: the Discrete Case
• Toward a Research Agenda in Adversarial Reasoning: Computational Approaches to Anticipating the Opponent’s Intent and Actions
• Multi-Level Cause-Effect Systems
• A scalable quasi-Bayesian framework for Gaussian graphical models
• Probabilistic Model-Based Approach for Heart Beat Detection
We introduce a class of random graphs with a hierarchical community structure, which we call the hierarchical configuration model. On the inter-community level, the graph is a configuration model, and on the intra-community level, every vertex in the configuration model is replaced by a community: a small graph. These communities may have any shape, as long as they are connected. For these hierarchical graphs, we find the size of the largest component, the degree distribution and the clustering coefficient. Furthermore, we determine the conditions under which a giant percolation cluster exists, and find its size.
In the last ten years, the employment of symbolic methods has substantially extended both the theory and the applications of statistics and probability. This survey reviews the development of a symbolic technique arising from classical umbral calculus, as introduced by Rota and Taylor in The usefulness of this symbolic technique is twofold. The first is to show how new algebraic identities drive in discovering insights among topics apparently very far from each other and related to probability and statistics. One of the main tools is a formal generalization of the convolution of identical probability distributions, which allows us to employ compound Poisson random variables in various topics that are only somewhat interrelated. Having got a different and deeper viewpoint, the second goal is to show how to set up algorithmic processes performing efficiently algebraic calculations. In particular, the challenge of finding these symbolic procedures should lead to a new method, and it poses new problems involving both computational and conceptual issues. Evidence of efficiency in applying this symbolic method will be shown within statistical inference, parameter estimation, L\’evy processes, and, more generally, problems involving multivariate functions. The symbolic representation of Sheffer polynomial sequences allows us to carry out a unifying theory of classical, Boolean and free cumulants. Recent connections within random matrices have extended the applications of the symbolic method.
The tripartite graph is one of the commonest topological structures in social tagging systems such as Delicious, which has three types of nodes (i.e., users, URLs and tags). Traditional recommender systems developed based on collaborative filtering for the social tagging systems bring very high demands on CPU time cost. In this paper, to overcome this drawback, we propose a novel approach that extracts non-overlapping user clusters and corresponding overlapping item clusters simultaneously through coarse clustering to accelerate the user-based collaborative filtering and develop a fast recommendation algorithm for the social tagging systems. The experimental results show that the proposed approach is able to dramatically reduce the processing time cost greater than and relatively enhance the accuracy in comparison with the ordinary user-based collaborative filtering algorithm.
Bag-of-ngram based methods still achieve state-of-the-art results for tasks such as sentiment classification of long movie reviews, despite semantic information is partially lost for these methods. Many document embeddings methods have been proposed to capture semantics, but they still can’t outperform bag-of-ngram based methods on this task. In this paper, we modify the architecture of the recently proposed Paragraph Vector, allowing it to learn document vectors by predicting not only words, but n-gram features as well. Our model is able to capture both semantics and word order in documents while keeping the expressive power of learned vectors. Experimental results on IMDB movie review dataset shows that our model outperforms previous deep learning models and bag-of-ngram based models due to the above advantages. More robust results are also obtained when our model is combined with other models. The source code of our model will be also published together with this paper.
Mining movement data to reveal interesting behavioral patterns has gained attention in recent years. One such pattern is the convoy pattern which consists of at least m objects moving together for at least k consecutive time instants where m and k are user-defined parameters. Existing algorithms for detecting convoy patterns, however do not scale to real-life dataset sizes. Therefore a distributed algorithm for convoy mining is inevitable. In this paper, we discuss the problem of convoy mining and analyze different data partitioning strategies to pave the way for a generic distributed convoy pattern mining algorithm.
We explore the problem of binary classification in machine learning, with a twist – the classifier is allowed to abstain on any datum, professing ignorance about the true class label without committing to any prediction. This is directly motivated by applications like medical diagnosis and fraud risk assessment, in which incorrect predictions have potentially calamitous consequences. We focus on a recent spate of theoretically driven work in this area that characterizes how allowing abstentions can lead to fewer errors in very general settings. Two areas are highlighted: the surprising possibility of zero-error learning, and the fundamental tradeoff between predicting sufficiently often and avoiding incorrect predictions. We review efficient algorithms with provable guarantees for each of these areas. We also discuss connections to other scenarios, notably active learning, as they suggest promising directions of further inquiry in this emerging field.
Multi-relational learning has received lots of attention from researchers in various research communities. Most existing methods either suffer from superlinear per-iteration cost, or are sensitive to the given ranks. To address both issues, we propose a scalable core tensor trace norm Regularized Orthogonal Iteration Decomposition (ROID) method for full or incomplete tensor analytics, which can be generalized as a graph Laplacian regularized version by using auxiliary information or a sparse higher-order orthogonal iteration (SHOOI) version. We first induce the equivalence relation of the Schatten p-norm (0<p<\infty) of a low multi-linear rank tensor and its core tensor. Then we achieve a much smaller matrix trace norm minimization problem. Finally, we develop two efficient augmented Lagrange multiplier algorithms to solve our problems with convergence guarantees. Extensive experiments using both real and synthetic datasets, even though with only a few observations, verified both the efficiency and effectiveness of our methods.
In this work we discuss a novel model prior probability for variable selection in linear regression. The idea is to determine the prior mass in an objective sense, by considering the worth of each of the possible regression models, given the number of covariates under consideration. Through a simulation study, we show that the proposed prior outperforms the uniform prior and the Scott \& Berger prior in a scenario of no prior knowledge about the size of the true regression models. We illustrate the use of the prior using two well-known data sets with, respectively, 15 and 4 covariates.
The report proposes a new approach for inverse reinforcement learning based on deep Gaussian process (GP), which is capable of learning complicated reward structures with few demonstrations. The model stacks multiple latent GP layers to learn abstract representations of the state feature space, which is linked to the demonstrations through the Maximum Entropy learning framework. As analytic derivation of the model evidence is prohibitive due to the nonlinearity of latent variables, variational inference is employed for approximate inference, based on a special choice of variational distributions. This guards the model from over training, achieving the Automatic Occam’s Razor. Experiments on the benchmark test, i.e., object world, as well as a new setup, i.e., binary world, are performed, where the proposed method outperforms state-of-the-art approaches.
Vehicles are becoming more and more connected, this opens up a larger attack surface which not only affects the passengers inside vehicles, but also people around them. These vulnerabilities exist because modern systems are built on the comparatively less secure and old CAN bus framework which lacks even basic authentication. Since a new protocol can only help future vehicles and not older vehicles, our approach tries to solve the issue as a data analytics problem and use machine learning techniques to secure cars. We develop a Hidden Markov Model to detect anomalous states from real data collected from vehicles. Using this model, while a vehicle is in operation, we are able to detect and issue alerts. Our model could be integrated as a plug-n-play device in all new and old cars.
Stochastic gradient Markov chain Monte Carlo (SG-MCMC) methods are Bayesian analogs to popular stochastic optimization methods; however, this connection is not well studied. We explore this relationship by applying simulated annealing to an SGMCMC algorithm. Furthermore, we extend recent SG-MCMC methods with two key components: i) adaptive preconditioners (as in ADAgrad or RMSprop), and ii) adaptive element-wise momentum weights. The zero-temperature limit gives a novel stochastic optimization method with adaptive element-wise momentum weights, while conventional optimization methods only have a shared, static momentum weight. Under certain assumptions, our theoretical analysis suggests the proposed simulated annealing approach converges close to the global optima. Experiments on several deep neural network models show state-of-the-art results compared to related stochastic optimization algorithms.
The histogram method is a powerful non-parametric approach for estimating the probability density function of a continuous variable. But the construction of a histogram, compared to the parametric approaches, demands a large number of observations to capture the underlying density function. Thus it is not suitable for analyzing a sparse data set, a collection of units with a small size of data. In this paper, by employing the probabilistic topic model, we develop a novel Bayesian approach to alleviating the sparsity problem in the conventional histogram estimation. Our method estimates a unit’s density function as a mixture of basis histograms, in which the number of bins for each basis, as well as their heights, is determined automatically. The estimation procedure is performed by using the fast and easy-to-implement collapsed Gibbs sampling. We apply the proposed method to synthetic data, showing that it performs well.