Hierarchical Latent Word Clustering

This paper presents a new Bayesian non-parametric model by extending the usage of Hierarchical Dirichlet Allocation to extract tree structured word clusters from text data. The inference algorithm of the model collects words in a cluster if they share similar distribution over documents. In our experiments, we observed meaningful hierarchical structures on NIPS corpus and radiology reports collected from public repositories.


Model-Coupled Autoencoder for Time Series Visualisation

We present an approach for the visualisation of a set of time series that combines an echo state network with an autoencoder. For each time series in the dataset we train an echo state network, using a common and fixed reservoir of hidden neurons, and use the optimised readout weights as the new representation. Dimensionality reduction is then performed via an autoencoder on the readout weight representations. The crux of the work is to equip the autoencoder with a loss function that correctly interprets the reconstructed readout weights by associating them with a reconstruction error measured in the data space of sequences. This essentially amounts to measuring the predictive performance that the reconstructed readout weights exhibit on their corresponding sequences when plugged back into the echo state network with the same fixed reservoir. We demonstrate that the proposed visualisation framework can deal both with real valued sequences as well as binary sequences. We derive magnification factors in order to analyse distance preservations and distortions in the visualisation space. The versatility and advantages of the proposed method are demonstrated on datasets of time series that originate from diverse domains.


Incremental Spectral Sparsification for Large-Scale Graph-Based Semi-Supervised Learning

While the harmonic function solution performs well in many semi-supervised learning (SSL) tasks, it is known to scale poorly with the number of samples. Recent successful and scalable methods, such as the eigenfunction method focus on efficiently approximating the whole spectrum of the graph Laplacian constructed from the data. This is in contrast to various subsampling and quantization methods proposed in the past, which may fail in preserving the graph spectra. However, the impact of the approximation of the spectrum on the final generalization error is either unknown, or requires strong assumptions on the data. In this paper, we introduce Sparse-HFS, an efficient edge-sparsification algorithm for SSL. By constructing an edge-sparse and spectrally similar graph, we are able to leverage the approximation guarantees of spectral sparsification methods to bound the generalization error of Sparse-HFS. As a result, we obtain a theoretically-grounded approximation scheme for graph-based SSL that also empirically matches the performance of known large-scale methods.


Using quantum theory to reduce the complexity of input-output processes

RepEx: A Flexible Framework for Scalable Replica Exchange Molecular Dynamics Simulations

Exact extreme value statistics at mixed order transitions

Optimal Quadrature Subsampling for Least Squares Polynomial Approximations

Arithmetics and combinatorics of tropical Severi varieties of univariate polynomials

Optimal Composition Ordering Problems for Piecewise Linear Functions

The Local Cut Lemma

Pell’s equation and series expansions for irrational numbers

Data-driven Rank Breaking for Efficient Rank Aggregation

A second proof of the Shareshian–Wachs conjecture, by way of a new Hopf algebra

Perfect Sampling of Generalized Jackson Network

Depinning as a coagulation process

Product space for two processes with independent increments under nonlinear expectations

Single- and Multi-level Network Sparsification by Algebraic Distance

High-Field Limit from a Stochastic BGK Model to a Scalar Conservation Law with Stochastic Forcing

A New Approach for Testing Properties of Discrete Distributions

Goodness-of-fit tests for extended Log-GARCH models and specification tests against the EGARCH

Thrackles containing a standard musquash

A deviation bound for $α$-dependent sequences with applications to intermittent maps

Efficient parameter inference in general hidden Markov models using the filter derivatives

Mean Field Dynamics of a Network of Wilson-Cowan Neurons with Electrical Synapses

q-series and tails of colored Jones polynomials

Treeable Graphings Are Local Limits of Finite Graphs

Regularization and the small-ball method I: sparse recovery

Efficient Processing of Very Large Graphs in a Small Cluster

The quenched asymptotics for nonlocal Schrödinger operators with Poissonian potentials

The graphs with all but two eigenvalues equal to $-2$ or $0$

On the Ritt property and weak type maximal inequalities for convolution powers on $\ell^1(\Z)$

Polynomials with palindromic and unimodal coefficients

The Continuous Configuration Model: A Null for Community Detection on Weighted Networks

A Repulsive-Attractive Metropolis Algorithm for Multimodality

Network-inspired design of broadband materials with reduced dimensionality

Total positivity of Riordan arrays

Real-space renormalization for the finite temperature statics and dynamics of the Dyson Long-Ranged Ferromagnetic and Spin-Glass models

Total positivity of recursive matrices

On Structured Sparsity of Phonological Posteriors for Linguistic Parsing

A note on the central limit theorem for a one-sided reflected Ornstein-Uhlenbeck process

Ext and Tor on two-dimensional cyclic quotient singularities

Semiparametric stationarity and fractional unit roots tests based on data-driven multidimensional increment ratio statistics

Optimal exponential bounds for aggregation of estimators for the Kullback-Leibler loss

Fluid Dynamics Modeling : The Numerical Solution Of 2D Navier Hyperbolic Equations

The Critical Domain Size of Stochastic Population Models

On the maximum likelihood estimator for the Generalized Extreme-Value distribution

The diffeomorphism type of small hyperplane arrangements is combinatorially determined

A conformally invariant growth process of SLE excursions

Asymptotics of the Karhunen-Loeve expansion for the fractional Brownian motion

Basker: A Threaded Sparse LU Factorization Utilizing Hierarchical Parallelism and Data Layouts

Local universality for real roots of random trigonometric polynomials

Strong convergence rates for nonlinearity-truncated Euler-type approximations of stochastic Ginzburg-Landau equations

A New Pivot Selection Algorithm for Symmetric Indefinite Factorization Arising in Quadratic Programming with Block Constraint Matrices

Cores, joins and the Fano-flow conjectures

A Confidence-Based Approach for Balancing Fairness and Accuracy

Adaptive confidence sets in shape restricted regression

Syntax-Semantics Interaction Parsing Strategies. Inside SYNTAGMA

Higher spin six vertex model and symmetric rational functions

Local community detection by seed expansion: from conductance to weighted kernel 1-mean optimization

Automatic Matching of Bullet Lands