On Asymptotic Optimality in Sequential Changepoint Detection: Non-iid Case

We consider a sequential Bayesian changepoint detection problem for a general stochastic model, assuming that the observed data may be dependent and non-identically distributed and the prior distribution of the change point is arbitrary, not necessarily geometric. Tartakovsky and Veeravalli (2004) developed a general asymptotic theory of changepoint detection in the non-iid case and discrete time, and Baron and Tartakovsky (2006) in continuous time assuming certain stability of the log-likelihood ratio process. This stability property was formulated in terms of the r-quick convergence of the normalized log-likelihood ratio process to a positive and finite number, which can be interpreted as the limiting Kullback-Leibler information between the ‘change’ and ‘no change’ hypotheses. In these papers, it was conjectured that the r-quick convergence can be relaxed in the r-complete convergence, which is typically much easier to verify in particular examples. In the present paper, we justify this conjecture by showing that the Shiryaev change detection procedure is nearly optimal, minimizing asymptotically (as the probability of false alarm vanishes) the moments of the delay to detection up to order r whenever r-complete convergence holds. We also study asymptotic properties of the Shiryaev-Roberts detection procedure in the Bayesian context.

A Sensitivity Analysis of (and Practitioners’ Guide to) Convolutional Neural Networks for Sentence Classification

Convolutional Neural Networks (CNNs) have recently achieved remarkably strong performance on sentence classification tasks (Kim, 2014; Kalchbrenner et al.,2014; Wang et al., 2015). However, these models require practitioners to specify the exact model architecture and accompanying hyper-parameters, e.g., the choice of filter region size, regularization parameters, and so on. It is currently unknown how sensitive model performance is to changes in these configurations for the task of sentence classification. We thus conduct an empirical sensitivity analysis of one-layer CNNs to explore the effect of each part of the architecture on the performance; our aim is to assess the robustness of the model and to distinguish between important and comparatively inconsequential design decisions for sentence classification. We focus on one-layer CNNs (to the exclusion of more complex models) due to their comparative simplicity and strong empirical performance (Kim, 2014). We derive practical advice from our extensive empirical results for those interested in getting the most out of CNNs for sentence classification.

Statistical Matching using Fractional Imputation

Statistical matching is a technique for integrating two or more data sets when information available for matching records for individual participants across data sets is incomplete. Statistical matching can be viewed as a missing data problem where a researcher wants to perform a joint analysis of variables that are never jointly observed. A conditional independence assumption is often used to create imputed data for statistical matching. We consider an alternative approach to statistical matching without using the conditional independence assumption. We apply parametric fractional imputation of Kim (2011) to create imputed data using an instrumental variable assumption to identify the joint distribution. We also present variance estimators appropriate for the imputation procedure. We explain how the method applies directly to the analysis of data from split questionnaire designs and measurement error models.

A Scalable Empirical Bayes Approach to Variable Selection

We develop a model-based empirical Bayes approach to variable selection problems in which the number of predictors is very large, possibly much larger than the number of responses (the so-called ‘large p, small n’ problem). We consider the multiple linear regression setting, where the response is assumed to be a continuous variable and it is a linear function of the predictors plus error. The explanatory variables in the linear model can have a positive effect on the response, a negative effect, or no effect. We model the effects of the linear predictors as a three-component mixture in which a key assumption is that only a small (unknown) fraction of the candidate predictors have a non-zero effect on the response variable. By treating the coefficients as random effects we develop an approach that is computationally efficient because the number of parameters that have to be estimated is small, and remains constant regardless of the number of explanatory variables. The model parameters are estimated using the EM algorithm which is scalable and leads to significantly faster convergence, compared with simulation-based methods.

Kernel spectral clustering of large dimensional data

This article proposes a first analysis of kernel spectral clustering methods in the regime where the dimension p of the data vectors to be clustered and their number n grow large at the same rate. We demonstrate, under a k-class Gaussian mixture model, that the normalized Laplacian matrix associated with the kernel matrix asymptotically behaves similar to a so-called spiked random matrix. Some of the isolated eigenvalue-eigenvector pairs in this model are shown to carry the clustering information upon a separability condition classical in spiked matrix models. We evaluate precisely the position of these eigenvalues and the content of the eigenvectors, which unveil important properties concerning spectral clustering, in particular in simple toy models. Our results are then compared to the practical clustering of images from the MNIST database, thereby revealing an important match between theory and practice.

Heteroscedasticity Testing for Regression Models: A Dimension Reduction-based Model Adaptive

Heteroscedasticity testing is of importance in regression analysis. Existing local smoothing tests suffer severely from curse of dimensionality even when the number of covariates is moderate because of use of nonparametric estimation. In this paper, a dimension reduction-based model adaptive test is proposed which behaves like a local smoothing test as if the number of covariates were equal to the number of their linear combinations in the mean regression function, in particular, equal to 1 when the mean function contains a single index. The test statistic is asymptotically normal under the null hypothesis such that critical values are easily determined. The finite sample performances of the test are examined by simulations and a real data analysis.

Adopting Robustness and Optimality in Fitting and Learning

Search for the Heisenberg spin glass on rewired square lattices with antiferromagnetic interaction

Quantitative approximation schemes for glasses

Complex Politics: A Quantitative Semantic and Topological Analysis of UK House of Commons Debates

Gene network reconstruction using global-local shrinkage priors

Improved Deep Learning Baselines for Ubuntu Corpus Dialogs

Fast graph operations in quantum computation

Number of rational points of symmetric complete intersections over a finite field and applications

Many Body Localization Transition in the strong disorder limit : entanglement entropy from the statistics of rare extensive resonances

Hybrid Dialog State Tracker

Effects of disorder in Graphene and T-Graphene : an augmented space approach

On the discretization in time of the stochastic Allen-Cahn equation

Bounds for the multivariate normal approximation of the maximum likelihood estimator

On the lattice of antichains of finite intervals

Model distances for vine copulas in high dimensions with application to testing the simplifying assumption

Segregating Markov chains

Optimal detection of multi-sample aligned sparse signals

All or Nothing at All

Mesoscopic fluctuations for unitary invariant ensembles

Existence of invariant densities for semiflows with jumps

Maximum Loss of Certain Levy Processes

Elastic regularization in restricted Boltzmann machines: Dealing with $p\gg N$

A note on the best attainable rates of convergence for estimates of the shape parameter of regular variation

An invariance principle for stochastic series I. Gaussian limits

On the Complexity of Rainbow Coloring Problems

Relative Cayley graphs of finite groups

A language model based approach towards large scale and lightweight language identification systems

Tranport estimates for random measures in dimension one

Correcting the estimator for the mean vectors in a multivariate errors-in-variables regression model

Excluding a full grid minor

Self 2-distance graphs

UAVs using Bayesian Optimization to Locate WiFi Devices

Dual Control for Approximate Bayesian Reinforcement Learning

Coupling Importance Sampling and Multilevel Monte Carlo using Sample Average Approximation

Time Asymptotics for a Critical Case in Fragmentation and Growth-Fragmentation Equations

Splitting and time reversal for Markov additive processes

List coloring digraphs

Linear-Vertex Kernel for the Problem of Packing $r$-Stars into a Graph without Long Induced Paths

A note on stochastic Navier-Stokes equations with not regular multiplicative noise

A progressive mesh method for physical simulations using lattice Boltzmann method on single-node multi-gpu architectures

The Amplituhedron and the One-loop Grassmannian Measure

Dimension reduction-based significance testing in nonparametric regression

$\ell_1$-regularized Neural Networks are Improperly Learnable in Polynomial Time

Large deviation principle of occupation measure for stochastic real Ginzburg-Landau equation driven by $α$-stable noises

Bridge Correlational Neural Networks for Multilingual Multimodal Representation Learning

A Multilevel Coordinate Search Algorithm for Well Placement, Control and Joint Optimization

Default Bayesian Analysis with Global-Local Shrinkage Priors

The intrinsic value of HFO features as a biomarker of epileptic activity

Estimates of the coverage of parameter space by Latin Hypercube and Orthogonal sampling: connections between Populations of Models and Experimental Designs

Variations on a theme of Kasteleyn, with application to the totally nonnegative Grassmannian

Spacing Distribution of a Bernoulli Sampled Sequence

Consistent Estimation of Low-Dimensional Latent Structure in High-Dimensional Data

Homotopy type of intervals of the second higher Bruhat orders

Incidences between planes over finite fields

Spectral analysis of the Gram matrix of mixture models

Stability and Turán numbers of a class of hypergraphs via Lagrangians