Deep Learning for Single-View Instance Recognition

Deep learning methods have typically been trained on large datasets in which many training examples are available. However, many real-world product datasets have only a small number of images available for each product. We explore the use of deep learning methods for recognizing object instances when we have only a single training example per class. We show that feedforward neural networks outperform state-of-the-art methods for recognizing objects from novel viewpoints even when trained from just a single image per object. To further improve our performance on this task, we propose to take advantage of a supplementary dataset in which we observe a separate set of objects from multiple viewpoints. We introduce a new approach for training deep learning methods for instance recognition with limited training data, in which we use an auxiliary multi-view dataset to train our network to be robust to viewpoint changes. We find that this approach leads to a more robust classifier for recognizing objects from novel viewpoints, outperforming previous state-of-the-art approaches including keypoint-matching, template-based techniques, and sparse coding.

Exponential bounds for the hypergeometric distribution

We establish exponential bounds for the hypergeometric distribution which include a finite sampling correction factor, but are otherwise analogous to bounds for the binomial distribution due to Le\’on and Perron (2003) and Talagrand (1994). We also establish a convex ordering for sampling without replacement from populations of real numbers between zero and one: a population of all zeros or ones (and hence yielding a hypergeometric distribution in the upper bound) gives the extreme case.

Nonlinear stability and ergodicity of ensemble based Kalman filters

The ensemble Kalman filter (EnKF) and ensemble square root filter (ESRF) are data assimilation methods used to combine high dimensional, nonlinear dynamical models with observed data. Despite their widespread usage in climate science and oil reservoir simulation, very little is known about the long-time behavior of these methods and why they are effective when applied with modest ensemble sizes in large dimensional turbulent dynamical systems. By following the basic principles of energy dissipation and controllability of filters, this paper establishes a simple, systematic and rigorous framework for the nonlinear analysis of EnKF and ESRF with arbitrary ensemble size, focusing on the dynamical properties of boundedness and geometric ergodicity. The time uniform boundedness guarantees that the filter estimate will not diverge to machine infinity in finite time, which is a potential threat for EnKF and ESQF known as the catastrophic filter divergence. Geometric ergodicity ensures in addition that the filter has a unique invariant measure and that initialization errors will dissipate exponentially in time. We establish these results by introducing a natural notion of observable energy dissipation. The time uniform bound is achieved through a simple Lyapunov function argument, this result applies to systems with complete observations and strong kinetic energy dissipation, but also to concrete examples with incomplete observations. With the Lyapunov function argument established, the geometric ergodicity is obtained by verifying the controllability of the filter processes; in particular, such analysis for ESQF relies on a careful multivariate perturbation analysis of the covariance eigen-structure.

Nonlinear stability of the ensemble Kalman filter with adaptive covariance inflation

The Ensemble Kalman filter and Ensemble square root filters are data assimilation methods used to combine high dimensional nonlinear models with observed data. These methods have proved to be indispensable tools in science and engineering as they allow computationally cheap, low dimensional ensemble state approximation for extremely high dimensional turbulent forecast models. From a theoretical perspective, these methods are poorly understood, with the exception of a recently established but still incomplete nonlinear stability theory. Moreover, recent numerical and theoretical studies of catastrophic filter divergence have indicated that stability is a genuine mathematical concern and can not be taken for granted in implementation. In this article we propose a simple modification of ensemble based methods which resolves these stability issues entirely. The method involves a new type of adaptive covariance inflation, which comes with minimal additional cost. We develop a complete nonlinear stability theory for the adaptive method, yielding Lyapunov functions and geometric ergodicity under weak assumptions. We present numerical evidence which suggests the adaptive methods have improved accuracy over standard methods and completely eliminate catastrophic filter divergence. This enhanced stability allows for the use of extremely cheap, unstable forecast integrators, which would otherwise lead to widespread filter malfunction.

Distributed Mini-Batch SDCA

We present an improved analysis of mini-batched stochastic dual coordinate ascent for regularized empirical loss minimization (i.e. SVM and SVM-type objectives). Our analysis allows for flexible sampling schemes, including where data is distribute across machines, and combines a dependence on the smoothness of the loss and/or the data spread (measured through the spectral norm).

Interacting partially directed self avoiding walk : scaling limits

This paper is dedicated to the investigation of a 1+1 dimensional self-interacting and partially directed self-avoiding walk, usually referred to by the acronym IPDSAW and introduced in \cite{ZL68} by Zwanzig and Lauritzen to study the collapse transition of an homopolymer dipped in a poor solvant. In \cite{POBG93}, physicists displayed numerical results concerning the typical growth rate of some geometric features of the path as its length L diverges. From this perspective the quantities of interest are the projections of the path onto the horizontal axis (also called horizontal extension) and onto the vertical axis for which it is useful to define the lower and the upper envelopes of the path. With the help of a new random walk representation, we proved in \cite{CNGP13} that the path grows horizontally like \sqrt{L} in its collapsed regime and that, once rescaled by \sqrt{L} vertically and horizontally, its upper and lower envelopes converge to some deterministic Wulff shapes. In the present paper, we bring the geometric investigation of the path several steps further. In the extended regime, we prove a law of large number for the horizontal extension of the polymer rescaled by its total length L, we provide a precise asymptotics of the partition function and we show that its lower and upper envelopes, once rescaled in time by L and in space by \sqrt{L}, converge to the same Brownian motion. At criticality, we identify the limiting distribution of the horizontal extension rescaled by L^{2/3} and we show that the excess partition function decays as L^{2/3} with an explicit prefactor. In the collapsed regime, we identify the joint limiting distribution of the fluctuations of the upper and lower envelopes around their associated limiting Wulff shapes, rescaled in time by \sqrt{L} and in space by L^{1/4}.

On the pseudoachromatic index of the complete graph III

Let \Pi_q be the projective plane of order q , let \psi(m):=\psi(L(K_m)) the pseudoachromatic number of the complete line graph of order m , let a\in \{ 3,4,\dots,\tfrac{q}{2}+1 \} and m_a=(q+1)^2-a . In this paper, we improve the upper bound of \psi(m) given by Araujo-Pardo et al. [J Graph Theory 66 (2011), 89–97] and Jamison [Discrete Math. 74 (1989), 99–115] in the following values: if x\geq 2 is an integer and m\in \{4x^2-x,\dots,4x^2+3x-3\} then \psi(m) \leq 2x(m-x-1). On the other hand, if q is even and there exists \Pi_q we give a complete edge-colouring of K_{m_a} with (m_a-a)q colours. Moreover, using this colouring we extend the previous results for a=\{-1,0,1,2\} given by Araujo-Pardo et al. in [J Graph Theory 66 (2011), 89–97] and [Bol. Soc. Mat. Mex. (2014) 20:17–28] proving that \psi(m_a)=(m_a-a)q for a\in \{3,4,\dots,\left\lceil \frac{1+\sqrt{4q+9}}{2}\right\rceil -1 \} .

Inspection games in a mean field setting

In this paper, we present a new development of inspection games in a mean field setting. In our dynamic version of an inspection game, there is one inspector and a large number N interacting inspectees with a finite state space. By applying the mean field game methodology, we present a solution as an epsilon-equilibrium to this type of inspection games, where epsilon goes to 0 as N tends to infinity. In order to facilitate numerical analysis of this new type inspection game, we conduct an approximation analysis, that is we approximate the optimal Lipschitz continuous switching strategies by smooth switching strategies. We show that any approximating smooth switching strategy is also an epsilon-equilibrium solution to the inspection game with a large and finite number N of inspectees with epsilon being of order 1/N.

How Data Volume Affects Spark Based Data Analytics on a Scale-up Server

Sheer increase in volume of data over the last decade has triggered research in cluster computing frameworks that enable web enterprises to extract big insights from big data. While Apache Spark is gaining popularity for exhibiting superior scale-out performance on the commodity machines, the impact of data volume on the performance of Spark based data analytics in scale-up configuration is not well understood. We present a deep-dive analysis of Spark based applications on a large scale-up server machine. Our analysis reveals that Spark based data analytics are DRAM bound and do not benefit by using more than 12 cores for an executor. By enlarging input data size, application performance degrades significantly due to substantial increase in wait time during I/O operations and garbage collection, despite 10\% better instruction retirement rate (due to lower L1 cache misses and higher core utilization). We match memory behaviour with the garbage collector to improve performance of applications between 1.6x to 3x.

Approximating Dense Max 2-CSPs

In this paper, we present a polynomial-time algorithm that approximates sufficiently high-value Max 2-CSPs on sufficiently dense graphs to within O(N^{\varepsilon}) approximation ratio for any constant \varepsilon > 0. Using this algorithm, we also achieve similar results for free games, projection games on sufficiently dense random graphs, and the Densest k-Subgraph problem with sufficiently dense optimal solution. Note, however, that algorithms with similar guarantees to the last algorithm were in fact discovered prior to our work by Feige et al. and Suzuki and Tokuyama. In addition, our idea for the above algorithms yields the following by-product: a quasi-polynomial time approximation scheme (QPTAS) for satisfiable dense Max 2-CSPs with better running time than the known algorithms.

Large Covariance Estimation through Elliptical Factor Models

We proposed a general Principal Orthogonal complEment Thresholding (POET) framework for large-scale covariance matrix estimation based on an approximate factor model. A set of high level sufficient conditions for the procedure to achieve optimal rates of convergence under different matrix norms were brought up to better understand how POET works. Such a framework allows us to recover the results for sub-Gaussian in a more transparent way that only depends on the concentration properties of the sample covariance matrix. As a new theoretical contribution, for the first time, such a framework allows us to exploit conditional sparsity covariance structure for the heavy-tailed data. In particular, for the elliptical data, we proposed a robust estimator based on marginal and multivariate Kendall’s tau to satisfy these conditions. In addition, conditional graphical model was also studied under the same framework. The technical tools developed in this paper are of general interest to high dimensional principal component analysis. Thorough numerical results were also provided to back up the developed theory.

VMF-SNE: Embedding for Spherical Data

T-SNE is a well-known approach to embedding high-dimensional data and has been widely used in data visualization. The basic assumption of t-SNE is that the data are non-constrained in the Euclidean space and the local proximity can be modelled by Gaussian distributions. This assumption does not hold for a wide range of data types in practical applications, for instance spherical data for which the local proximity is better modelled by the von Mises-Fisher (vMF) distribution instead of the Gaussian. This paper presents a vMF-SNE embedding algorithm to embed spherical data. An iterative process is derived to produce an efficient embedding. The results on a simulation data set demonstrated that vMF-SNE produces better embeddings than t-SNE for spherical data.

Beyond the Valley of the Covariance Function

Discussion of ‘Cross-Covariance Functions for Multivariate Geostatistics’ by Genton and Kleiber [arXiv:1507.08017].

The Submodular Secretary Problem Goes Linear

During the last decade, the matroid secretary problem (MSP) became one of the most prominent classes of online selection problems. Partially linked to its numerous applications in mechanism design, substantial interest arose also in the study of nonlinear versions of MSP, with a focus on the submodular matroid secretary problem (SMSP). So far, O(1)-competitive algorithms have been obtained for SMSP over some basic matroid classes. This created some hope that, analogously to the matroid secretary conjecture, one may even obtain O(1)-competitive algorithms for SMSP over any matroid. However, up to now, most questions related to SMSP remained open, including whether SMSP may be substantially more difficult than MSP; and more generally, to what extend MSP and SMSP are related. Our goal is to address these points by presenting general black-box reductions from SMSP to MSP. In particular, we show that any O(1)-competitive algorithm for MSP, even restricted to a particular matroid class, can be transformed in a black-box way to an O(1)-competitive algorithm for SMSP over the same matroid class. This implies that the matroid secretary conjecture is equivalent to the same conjecture for SMSP. Hence, in this sense SMSP is not harder than MSP. Also, to find O(1)-competitive algorithms for SMSP over a particular matroid class, it suffices to consider MSP over the same matroid class. Using our reductions we obtain many first and improved O(1)-competitive algorithms for SMSP over various matroid classes by leveraging known algorithms for MSP. Moreover, our reductions imply an O(loglog(rank))-competitive algorithm for SMSP, thus, matching the currently best asymptotic algorithm for MSP, and substantially improving on the previously best O(log(rank))-competitive algorithm for SMSP.

On the Flexibility of Multivariate Covariance Models: Comment on the Paper by Genton and Kleiber

Discussion of ‘Cross-Covariance Functions for Multivariate Geostatistics’ by Genton and Kleiber [arXiv:1507.08017].

A critique of Birnbaum’s counter-example to the likelihood principle: What happened did not have to happen

The likelihood principle (LP) is controversial in part due to claimed counter-examples. Birnbaum’s counter-example is important because of his role in the development of the LP and because it leads to a conclusion that adherence to the LP would always lead to favouring of a false, or at least uninteresting, hypothesis, specifically the determinist hypothesis that things had to turn out the way they did. The example has a single observation is drawn from a population with unknown mean and either high spread or zero spread, and Birnbaum’s analysis shows that no matter what value is observed the LP will suggest that there is strong support for the hypothesis that the spread is zero. That conflicts with the intuitive account wherein no single observation contains information relevant to the spread parameter and so no evidence-based preference between the hypothesised spread values should be possible. Two objections to the alleged counter-example are provided. First, the statistical model of Birnbaum’s example contains two parameters, mean and spread, but only spread is of interest, and the analysis allows the mean parameter to dominates the likelihood function and thus it allows the nuisance parameter to determine the result. An alternative analysis that eliminates the influence of the nuisance parameter gives a result that matches intuitive expectations. Second, Birnbaum’s model has two parameters and only one data point and so it is overfitted. Elimination of the overfitting by either parameter omission or a second datum eliminates the apparently counter-intuitive consequences of the LP. Thus it is demonstrated that Birnbaum’s example does not support the idea that the LP leads inevitably to the uninteresting determinist hypothesis to prevailing over all others. Birnbaum’s example fails as a general counter-example to the likelihood principle.

Tag-Weighted Topic Model For Large-scale Semi-Structured Documents

To date, there have been massive Semi-Structured Documents (SSDs) during the evolution of the Internet. These SSDs contain both unstructured features (e.g., plain text) and metadata (e.g., tags). Most previous works focused on modeling the unstructured text, and recently, some other methods have been proposed to model the unstructured text with specific tags. To build a general model for SSDs remains an important problem in terms of both model fitness and efficiency. We propose a novel method to model the SSDs by a so-called Tag-Weighted Topic Model (TWTM). TWTM is a framework that leverages both the tags and words information, not only to learn the document-topic and topic-word distributions, but also to infer the tag-topic distributions for text mining tasks. We present an efficient variational inference method with an EM algorithm for estimating the model parameters. Meanwhile, we propose three large-scale solutions for our model under the MapReduce distributed computing platform for modeling large-scale SSDs. The experimental results show the effectiveness, efficiency and the robustness by comparing our model with the state-of-the-art methods in document modeling, tags prediction and text classification. We also show the performance of the three distributed solutions in terms of time and accuracy on document modeling.

Capturing Multivariate Spatial Dependence: Model, Estimate and then Predict

Physical processes rarely occur in isolation, rather they influence and interact with one another. Thus, there is great benefit in modeling potential dependence between both spatial locations and different processes. It is the interaction between these two dependencies that is the focus of Genton and Kleiber’s paper under discussion. We see the problem of ensuring that any multivariate spatial covariance matrix is nonnegative definite as important, but we also see it as a means to an end. That ‘end’ is solving the scientific problem of predicting a multivariate field. [arXiv:1507.08017].

When Doesn’t Cokriging Outperform Kriging?

Although cokriging in theory should yield smaller or equal prediction variance than kriging, this outperformance sometimes is hard to see in practice. This should motivate theoretical studies on cokriging. In general, there is a lack of theoretical results for cokriging. In this work, we provide some theoretical results to compare cokriging with kriging by examining some explicit models and specific sampling schemes. [arXiv:1507.08017]

Rejoinder of ‘Cross-Covariance Functions for Multivariate Geostatistics’

Rejoinder of “Cross-Covariance Functions for Multivariate Geostatistics” by Genton and Kleiber [arXiv:1507.08017].

Convolution Preserves Partial Synchronicity of Log-concave Sequences

In a recent proof of the log-concavity of genus polynomials of some families of graphs, Gross et al. defined the weakly synchronicity relation between log-concave sequences, and conjectured that the convolution operation by any log-concave sequence preserves weakly synchronicity. We disprove it by providing a counterexample. Furthermore, we find the so-called partial synchronicity relation between log-concave sequences, which is (i) weaker than the synchronicity, (ii) stronger than the weakly synchronicity, and (iii) preserved by the convolution operation.

Robustness to outliers in location-scale parameter model using log-regularly varying distributions

Estimating the location and scale parameters is common in statistics, using, for instance, the well-known sample mean and standard deviation. However, inference can be contaminated by the presence of outliers if modeling is done with light-tailed distributions such as the normal distribution. In this paper, we study robustness to outliers in location-scale parameter models using both the Bayesian and frequentist approaches. We find sufficient conditions (e.g., on tail behavior of the model) to obtain whole robustness to outliers, in the sense that the impact of the outliers gradually decreases to nothing as the conflict grows infinitely. To this end, we introduce the family of log-Pareto-tailed symmetric distributions that belongs to the larger family of log-regularly varying distributions.

Metadata Embeddings for User and Item Cold-start Recommendations

I present a hybrid matrix factorisation model representing users and items as linear combinations of their content features’ latent factors. The model outperforms both collaborative and content-based models in cold-start or sparse interaction data scenarios (using both user and item metadata), and performs at least as well as a pure collaborative matrix factorisation model where interaction data is abundant. Additionally, feature embeddings produced by the model encode semantic information in a way reminiscent of word embedding approaches, making them useful for a range of related tasks such as tag recommendations.

Optimal designs for the proportional interference model

The interference model has been widely used and studied in block experiments where the treatment for a particular plot has effects on its neighbor plots. In this paper, we study optimal circular designs for the proportional interference model, in which the neighbor effects of a treatment are proportional to its direct effect. Kiefer’s equivalence theorems for estimating both the direct and total treatment effects are developed with respect to the criteria of A, D, E and T. Parallel studies are carried out for the undirectional model, where the neighbor effects do not depend on whether they are from the left or right. Moreover, the connection between optimal designs for the directional and undiretional models is built. Importantly, one can easily develop a computer program for finding optimal designs based on these theorems.

Optimal estimates for short horizon travel time prediction in urban areas

Increasing popularity of mobile route planning applications based on GPS technology provides opportunities for collecting traffic data in urban environments. One of the main challenges for travel time estimation and prediction in such a setting is how to aggregate data from vehicles that have followed different routes, and predict travel time for other routes of interest. One approach is to predict travel times for route segments, and sum those estimates to obtain a prediction for the whole route. We study how to obtain optimal predictions in this scenario. It appears that the optimal estimate, minimizing the expected mean absolute error, is a combination of the mean and the median travel times on each segment, where the combination function depends on the number of segments in the route of interest. We present a methodology for obtaining such predictions, and demonstrate its effectiveness with a case study using travel time data from a district of St. Petersburg collected over one year. The proposed methodology can be applied for real-time prediction of expected travel times in an urban road network.

Generalised and Quotient Models for Random And/Or Trees and Application to Satisfiability

This article is motivated by the following satisfiability question: pick uniformly at random an and/or Boolean expression of length n, built on a set of k_n Boolean variables. What is the probability that this expression is satisfiable? asymptotically when n tends to infinity? The model of random Boolean expressions developed in the present paper is the model of Boolean Catalan trees, already extensively studied in the literature for a constant sequence (k_n)_{n\geq 1}. The fundamental breakthrough of this paper is to generalise the previous results to any (reasonable) sequence of integers (k_n)_{n\geq 1}, which enables us, in particular, to solve the above satisfiability question. We also analyse the effect of introducing a natural equivalence relation on the set of Boolean expressions. This new ‘quotient’ model happens to exhibit a very interesting threshold (or saturation) phenomenon at k_n = n/ln n.

On The Chromatic Number of Matching Graphs

In an earlier paper, the present authors (2013) introduced the altermatic number of graphs and used Tucker’s Lemma, an equivalent combinatorial version of the Borsuk-Ulam Theorem, to show that the altermatic number is a lower bound for the chromatic number. A matching graph has the set of all matchings of a specified size of a graph as vertex set and two vertices are adjacent if the corresponding matchings are edge-disjoint. It is known that the Kneser graphs, the Schrijver graphs, and the permutation graphs can be represented by matching graphs. In this paper, as a generalization of the well-known result of Schrijver about the chromatic number of Schrijver graphs, we determine the chromatic number of a large family of matching graphs by specifying their altermatic number. In particular, we determine the chromatic number of these matching graphs in terms of the generalized Turan number of matchings.

A central limit theorem and a law of the iterated logarithm for the Biggins martingale of the supercritical branching random walk

Let (W_n(\theta))_{n\in\mathbb N_0} be the Biggins martingale associated with a supercritical branching random walk and denote by W_\infty(\theta) its limit. Assuming essentially that the martingale (W_n(2\theta))_{n\in\mathbb N_0} is uniformly integrable and that \text{Var} W_1(\theta) is finite, we prove a functional central limit theorem for the tail process (W_\infty(\theta) - W_{n+r}(\theta))_{r\in\mathbb N_0} and a law of the iterated logarithm for W_\infty(\theta)-W_n(\theta), as n\to\infty.

Image sets of fractional Brownian sheets

Let B^H = \{ B^H(t), t\in\mathbb{R}^N \} be an (N,d)-fractional Brownian sheet with Hurst index H=(H_1,\dotsc,H_N)\in (0,1)^N. The main objective of the present paper is to study the Hausdorff dimension of the image sets B^H(F+t), F\subset\mathbb{R}^N and t\in\mathbb{R}^N, in the dimension case d<\tfrac{1}{H_1}+\cdots+\tfrac{1}{H_N}. Following the seminal work of Kaufman (1989), we establish uniform dimensional properties on B^H, answering questions raised by Khoshnevisan et al (2006) and Wu and Xiao (2009). For the purpose of this work, we introduce a refinement of the sectorial local-nondeterminism property which can be of independent interest to the study of other fine properties of fractional Brownian sheets.

A Model for Foraging Ants, Controlled by Spiking Neural Networks and Double Pheromones

A model of an Ant System where ants are controlled by a spiking neural circuit and a second order pheromone mechanism in a foraging task is presented. A neural circuit is trained for individual ants and subsequently the ants are exposed to a virtual environment where a swarm of ants performed a resource foraging task. The model comprises an associative and unsupervised learning strategy for the neural circuit of the ant. The neural circuit adapts to the environment by means of classical conditioning. The initially unknown environment includes different types of stimuli representing food and obstacles which, when they come in direct contact with the ant, elicit a reflex response in the motor neural system of the ant: moving towards or away from the source of the stimulus. The ants are released on a landscape with multiple food sources where one ant alone would have difficulty harvesting the landscape to maximum efficiency. The introduction of a double pheromone mechanism yields better results than traditional ant colony optimization strategies. Traditional ant systems include mainly a positive reinforcement pheromone. This approach uses a second pheromone that acts as a marker for forbidden paths (negative feedback). This blockade is not permanent and is controlled by the evaporation rate of the pheromones. The combined action of both pheromones acts as a collective stigmergic memory of the swarm, which reduces the search space of the problem. This paper explores how the adaptation and learning abilities observed in biologically inspired cognitive architectures is synergistically enhanced by swarm optimization strategies. The model portraits two forms of artificial intelligent behaviour: at the individual level the spiking neural network is the main controller and at the collective level the pheromone distribution is a map towards the solution emerged by the colony.

Semiparametric GEE analysis in partially linear single-index models for longitudinal data

In this article, we study a partially linear single-index model for longitudinal data under a general framework which includes both the sparse and dense longitudinal data cases. A semiparametric estimation method based on a combination of the local linear smoothing and generalized estimation equations (GEE) is introduced to estimate the two parameter vectors as well as the unknown link function. Under some mild conditions, we derive the asymptotic properties of the proposed parametric and nonparametric estimators in different scenarios, from which we find that the convergence rates and asymptotic variances of the proposed estimators for sparse longitudinal data would be substantially different from those for dense longitudinal data. We also discuss the estimation of the covariance (or weight) matrices involved in the semiparametric GEE method. Furthermore, we provide some numerical studies including Monte Carlo simulation and an empirical application to illustrate our methodology and theory.

Framework for learning agents in quantum environments

In this paper we provide a broad framework for describing learning agents in general quantum environments. We analyze the types of classically specified environments which allow for quantum enhancements in learning, by contrasting environments to quantum oracles. We show that whether or not quantum improvements are at all possible depends on the internal structure of the quantum environment. If the environments are constructed and the internal structure is appropriately chosen, or if the agent has limited capacities to influence the internal states of the environment, we show that improvements in learning times are possible in a broad range of scenarios. Such scenarios we call luck-favoring settings. The case of constructed environments is particularly relevant for the class of model-based learning agents, where our results imply a near-generic improvement.

Spectral analysis of the diffusion operator with random jumps from the boundary

Using an operator-theoretic framework in a Hilbert-space setting, we perform a detailed spectral analysis of the one-dimensional Laplacian in a bounded interval, subject to specific non-self-adjoint connected boundary conditions modelling a random jump from the boundary to a point inside the interval. In accordance with previous works, we find that all the eigenvalues are real. As the new results, we derive and analyse the adjoint operator, determine the geometric and algebraic multiplicities of the eigenvalues, write down formulae for the eigenfunctions together with the generalised eigenfunctions and study their basis properties. It turns out that the latter heavily depend on Diophantine properties of the interior point. Finally, we find a closed formula for the metric operator that provides a similarity transform of the problem to a self-adjoint operator.

Cost optimization of data flows based on task re-ordering

Analyzing big data in a highly dynamic environment becomes more and more critical because of the increasingly need for end-to-end processing of this data. Modern data flows are quite complex and there are not efficient, cost-based, fully-automated, scalable optimization solutions that can facilitate flow designers. The state-of-the-art proposals fail to provide near optimal solutions even for simple data flows. To tackle this problem, we introduce a set of approximate algorithms for defining the execution order of the constituent tasks, in order to minimize the total execution cost of a data flow. We also present the advantages of the parallel execution of data flows. We validated our proposals in both a real tool and synthetic flows and the results show that we can achieve significant speed-ups, moving much closer to optimal solutions.

Randomised Rounding with Applications

We develop new techniques for rounding packing integer programs using iterative randomized rounding. It is based on a novel application of multidimensional Brownian motion in \mathbb{R}^n. Let \overset{\sim}{x} \in {[0,1]}^n be a fractional feasible solution of a packing constraint A x \leq 1,\ \ A \in {\{0,1 \}}^{m\times n} that maximizes a linear objective function. The independent randomized rounding method of Raghavan-Thompson rounds each variable x_i to 1 with probability \overset{\sim}{x_i} and 0 otherwise. The expected value of the rounded objective function matches the fractional optimum and no constraint is violated by more than O(\frac{\log m} {\log\log m}).In contrast, our algorithm iteratively transforms \overset{\sim}{x} to \hat{x} \in {\{ 0,1\}}^{n} using a random walk, such that the expected values of \hat{x}_i‘s are consistent with the Raghavan-Thompson rounding. In addition, it gives us intermediate values x' which can then be used to bias the rounding towards a superior solution.The reduced dependencies between the constraints of the sparser system can be exploited using {\it Lovasz Local Lemma}. For m randomly chosen packing constraints in n variables, with k variables in each inequality, the constraints are satisfied within O(\frac{\log (mkp\log m/n) }{\log\log (mkp\log m/n)}) with high probability where p is the ratio between the maximum and minimum coefficients of the linear objective function. Further, we explore trade-offs between approximation factors and error, and present applications to well-known problems like circuit-switching, maximum independent set of rectangles and hypergraph b-matching. Our methods apply to the weighted instances of the problems and are likely to lead to better insights for even dependent rounding.

A New Approach to Examine q-Steiner Systems

The interest in q-analogs of codes and designs has been increased in the last few years as a consequence of their new application in error-correction for random network coding. Two of the most intriguing problems are the existence question of an infinite family of q-analog of Steiner systems, excepts for spreads, and the existence question for q-analog for the Fano plane. We exhibit a new method to attack these problems. In the process we define a new family of designs whose existence is implied from the existence of q-analog of Steiner systems, but their existence can be also independent. We present necessary conditions for the existence for such designs, trivial constructions for such designs, and a nontrivial recursive construction.

A global sensitivity analysis approach for morphogenesis models

Morphogenesis is a developmental process in which cells organize into shapes and patterns. Complex, multi-factorial models are commonly used to study morphogenesis. It is difficult to understand the relation between the uncertainty in the input and the output of such `black-box’ models, giving rise to the need for sensitivity analysis tools. In this paper, we introduce a workflow for a global sensitivity analysis approach to study the impact of single parameters and the interactions between them on the output of morphogenesis models. To demonstrate the workflow, we used a published, well-studied model of vascular morphogenesis. The parameters of the model represent cell properties and behaviors that drive the mechanisms of angiogenic sprouting. The global sensitivity analysis correctly identified the dominant parameters in the model, consistent with previous studies. Additionally, the analysis provides information on the relative impact of single parameters and of interactions between them. The uncertainty in the output of the model was largely caused by single parameters. The parameter interactions, although of low impact, provided new insights in the mechanisms of \emph{in silico} sprouting. Finally, the analysis indicated that the model could be reduced by one parameter. We propose global sensitivity analysis as an alternative approach to study and validate the mechanisms of morphogenesis. Comparison of the ranking of the impact of the model parameters to knowledge derived from experimental data and validation of manipulation experiments can help to falsify models and to find the operand mechanisms in morphogenesis. The workflow is applicable to all `black-box’ models, including high-throughput \emph{in vitro} models in which an output measure is affected by a set of experimental perturbations.

Large time zero temperature dynamics of the spherical p=2-spin model of finite size

We revisit the long time dynamics of the spherical fully connected spin-glass model, i.e. the spherical p=2-spin model, when the number of spins N is large but finite. At T=0 where the system is in a (trivial) spin-glass phase, and on long time scale t \gtrsim {\cal O}{(N^{2/3})} we show that the behavior of physical observables, like the energy, correlation and response functions, is controlled by the density of near-extreme eigenvalues at the edge of the spectrum of the coupling matrix J, and are thus non self-averaging. We show that the late time decay of these observables, once averaged over the disorder, is controlled by new universal exponents which we compute exactly.

Kac’s Walk on $n$-sphere mixes in $n\log n$ steps

Determining the mixing time of Kac’s random walk on the sphere \mathrm{S}^{n-1} is a long-standing open problem. We show that the total variation mixing time of Kac’s walk on \mathrm{S}^{n-1} is between n \log(n) and 200 \,n \log(n). Our bound is thus optimal up to a constant factor, improving on the best-known upper bound of O(n^{5} \log(n)^{2}) due to Jiang. Our main tool is a `non-Markovian’ coupling recently introduced by the second author for obtaining the convergence rates of certain high dimensional Gibbs samplers in continuous state spaces.

CRISNER: A Practically Efficient Reasoner for Qualitative Preferences

We present CRISNER (Conditional & Relative Importance Statement Network PrEference Reasoner), a tool that provides practically efficient as well as exact reasoning about qualitative preferences in popular ceteris paribus preference languages such as CP-nets, TCP-nets, CP-theories, etc. The tool uses a model checking engine to translate preference specifications and queries into appropriate Kripke models and verifiable properties over them respectively. The distinguishing features of the tool are: (1) exact and provably correct query answering for testing dominance, consistency with respect to a preference specification, and testing equivalence and subsumption of two sets of preferences; (2) automatic generation of proofs evidencing the correctness of answer produced by CRISNER to any of the above queries; (3) XML inputs and outputs that make it portable and pluggable into other applications. We also describe the extensible architecture of CRISNER, which can be extended to new reference formalisms based on ceteris paribus semantics that may be developed in the future.

Metropolized Randomized Maximum Likelihood for sampling from multimodal distributions

This article describes a method for using optimization to derive efficient independent transition functions for Markov chain Monte Carlo simulations. Our interest is in sampling from a posterior density \pi(x) for problems in which the dimension of the model space is large, \pi(x) is multimodal with regions of low probability separating the modes, and evaluation of the likelihood is expensive. We restrict our attention to the special case for which the target density is the product of a multivariate Gaussian prior and a likelihood function for which the errors in observations are additive and Gaussian.

Orthogonal parallel MCMC methods for sampling and optimization

Monte Carlo (MC) methods are widely used in statistics, signal processing and machine learning. A well-known class of MC methods are Markov Chain Monte Carlo (MCMC) algorithms. In order to foster better exploration of the state space, specially in high-dimensional applications, several schemes employing multiple parallel MCMC chains have been recently introduced. In this work, we describe a novel parallel interacting MCMC scheme, called orthogonal MCMC (O-MCMC), where a set of vertical parallel MCMC chains share information using some horizontal MCMC techniques working on the entire population of current states. More specifically, the vertical chains are led by random-walk proposals, whereas the horizontal MCMC techniques employ independent proposals, thus allowing an efficient combination of global exploration and local approximation. The interaction is contained in these horizontal iterations. Within the analysis of different implementations of O-MCMC, novel schemes for reducing the overall computational cost of parallel multiple try Metropolis (MTM) chains are also presented. Furthermore, a modified version of O-MCMC for optimization is provided by considering parallel simulated annealing (SA) algorithms. Finally, we also discuss the application of O-MCMC in a big bata framework. Numerical results show the advantages of the proposed sampling scheme in terms of efficiency in the estimation, as well as robustness in terms of independence with respect to initial values and the choice of the parameters.

Brownian motion and Random Walk above Quenched Random Wall

We study the probability of a random walk staying above a trajectory of another random walk. More precisely, let \{B_{n}\}_{n\in\mathbb{N}} and \{W_{n}\}_{n\in\mathbb{N}} be two centered random walks (subject to moment conditions). We establish that \mathbb{P}(\forall_{n\leq N}B_{n}\geq W_{n}|W)\sim N^{-\gamma}, where \gamma is a non-random exponent and \sim is understood on the log scale. In the classical setting (i.e. W_{n}\equiv0) it is well-known that \gamma=1/2. We prove that for any non-trivial wall W one has \gamma>1/2 and the exponent \gamma depends only on \mathrm{Var}(B_{1})/\mathrm{Var}(W_{1}). Further, we prove that these results still hold if B depends weakly on W, this problem naturally emerges in studies of branching random walks in a time-inhomogenous random environment. They are valid also in the continuous time setting, when B and W are (possibly perturbed) Brownian motions. Finally, we present an analogue for Ornstein-Uhlenbeck processes. This time the decay is exponential \exp(-\gamma N).

Generalized Ensemble Model for Document Ranking

A generalized ensemble model (gEnM) for document ranking is proposed in this paper. The gEnM linearly combines basis document retrieval models and tries to retrieve relevant documents at high positions. In order to obtain the optimal linear combination of multiple document retrieval models or rankers, an optimization program is formulated by directly maximizing the mean average precision. Both supervised and unsupervised learning algorithms are presented to solve this program. For the supervised scheme, two approaches are considered based on the data setting, namely batch and online setting. In the batch setting, we propose a revised Newton’s algorithm, gEnM.BAT, by approximating the derivative and Hessian matrix. In the online setting, we advocate a stochastic gradient descent (SGD) based algorithm—gEnM.ON. As for the unsupervised scheme, an unsupervised ensemble model (UnsEnM) by iteratively co-learning from each constituent ranker is presented. Experimental study on benchmark data sets verifies the effectiveness of the proposed algorithms. Therefore, with appropriate algorithms, the gEnM is a viable option in diverse practical information retrieval applications.

On the Markus-Spielman-Srivastava inequality for sums of rank-one matrices

We extend the result of Markus, Spielman, and Srivastava about the sum of rank-one symmetric random matrices to the case when the isotropy assumption on the random matrices is relaxed.

Likelihood-free inference in high-dimensional models

Methods that bypass analytical evaluations of the likelihood function have become an indispensable tool for statistical inference in many fields of science. These so-called likelihood-free methods rely on accepting and rejecting simulations based on summary statistics, which limits them to low dimensional models for which the absolute likelihood is large enough to result in manageable acceptance rates. To get around these issues, we introduce a novel, likelihood-free Markov-Chain Monte Carlo (MCMC) method combining two key innovations: updating only one parameter per iteration and accepting or rejecting this update based on subsets of statistics sufficient for this parameter. This increases acceptance rates dramatically, rendering this approach suitable even for models of very high dimensionality. We further derive that for linear models, a one dimensional combination of statistics per parameter is sufficient and can be found empirically with simulations. Finally, we demonstrate that our method readily scales to models of very high dimensionality using both toy models as well as by jointly inferring the effective population size, the distribution of fitness effects of new mutations (DFE) and selection coefficients for each locus from data of a recent experiment on the evolution of drug-resistance in Influenza.

Local likelihood estimation for covariance functions with spatially-varying parameters: the convoSPAT package for R

In spite of the interest in and appeal of convolution-based approaches for nonstationary spatial modeling, off-the-shelf software for model fitting does not as of yet exist. Convolution-based models are highly flexible yet notoriously difficult to fit, even with relatively small data sets. The general lack of pre-packaged options for model fitting makes it difficult to compare new methodology in nonstationary modeling with other existing methods, and as a result most new models are simply compared to stationary models. Using a convolution-based approach, we present a new nonstationary covariance function for spatial Gaussian process models that allows for efficient computing in two ways: first, by representing the spatially-varying parameters via a discrete mixture or ‘mixture component’ model, and second, by estimating the mixture component parameters through a local likelihood approach. In order to make computation for a convolution-based nonstationary spatial model readily available, this paper also presents and describes the convoSPAT package for R. The nonstationary model is fit to both a synthetic data set and a real data application involving annual precipitation to demonstrate the capabilities of the package.

Comparing the rankings obtained from two biodiversity indices: the Fair Proportion Index and the Shapley Value

The Shapley Value and the Fair Proportion Index of phylogenetic trees have been frequently discussed as prioritization tools in conservation biology. Both indices rank species according to their contribution to total phylogenetic diversity, allowing for a simple conservation criterion. While both indices have their specific advantages and drawbacks, it has recently been shown that both values are closely related. However, as different authors use different definitions of the Shapley Value, the specific degree of relatedness depends on the specific version of the Shapley Value – it ranges from a high correlation index to equality of the indices. In this note, we first give an overview of the different indices. Then we turn our attention to the mere ranking order provided by either of the indices. We show that even though the chance of two rankings being exactly identical (when obtained from different versions of the Shapley Value) decreases with an increasing number of taxa, the distance between the two rankings converges to zero, i.e. the rankings are becoming more and more alike. Moreover, we introduce our software package FairShapley, which was implemented in Perl and with which all calculations have been performed.

Heritability Estimation in Matrix-Variate Mixed models — A Bayesian Approach

Since the emergence of genome-wide association studies (GWASs), estimation of the narrow sense heritability explained by common single-nucleotide polymorphisms (SNPs) via linear mixed model approaches became widely used. As in most GWASs, most of the heritability analyses are performed using univariate approaches i.e. considering each phenotype independently. In this study, we propose a Bayesian matrix-variate mixed model that takes into account the genetic correlation between phenotypes in addition to the genetic correlation between individuals which is usually modelled via a relatedness matrix. We showed that when the relatedness matrix is estimated using all the genome-wide SNPs, our model is equivalent to a matrix normal regression with matrix normal prior on the effect sizes. Using real data we demonstrate that there is a boost in the heritability explained when phenotypes are jointly modelled (25-35% increase). In fact based on their standard error, the joint modelling provides more accurate estimates of the heritability over the univariate modelling. Moreover, our Bayesian approach provides slightly higher estimates of heritability compared to the maximum likelihood method. On the other hand, although our method performs less well in phenotype prediction, we note that an initial imputation step relatively increases the prediction accuracy.

Moment conditions and Bayesian nonparametrics

Models phrased though moment conditions are central to much of modern statistics and econometrics. Here these moment conditions are embedded within a nonparametric Bayesian setup. Handling such a model is not probabilistically straightforward as the posterior has support on a manifold. We solve the relevant issues, building new probability and computational tools using Hausdorff measures to analyze them on real and simulated data. These new methods can be applied widely, including providing Bayesian analysis of quasi-likelihoods, linear and nonlinear regression and quantile regression, missing data, set identified models, and hierarchical models.

Asymptotics for the determinant of the combinatorial Laplacian on hypercubic lattices and the quartered Aztec diamond

In this paper, we compute asymptotics for the determinant of the combinatorial Laplacian on a sequence of d-dimensional orthotope square lattices as the number of vertices in each dimension grows at the same rate. It is related to the number of spanning trees by the well-known matrix tree theorem. Asymptotics for 2 and 3 component rooted spanning forests in these graphs are also derived. Moreover, we express the number of spanning trees in a 2-dimensional square lattice in terms of the one in a 2-dimensional discrete torus and also in the quartered Aztec diamond. As a consequence, we find an asymptotic expansion of the number of spanning trees in a subgraph of \mathbb{Z}^2 with a triangular boundary.

A General Hidden State Random Walk Model for Animal Movement

In this paper, we propose a general hidden state random walk model to describe the movement of an animal that takes into account movement taxis with respect to features of the environment. A circular-linear process models the direction and distance between two consecutive localizations of the animal. A hidden process structure accounts for the animal’s change in movement behavior. The originality of the proposed approach is that several environmental targets can be included in the directional model. An EM algorithm is devised to fit this model and an application to the analysis of the movement of caribou in Canada’s boreal forest is presented