**Towers for commuting endomorphisms, and combinatorial applications**

commuting non-invertible measure-preserving transformations, and we present

several combinatorial applications.

**A multiscale strategy for Bayesian inference using transport maps**

from observational data. Bayesian inference provides a mechanism for capturing

the resulting parameter uncertainty, but typically at a high computational

cost. This work introduces a multiscale decomposition that exploits conditional

independence across scales, when present in certain classes of inverse

problems, to decouple Bayesian inference into two stages: (1) a computationally

tractable coarse-scale inference problem; and (2) a mapping of the

low-dimensional coarse-scale posterior distribution into the original

high-dimensional parameter space. This decomposition relies on a

characterization of the non-Gaussian joint distribution of coarse- and

fine-scale quantities via optimal transport maps. We demonstrate our approach

on a sequence of inverse problems arising in subsurface flow, using the

multiscale finite element method to discretize the steady state pressure

equation. We compare the multiscale strategy with full-dimensional Markov chain

Monte Carlo on a problem of moderate dimension (100 parameters) and then use it

to infer a conductivity field described by over 10,000 parameters.

**String Comparison in $V$-Order: New Lexicographic Properties & On-line
Applications**

Factorization Families (UMFFs), which are themselves generalizations of Lyndon

words. $V$-order has recently been proposed as an alternative to

lexicographical order in the computation of suffix arrays and in the

suffix-sorting induced by the Burrows-Wheeler transform. Efficient $V$-ordering

of strings thus becomes a matter of considerable interest. In this paper we

present new and surprising results on $V$-order in strings, then go on to

explore the algorithmic consequences.

**Truth Serums for Massively Crowdsourced Evaluation Tasks**

absence of verifiability is a major challenge faced while crowdsourcing many

types of evaluation tasks like labeling images, grading assignments in online

courses, etc. In this paper, we propose new reward mechanisms for such settings

that, unlike most previously studied mechanisms, impose minimal assumptions on

the structure and knowledge of the underlying generating model, can account for

heterogeneity in the agents’ abilities, require no extraneous elicitation from

them, and furthermore allow their beliefs to be (almost) arbitrary. Moreover,

these mechanisms have the simple and intuitive structure of output agreement

mechanisms, which, despite not incentivizing truthful behavior, have

nevertheless been quite popular in practice. We achieve this by leveraging a

typical characteristic of many of these settings, which is the existence of a

large number of similar tasks.

**Convergence of Pseudo Posterior Distributions under Informative Sampling**

correlated with the response of interest and induces a dependence among sampled

observations. Unadjusted model-based inference performed on data acquired under

an informative sampling design can be biased concerning parameters of the

population generating distribution if the sample design is not accounted for in

the model. Known marginal inclusion probabilities may be used to weight the

likelihood contribution of each observed unit to form a “pseudo” posterior

distribution with the intent to adjust for the design. This article extends a

theoretical result on the consistency of the posterior distribution, defined on

an analyst-specified model space, at the true generating distribution to the

sampling-weighted pseudo posterior distribution used to account for an

informative sampling design. We construct conditions on known marginal and

pairwise inclusion probabilities that define a class of sampling designs where

consistency of the pseudo posterior is achieved, in probability. We demonstrate

the result on an application concerning the Bureau of Labor Statistics Job

Openings and Labor Turnover Survey.

**Weighted cumulative entropies: An extension of CRE and CE**

[5], for a system or component lifetime. Representing properties of cumulative

entropies, several bounds and inequalities for the WCRE is proposed

data with quadratic sufficient statistic. In this paper, we show that given a

single realization from this model, the maximum pseudolikelihood estimate

(MPLE) of the natural parameter is $\sqrt{a_N}$ consistent at a point whenever

the log-partition function has order $a_N$ in a neighborhood of that point.

This gives consistency rates of the MPLE for ferromagnetic Ising models on

general weighted graphs in all regimes, extending results of Chatterjee (2007)

where only $\sqrt N$-consistency of the MPLE was shown. It is also shown that

consistent testing, and hence estimation, is impossible in the high temperature

phase in ferromagnetic Ising models on a converging sequence of weighted

graphs, which includes the Curie-Weiss model. In this regime, the sufficient

statistic is distributed as a weighted sum of independent $\chi^2_1$ random

variables, and the asymptotic power of the most powerful test is determined.

**The Digital Synaptic Neural Substrate: A New Approach to Computational
Creativity**

Synaptic Neural Substrate’ (DSNS). It uses selected attributes from objects in

various domains (e.g. chess problems, classical music, renowned artworks) and

recombines them in such a way as to generate new attributes that can then, in

principle, be used to create novel objects of creative value to humans relating

to any one of the source domains. This allows some of the burden of creative

content generation to be passed from humans to machines. The approach was

tested in the domain of chess problem composition. We used it to automatically

compose numerous sets of chess problems based on attributes extracted and

recombined from chess problems and tournament games by humans, renowned

paintings, computer-evolved abstract art, photographs of people, and classical

music tracks. The quality of these generated chess problems was then assessed

automatically using an existing and experimentally-validated computational

chess aesthetics model. They were also assessed by human experts in the domain.

The results suggest that attributes collected and recombined from chess and

other domains using the DSNS approach can indeed be used to automatically

generate chess problems of reasonably high aesthetic quality. In particular, a

low quality chess source (i.e. tournament game sequences between weak players)

used in combination with actual photographs of people was able to produce

three-move chess problems of comparable quality or better to those generated

using a high quality chess source (i.e. published compositions by human

experts), and more efficiently as well. Why information from a foreign domain

can be integrated and functional in this way remains an open question for now.

The DSNS approach is, in principle, scalable and applicable to any domain in

which objects have attributes that can be represented using real numbers.

**Sharp upper and lower bounds for the spectral radius of a nonnegative
irreducible matrix**

radius of a nonnegative irreducible matrix. We also apply these bounds to

various matrices associated with a graph or a digraph, obtain some new results

or known results about various spectral radii, including the adjacency spectral

radius, the signless Laplacian spectral radius, the distance spectral radius,

the distance signless Laplacian spectral radius of a graph or a digraph.

**Path-factors involving paths of order seven and nine**

number of components $C$ of $G-X$ with $|V(C)|=i$): (i)~If a graph $G$

satisfies $c_{1}(G-X)+\frac{1}{3}c_{3}(G-X)+\frac{1}{3}c_{5}(G-X)\leq

\frac{2}{3}|X|$ for all $X\subseteq V(G)$, then $G$ has a

$\{P_{2},P_{7}\}$-factor. (ii)~If a graph $G$ satisfies

$c_{1}(G-X)+c_{3}(G-X)+\frac{2}{3}c_{5}(G-X)+\frac{1}{3}c_{7}(G-X)\leq

\frac{2}{3}|X|$ for all $X\subseteq V(G)$, then $G$ has a

$\{P_{2},P_{9}\}$-factor.

process with random occurrence times and magnitudes, and extrapolating

life-time maximum or design loads from the data is a common task in structural

reliability analyses. In this paper, we consider a stationary live load

sequence that arrive according to a dependent point process and allow for a

weakened mixing-type dependence in the load pulse magnitudes that

asymptotically decreases to zero with increasing separation in the sequence.

Inclusion of dependence in the model eliminates the unnecessary conservatism

introduced by the i.i.d. (independent and identically distributed) assumption

often made in determining maximum live load distribution. The scale of

fluctuation of the loading process is used to identify clusters of exceedances

above high thresholds which in turn is used to estimate the extremal index of

the process. A Bayesian updating of the empirical distribution function,

derived from the distribution of order statistics in a dependent stationary

series, is performed. The pulse arrival instants are modeled as a Cox process

goverened by a stationary lognormal intensity. An illustrative example utilizes

in-service peak strain data from ambient traffic collected on a high volume

highway bridge, and analyzes the asymptotic behavior of the maximum load.

**Range Predecessor and Lempel-Ziv Parsing**

important and widely-used algorithmic tools in data compression and string

processing. We show that the Lempel-Ziv parsing of a string of length $n$ on an

alphabet of size $\sigma$ can be computed in $O(n\log\log\sigma)$ time ($O(n)$

time if we allow randomization) using $O(n\log\sigma)$ bits of working space;

that is, using space proportional to that of the input string in bits. The

previous fastest algorithm using $O(n\log\sigma)$ space takes

$O(n(\log\sigma+\log\log n))$ time. We also consider the important rightmost

variant of the problem, where the goal is to associate with each phrase of the

parsing its most recent occurrence in the input string. We solve this problem

in $O(n(1 + (\log\sigma/\sqrt{\log n}))$ time, using the same working space as

above. The previous best solution for rightmost parsing uses

$O(n(1+\log\sigma/\log\log n))$ time and $O(n\log n)$ space. As a bonus, in our

solution for rightmost parsing we provide a faster construction method for

efficient 2D orthogonal range reporting, which is of independent interest.

**Remoteness and distance eigenvalues of a graph**

$\rho$ of $G$ is the maximum average distance from a vertex to all others and

$\partial_1\geq\cdots\geq \partial_n$ are the distance eigenvalues of $G$. In

\cite{AH}, Aouchiche and Hansen conjectured that $\rho+\partial_3>0$ when

$d\geq 3$ and $\rho+\partial_{\lfloor\frac{7d}{8}\rfloor}>0.$ In this paper, we

confirm these two conjectures. Furthermore, we give lower bounds on

$\partial_n+\rho$ and $\partial_1-\rho$ when $G\ncong K_n$ and the extremal

graphs are characterized.

**On Liveness of Dynamic Storage**

Paxos, RAMBO, and RDS, do not ensure liveness (wait-freedom) in asynchronous

runs with infinitely many reconfigurations. We prove that this is inherent for

asynchronous dynamic storage algorithms, including ones that use $\Omega$ or

$\diamond S$ oracles. Our result holds even if only one process may fail,

provided that machines that were successfully removed from the system’s

configuration may be switched off by an administrator. Intuitively, the

impossibility relies on the fact that a correct process can be suspected to

have failed at any time, i.e., its failure is indistinguishable to other

processes from slow delivery of its messages, and so the system should be able

to reconfigure without waiting for this process to complete its pending

operations.

To circumvent this result, we define a dynamic eventually perfect failure

detector, and present an algorithm that uses it to emulate wait-free dynamic

atomic storage (with no restrictions on reconfiguration rate). Together, our

results thus draw a sharp line between oracles like $\Omega$ and $\diamond S$,

which allow some correct process to continue to be suspected forever, and a

dynamic eventually perfect one, which does not.

**A family of non-Schurian $p$-Schur rings over groups of order $p^3$**

order $p^3$ is Schurian. In this article, we consider the Schurity problem of

non-commutative $p$-Schur rings over groups of order $p^3$. In particular, it

is given a family of non-Schurian $p$-Schur rings over groups of order $p^3$.

**Proper Scoring and Sufficiency**

theory, statistics, statistical mechanics, and portfolio theory. We demonstrate

that all these topics involve some kind of optimization that leads directly to

the use of Bregman divergences. If a sufficiency condition is also fulfilled

the Bregman divergence must be proportional to information divergence. The

sufficiency condition has quite different consequences in the different areas

of application, and often it is not fulfilled. Therefore the sufficiency

condition can be used to explain when results from one area can be transferred

directly from one area to another and when one will experience differences.

**Compressed Sensing without Sparsity Assumptions**

$x\in\mathbb{R}^p$ can be accurately recovered from an underdetermined set of

$n$ linear measurements with $n\ll p$, provided that $x$ is sufficiently

sparse. However, in applications, the degree of sparsity $\|x\|_0$ is typically

unknown, and the problem of directly estimating $\|x\|_0$ has been a

longstanding gap between theory and practice. A closely related issue is that

$\|x\|_0$ is a highly idealized measure of sparsity, and for real signals with

entries not exactly equal to 0, the value $\|x\|_0=p$ is not a useful

description of compressibility. In our previous conference paper that examined

these problems, Lopes 2013, we considered an alternative measure of “soft”

sparsity, $\|x\|_1^2/\|x\|_2^2$, and designed a procedure to estimate

$\|x\|_1^2/\|x\|_2^2$ that does not rely on sparsity assumptions.

The present work offers a new deconvolution-based method for estimating

unknown sparsity, which has wider applicability and sharper theoretical

guarantees. Whereas our earlier work was limited to estimating the quantity

$\|x\|_1^2/\|x\|_2^2$, the current paper introduces a family of entropy-based

sparsity measures $s_q(x):=\big(\frac{\|x\|_q}{\|x\|_1}\big)^{\frac{q}{1-q}}$

parameterized by $q\in[0,\infty]$. Two other main advantages of the new

approach are that it handles measurement noise with infinite variance, and that

it yields confidence intervals for $s_q(x)$ with asymptotically exact coverage

probability (whereas our previous intervals were conservative). In addition to

confidence intervals, we also analyze several other aspects of our proposed

estimator $\hat{s}_q(x)$ and show that randomized measurements are an essential

aspect of our procedure.

**Dimensionality-reduced subspace clustering**

high-dimensional data points into a union of low-dimensional linear subspaces,

whose number, orientations, and dimensions are all unknown. In practice one may

have access to dimensionality-reduced observations of the data only, resulting,

e.g., from undersampling due to complexity and speed constraints on the

acquisition device or mechanism. More pertinently, even if the high-dimensional

data set is available it is often desirable to first project the data points

into a lower-dimensional space and to perform clustering there; this reduces

storage requirements and computational cost. The purpose of this paper is to

quantify the impact of dimensionality reduction through random projection on

the performance of three subspace clustering algorithms, all of which are based

on principles from sparse signal recovery. Specifically, we analyze the

thresholding based subspace clustering (TSC) algorithm, the sparse subspace

clustering (SSC) algorithm, and an orthogonal matching pursuit variant thereof

(SSC-OMP). We find, for all three algorithms, that dimensionality reduction

down to the order of the subspace dimensions is possible without incurring

significant performance degradation. The mathematical engine behind our theory

is a result quantifying how the affinities between subspaces change under

random projections. Experiments on synthetic and real data complement our

theoretical results.

and show that selection procedures based on priors from these classes are

consistent for linear models even when the number of covariates $p$ increases

sub-exponentially with the sample size $n$, provided that certain regularity

constraints are satisfied. We also demonstrate that the resulting prior

densities impose fully adaptive penalties on regression parameters,

distinguishing our model selection procedure from existing $L_0$ penalized

likelihood methods. To implement this framework, we propose a scalable

algorithm called Simplified Shotgun Stochastic Search with Screening (S5) that

efficiently explores high-dimensional model spaces. Compared to standard MCMC

algorithms, S5 can dramatically speed the rate at which Bayesian model

selection procedures identify high posterior probability models. Finally, we

present a detailed comparison between the proposed procedure and the most

commonly used alternative methods for variable selection using precision-recall

curves and more standard simulation scenarios.

**Improved initialisation of model-based clustering using Gaussian
hierarchical partitions**

crucial. Various starting points in the parameter space often lead to different

local maxima of the likelihood function and, so to different clustering

partitions. Among the several approaches available in the literature,

model-based agglomerative hierarchical clustering is used to provide initial

partitions in the popular MCLUST R package. This choice is computationally

convenient and often yields good clustering partitions. However, in certain

circumstances, poor initial partitions may cause the EM algorithm to converge

to a local maximum of the likelihood function. We propose several simple and

fast refinements based on data transformations and illustrate them through data

examples.

**Spectra of Random Hypermatrices and Hypergraphs**

homogeneous adjacency eigenvalues of random and complete uniform hypergraphs.

There is a natural conjecture arising from analogy to random matrix theory that

connects these spectra to that of the all-ones hypermatrix. Several of the

ingredients along a possible path to this conjecture are established, and may

be of independent interest in spectral hypergraph/hypermatrix theory. In

particular, we provide a bound on the spectral radius of the symmetric

Bernoulli hyperensemble.

**Improved Approximations for Cubic and Cubic Bipartite TSP**

on cubic graphs, and cubic bipartite graphs. For cubic bipartite graphs with n

nodes, we improve on recent results of Karp and Ravi (2014) by giving a simple

“local improvement” algorithm that finds a tour of length at most 5/4 n – 2.

For 2-connected cubic graphs, we show that the techniques of Moemke and

Svensson (2011) can be combined with the techniques of Correa, Larre and Soto

(2012), to obtain a tour of length at most (4/3-1/8754)n.

**Obstructions to combinatorial formulas for plethysm**

quasi-polynomials arising in formulas for plethysm. We demonstrate, on

$S^3(S^k)$, that these need not be counting functions of lattice points in

convex bodies, even when restricted to single rays.

**Inhomogeneous random graphs, isolated vertices, and Poisson
approximation**

points $x,y$ connected with probability $\phi(x,y)$. Suppose the number of

points is large but the mean number of isolated points is $O(1)$. We give

general criteria for the latter to be approximately Poisson distributed. More

generally, we consider the number of vertices of fixed degree, the number of

components of fixed order, and the number of edges. We use a general result on

Poisson approximation by Stein’s method for a set of points selected from a

Poisson point process. This method also gives a good Poisson approximation for

Poisson U-statistics.

**Scale invariant volatility estimation in scalar diffusions**

observed at equidistant time points. We use the spectral representation of the

volatility in terms of the invariant density and an eigenpair of the

infinitesimal generator to construct a scale invariant estimator; we prove

minimax optimal convergence rates both for high and low frequency observations.

The proofs are based on a posteriori error bounds for generalized eigenvalue

problems and path properties of scalar diffusions. To ensure that the model is

uniformly defined across scales we consider diffusions with compact state space

and boundary reflection. The finite sample performance is illustrated by a

numerical example.

**Acyclicity for Groups and Vector Spaces**

proved that torsion-free groups admit this property. In this paper, we

introduce a duality of acyclic matching as a tool for classi?cation of some

Abelian groups, moreover, we study matchings for vector spaces and give a

connection between matchings in groups and vector spaces. Our tools mix

additive number theory, combinatorics and algebra.

**A Framework of Sparse Online Learning and Its Applications**

today. In this paper, we address several open challenges of big data stream

classification, including high volume, high velocity, high dimensionality, high

sparsity, and high class-imbalance. Many existing studies in data mining

literature solve data stream classification tasks in a batch learning setting,

which suffers from poor efficiency and scalability when dealing with big data.

To overcome the limitations, this paper investigates an online learning

framework for big data stream classification tasks. Unlike some existing online

data stream classification techniques that are often based on first-order

online learning, we propose a framework of Sparse Online Classification (SOC)

for data stream classification, which includes some state-of-the-art

first-order sparse online learning algorithms as special cases and allows us to

derive a new effective second-order online learning algorithm for data stream

classification. In addition, we also propose a new cost-sensitive sparse online

learning algorithm by extending the framework with application to tackle online

anomaly detection tasks where class distribution of data could be very

imbalanced. We also analyze the theoretical bounds of the proposed method, and

finally conduct an extensive set of experiments, in which encouraging results

validate the efficacy of the proposed algorithms in comparison to a family of

state-of-the-art techniques on a variety of data stream classification tasks.

**True Online Emphatic TD($λ$): Quick Reference and Implementation
Guide**

TD($\lambda$), a model-free temporal-difference algorithm for learning to make

long-term predictions which combines the emphasis idea (Sutton, Mahmood & White

2015) and the true-online idea (van Seijen & Sutton 2014). The setting used

here includes linear function approximation, the possibility of off-policy

training, and all the generality of general value functions, as well as the

emphasis algorithm’s notion of “interest”.

**Finite-time Stability and synchronization with Finite-time**

systems. Here, we will give an insightful look at this issue and explore the

intrinsic property by mathematical analysis. Several theorems and corollaries

show how to find the settling time. Moreover, we apply the corresponding theory

on the fixed-time synchronization problem of nonlinearly coupled systems and a

numerical simulation is also given to demonstrate its efficiency.

**On the Use of Cauchy Prior Distributions for Bayesian Logistic
Regression**

predictors can perfectly classify part or all of the observations in the

sample, and as a result, finite maximum likelihood estimates of the regression

coefficients do not exist. Gelman et al (2008) recommended independent Cauchy

distributions as default priors for the regression coefficients in logistic

regression, even in the case of separation, and reported posterior modes in

their analyses. As the mean does not exist for the Cauchy prior, a natural

question is whether the posterior means of the regression coefficients exist

under separation. We prove two theorems that provide necessary and sufficient

conditions for the existence of posterior means under independent Cauchy priors

for the logit link and a general family of link functions, including the probit

link. For full Bayesian inference, we develop a Gibbs sampler based on

Polya-Gamma data augmentation to sample from the posterior distribution under

independent Student-t priors including Cauchy priors, and provide a companion R

package in the supplement. We demonstrate empirically that even when the

posterior means of the regression coefficients exist under separation, the

magnitude of the posterior samples for Cauchy priors may be unusually large,

and the corresponding Markov chain shows extremely slow mixing.

**On some estimators of the Hurst index of the solution of SDE driven by a
fractional Brownian motion**

parameter of solutions of stochastic differential equations are proposed. The

estimators are based on discrete observations of the underlying processes.

**Stable low-rank matrix recovery via null space properties**

possibly noisy set of linear measurements arises in a number of areas. In order

to derive rigorous recovery results, the measurement map is usually modeled

probabilistically. We derive sufficient conditions on the minimal amount of

measurements ensuring recovery via convex optimization. We establish our

results via certain properties of the null space of the measurement map. In the

setting where the measurements are realized as Frobenius inner products with

independent standard Gaussian random matrices we show that $10 r (n_1 + n_2)$

measurements are enough to uniformly and stably recover an $n_1 \times n_2$

matrix of rank at most $r$. We then significantly generalize this result by

only requiring independent mean-zero, variance one entries with four finite

moments at the cost of replacing $10$ by some universal constant. We also study

the case of recovering Hermitian rank-$r$ matrices from measurement matrices

proportional to rank-one projectors. For $m \geq C r n$ rank-one projective

measurements onto independent standard Gaussian vectors, we show that nuclear

norm minimization uniformly and stably reconstructs Hermitian rank-$r$ matrices

with high probability. Next, we partially de-randomize this by establishing an

analogous statement for projectors onto independent elements of a complex

projective 4-designs at the cost of a slightly higher sampling rate $m \geq C

rn \log n$. Moreover, if the Hermitian matrix to be recovered is known to be

positive semidefinite, then we show that the nuclear norm minimization approach

may be replaced by minimizing the $\ell_2$-norm of the residual subject to the

positive semidefinite constraint. Then no estimate of the noise level is

required a priori. We discuss applications in quantum physics and the phase

retrieval problem.

of $v$, denoted by $\zeta(v)$ as $\zeta(v)={\max}_{H: v\in H}~\delta(H)$, where

the maximum is taken over all subgraphs of $G$ containing the vertex $v$. We

show that the degenerate degree sequence of any graph can be determined by an

efficient algorithm. A $k$-independent set in $G$ is any set $S$ of vertices

such that $\Delta(G[S])\leq k$. The largest cardinality of any $k$-independent

set is denoted by $\alpha_k(G)$. For $k\in \{1, 2, 3\}$, we prove that

$\alpha_{k-1}(G)\geq {\sum}_{v\in G} \min \{1, 1/(\zeta(v)+(1/k))\}$. Using the

concept of cheap vertices we strengthen our bound for the independence number.

The resulting lower bounds improve greatly the famous Caro-Wei bound and also

the best known bounds for $\alpha_1(G)$ and $\alpha_2(G)$ for some families of

graphs. We show that the equality in our bound for independence number happens

for a large class of graphs. Our bounds are achieved by Cheap-Greedy algorithms

for $\alpha_k(G)$ which are designed by the concept of cheap sets. At the end,

a bound for $\alpha_k(G)$ is presented, where $G$ is a forest and $k$ an

arbitrary non-negative integer.

**Hypohamiltonian planar cubic graphs with girth five**

hamiltonian if any vertex is removed. Many hypohamiltonian planar cubic graphs

have been found, starting with constructions of Thomassen in 1981. However, all

the examples found until now had 4-cycles. In this note we present the first

examples of hypohamiltonian planar cubic graphs with cyclic connectivity five,

and thus girth five. We show by computer search that the smallest members of

this class are three graphs with 76 vertices.

**Task Selection for Bandit-Based Task Assignment in Heterogeneous
Crowdsourcing**

(assigning the labeling task to a suitable worker) are two major challenges in

task assignment for crowdsourcing. Recently, worker selection has been

successfully addressed by the bandit-based task assignment (BBTA) method, while

task selection has not been thoroughly investigated yet. In this paper, we

experimentally compare several task selection strategies borrowed from active

learning literature, and show that the least confidence strategy significantly

improves the performance of task assignment in crowdsourcing.

**A Neural Prototype for a Virtual Chemical Spectrophotometer**

(Ni) and cobalt (Co) was developed based on an artificial neural network (ANN).

The developed ANN correlates the respective concentrations of Co and Ni given

the absorbance profile of a Co-Ni mixture based on the Beer’s Law. The virtual

chemical spectrometer was trained using a 3-layer jump connection neural

network model (NNM) with 126 input nodes corresponding to the 126 absorbance

readings from 350 nm to 600 nm, 70 nodes in the hidden layer using a logistic

activation function, and 2 nodes in the output layer with a logistic function.

Test result shows that the NNM has correlation coefficients of 0.9953 and

0.9922 when predicting [Co] and [Ni], respectively. We observed, however, that

the NNM has a duality property and that there exists a real-world practical

application in solving the dual problem: Predict the Co-Ni mixture’s absorbance

profile given [Co] and [Ni]. It turns out that the dual problem is much harder

to solve because the intended output has a much bigger cardinality than that of

the input. Thus, we trained the dual ANN, a 3-layer jump connection nets with 2

input nodes corresponding to [Co] and [Ni], 70-logistic-activated nodes in the

hidden layer, and 126 output nodes corresponding to the 126 absorbance readings

from 250 nm to 600 nm. Test result shows that the dual NNM has correlation

coefficients that range from 0.9050 through 0.9980 at 356 nm through 578 nm

with the maximum coefficient observed at 480 nm. This means that the dual ANN

can be used to predict the absorbance profile given the respective Co-Ni

concentrations which can be of importance in creating academic models for a

virtual chemical spectrophotometer.

**Modeling Website Workload Using Neural Networks**

number of requests received by 1998 FIFA World Cup website. Modeling is done by

means of time-series forecasting. The log traces of the website, available

through the Internet Traffic Archive (ITA), are processed to obtain two

time-series data sets that are used for finding the following measurements:

requests/day and requests/second. These are modeled by training and simulating

ANN. The method followed to collect and process the data, and perform the

experiments have been detailed in this article. In total, 13 cases have been

tried and their results have been presented, discussed, compared and

summarized. Lastly, future works have also been mentioned.

**Non-hereditary Minimum Deep Coalescence trees**

describing the evolutionary history of a set of species. But the trees derived

from geneti data with the help of tree inference methods are gene trees that

need not coincide with the species tree. This can for example happen when

so-called deep coalescence events take place. It is also known that species

trees can differ from their most likely gene trees. Therefore, as a means to

find the species tree, it has been suggested to use subtrees of the gene trees,

for example triples, and to puzzle them together in order to find the species

tree. In this paper, we will show that this approach may lead to wrong trees

regarding the minimum deep coalescence criterion (MDC). In particular, we

present an example in which the optimal MDC tree is unique, but none of its

triple subtrees fulfills the MDC criterion. In this sense, MDC is a

non-hereditary tree reconstruction method.

**Projective modules over polyhedral semirings**

monoid. The analysis extends to the case of the semiring of convex,

piecewise-affine functions on a polyhedron, for which projective modules

correspond to convex families of weight polyhedra for the general linear group.

**Spatial mixing and approximate counting for Potts model on graphs with
bounded average degree**

establish its connection to the strong spatial mixing for spin systems. More

specifically, we show that for anti-ferromagnetic Potts model on families of

graphs characterized by a specific contraction function, the model exhibits

strong spatial mixing, and if further the graphs exhibit certain local sparsity

which are very natural and easy to satisfy by typical sparse graphs, then we

also have FPTAS for computing the partition function.

This new characterization of strong spatial mixing of multi-spin system does

not require maximum degree of the graphs to be bounded, but instead it relates

the decay of correlation of the model to a notion of effective average degree

measured by the contraction of a function on the family of graphs. It also

generalizes other notion of effective average degree which may determine the

strong spatial mixing, such as the connective constant, whose connection to

strong spatial mixing is only known for very simple models and is not

extendable to general spin systems.

As direct consequences: (1) we obtain FPTAS for the partition function of

$q$-state anti-ferromagnetic Potts model with activity $0\le\beta<1$ on graphs

of maximum degree bounded by $d$ when $q> 3(1-\beta)d+1$, improving the

previous best bound $\beta> 3(1-\beta)d$ and asymptotically approaching the

inapproximability threshold $q=(1-\beta)d$, and (2) we obtain an efficient

sampler (in the same sense of fully polynomial-time almost uniform sampler,

FPAUS) for the Potts model on Erd\H{o}s-R\’enyi random graph

$\mathcal{G}(n,d/n)$ with sufficiently large constant $d$, provided that $q>

3(1-\beta)d+4$. In particular when $\beta=0$, the sampler becomes an FPAUS for

for proper $q$-coloring in $\mathcal{G}(n,d/n)$ with $q> 3d+4$, improving the

current best bound $q> 5.5d$ for FPAUS for $q$-coloring in

$\mathcal{G}(n,d/n)$.

function is a computationally challenging task that arises in a number of

applications. For the common function of the inverse of a large, sparse matrix,

the standard approach is based on a Monte Carlo method which converges slowly.

We present a different approach by exploiting the pattern correlation between

the diagonal of the inverse of the matrix and the diagonal of some approximate

inverse that can be computed inexpensively. We leverage various sampling and

fitting techniques to fit the diagonal of the approximation to the diagonal of

the inverse. Based on a dynamic evaluation of the variance, the proposed method

can be used as a variance reduction method for Monte Carlo in some cases.

Furthermore, the presented method may serve as a standalone kernel for

providing a fast trace estimate with a small number of samples. An extensive

set of experiments with various technique combinations demonstrates the

effectiveness of our method in some real applications.

**Consistency of plug-in confidence sets for classification in
semi-supervised learning**

applications such as medical diagnoses, wrong prediction can be fatal. For

classification, there already exist procedures that allow to not classify data

when the confidence in their prediction is weak. This approach is known as

classification with reject option. In the present paper, we provide new

methodology for this approach. Predicting a new instance via a confidence set,

we ensure an exact control of the probability of classification. Moreover, we

show that this methodology is easily implementable and entails attractive

theoretical and numerical properties.

**A Note on Boolean Lattices and Farey Sequences III**

**A Deterministic Algorithm for Maximizing Submodular Functions**

by Feige, Mirrokni, and Vondrak [FOCS’07] who provided a deterministic

local-search based algorithm that guarantees an approximation ratio of $\frac 1

3$, as well as a randomized $\frac 2 5$-approximation algorithm. An extensive

line of research followed and various algorithms with improving approximation

ratios were developed, all of them are randomized. Finally, Buchbinder et al.

[FOCS’12] presented a randomized $\frac 1 2$-approximation algorithm, which is

the best possible.

This paper gives the first deterministic algorithm for maximizing a

non-negative submodular function that achieves an approximation ratio better

than $\frac 1 3$. The approximation ratio of our algorithm is $\frac 2 5$. Our

algorithm is based on recursive composition of solutions obtained by the local

search algorithm of Feige et al. We show that the $\frac 2 5$ approximation

ratio can be guaranteed when the recursion depth is $2$, and leave open the

question of whether the approximation ratio improves as the recursion depth

increases.

**Estimator Selection: End-Performance Metric Aspects**

been introduced. In this context, the distance of the estimated system from the

true one is measured in terms of a particular end-performance metric. This

treatment leads to superior unknown system estimates to classical experiment

designs based on usual pointwise functional distances of the estimated system

from the true one. The separation of the system estimator from the experiment

design is done within this new framework by choosing and fixing the estimation

method to either a maximum likelihood (ML) approach or a Bayesian estimator

such as the minimum mean square error (MMSE). Since the MMSE estimator delivers

a system estimate with lower mean square error (MSE) than the ML estimator for

finite-length experiments, it is usually considered the best choice in practice

in signal processing and control applications. Within the application-oriented

framework a related meaningful question is: Are there end-performance metrics

for which the ML estimator outperforms the MMSE when the experiment is

finite-length? In this paper, we affirmatively answer this question based on a

simple linear Gaussian regression example.

**Lévy insurance risk processes with parisian type severity of debt**

negative L\’evy insurance risk process. More precisely, we study the

Gerber-Shiu distribution for a ruin model where at each time the surplus goes

negative, an independent negative random level is considered. If a negative

excursion of the surplus exceeds such random level then the insurance company

goes out of business. Our methodology uses excursion theory and relies on the

description of the excursion measure away from 0 which was recently obtained by

the authors in Pardo et al. (see arXiv:1507.05225). Our results are given in

terms of the so-called scale functions.

improve the speed of kernel manifold learning algorithms relying on spectral

decomposition. Utilizing recent insights in kernel smoothing and learning with

integral operators, we propose Reduced Set KPCA (RSKPCA), which also suggests

an easy-to-implement method to remove or replace samples with minimal effect on

the empirical operator. A simple data point selection procedure is given to

generate a substitute density for the data, with accuracy that is governed by a

user-tunable parameter . The effect of the approximation on the quality of the

KPCA solution, in terms of spectral and operator errors, can be shown directly

in terms of the density estimate error and as a function of the parameter . We

show in experiments that RSKPCA can improve both training and evaluation time

of KPCA by up to an order of magnitude, and compares favorably to the

widely-used Nystrom and density-weighted Nystrom methods.

**Detection of phase transition in generalized Pólya urn in Information
cascade experiment**

urn in an information cascade experiment. The method is based on the asymptotic

behavior of the correlation $C(t)$ between the first subject’s choice and the

$t+1$-th subject’s choice, the limit value of which, $c\equiv \lim_{t\to

\infty}C(t)$, is the order parameter of the phase transition. To verify the

method, we perform a voting experiment using two-choice questions. An urn X is

chosen at random from two urns A and B, which contain red and blue balls with

different configurations. Subjects sequentially guess whether X is A or B using

information about the prior subjects’ choices and the color of a ball randomly

drawn from X. The color tells the subject which is X with probability $q$. We

set $q\in \{5/9,6/9,7/9\}$ by controlling the configurations of red and blue

balls in A and B. The (average) length of the sequence of the subjects is 63

(53.7) for $q\in \{5/9,6/9,(7/9)\}$. We describe the sequential voting process

by a nonlinear P\’olya urn model. The model suggests the possibility of a phase

transition when $q$ changes. We show that $c>0\,\,\,(=0)$ for

$q=5/9,6/9\,\,\,(7/9)$ and successfully detect the phase transition using the

proposed method.

motivated by problems that arise in large-scale radiological survey and anomaly

detection. In this context, the density functions to be estimated are the

background gamma-ray energy spectra at sites spread across a large geographical

area, such as nuclear production and waste-storage sites, military bases,

medical facilities, university campuses, or the downtown of a city. Several

challenges combine to make this a difficult problem. First, the spectral

density at any given spatial location may have both smooth and non-smooth

features. Second, the spatial correlation in these density functions is neither

stationary nor locally isotropic. Third, the spatial correlation decays at

different length scales at different locations in the support of the underlying

density. Finally, at some spatial locations, there is very little data. We

present a method called multiscale spatial density smoothing that successfully

addresses these challenges. The method is motivated by the same construction

that underlies a P\’olya-tree prior, in that it is based on a recursive dyadic

partition of the underlying probability measure. We also describe an efficient

algorithm for finding a maximum a posteriori (MAP) estimate that leverages

recent advances in convex optimization for non-smooth functions.

We apply multiscale spatial density smoothing to real data collected on the

background gamma-ray spectra at locations across a large university campus. Our

results show that the method exhibits state-of-the-art performance for spatial

smoothing in density estimation, and that it leads to substantial improvements

in power when used in conjunction with existing methods for detecting the kinds

of radiological anomalies that may have important consequences for public

health and safety.

**Hydrodynamical spectral evolution for random matrices**

Gaussian Hermitian matrix and $X^{(0)}$ is non-random or random independent of

$X$, are closely related to Dyson Brownian motion. Previous works have shown

how an infinite hierarchy of equations satisfied by the dynamical correlations

become triangular in the infinite density limit, and give rise to the complex

Burgers equation for the Green’s function of the corresponding one-point

density function. We show how this and analogous partial differential

equations, for chiral, circular and Jacobi versions of Dyson Brownian motion

follow from a macroscopic hydrodynamical description involving the current

density and continuity equation. The method of characteristics gives a

systematic approach to solving the PDEs, and in the chiral case we show how

this efficiently reclaims the characterisation of the global eigenvalue density

for non-central Wishart matrices due to Dozier and Silverstein. Collective

variables provide another approach to deriving the complex Burgers equation in

the Gaussian case, and we show that this approach applies equally as well to

chiral matrices. We relate both the Gaussian and chiral cases to the

asymptotics of matrix integrals.

**Explicit Local Integrals of Motion for the Many-Body Localized State**

characterized by local integrals of motion. Here we introduce an Hilbert space

renormalization scheme that iteratively finds such integrals of motion exactly.

Our method is based on the consecutive action of a similarity transformation

using displacement operators. We show, as a proof of principle, localization in

a $N=12$ and $N=36$ interacting fermion chains with random onsite potentials.

Our scheme of consecutive displacement transformations can be used to study

Many Body Localization in any dimension, as well as disorder-free Hamiltonians.

**Efficient Calibration for Imperfect Computer Models**

using physical observations. Kennedy and O’Hagan (2001) shows that the

calibration method based on Gaussian process models proposed by Kennedy and

O’Hagan (2001) may lead to unreasonable estimate for imperfect computer models.

In this work, we extend their study to calibration problems with stochastic

physical data. We propose a novel method, called the $L_2$ calibration, and

show its semiparametric efficiency. The conventional method of the ordinary

least squares is also studied. Theoretical analysis shows that it is consistent

but not efficient. Numerical examples show that the proposed method outperforms

the existing ones.

**Complete branching rules for Specht modules**

arbitrary diagram admits a (complete) branching rule. This description, given

in terms of the maximal rectangles of the diagram, generalizes all previously

known branching rules for Specht modules, such as those given by Reiner and

Shimozono for northwest diagrams and by the present author for forest diagrams.

**Harnack inequality for non-local Schrödinger operators**

twice differentiable function with all second partial derivatives being

continuous. For $1\leq i,j\leq d$, let $a_{ij} : \mathbb{R}^d \rightarrow \R$

be a differentiable function with all partial derivatives being continuous and

bounded. We shall consider the Schr\”odinger operator associated to

\begin{eqnarray*}

\mathcal{L}f(x) &=& \frac12 \sum_{i=1}^d \sum_{j=1}^d

\frac{\partial}{\partial x_i} \left(a_{ij}(\cdot)

\frac{\partial f}{\partial x_j}\right)(x) +

\int_{\mathbb{R}^d\setminus{\{0\}}} [f(y) – f(x) ]J(x,y)dy.

\end{eqnarray*} where $J: \mathbb{R}^d \times \R^d \rightarrow \R$ is a

symmetric measurable function. Let $q: \mathbb{R}^d \rightarrow \R.$ We specify

assumptions on $a, q,$ and $J$ so that non-negative bounded solutions to

$${\mathcal L}f + qf = 0$$ satisfy a Harnack inequality.

**Learning (Predictive) Risk Scores in the Presence of Censoring due to
Interventions**

patient’s hospital stay to monitor their health status. Tools for integrating

these measurements into severity scores, that accurately track changes in

illness severity, can improve clinicians ability to provide timely

interventions. Existing approaches for creating such scores either 1) rely on

experts to fully specify the severity score, or 2) train a predictive score,

using supervised learning, by regressing against a surrogate marker of severity

such as the presence of downstream adverse events. The first approach does not

extend to diseases where an accurate score cannot be elicited from experts. The

second approach often produces scores that suffer from bias due to

treatment-related censoring (Paxton, 2013). We propose a novel ranking based

framework for disease severity score learning (DSSL). DSSL exploits the

following key observation: while it is challenging for experts to quantify the

disease severity at any given time, it is often easy to compare the disease

severity at two different times. Extending existing ranking algorithms, DSSL

learns a function that maps a vector of patient’s measurements to a scalar

severity score such that the resulting score is temporally smooth and

consistent with the expert’s ranking of pairs of disease states. We apply DSSL

to the problem of learning a sepsis severity score using a large, real-world

dataset. The learned scores significantly outperform state-of-the-art clinical

scores in ranking patient states by severity and in early detection of future

adverse events. We also show that the learned disease severity trajectories are

consistent with clinical expectations of disease evolution. Further, using

simulated datasets, we show that DSSL exhibits better generalization

performance to changes in treatment patterns compared to the above approaches.

**A Social Spider Algorithm for Solving the Non-convex Economic Load
Dispatch Problem**

system control and operation. Although conventional ELD formulation can be

solved using mathematical programming techniques, modern power system

introduces new models of the power units which are non-convex,

non-differentiable, and sometimes non-continuous. In order to solve such

non-convex ELD problems, in this paper we propose a new approach based on the

Social Spider Algorithm (SSA). The classical SSA is modified and enhanced to

adapt to the unique characteristics of ELD problems, e.g., valve-point effects,

multi-fuel operations, prohibited operating zones, and line losses. To

demonstrate the superiority of our proposed approach, five widely-adopted test

systems are employed and the simulation results are compared with the

state-of-the-art algorithms. In addition, the parameter sensitivity is

illustrated by a series of simulations. The simulation results show that SSA

can solve ELD problems effectively and efficiently.

**The Two Ignored Components of Random Variation**

identity and a set of external variation-generating factors. When only a single

source is active, two mutually exclusive extreme scenarios may ensue that

result in the exponential or the normal, the only truly univariate

distributions. All other supposedly univariate random variation observed in

nature is truly bivariate. In this article, we elaborate on this new paradigm

for random variation and develop a general bivariate distribution to reflect

it. It is shown that numerous current univariate distributions are special

cases of an approximation to the new bivariate distribution. We first show that

the exponential and the normal are special cases of a single distribution

represented by a Response Modeling Methodology model. We then develop a general

bivariate distribution commensurate with the new paradigm, its properties are

discussed and its moments developed. An approximating assumption results in a

univariate general distribution that is shown to include as exact special cases

widely used distributions like generalized gamma, log-normal, F, t, and Cauchy.

Compound distributions and their relationship to the new paradigm are

addressed. Empirical observations that comply with predictions derived from the

new paradigm corroborate its scientific validity.

**A Fuss-type family of positive definite sequences**

numbers. We show a sufficient condition for positive definiteness of $a_n(p,t)$

and prove that some of the corresponding probability measures are infinitely

divisible with respect to the additive free convolution.

**Bayesian analysis of multivariate stable distributions using
one-dimensional projections**

distributions. We exploit the representation of Matsui and Takemura (2009) for

univariate projections, and the representation of the distributions in terms of

their spectral measure. We present efficient MCMC schemes to perform the

computations when the spectral measure is approximated discretely or, as we

propose, by a normal distribution. Appropriate latent variables are introduced

to implement MCMC. In relation to the discrete approximation, we propose

efficient computational schemes based on the characteristic function.

**Spectral structure of singular spectrum decomposition for time series**

decomposition of a time series. The singular value decomposition of the

trajectory matrix and the anti-diagonal averaging leads to a time-series

decomposition. In this algorithm, a single free parameter, window length $K$,

is involved which is the FIR filter length for the time series. There are no

generally accepted criterion for the proper choice of the window length $K$.

Moreover, the proper window length depends on the specific problem which we are

interested in. Thus, it is important to monitor the spectral structure of the

SSA decomposition and its window length dependence in detail for the practical

application. In this paper, based on the filtering interpretation of SSA, it is

shown that the decomposition of the power spectrum for the original time series

is possible with the filters constructed from the eigenvectors of the

lagged-covariance matrix. With this, we can obtain insights into the spectral

structure of the SSA decomposition and it helps us for the proper choice of the

window length in the practical application of SSA.

**Perfect Graeco-Latin balanced incomplete block designs and related
designs**

defined and a few series of them have been constructed in Bagchi (2010). These

plans are very closely related to the `mutually orthogonal balanced nested

row-column designs’ of Morgan and Uddin (1996) and many other combinatorial

designs in the literature with different names like `BIBDs for two sets of

treatment’, `Graeco-Latin designs’ and `PERGOLAs’. In fact all of them may be

viewed as POTBs satisfying one or more additional conditions, making them

`optimal’. However, the PERGOLAs are defined to satisfy an additional property,

without which also it is optimal. Interestingly, this additional property is

satisfied by all the hitherto known examples of POTBs, even when their

definitions do not demand it. In this paper we present direct and recursive

constructions of POTBs. In the process we have constructed one design which

seems to be the first example of an `optimal’ two-factor POTB which is not a

PERGOLA (see Theorem \ref {POTB2}).

multidimensional transient symmetric Markov process with a multidimensional

Gaussian random field. This association, known as Dynkin’s isomorphism, has

profoundly influenced the studies of Markov properties of generalized Gaussian

random fields. Extending Dykin’s isomorphism, we study here a particular

generalized Gaussian Markov random field, namely, the de Wijs process that

originated in Georges Matheron’s pioneering work on mining geostatistics and,

following McCullagh (Ann. Statist. 30 (2002) 1225-1310), is now receiving

renewed attention in spatial statistics. This extension of Dynkin’s theory

associates the de Wijs process with the (recurrent) Brownian motion on the two

dimensional plane, grants us further insight into Matheron’s kriging formula

for the de Wijs process and highlights previously unexplored relationships of

the central Markov models in spatial statistics with Markov processes on the

plane.

**Almost Optimal Cover-Free Families**

$n$-bit strings such that: “in any $d:=r+s$ indices we see all patterns of

weight $r$”. CFFs have been of interest for a long time both in discrete

mathematics as part of block design theory, and in theoretical computer science

where they have found a variety of applications, for example, in parametrized

algorithms where they were introduced in the recent breakthrough work of Fomin,

Lokshtanov and Saurabh under the name `lopsided universal sets’.

In this paper we give the first explicit construction of cover-free families

of optimal size up to lower order multiplicative terms, {for any $r$ and $s$}.

In fact, our construction time is almost linear in the size of the family.

Before our work, such a result existed only for $r=d^{o(1)}$. and $r=

\omega(d/(\log\log d\log\log\log d))$. As a sample application, we improve the

running times of parameterized algorithms from the recent work of Gabizon,

Lokshtanov and Pilipczuk.

**Combinatorial properties of Nil-Bohr sets**

apply to a set of positive integers, namely $\mathrm{Nil}_d{-}\mathrm{Bohr}$

and $\mathrm{SG}_k$, as introduced by Host and Kra. We prove that any

$\mathrm{Nil}_d{-}\mathrm{Bohr}_0$ set is necessarily $\mathrm{SG}_k$ where

${k}$ is effectively bounded in terms of $d$. This partially resolves a

conjecture of Host and Kra.

**A genetic algorithm for autonomous navigation in partially observable
domain**

robotics. Although, in general, it may be challenging when an autonomous

vehicle is placed into partially observable domain. In this paper we consider

simplistic environment model and introduce a navigation algorithm based on

Learning Classifier System.

integer $k$ for which there exists a graph $G$ with maximum degree at most $k$

such that any proper edge colouring of $G$ yields a rainbow copy of $H$. In

this paper we prove a general upper bound on degree anti-Ramsey numbers,

determine the precise value of the degree anti-Ramsey number of any forest, and

prove an upper bound on the degree anti-Ramsey numbers of cycles of any length

which is best possible up to a multiplicative factor of $2$. Our proofs involve

a variety of tools, including a classical result of Bollob\’as concerning cross

intersecting families and a topological version of Hall’s Theorem due to

Aharoni, Berger and Meshulam.

**Application of Kullback-Leibler divergence for short-term user interest
detection**

are concentrated mainly on static user preference extraction. This approach

works well as an example for music recommendations when a user behavior tends

to be stable over long period of time, however the most common situation in

e-commerce is different which requires reactive algorithms based on a

short-term user activity analysis. This paper introduces a small mathematical

framework for short-term user interest detection formulated in terms of item

properties and its application for recommender systems enhancing. The framework

is based on the fundamental concept of information theory — Kullback-Leibler

divergence.

**Multiple Extended Target Tracking with Labelled Random Finite Sets**

commonly known as extended targets. These present a challenge for many tracking

algorithms, as they violate one of the key assumptions of the standard

measurement model. In this paper, a new algorithm is proposed for tracking

multiple extended targets in clutter, that is capable of estimating the number

of targets, as well the trajectories of their states, comprising the

kinematics, measurement rates and extents. The proposed technique is based on

modelling the multi-target state as a generalised labelled multi-Bernoulli

(GLMB) random finite set (RFS), within which the extended targets are modelled

using gamma Gaussian inverse Wishart (GGIW) distributions. A cheaper variant of

the algorithm is also proposed, based on the labelled multi-Bernoulli (LMB)

filter. The proposed GLMB/LMB-based algorithms are compared with an extended

target version of the cardinalised probability hypothesis density (CPHD)

filter, and simulation results show that the (G)LMB has improved estimation and

tracking performance.

**Coherent Frameworks for Statistical Inference serving Integrating
Decision Support Systems**

systems, needs to draw on the expertise of a variety of disparate panels of

experts and integrate this information coherently. To achieve this, diverse

supporting probabilistic models need to be networked together, the output of

one model providing the input to the next. In this paper we provide a

technology for designing an integrating decision support system and to enable

the centre to explore and compare the efficiency of different candidate

policies. We develop a formal statistical methodology to underpin this tool. In

particular, we derive sufficient conditions that ensure inference remains

coherent before and after relevant evidence is accommodated into the system.

The methodology is illustrated throughout using examples drawn from two

decision support systems: one designed for nuclear emergency crisis management

and the other to support policy makers in addressing the complex challenges of

food poverty in the UK.

**A Combinatorial Approximation Algorithm for Graph Balancing with Light
Hyper Edges**

\infty\}|C_{\max})$ is a classical problem in the field of machine scheduling.

In a landmark paper in 1990 [7], Lenstra, Shmoys, and Tardos gave a

2-approximation algorithm and proved that the problem cannot be approximated

within 1.5 unless P=NP. The upper and lower bounds of the problem have been

essentially unimproved in the intervening 25 years, despite several remarkable

successful attempts in some special cases of the problem [1,3,10] recently.

In this paper, we consider a special case called graph-balancing with light

hyper edges: jobs with weights in the range of $(\beta W, W]$ can be assigned

to at most two machines, where $4/7\leq \beta < 1$, and $W$ is the largest job

weight, while there is no restriction on the number of machines a job can be

assigned to if its weight is in the range of $(0,\beta W]$. Our main

contribution is a $5/3+\beta/3$ approximation algorithm, thus breaking the

barrier of 2 in this special case. Unlike the several recent works [1,3,10]

that make use of (configuration)-LPs, our algorithm is purely combinatorial.

In addition, we consider another special case where jobs have only two

possible weights $\{w, W\}$, and $w < W$. Jobs with weight $W$ can be assigned

to only two machines, while there is no restriction on the others. For this

special case, we give a 1.5-approximation algorithm, thus matching the general

lower bound (indeed the current 1.5 lower bound is established in an even more

restricted setting). Interestingly, depending on the specific values of $w$ and

$W$, sometimes our algorithm guarantees sub-1.5 approximation ratios.

**Partial Resampling to Approximate Covering Integer Programs**

and which have attracted a long line of research developing (randomized)

approximation algorithms. Srinivasan (2006) gave a rounding algorithm based on

the FKG inequality for systems which are “column-sparse.” This algorithm may

return an integer solution in which the variables get assigned large (integral)

values; Kolliopoulos & Young (2005) modified this algorithm to limit the

solution size, at the cost of a worse approximation ratio. We develop a new

rounding scheme based on the Partial Resampling variant of the Lov\'{a}sz Local

Lemma developed by Harris \& Srinivasan (2013). This achieves an approximation

ratio of $1 + \frac{\ln (\Delta_1+1)}{a} + O(\sqrt{ \frac{\log

(\Delta_1+1)}{a}} )$, where $a$ is the minimum covering constraint and

$\Delta_1$ is the maximum $\ell_1$-norm of any column of the covering matrix

(whose entries are scaled to lie in $[0,1]$); we also show nearly-matching

inapproximability and integrality-gap lower bounds.

Our approach improves asymptotically, in several different ways, over known

results. First, it replaces $\Delta_0$, the maximum number of nonzeroes in any

column (from the result of Srinivasan) by $\Delta_1$ which is always — and can

be much — smaller than $\Delta_0$; this is the first such result in this

context. Second, our algorithm automatically handles multi-criteria programs;

we achieve improved approximation ratios compared to the algorithm of

Srinivasan, and give, for the first time when the number of objective functions

is large, polynomial-time algorithms with good multi-criteria approximations.

We also significantly improve upon the upper-bounds of Kolliopoulos & Young

when the integer variables are required to be within $(1 + \epsilon)$ of some

given upper-bounds, and show nearly-matching inapproximability.

**Requirements for Open-Ended Evolution in Natural and Artificial Systems**

evolutionary systems. Many ideas exist in the biological literature beyond the

basic Darwinian requirements of variation, differential reproduction and

inheritance. I argue that these ideas can be seen as aspects of five

fundamental requirements for open-ended evolution: (1) robustly reproductive

individuals, (2) a medium allowing the possible existence of a practically

unlimited diversity of individuals and interactions, (3) individuals capable of

producing more complex offspring, (4) mutational pathways to other viable

individuals, and (5) drive for continued evolution. I briefly discuss

implications of this view for the design of artificial systems with greater

evolutionary potential.

**Sample-dependent first-passage time distribution in a disordered medium**

random traps is believed to be well-described by the annealed continuous time

random walk (CTRW). We propose an approximate expression for the

first-passage-time (FPT) distribution in a given sample that enables detailed

comparison of the two problems. For a system of finite size, the number and

spatial arrangement of deep traps yield significant sample-to-sample variations

in the FPT statistics. Numerical simulations of a quenched trap model with

power-law sojourn times confirm the existence of two characteristic time scales

and a non-self-averaging FPT distribution, as predicted by theory.

**Posterior contraction rates for deconvolution of Dirichlet-Laplace
mixtures**

Laplace density and a Dirichlet process prior on the mixing distribution. We

derive a contraction rate of the corresponding posterior distribution, both for

the mixing distribution relative to the Wasserstein metric and for the mixed

density relative to the Hellinger and $L_q$ metrics.

**Optimum design via I-divergence for stable estimation in generalized
regression models**

standardly based on the Fisher information matrix (cf. Atkinson et al (2014)

for a recent exposition). The corresponding optimality criteria are related to

the asymptotic properties of maximal likelihood (ML) estimators in such models.

However, in finite sample experiments there could be problems with

identifiability, stability and uniqueness of the ML estimate, which are not

reflected by the information matrices. In P\’azman and Pronzato (2014) is

discussed how to solve some of these estimability issues on the design stage of

an experiment in standard nonlinear regression. Here we want to extend this

design methodology to more general models based on exponential families of

distributions (binomial, Poisson, normal with parametrized variances, etc.).

The main tool for that is the information (or Kullback-Leibler) divergence,

which is closely related to the ML estimation.

**Assessment of petrophysical quantities inspired by joint multifractal
approach**

analyze some petrophysical quantities for characterizing the petroleum

reservoir. These quantities include Gamma emission (GR), sonic transient time

(DT) and Neutron porosity (NPHI) which are collected from four wells of a

reservoir. To quantify mutual interaction of petrophysical quantities, joint

multifractal random walk is implemented. This approach is based on the mutual

multiplicative cascade notion in the multifractal formalism and in this

approach $L_0$ represents a benchmark to describe the nature of

cross-correlation between two series. The analysis of the petrophysical

quantities revealed that GR for all wells has strongly multifractal nature due

to the considerable abundance of large fluctuations in various scales. The

variance of probability distribution function, $\lambda_{\ell}^2$, at scale

$\ell$ and its intercept determine the multifractal properties of the data sets

sourced by probability density function. The value of $\lambda_0 ^2$ for NPHI

data set is less than GR’s, however, DT shows a nearly monofractal behavior,

namely $\lambda_0 ^2\rightarrow 0$, so we find that $\lambda_0^2({\rm

GR})>\lambda_0^2({\rm NPHI})\gg\lambda_0^2({\rm DT})$. While, the value of

Hurst exponents can not discriminate between series GR, NPHI and DT. Joint

analysis of the petrophysical quantities for considered wells demonstrates that

$L_0$ has negative value for GR-NPHI confirming that finding shaly layers is in

competition with finding porous medium while it takes positive value for GR-DT

determining that continuum medium can be detectable by evaluating the

statistical properties of GR and its cross-correlation to DT signal.

**On local search and LP and SDP relaxations for k-Set Packing**

combinatorial optimization problems and knows a lot of applications. It is

equivalent to hypergraph matching and it is strongly related to the maximum

independent set problem. In this thesis we study the k-set packing problem

where given a universe U and a collection C of subsets over U, each of

cardinality k, one needs to find the maximum collection of mutually disjoint

subsets. Local search techniques have proved to be successful in the search for

approximation algorithms, both for the unweighted and the weighted version of

the problem where every subset in C is associated with a weight and the

objective is to maximise the sum of the weights. We make a survey of these

approaches and give some background and intuition behind them. In particular,

we simplify the algebraic proof of the main lemma for the currently best

weighted approximation algorithm of Berman ([Ber00]) into a proof that reveals

more intuition on what is really happening behind the math. The main result is

a new bound of k/3 + 1 + epsilon on the integrality gap for a polynomially

sized LP relaxation for k-set packing by Chan and Lau ([CL10]) and the natural

SDP relaxation [NOTE: see page iii]. We provide detailed proofs of lemmas

needed to prove this new bound and treat some background on related topics like

semidefinite programming and the Lovasz Theta function. Finally we have an

extended discussion in which we suggest some possibilities for future research.

We discuss how the current results from the weighted approximation algorithms

and the LP and SDP relaxations might be improved, the strong relation between

set packing and the independent set problem and the difference between the

weighted and the unweighted version of the problem.

**Characterizations of the spectral radius of nonnegative weakly
irreducible tensors via digraph**

characterizations of the spectral radius of $\mathcal{A}$, by using the digraph

of tensors. As applications, some bounds on the spectral radius of the

adjacency tensor and the signless Laplacian tensor of the $k$-uniform

hypergraphs are shown.

**Unification of Fusion Theories, Rules, Filters, Image Fusion and Target
Tracking Methods (UFT)**

presentations, and scientific grant applications (between 2004-2015) for the

unification of fusion theories, combinations of fusion rules, image fusion

procedures, filter algorithms, and target tracking methods for more accurate

applications to our real world problems – since neither fusion theory nor

fusion rule fully satisfy all needed applications. For each particular

application, one selects the most appropriate fusion space and fusion model,

then the fusion rules, and the algorithms of implementation. He has worked in

the Unification of the Fusion Theories (UFT), which looks like a cooking

recipe, better one could say like a logical chart for a computer programmer,

but one does not see another method to comprise/unify all things. The

unification scenario presented herein, which is now in an incipient form,

should periodically be updated incorporating new discoveries from the fusion

and engineering research.

**$F$ tests for the strip-split plot design**

components and the approximate degrees of freedom for each of the eight

possible mixed models of the strip-split plot design. We present an example to

illustrate the model and compare it to more traditional settings like a

three-way factorial design and a split-split plot model.

Ryj\'{a}\v{c}ek that every locally connected graph is weakly pancyclic.

For a connected locally connected graph $G$ of order at least $3$, our

results are as follows: If $G$ is $(K_1+(K_1\cup K_2))$-free, then $G$ is

weakly pancyclic. If $G$ is $(K_1+(K_1\cup K_2))$-free, then $G$ is fully cycle

extendable if and only if $2\delta(G)\geq n(G)$. If $G$ is $\{

K_1+K_1+\bar{K}_3,K_1+P_4\}$-free or $\{ K_1+K_1+\bar{K}_3,K_1+(K_1\cup

P_3)\}$-free, then $G$ is fully cycle extendable. If $G$ is distinct from

$K_1+K_1+\bar{K}_3$ and $\{ K_1+P_4,K_{1,4},K_2+(K_1\cup K_2)\}$-free, then $G$

is fully cycle extendable.

Furthermore, if $G$ is a connected graph of order at least $3$ such that

$$|N_G(u)\cap N_G(v)\cap N_G(w)|>|N_G(u)\setminus (N_G[v]\cup N_G[w])|$$ for

every induced path $vuw$ of order $3$ in $G$, then $G$ is fully cycle

extendable, which implies that every connected locally Ore or locally Dirac

graph of order at least $3$ is fully cycle extendable.

**Estimating an Activity Driven Hidden Markov Model**

time-dependent $\textit{activity levels}$ that drive transitions and emissions,

and show how to estimate its parameters. Our construction is motivated by the

problem of inferring human mobility on sub-daily time scales from, for example,

mobile phone records.

**Faster Spectral Sparsification of Laplacian and SDDM Matrix Polynomials**

algorithm for computing in nearly linear time a spectral sparsifier

$D-\widehat{M}_{N}$ for any matrix-polynomial of the form \[

D-\widehat{M}_{N}\approx_{\epsilon}D-D\sum_{i=0}^{N}\gamma_{i}(D^{-1} M)^i \]

where $B=D-M\in\mathbb{R}^{n\times n}$ is either Laplacian or SDDM matrix with

$m_B$ non-zero entries, and the vector $\gamma$ is induced by a mixture of

discrete Binomial distributions (MDBD).

Our sparsification algorithm runs in time

$\widetilde{O}(\epsilon^{-2}m_B\log^{3}n\cdot\log^{2}N+\epsilon^{-4}\cdot

nN\cdot\log^{4}n\cdot\log^{5}N)$ and exhibits a significant improvement over

the result of Cheng et al. [CCLPT15] whose running time is

$\widetilde{O}(\epsilon^{-2}\cdot m_B N^2\cdot\mathrm{poly}(\log n))$. Although

their algorithm handles general probability distributions, we show that MDBD

approximate large class of continuous probability distributions. Moreover, we

propose a nearly linear time algorithm that recovers exactly the coefficients

of any convex combination of $N+1$ distinct Binomial distributions (MDBD),

given as input the target probability distribution $\gamma$ that is induced by

a MDBD.

In addition, we show that a simple preprocessing step speeds up the run time

of Spielman and Peng’s [PS14] SDDM Solver from

$\widetilde{O}(m_B\log^{3}n\cdot\log^{2}\kappa_B)$ to

$\widetilde{O}(m_{B}\log^{2}n + n\log^{4}n\cdot\log^{5}\kappa_B)$, where

$\kappa_B$ is the condition number of the input SDDM matrix $B$.

When $B$ is a Laplacian matrix, our result leads to a faster algorithm for

analyzing the long term behaviour of random walks induced by transition

matrices of the form $\sum_{i=0}^{N}\gamma_{i}(D^{-1} A)^i$. We hope that our

sparsification algorithm for SDDM matrices can be useful in the context of

Machine Learning.

the domain of attraction of a stable law with index $\alpha \in (0,1)$. The

famous local renewal theorem of Garsia and Lamperti, also called strong renewal

theorem, is known to hold in complete generality only for $\alpha >

\frac{1}{2}$. Understanding when the strong renewal theorem holds for $\alpha

\le \frac{1}{2}$ is a long-standing problem, with sufficient conditions given

by Williamson, Doney and Chi. In this paper we give a complete solution,

providing explicit conditions that are both necessary and sufficient.

An analogous result has been independently and simultaneously proved by Doney

(arXiv:1507.06790).

**Nonlinear elasticity of disordered fiber networks**

strong nonlinearities. In the case of athermal gels (such as collagen-I) the

nonlinearity has long been associated with a crossover from a bending dominated

to a stretching dominated regime of elasticity. The physics of this crossover

is related to the existence of a central-force isostatic point and to the fact

that for most gels the bending modulus is small. This crossover induces scaling

behavior for the elastic moduli. In particular, for linear elasticity such a

scaling law has been demonstrated [Broedersz et al. Nature Physics, 2011 7,

983]. In this work we generalize the scaling to the nonlinear regime with a

two-parameter scaling law involving three critical exponents. We test the

scaling law numerically for two disordered lattice models, and find a good

scaling collapse for the shear modulus in both the linear and nonlinear

regimes. We compute all the critical exponents for the two lattice models and

discuss the applicability of our results to real systems.

**Strong averaging along foliated Lévy diffusions with heavy tails on
compact leaves**

discontinuous heavy-tailed L\’evy noise, which are invariant on the compact

horizontal leaves of a foliated manifold subject to small transversal random

perturbations. We extend a result for such diffusions with exponential moments

and bounded, deterministic perturbations to diffusions with polynomial moments

of order $p \geq 2$, perturbed by deterministic and stochastic integrals with

unbounded coefficients and polynomial moments. The main argument relies on a

result of the dynamical system for each individual jump increments of the

corresponding canonical Marcus equation. The example of L\’evy rotations on the

unit circle subject to perturbations by a planar Ornstein-Uhlenbeck process is

carried out in detail.

**On Bivariate Exponentiated Extended Weibull Family of Distributions**

bivariate exponentiated extended Weibull distributions. The model introduced

here is of Marshall-Olkin type. This new class of bivariate distributions

contains several bivariate lifetime models. Some mathematical properties of the

new class of distributions are studied. We provide the joint and conditional

density functions, the joint cumulative distribution function and the joint

survival function. Special bivariate distributions are investigated in some

detail. The maximum likelihood estimators are obtained using the EM algorithm.

We illustrate the usefulness of the new class by means of application to two

real data sets.

**Online Censoring for Large-Scale Regressions with Application to
Streaming Big Data**

methods, popular both for its simplicity as well as its broad applicability. On

par with data-intensive applications, the sheer size of linear regression

problems creates an ever growing demand for quick and cost efficient solvers.

Fortunately, a significant percentage of the data accrued can be omitted while

maintaining a certain quality of statistical inference with an affordable

computational budget. The present paper introduces means of identifying and

omitting “less informative” observations in an online and data-adaptive

fashion, built on principles of stochastic approximation and data censoring.

First- and second-order stochastic approximation maximum likelihood-based

algorithms for censored observations are developed for estimating the

regression coefficients. Online algorithms are also put forth to reduce the

overall complexity by adaptively performing censoring along with estimation.

The novel algorithms entail simple closed-form updates, and have provable

(non)asymptotic convergence guarantees. Furthermore, specific rules are

investigated for tuning to desired censoring patterns and levels of

dimensionality reduction. Simulated tests on real and synthetic datasets

corroborate the efficacy of the proposed data-adaptive methods compared to

data-agnostic random projection-based alternatives.

**Yield statistics of interpolated superoscillations**

introduced as a means for possibly making the use of the phenomenon of

superoscillation practical. In this paper we study how good is a

superoscillation that is not optimal. Namely, by how much is the yield

decreased when the signal departs from the optimal one. We consider two

situations. One is the case where the signal strictly obeys the interpolation

requirement and the other is when that requirement is relaxed. In the latter

case the yield can be increased at the expense of deterioration of signal

quality. An important conclusion is that optimizing superoscillations may be

challenging in terms of the precision needed, however, storing and using them

is not at that sensitive. This is of great importance in physical systems where

noise and error are inevitable.