A breakpoint detection error function for segmentation model selection and evaluation

We consider the multiple breakpoint detection problem, which is concerned with detecting the locations of several distinct changes in a one-dimensional noisy data series. We propose the breakpointError, a function that can be used to evaluate estimated breakpoint locations, given the known locations of true breakpoints. We discuss an application of the breakpointError for finding optimal penalties for breakpoint detection in simulated data. Finally, we show how to relax the breakpointError to obtain an annotation error function which can be used more readily in practice on real data. A fast C implementation of an algorithm that computes the breakpointError is available in an R package on R-Forge.

Differentially Private Online Learning for Video Recommendation with Social Big Data over Media Cloud

With the rapid growth in multimedia services and the enormous offers of video contents in online social networks, users have difficulty in obtaining their interests. Therefore, various personalized recommendation systems have been proposed. However, they ignore that the accelerated proliferation of social media data has led to the big data era, which has greatly impeded the process of video recommendation. In addition, none of them has considered both the privacy of users’ contexts (e,g,. social status, ages and hobbies) and video service vendors’ repositories, which are extremely sensitive and of significant commercial value. To handle the problems, we propose a cloud-assisted differentially private video recommendation system based on distributed online learning. In our framework, service vendors are modeled as distributed cooperative learners, recommending videos according to user’s context, while simultaneously adapting the video-selection strategy based on user-click feedback to maximize total user clicks (reward). Considering the sparsity and heterogeneity of big social media data, we also propose a novel \emph{geometric differentially private} model, which can greatly reduce the performance (recommendation accuracy) loss. Our simulation shows the proposed algorithms outperform other existing methods and keep a delicate balance between computing accuracy and privacy preserving level.

GR2RSS: Publishing Linked Open Commerce Data as RSS and Atom Feeds

The integration of Linked Open Data (LOD) content in Web pages is a challenging and sometimes tedious task for Web developers. At the same moment, most software packages for blogs, content management systems (CMS), and shop applications support the consumption of feed formats, namely RSS and Atom. In this technical report, we demonstrate an on-line tool that fetches e-commerce data from a SPARQL endpoint and syndicates obtained results as RSS or Atom feeds. Our approach combines (1) the popularity and broad tooling support of existing feed formats, (2) the precision of queries against structured data built upon common Web vocabularies like schema.org, GoodRelations, FOAF, VCard, and WGS 84, and (3) the ease of integrating content from a large number of Web sites and other data sources in RDF in general.

Learning A Task-Specific Deep Architecture For Clustering

While deep networks show to be highly effective in extensive applications, few efforts have been spent on studying its potential in clustering. In this paper, we argue that the successful domain expertise of sparse coding in clustering is still valuable, and can be combined with the key ingredients of deep learning. A novel feed-forward architecture, named TAG-LISTA, is constructed from graph-regularized sparse coding. It is then trained with task-specific loss functions from end to end. The inner connections of the proposed network to sparse coding leads to more effective training. Moreover, by introducing auxiliary clustering tasks to the hierarchy of intermediate features, we present DTAG-LISTA and obtain a further performance boost. We demonstrate extensive experiments on several benchmark datasets, under a wide variety of settings. The results verify that the proposed model performs significantly outperforms the generic architectures of the same parameter capacity, and also gains remarkable margins over several state-of-the-art methods.

Online Supervised Subspace Tracking

We present a framework for supervised subspace tracking, when there are two time series x_t and y_t, one being the high-dimensional predictors and the other being the response variables and the subspace tracking needs to take into consideration of both sequences. It extends the classic online subspace tracking work which can be viewed as tracking of x_t only. Our online sufficient dimensionality reduction (OSDR) is a meta-algorithm that can be applied to various cases including linear regression, logistic regression, multiple linear regression, multinomial logistic regression, support vector machine, the random dot product model and the multi-scale union-of-subspace model. OSDR reduces data-dimensionality on-the-fly with low-computational complexity and it can also handle missing data and dynamic data. OSDR uses an alternating minimization scheme and updates the subspace via gradient descent on the Grassmannian manifold. The subspace update can be performed efficiently utilizing the fact that the Grassmannian gradient with respect to the subspace in many settings is rank-one (or low-rank in certain cases). The optimization problem for OSDR is non-convex and hard to analyze in general; we provide convergence analysis of OSDR in a simple linear regression setting. The good performance of OSDR compared with the conventional unsupervised subspace tracking are demonstrated via numerical examples on simulated and real data.

Value function approximation via low-rank models

We propose a novel value function approximation technique for Markov decision processes. We consider the problem of compactly representing the state-action value function using a low-rank and sparse matrix model. The problem is to decompose a matrix that encodes the true value function into low-rank and sparse components, and we achieve this using Robust Principal Component Analysis (PCA). Under minimal assumptions, this Robust PCA problem can be solved exactly via the Principal Component Pursuit convex optimization problem. We experiment the procedure on several examples and demonstrate that our method yields approximations essentially identical to the true function.

A characterization of L(2, 1)-labeling number for trees with maximum degree 3

A large deviations approach to limit theory for heavy-tailed time series

A new look at duality for the symbiotic branching model

A piecewise deterministic scaling limit of Lifted Metropolis-Hastings in the Curie-Weiss model

A Telescopic Binary Learning Machine for Training Neural Networks

Adaptive Smoothing Algorithms for Nonsmooth Composite Convex Minimization

Adaptive, delayed-acceptance MCMC for targets with expensive likelihoods

ANCOVA: A heteroscedastic global test when there is curvature and two covariates

Approximations of standard equivalence relations and Bernoulli percolation at p\_u

Band Depth Clustering for Nonstationary Time Series and Wind Speed Behavior

Bayesian Models for Heterogeneous Personalized Health Data

Bounds and Fixed-Parameter Algorithms for Weighted Improper Coloring (Extended Version)

Brewing Analytics Quality for Cloud Performance

Circuit topology of linear polymers: a statistical mechanical treatment

Defining and estimating causal direct and indirect effects when setting the mediator to specific values is not feasible

Distance measures and evolution of polymer chains in their topological space

Equidistribution of the conormal cycle of random nodal sets

Ergodic Backward Stochastic Difference Equations

Estimation of matrices with row sparsity

Evolving Unipolar Memristor Spiking Neural Networks

Explicit resilient functions matching Ajtai-Linial

Fast Algorithms for the computation of Fourier Extensions of arbitrary length

Fingerprinting-Based Positioning in Distributed Massive MIMO Systems

Generalising separating families of fixed size

Interdisciplinary and physics challenges of Network Theory

LAN property for stochastic differential equations with additive fractional noise and continuous time observation

Large deviations principle for the invariant measures of the 2D stochastic Navier-Stokes equations on a torus

Learning Deep $\ell_0$ Encoders

Metastatic liver tumour segmentation from discriminant Grassmannian manifolds

Multi-Sensor Slope Change Detection

On friendliness between trees

On Minimizing Crossings in Storyline Visualizations

On the critical group of the missing Moore graph

On the joint behaviour of speed and entropy of random walks on groups

On the Lower Bound for the Number of Facets of a k-Neighborly Polytope

On the Structure of nil-Temperley-Lieb Algebras of type A

Oscillations of quenched slowdown asymptotics for ballistic one-dimensional random walk in a random environment

Pure and Hybrid Evolutionary Computing in Global Optimization of Chemical Structures: from Atoms and Molecules to Clusters and Crystals

Pure morphic sequences and their standard forms

Robust Bayesian model selection for heavy-tailed linear regression using finite mixtures

Scalable Computation of Regularized Precision Matrices via Stochastic Optimization

Scalable Task-Based Algorithm for Multiplication of Block-Rank-Sparse Matrices

Schur-positive sets of permutations via products of grid classes

Segmentation of functional-biased series by a Bayesian approach

Sequential Information Guided Sensing

The central limit theorem for a sequence of random processes with space varying long memory

The classification of subfactors with index at most $5 \frac{1}{4}$

Tight Heffter Arrays Exist for all Possible Values: The Research Report

Towards Tight Bounds for the Streaming Set Cover Problem

Transitional annealed adaptive slice sampling for Gaussian process hyper-parameter estimation

Tumor Motion Tracking in Liver Ultrasound Images Using Mean Shift and Active Contour

Using Genetic Distance to Infer the Accuracy of Genomic Prediction

Variance estimation and allocation in the particle filter