A Schur Complement Cheeger Inequality

Cheeger’s inequality shows that any undirected graph G with minimum nonzero normalized Laplacian eigenvalue \lambda_G has a cut with conductance at most O(\sqrt{\lambda_G}). Qualitatively, Cheeger’s inequality says that if the relaxation time of a graph is high, there is a cut that certifies this. However, there is a gap in this relationship, as cuts can have conductance as low as \Theta(\lambda_G). To better approximate the relaxation time of a graph, we consider a more general object. Instead of bounding the mixing time with cuts, we bound it with cuts in graphs obtained by Schur complementing out vertices from the graph G. Combinatorially, these Schur complements describe random walks in G restricted to a subset of its vertices. As a result, all Schur complement cuts have conductance at least \Omega(\lambda_G). We show that unlike with cuts, this inequality is tight up to a constant factor. Specifically, there is a Schur complement cut with conductance at most O(\lambda_G).

A Frequency Scaling based Performance Indicator Framework for Big Data Systems

It is important for big data systems to identify their performance bottleneck. However, the popular indicators such as resource utilizations, are often misleading and incomparable with each other. In this paper, a novel indicator framework which can directly compare the impact of different indicators with each other is proposed to identify and analyze the performance bottleneck efficiently. A methodology which can construct the indicator from the performance change with the CPU frequency scaling is described. Spark is used as an example of a big data system and two typical SQL benchmarks are used as the workloads to evaluate the proposed method. Experimental results show that the proposed method is accurate compared with the resource utilization method and easy to implement compared with some white-box method. Meanwhile, the analysis with our indicators lead to some interesting findings and valuable performance optimization suggestions for big data systems.

Cloud based Real-Time and Low Latency Scientific Event Analysis

Astronomy is well recognized as big data driven science. As the novel observation infrastructures are developed, the sky survey cycles have been shortened from a few days to a few seconds, causing data processing pressure to shift from offline to online. However, existing scientific databases focus on offline analysis of long-term historical data, not real-time and low latency analysis of large-scale newly arriving data. In this paper, a cloud based method is proposed to efficiently analyze scientific events on large-scale newly arriving data. The solution is implemented as a highly efficient system, namely Aserv. A set of compact data store and index structures are proposed to describe the proposed scientific events and a typical analysis pattern is formulized as a set of query operations. Domain aware filter, accuracy aware data partition, highly efficient index and frequently used statistical data designs are four key methods to optimize the performance of Aserv. Experimental results under the typical cloud environment show that the presented optimization mechanism can meet the low latency demand for both large data insertion and scientific event analysis. Aserv can insert 3.5 million rows of data within 3 seconds and perform the heaviest query on 6.7 billion rows of data also within 3 seconds. Furthermore, a performance model is given to help Aserv choose the right cloud resource setup to meet the guaranteed real-time performance requirement.

Beta Distribution Drift Detection for Adaptive Classifiers

With today’s abundant streams of data, the only constant we can rely on is change. For stream classification algorithms, it is necessary to adapt to concept drift. This can be achieved by monitoring the model error, and triggering counter measures as changes occur. In this paper, we propose a drift detection mechanism that fits a beta distribution to the model error, and treats abnormal behavior as drift. It works with any given model, leverages prior knowledge about this model, and allows to set application-specific confidence thresholds. Experiments confirm that it performs well, in particular when drift occurs abruptly.

Efficiently Charting RDF

We propose a visual query language for interactively exploring large-scale knowledge graphs. Starting from an overview, the user explores bar charts through three interactions: class expansion, property expansion, and subject/object expansion. A major challenge faced is performance: a state-of-the-art SPARQL engine may require tens of minutes to compute the multiway join, grouping and counting required to render a bar chart. A promising alternative is to apply approximation through online aggregation, trading precision for performance. However, state-of-the-art online aggregation algorithms such as Wander Join have two limitations for our exploration scenario: (1) a high number of rejected paths slows the convergence of the count estimations, and (2) no unbiased estimator exists for counts under the distinct operator. We thus devise a specialized algorithm for online aggregation that augments Wander Join with exact partial computations to reduce the number of rejected paths encountered, as well as a novel estimator that we prove to be unbiased in the case of the distinct operator. In an experimental study with random interactions exploring two large-scale knowledge graphs, our algorithm shows a clear reduction in error with respect to computation time versus Wander Join.

Dataset Distillation

Model distillation aims to distill the knowledge of a complex model into a simpler one. In this paper, we consider an alternative formulation called {\em dataset distillation}: we keep the model fixed and instead attempt to distill the knowledge from a large training dataset into a small one. The idea is to {\em synthesize} a small number of data points that do not need to come from the correct data distribution, but will, when given to the learning algorithm as training data, approximate the model trained on the original data. For example, we show that it is possible to compress 60,000 MNIST training images into just 10 synthetic {\em distilled images} (one per class) and achieve close to original performance with only a few steps of gradient descent, given a particular fixed network initialization. We evaluate our method in a wide range of initialization settings and with different learning objectives. Experiments on multiple datasets show the advantage of our approach compared to alternative methods in most settings.

Neural Non-Stationary Spectral Kernel

Standard kernels such as Mat\’ern or RBF kernels only encode simple monotonic dependencies within the input space. Spectral mixture kernels have been proposed as general-purpose, flexible kernels for learning and discovering more complicated patterns in the data. Spectral mixture kernels have recently been generalized into non-stationary kernels by replacing the mixture weights, frequency means and variances by input-dependent functions. These functions have also been modelled as Gaussian processes on their own. In this paper we propose modelling the hyperparameter functions with neural networks, and provide an experimental comparison between the stationary spectral mixture and the two non-stationary spectral mixtures. Scalable Gaussian process inference is implemented within the sparse variational framework for all the kernels considered. We show that the neural variant of the kernel is able to achieve the best performance, among alternatives, on several benchmark datasets.

A deconvolution method for reconstruction of data time series from intermittent systems

In this manuscript, we will investigate the deconvolution method for recovering pulse arrival times and amplitudes using synthetic data. For the deconvolution procedure to have hope of recovering amplitudes and arrivals, the average waiting time between events must be at least 10 times the time step.

An Analytical Approach to Improving Time Warping on Multidimensional Time Series

Dynamic time warping (\texttt{DTW}) is one of the most used distance functions to compare time series, e.\,g. in nearest neighbor classifiers. Yet, fast state of the art algorithms only compare 1-dimensional time series efficiently. One of these state of the art algorithms uses a lower bound (\texttt{LB}_\texttt{Keogh}) introduced by E. Keogh to prune \texttt{DTW} computations. We introduce \texttt{LB}_\texttt{Box} as a canonical extension to \texttt{LB}_\texttt{Keogh} on multi-dimensional time series. We evaluate its performance conceptually and experimentally and show that an alternative to \texttt{LB}_\texttt{Box} is necessary for multi-dimensional time series. We also propose a new algorithm for the dog-keeper distance (\texttt{DK}) which is an alternative distance function to \texttt{DTW} and show that it outperforms \texttt{DTW} with \texttt{LB}_\texttt{Box} by more than one order of magnitude on multi-dimensional time series.

Generative Adversarial Network Training is a Continual Learning Problem

Generative Adversarial Networks (GANs) have proven to be a powerful framework for learning to draw samples from complex distributions. However, GANs are also notoriously difficult to train, with mode collapse and oscillations a common problem. We hypothesize that this is at least in part due to the evolution of the generator distribution and the catastrophic forgetting tendency of neural networks, which leads to the discriminator losing the ability to remember synthesized samples from previous instantiations of the generator. Recognizing this, our contributions are twofold. First, we show that GAN training makes for a more interesting and realistic benchmark for continual learning methods evaluation than some of the more canonical datasets. Second, we propose leveraging continual learning techniques to augment the discriminator, preserving its ability to recognize previous generator samples. We show that the resulting methods add only a light amount of computation, involve minimal changes to the model, and result in better overall performance on the examined image and text generation tasks.

Bayesian graph convolutional neural networks for semi-supervised classification

Recently, techniques for applying convolutional neural networks to graph-structured data have emerged. Graph convolutional neural networks (GCNNs) have been used to address node and graph classification and matrix completion. Although the performance has been impressive, the current implementations have limited capability to incorporate uncertainty in the graph structure. Almost all GCNNs process a graph as though it is a ground-truth depiction of the relationship between nodes, but often the graphs employed in applications are themselves derived from noisy data or modelling assumptions. Spurious edges may be included; other edges may be missing between nodes that have very strong relationships. In this paper we adopt a Bayesian approach, viewing the observed graph as a realization from a parametric family of random graphs. We then target inference of the joint posterior of the random graph parameters and the node (or graph) labels. We present the Bayesian GCNN framework and develop an iterative learning procedure for the case of assortative mixed-membership stochastic block models. We present the results of experiments that demonstrate that the Bayesian formulation can provide better performance when there are very few labels available during the training process.

LEASGD: an Efficient and Privacy-Preserving Decentralized Algorithm for Distributed Learning

Distributed learning systems have enabled training large-scale models over large amount of data in significantly shorter time. In this paper, we focus on decentralized distributed deep learning systems and aim to achieve differential privacy with good convergence rate and low communication cost. To achieve this goal, we propose a new learning algorithm LEASGD (Leader-Follower Elastic Averaging Stochastic Gradient Descent), which is driven by a novel Leader-Follower topology and a differential privacy model.We provide a theoretical analysis of the convergence rate and the trade-off between the performance and privacy in the private setting.The experimental results show that LEASGD outperforms state-of-the-art decentralized learning algorithm DPSGD by achieving steadily lower loss within the same iterations and by reducing the communication cost by 30%. In addition, LEASGD spends less differential privacy budget and has higher final accuracy result than DPSGD under private setting.

A Concept-Centered Hypertext Approach to Case-Based Retrieval

The goal of case-based retrieval is to assist physicians in the clinical decision making process, by finding relevant medical literature in large archives. We propose a research that aims at improving the effectiveness of case-based retrieval systems through the use of automatically created document-level semantic networks. The proposed research tackles different aspects of information systems and leverages the recent advancements in information extraction and relational learning to revisit and advance the core ideas of concept-centered hypertext models. We propose a two-step methodology that in the first step addresses the automatic creation of document-level semantic networks, then in the second step it designs methods that exploit such document representations to retrieve relevant cases from medical literature. For the automatic creation of documents’ semantic networks, we design a combination of information extraction techniques and relational learning models. Mining concepts and relations from text, information extraction techniques represent the core of the document-level semantic networks’ building process. On the other hand, relational learning models have the task of enriching the graph with additional connections that have not been detected by information extraction algorithms and strengthening the confidence score of extracted relations. For the retrieval of relevant medical literature, we investigate methods that are capable of comparing the documents’ semantic networks in terms of structure and semantics. The automatic extraction of semantic relations from documents, and their centrality in the creation of the documents’ semantic networks, represent our attempt to go one step further than previous graph-based approaches.

MG-WFBP: Efficient Data Communication for Distributed Synchronous SGD Algorithms

Distributed synchronous stochastic gradient descent has been widely used to train deep neural networks on computer clusters. With the increase of computational power, network communications have become one limiting factor on system scalability. In this paper, we observe that many deep neural networks have a large number of layers with only a small amount of data to be communicated. Based on the fact that merging some short communication tasks into a single one may reduce the overall communication time, we formulate an optimization problem to minimize the training iteration time. We develop an optimal solution named merged-gradient WFBP (MG-WFBP) and implement it in our open-source deep learning platform B-Caffe. Our experimental results on an 8-node GPU cluster with 10GbE interconnect and trace-based simulation results on a 64-node cluster both show that the MG-WFBP algorithm can achieve much better scaling efficiency than existing methods WFBP and SyncEASGD.

A Frank-Wolfe Framework for Efficient and Effective Adversarial Attacks
Optimal Learning for Dynamic Coding in Deadline-Constrained Multi-Channel Networks
From Recognition to Cognition: Visual Commonsense Reasoning
sCAKE: Semantic Connectivity Aware Keyword Extraction
Adaptive Control By Regulation-Triggered Batch Least-Squares Estimation of Non-Observable Parameters
Part-level Car Parsing and Reconstruction from Single Street View
Maximizing Multivariate Information with Error-Correcting Codes
Robust Artificial Intelligence and Robust Human Organizations
CIAN: Cross-Image Affinity Net for Weakly Supervised Semantic Segmentation
Algae Detection Using Computer Vision and Deep Learning
Entropy and drift for word metrics on relatively hyperbolic groups
Data Management in Time-Domain Astronomy: Requirements and Challenges
Research on CRO’s Dilemma In Sapiens Chain: A Game Theory Method
Fair Division with a Secretive Agent
AstroServ: Distributed Database for Serving Large-Scale Full Life-Cycle Astronomical Data
Sampling Techniques for Large-Scale Object Detection from Sparsely Annotated Objects
Object Tracking by Reconstruction with View-Specific Discriminative Correlation Filters
Exploiting Numerical Sparsity for Efficient Learning : Faster Eigenvector Computation and Regression
Efficient non-uniform quantizer for quantized neural network targeting reconfigurable hardware
Affinity Derivation and Graph Merge for Instance Segmentation
Automatic Image Stylization Using Deep Fully Convolutional Networks
An Optimal Space Lower Bound for Approximating MAX-CUT
The synthesis of data from instrumented structures and physics-based models via Gaussian processes
UnDEMoN 2.0: Improved Depth and Ego Motion Estimation through Deep Image Sampling
Optimal switching problems with an infinite set of modes: an approach by randomization and constrained backward SDEs
Chasing the Echo State Property
DSBI: Double-Sided Braille Image Dataset and Algorithm Evaluation for Braille Dots Detection
Are 2D-LSTM really dead for offline text recognition?
Making Agents’ Abilities Explicit
Kernel-based Multi-Task Contextual Bandits in Cellular Network Configuration
Efficient Image Retrieval via Decoupling Diffusion into Online and Offline Processing
Hermitian Laplacians and a Cheeger inequality for the Max-2-Lin problem
Beyond One Glance: Gated Recurrent Architecture for Hand Segmentation
Robust active attacks on social graphs
Non-overlapping Dyck matrices
DeepVisInterests: CNN-Ontology Prediction of Users Interests from Social Images
Large-scale analysis of user exposure to online advertising in Facebook
Single-Agent Policy Tree Search With Guarantees
On the martingale property in the rough Bergomi model
Near Gaussian Multiple Access Channel Capacity Detection and Decoding
Adaptive Edge Process Migration for IoT in Heterogeneous Cloud-Fog-Edge Computing Environment
Deep Geometric Prior for Surface Reconstruction
A Real-Time Remote IDS Testbed for Connected Vehicles
Deep Learned Frame Prediction for Video Compression
Robust Semi-Supervised Learning when Labels are Missing at Random
Predicting the Flu from Instagram
Subsampling (weighted smooth) empirical copula processes
A Bayesian model of acquisition and clearance of bacterial colonization
Lévy noise induced escape in the Morris-Lecar model
MagicVO: End-to-End Monocular Visual Odometry through Deep Bi-directional Recurrent Convolutional Neural Network
Extracting conditionally heteroscedastic components using ICA
One-Shot Item Search with Multimodal Data
The Fact Extraction and VERification (FEVER) Shared Task
Optimal Designs for Second-Order Interactions in Paired Comparison Experiments with Binary Attributes
Noise2Void – Learning Denoising from Single Noisy Images
Modelling Agents Endowed with Social Practices: Static Aspects
GarNet: A Two-stream Network for Fast and Accurate 3D Cloth Draping
Eliminating Exposure Bias and Loss-Evaluation Mismatch in Multiple Object Tracking
An explicit representation and enumeration for negacyclic codes of length $2^kn$ over $\mathbb{Z}_4+u\mathbb{Z}_4$
Systematic Exploration of the High Likelihood Set of Phylogenetic Tree Topologies
Besov class via heat semigroup on Dirichlet spaces II: BV functions and Gaussian heat kernel estimates
An explicit representation and enumeration for self-dual cyclic codes over $\mathbb{F}_{2^m}+u\mathbb{F}_{2^m}$ of length $2^s$
Convexity in scientific collaboration networks
Higher-order approximate confidence intervals
Finding perfect matchings in random regular graphs in linear time
A spatially varying change points model for monitoring glaucoma progression using visual field data
Rotting bandits are no harder than stochastic ones
Performance Analysis of a Low-Interference N-Continuous OFDM Scheme
Dense xUnit Networks
Understanding the Importance of Single Directions via Representative Substitution
Fast Object Detection in Compressed Video
Combining Deep Learning and Qualitative Spatial Reasoning to Learn Complex Structures from Sparse Examples with Noise
Learning State Representations in Complex Systems with Multimodal Data
Measuring Effects of Medication Adherence on Time-Varying Health Outcomes using Bayesian Dynamic Linear Models
Recycling cardiogenic artifacts in impedance pneumography
Ultra-dense Radio Access Networks for Smart Cities: Cloud-RAN, Fog-RAN and ‘cell-free’ Massive MIMO
Refined WaveNet Vocoder for Variational Autoencoder Based Voice Conversion
Robust Classification of Financial Risk
MobiFace: A Lightweight Deep Learning Face Recognition on Mobile Devices
Automatic Face Aging in Videos via Deep Reinforcement Learning
Bilinear Parameterization For Differentiable Rank-Regularization
Learning to detect dysarthria from raw speech
Classifications of quasitrivial semigroups
Counterexamples to a conjecture of Harris on Hall ratio
Unprocessing Images for Learned Raw Denoising
See before you see: Real-time high speed motion prediction using fast aperture-robust event-driven visual flow
SOC: hunting the underground inside story of the ethereum Social-network Opinion and Comment
Spatial log-Gaussian Cox processes in Hilbert spaces
Reliable uncertainty estimate for antibiotic resistance classification with Stochastic Gradient Langevin Dynamics
Understanding and Improving Kernel Local Descriptors
The Structure of Optimal Private Tests for Simple Hypotheses
Finite-Time Stability of Adaptive Parameter Estimation and Control
Knots in random neural networks
Fairness Under Unawareness: Assessing Disparity When Protected Class Is Unobserved
FineGAN: Unsupervised Hierarchical Disentanglement for Fine-Grained Object Generation and Discovery
The Capacity of Private Information Retrieval from Decentralized Uncoded Caching Databases
Cross-Lingual Approaches to Reference Resolution in Dialogue Systems
Class-Distinct and Class-Mutual Image Generation with GANs
Label-Noise Robust Generative Adversarial Networks
Integrated Object Detection and Tracking with Tracklet-Conditioned Detection
Deformable ConvNets v2: More Deformable, Better Results
Central limit theorems with a rate of convergence for time-dependent intermittent maps