Supervising Feature Influence

Causal influence measures for machine learnt classifiers shed light on the reasons behind classification, and aid in identifying influential input features and revealing their biases. However, such analyses involve evaluating the classifier using datapoints that may be atypical of its training distribution. Standard methods for training classifiers that minimize empirical risk do not constrain the behavior of the classifier on such datapoints. As a result, training to minimize empirical risk does not distinguish among classifiers that agree on predictions in the training distribution but have wildly different causal influences. We term this problem covariate shift in causal testing and formally characterize conditions under which it arises. As a solution to this problem, we propose a novel active learning algorithm that constrains the influence measures of the trained model. We prove that any two predictors whose errors are close on both the original training distribution and the distribution of atypical points are guaranteed to have causal influences that are also close. Further, we empirically demonstrate with synthetic labelers that our algorithm trains models that (i) have similar causal influences as the labeler’s model, and (ii) generalize better to out-of-distribution points while (iii) retaining their accuracy on in-distribution points.


Generalized Laplace Inference in Multiple Change-Points Models

Under the classical long-span asymptotic framework we develop a class of Generalized Laplace (GL) inference methods for the change-point dates in a linear time series regression model with multiple structural changes analyzed in, e.g., Bai and Perron (1998). The GL estimator is defined by an integration rather than optimization-based method and relies on the least-squares criterion function. It is interpreted as a classical (non-Bayesian) estimator and the inference methods proposed retain a frequentist interpretation. Since inference about the change-point dates is a nonstandard statistical problem, the original insight of Laplace to interpret a certain transformation of a least-squares criterion function as a statistical belief over the parameter space provides a better approximation about the uncertainty in the data about the change-points relative to existing methods. Simulations show that the GL estimator is in general more precise than the OLS estimator. On the theoretical side, depending on some input (smoothing) parameter, the class of GL estimators exhibits a dual limiting distribution; namely, the classical shrinkage asymptotic distribution of Bai and Perron (1998), or a Bayes-type asymptotic distribution.


Bag of Recurrence Patterns Representation for Time-Series Classification

Time-Series Classification (TSC) has attracted a lot of attention in pattern recognition, because wide range of applications from different domains such as finance and health informatics deal with time-series signals. Bag of Features (BoF) model has achieved a great success in TSC task by summarizing signals according to the frequencies of ‘feature words’ of a data-learned dictionary. This paper proposes embedding the Recurrence Plots (RP), a visualization technique for analysis of dynamic systems, in the BoF model for TSC. While the traditional BoF approach extracts features from 1D signal segments, this paper uses the RP to transform time-series into 2D texture images and then applies the BoF on them. Image representation of time-series enables us to explore different visual descriptors that are not available for 1D signals and to treats TSC task as a texture recognition problem. Experimental results on the UCI time-series classification archive demonstrates a significant accuracy boost by the proposed Bag of Recurrence patterns (BoR), compared not only to the existing BoF models, but also to the state-of-the art algorithms.


Novel Fourier Quadrature Transforms and Analytic Signal Representations for Nonlinear and Non-stationary Time Series Analysis

The Hilbert transform (HT) and associated Gabor analytic signal (GAS) representation are well-known and widely used mathematical formulations for modeling and analysis of signals in various applications. In this study, like the HT, to obtain quadrature component of a signal, we propose the novel discrete Fourier cosine quadrature transforms (FCQTs) and discrete Fourier sine quadrature transforms (FSQTs), designated as Fourier quadrature transforms (FQTs). Using these FQTs, we propose sixteen Fourier-Singh analytic signal (FSAS) representations with following properties: (1) real part of eight FSAS representations is the original signal and imaginary part is the FCQT of the real part, (2) imaginary part of eight FSAS representations is the original signal and real part is the FSQT of the real part, (3) like the GAS, Fourier spectrum of the all FSAS representations has only positive frequencies, however unlike the GAS, the real and imaginary parts of the proposed FSAS representations are not orthogonal to each other. The Fourier decomposition method (FDM) is an adaptive data analysis approach to decompose a signal into a set of small number of Fourier intrinsic band functions which are AM-FM components. This study also proposes a new formulation of the FDM using the discrete cosine transform (DCT) with the GAS and FSAS representations, and demonstrate its efficacy for improved time-frequency-energy representation and analysis of nonlinear and non-stationary time series.


Technical Report: On the Usability of Hadoop MapReduce, Apache Spark & Apache Flink for Data Science

Distributed data processing platforms for cloud computing are important tools for large-scale data analytics. Apache Hadoop MapReduce has become the de facto standard in this space, though its programming interface is relatively low-level, requiring many implementation steps even for simple analysis tasks. This has led to the development of advanced dataflow oriented platforms, most prominently Apache Spark and Apache Flink. Those platforms not only aim to improve performance through improved in-memory processing, but in particular provide built-in high-level data processing functionality, such as filtering and join operators, which should make data analysis tasks easier to develop than with plain Hadoop MapReduce. But is this indeed the case? This paper compares three prominent distributed data processing platforms: Apache Hadoop MapReduce; Apache Spark; and Apache Flink, from a usability perspective. We report on the design, execution and results of a usability study with a cohort of masters students, who were learning and working with all three platforms in order to solve different use cases set in a data science context. Our findings show that Spark and Flink are preferred platforms over MapReduce. Among participants, there was no significant difference in perceived preference or development time between both Spark and Flink as platforms for batch-oriented big data analysis. This study starts an exploration of the factors that make big data platforms more – or less – effective for users in data science.


Statistical Validity and Consistency of Big Data Analytics: A General Framework

Informatics and technological advancements have triggered generation of huge volume of data with varied complexity in its management and analysis. Big Data analytics is the practice of revealing hidden aspects of such data and making inferences from it. Although storage, retrieval and management of Big Data seem possible through efficient algorithm and system development, concern about statistical consistency remains to be addressed in view of its specific characteristics. Since Big Data does not conform to standard analytics, we need proper modification of the existing statistical theory and tools. Here we propose, with illustrations, a general statistical framework and an algorithmic principle for Big Data analytics that ensure statistical accuracy of the conclusions. The proposed framework has the potential to push forward advancement of Big Data analytics in the right direction. The partition-repetition approach proposed here is broad enough to encompass all practical data analytic problems.


Privacy-preserving Sensory Data Recovery

In recent years, a large scale of various wireless sensor networks have been deployed for basic scientific works. Massive data loss is so common that there is a great demand for data recovery. While data recovery methods fulfil the requirement of accuracy, the potential privacy leakage caused by them concerns us a lot. Thus the major challenge of sensory data recovery is the issue of effective privacy preservation. Existing algorithms can either accomplish accurate data recovery or solve privacy issue, yet no single design is able to address these two problems simultaneously. Therefore in this paper, we propose a novel approach Privacy-Preserving Compressive Sensing with Multi-Attribute Assistance (PPCS-MAA). It applies PPCS scheme to sensory data recovery, which can effectively encrypts sensory data without decreasing accuracy, because it maintains the homomorphic obfuscation property for compressive sensing. In addition, multiple environmental attributes from sensory datasets usually have strong correlation so that we design a MultiAttribute Assistance (MAA) component to leverage this feature for better recovery accuracy. Combining PPCS with MAA, the novel recovery scheme can provide reliable privacy with high accuracy. Firstly, based on two real datasets, IntelLab and GreenOrbs, we reveal the inherited low-rank features as the ground truth and find such multi-attribute correlation. Secondly, we develop a PPCS-MAA algorithm to preserve privacy and optimize the recovery accuracy. Thirdly, the results of real data-driven simulations show that the algorithm outperforms the existing solutions.


Protection against Cloning for Deep Learning

The susceptibility of deep learning to adversarial attack can be understood in the framework of the Renormalisation Group (RG) and the vulnerability of a specific network may be diagnosed provided the weights in each layer are known. An adversary with access to the inputs and outputs could train a second network to clone these weights and, having identified a weakness, use them to compute the perturbation of the input data which exploits it. However, the RG framework also provides a means to poison the outputs of the network imperceptibly, without affecting their legitimate use, so as to prevent such cloning of its weights and thereby foil the generation of adversarial data.


Artificial Intelligence and Robotics

The recent successes of AI have captured the wildest imagination of both the scientific communities and the general public. Robotics and AI amplify human potentials, increase productivity and are moving from simple reasoning towards human-like cognitive abilities. Current AI technologies are used in a set area of applications, ranging from healthcare, manufacturing, transport, energy, to financial services, banking, advertising, management consulting and government agencies. The global AI market is around 260 billion USD in 2016 and it is estimated to exceed 3 trillion by 2024. To understand the impact of AI, it is important to draw lessons from it’s past successes and failures and this white paper provides a comprehensive explanation of the evolution of AI, its current status and future directions.


Transport-domain applications of widely used data sources in the smart transportation: A survey

The rapid growth of population and the permanent increase in the number of vehicles engender several issues in transportation systems, which in turn call for an intelligent and cost-effective approach to resolve the problems in an efficient manner. Smart transportation is a framework that leverages the power of Information and Communication Technology for acquisition, management, and mining of traffic-related data sources, which, in this study, are categorized into: 1) traffic flow sensors, 2) video image processors, 3) probe people and vehicles based on Global Positioning Systems (GPS), mobile phone cellular networks, and Bluetooth, 4) location-based social networks, 5) transit data with the focus on smart cards, and 6) environmental data. For each data source, first, the operational mechanism of the technology for capturing the data is succinctly demonstrated. Secondly, as the most salient feature of this study, the transport-domain applications of each data source that have been conducted by the previous studies are reviewed and classified into the main groups. Thirdly, a number of possible future research directions are provided for all types of data sources. Moreover, in order to alleviate the shortcomings pertaining to each single data source and acquire a better understanding of mobility behavior in transportation systems, the data fusion architectures are introduced to fuse the knowledge learned from a set of heterogeneous but complementary data sources. Finally, we briefly mention the current challenges and their corresponding solutions in the smart transportation.


Deep Unsupervised Saliency Detection: A Multiple Noisy Labeling Perspective

The success of current deep saliency detection methods heavily depends on the availability of large-scale supervision in the form of per-pixel labeling. Such supervision, while labor-intensive and not always possible, tends to hinder the generalization ability of the learned models. By contrast, traditional handcrafted features based unsupervised saliency detection methods, even though have been surpassed by the deep supervised methods, are generally dataset-independent and could be applied in the wild. This raises a natural question that ‘Is it possible to learn saliency maps without using labeled data while improving the generalization ability?’. To this end, we present a novel perspective to unsupervised saliency detection through learning from multiple noisy labeling generated by ‘weak’ and ‘noisy’ unsupervised handcrafted saliency methods. Our end-to-end deep learning framework for unsupervised saliency detection consists of a latent saliency prediction module and a noise modeling module that work collaboratively and are optimized jointly. Explicit noise modeling enables us to deal with noisy saliency maps in a probabilistic way. Extensive experimental results on various benchmarking datasets show that our model not only outperforms all the unsupervised saliency methods with a large margin but also achieves comparable performance with the recent state-of-the-art supervised deep saliency methods.


Automatic Generation of Optimal Reductions of Distributions

A reduction of a source distribution is a collection of smaller sized distributions that are collectively equivalent to the source distribution with respect to the property of decomposability. That is, an arbitrary language is decomposable with respect to the source distribution if and only if it is decomposable with respect to each smaller sized distribution (in the reduction). The notion of reduction of distributions has previously been proposed to improve the complexity of decomposability verification. In this work, we address the problem of generating (optimal) reductions of distributions automatically. A (partial) solution to this problem is provided, which consists of 1) an incremental algorithm for the production of candidate reductions and 2) a reduction validation procedure. In the incremental production stage, backtracking is applied whenever a candidate reduction that cannot be validated is produced. A strengthened substitution-based proof technique is used for reduction validation, while a fixed template of candidate counter examples is used for reduction refutation; put together, they constitute our (partial) solution to the reduction verification problem. In addition, we show that a recursive approach for the generation of (small) reductions is easily supported.


COBRAS: Fast, Iterative, Active Clustering with Pairwise Constraints

Constraint-based clustering algorithms exploit background knowledge to construct clusterings that are aligned with the interests of a particular user. This background knowledge is often obtained by allowing the clustering system to pose pairwise queries to the user: should these two elements be in the same cluster or not? Active clustering methods aim to minimize the number of queries needed to obtain a good clustering by querying the most informative pairs first. Ideally, a user should be able to answer a couple of these queries, inspect the resulting clustering, and repeat these two steps until a satisfactory result is obtained. We present COBRAS, an approach to active clustering with pairwise constraints that is suited for such an interactive clustering process. A core concept in COBRAS is that of a super-instance: a local region in the data in which all instances are assumed to belong to the same cluster. COBRAS constructs such super-instances in a top-down manner to produce high-quality results early on in the clustering process, and keeps refining these super-instances as more pairwise queries are given to get more detailed clusterings later on. We experimentally demonstrate that COBRAS produces good clusterings at fast run times, making it an excellent candidate for the iterative clustering scenario outlined above.


Mining on Manifolds: Metric Learning without Labels

In this work we present a novel unsupervised framework for hard training example mining. The only input to the method is a collection of images relevant to the target application and a meaningful initial representation, provided e.g. by pre-trained CNN. Positive examples are distant points on a single manifold, while negative examples are nearby points on different manifolds. Both types of examples are revealed by disagreements between Euclidean and manifold similarities. The discovered examples can be used in training with any discriminative loss. The method is applied to unsupervised fine-tuning of pre-trained networks for fine-grained classification and particular object retrieval. Our models are on par or are outperforming prior models that are fully or partially supervised.


Universal Sentence Encoder

We present models for encoding sentences into embedding vectors that specifically target transfer learning to other NLP tasks. The models are efficient and result in accurate performance on diverse transfer tasks. Two variants of the encoding models allow for trade-offs between accuracy and compute resources. For both variants, we investigate and report the relationship between model complexity, resource consumption, the availability of transfer task training data, and task performance. Comparisons are made with baselines that use word level transfer learning via pretrained word embeddings as well as baselines do not use any transfer learning. We find that transfer learning using sentence embeddings tends to outperform word level transfer. With transfer learning via sentence embeddings, we observe surprisingly good performance with minimal amounts of supervised training data for a transfer task. We obtain encouraging results on Word Embedding Association Tests (WEAT) targeted at detecting model bias. Our pre-trained sentence encoding models are made freely available for download and on TF Hub.


Unsupervised Textual Grounding: Linking Words to Image Concepts

Textual grounding, i.e., linking words to objects in images, is a challenging but important task for robotics and human-computer interaction. Existing techniques benefit from recent progress in deep learning and generally formulate the task as a supervised learning problem, selecting a bounding box from a set of possible options. To train these deep net based approaches, access to a large-scale datasets is required, however, constructing such a dataset is time-consuming and expensive. Therefore, we develop a completely unsupervised mechanism for textual grounding using hypothesis testing as a mechanism to link words to detected image concepts. We demonstrate our approach on the ReferIt Game dataset and the Flickr30k data, outperforming baselines by 7.98% and 6.96% respectively.


Iterative Visual Reasoning Beyond Convolutions

We present a novel framework for iterative visual reasoning. Our framework goes beyond current recognition systems that lack the capability to reason beyond stack of convolutions. The framework consists of two core modules: a local module that uses spatial memory to store previous beliefs with parallel updates; and a global graph-reasoning module. Our graph module has three components: a) a knowledge graph where we represent classes as nodes and build edges to encode different types of semantic relationships between them; b) a region graph of the current image where regions in the image are nodes and spatial relationships between these regions are edges; c) an assignment graph that assigns regions to classes. Both the local module and the global module roll-out iteratively and cross-feed predictions to each other to refine estimates. The final predictions are made by combining the best of both modules with an attention mechanism. We show strong performance over plain ConvNets, \eg achieving an 8.4\% absolute improvement on ADE measured by per-class average precision. Analysis also shows that the framework is resilient to missing regions for reasoning.


A systematic approach to improving the reliability and scale of evidence from health care data
TrackingNet: A Large-Scale Dataset and Benchmark for Object Tracking in the Wild
Nonlocal coupling among oscillators mediated by a slowly diffusing substance
Using symbolic computation to prove nonexistence of distance-regular graphs
Modeling Customer Engagement from Partial Observations
On the equivalence of inexact proximal ALM and ADMM for a class of convex composite programming
Quotients and lifts of symmetric directed graphs
Learning to Become an Expert: Deep Networks Applied To Super-Resolution Microscopy
Analysis of permanence time in emotional states: A case study using educational software
Who Let The Dogs Out? Modeling Dog Behavior From Visual Data
Probabilistic Knowledge Transfer for Deep Representation Learning
Experimental Parity-Induced Thermalization Gap in Disordered Ring Lattices
Defending against Adversarial Images using Basis Functions Transformations
Deep Learning Object Detection Methods for Ecological Camera Trap Data
Rank-Metric Codes and $q$-Polymatroids
Non-Convex Matrix Completion Against a Semi-Random Adversary
Deep Photometric Stereo on a Sunny Day
The Glassy Phase of Optimal Quantum Control
Features for Multi-Target Multi-Camera Tracking and Re-Identification
Memory Warps for Learning Long-Term Online Video Representations
A Survey on Deep Learning Methods for Robot Vision
Human Emotional Facial Expression Recognition
Boolean polynomial threshold functions and random tensors
Learning to Look around Objects for Top-View Representations of Outdoor Scenes
Congestion Pricing in a World of Self-driving vehicles: an Analysis of Different Strategies in Alternative Future Scenarios
Optimal Transport with Controlled Dynamics and Free End Times
Greedy Variance Estimation for the LASSO
Buildings, groups of Lie type, and random walks
Continuous Record Asymptotics for Structural Change Models
Tests for Forecast Instability and Forecast Failure under a Continuous Record Asymptotic Framework
Structural Risk Minimization for $C^{1,1}(\mathbb{R}^d)$ Regression
An Empirical Analysis of Constrained Support Vector Quantile Regression for Nonparametric Probabilistic Forecasting of Wind Power
Effective Capacity Analysis in Ultra-Dense Wireless Networks with Random Interference
Social GAN: Socially Acceptable Trajectories with Generative Adversarial Networks
Deep Texture Manifold for Ground Terrain Recognition
Randomized Primal-Dual Algorithms for Semi-Infinite Programming
Motion-Appearance Co-Memory Networks for Video Question Answering
Matrix Product Operators for Sequence to Sequence Learning
Adversarial Binary Coding for Efficient Person Re-identification
Modeling the spatio-temporal dynamics of land use change with recurrent neural networks
One Garnir to rule them all: on Specht modules and the CataLAnKe theorem
Lattice Walk Enumeration
Polynomial-Time Algorithms for Submodular Laplacian Systems
An LP-based hyperparameter optimization model for language modeling
Design of First-Order Optimization Algorithms via Sum-of-Squares Programming
Translational and rotational dynamical heterogeneities in granular systems
B-DCGAN:Evaluation of Binarized DCGAN for FPGA
Learning Free-Form Deformations for 3D Object Reconstruction
Best arm identification in multi-armed bandits with delayed feedback
On The Weak Representation Propertyin Progressively Enlarged Filtrations with an Application to Exponential Utility Maximization
Towards Unsupervised Automatic Speech Recognition Trained by Unaligned Speech and Text only
Weakly Aggregative Modal Logic: Characterization and Interpolation
Least conflict choosability
Synchronization Dynamics in the Presence of Coupling Delays and Phase Shifts
Context-aware Synthesis for Video Frame Interpolation
A simulation comparison of tournament designs for world men’s handball championships
Data-Driven Sensitivity Indices for Models With Dependent Inputs Using the Polynomial Chaos Expansion
A Review of Literature on Parallel Constraint Solving
A Fixed-Parameter Algorithm for the Max-Cut Problem on Embedded 1-Planar Graphs
Improving accuracy of Winograd convolution for DNNs
A real-time warning system for rear-end collision based on random forest classifier
Best Match Graphs
Hierarchical Sparse Channel Estimation for Massive MIMO
Dihedral angle prediction using generative adversarial networks
Copula Variational Bayes inference via information geometry
Modified SMOTE Using Mutual Information and Different Sorts of Entropies
On Hyperparameter Search in Cluster Ensembles
Energy of the Coulomb gas on the sphere at low temperature
Massive MIMO in Sub-6 GHz and mmWave: Physical, Practical, and Use-Case Differences
Structured Attention Guided Convolutional Neural Fields for Monocular Depth Estimation
Exploiting Weak Supermodularity for Coalition-Proof Mechanisms
Some bounds on the uniquely restricted matching number
Generalized Bayesian D criterion for single-stratum and multistratum designs
On the restriction problem for discrete paraboloid in lower dimension
Regain Sliding super point from distributed edge routers by GPU
Lévy area of fractional Ornstein-Uhlenbeck process and parameter estimation
Computer-Assisted Text Analysis for Social Science: Topic Models and Beyond
Simplicial $G$-complexes and representation stability of polyhedral products
Stabilisation by noise on the boundary for a Chafee-Infante equation with dynamical boundary conditions
Multivariate second order Poincaré inequalities for Poisson functionals
Pointwise properties of martingales with values in Banach function spaces
3D Consistent Biventricular Myocardial Segmentation Using Deep Learning for Mesh Generation
On the distribution of rank and crank statistic for integer partitions
Webcam-based Eye Gaze Tracking under Natural Head Movement
Renewal theory for extremal Markov sequences of the Kendall type
Learning Deep Models for Face Anti-Spoofing: Binary or Auxiliary Supervision
Routing and Network Coding over a Cyclic Network for Online Video Gaming
A bijection for Euler’s partition theorem in the spirit of Bressoud
Frequent Item-set Mining without Ubiquitous Items
Functional CLT for martingale-like nonstationary dependent structures
Identifying Semantic Divergences in Parallel Text without Annotations
Energy-Efficient Transmission of Hybrid Array with Non-Ideal Power Amplifiers and Circuitry
Deep Reinforcement Learning for Traffic Light Control in Vehicular Networks
The sieving phenomenon for finite groups
Detection of Structural Change in Geographic Regions of Interest by Self Organized Mapping: Las Vegas City and Lake Mead across the Years
Over-the-Air Computation in MIMO Multi-Access Channels: Beamforming and Channel Feedback
Notes on computational-to-statistical gaps: predictions using statistical physics
Conformal Prediction in Learning Under Privileged Information Paradigm with Applications in Drug Discovery
Online Barycenter Estimation of Large Weighted Graphs
Colorless green recurrent networks dream hierarchically
Learning Kinematic Descriptions using SPARE: Simulated and Physical ARticulated Extendable dataset
Stochastic Gradient Hamiltonian Monte Carlo with Variance Reduction for Bayesian Inference
A Convex Reformulation of the Robust Freeway Network Control Problem with Controlled Merging Junctions
Dual graded graphs and Bratteli diagrams of towers of groups
Barren plateaus in quantum neural network training landscapes
A Monotonicity Property of a New Bernstein Type Operator
Energy-level statistics in strongly disordered systems with power-law hopping
Towards Open-Set Identity Preserving Face Synthesis
Two can play this Game: Visual Dialog with Discriminative Question Generation and Answering
MaskRNN: Instance Level Video Object Segmentation
Generative Modeling using the Sliced Wasserstein Distance