Relational Autoencoder for Feature Extraction

Feature extraction becomes increasingly important as data grows high dimensional. Autoencoder as a neural network based feature extraction method achieves great success in generating abstract features of high dimensional data. However, it fails to consider the relationships of data samples which may affect experimental results of using original and new features. In this paper, we propose a Relation Autoencoder model considering both data features and their relationships. We also extend it to work with other major autoencoder models including Sparse Autoencoder, Denoising Autoencoder and Variational Autoencoder. The proposed relational autoencoder models are evaluated on a set of benchmark datasets and the experimental results show that considering data relationships can generate more robust features which achieve lower construction loss and then lower error rate in further classification compared to the other variants of autoencoders.

Detection of Adversarial Training Examples in Poisoning Attacks through Anomaly Detection

Machine learning has become an important component for many systems and applications including computer vision, spam filtering, malware and network intrusion detection, among others. Despite the capabilities of machine learning algorithms to extract valuable information from data and produce accurate predictions, it has been shown that these algorithms are vulnerable to attacks. Data poisoning is one of the most relevant security threats against machine learning systems, where attackers can subvert the learning process by injecting malicious samples in the training data. Recent work in adversarial machine learning has shown that the so-called optimal attack strategies can successfully poison linear classifiers, degrading the performance of the system dramatically after compromising a small fraction of the training dataset. In this paper we propose a defence mechanism to mitigate the effect of these optimal poisoning attacks based on outlier detection. We show empirically that the adversarial examples generated by these attack strategies are quite different from genuine points, as no detectability constrains are considered to craft the attack. Hence, they can be detected with an appropriate pre-filtering of the training dataset.

PoTrojan: powerful neural-level trojan designs in deep learning models

With the popularity of deep learning (DL), artificial intelligence (AI) has been applied in many areas of human life. Neural network or artificial neural network (NN), the main technique behind DL, has been extensively studied to facilitate computer vision and natural language recognition. However, the more we rely on information technology, the more vulnerable we are. That is, malicious NNs could bring huge threat in the so-called coming AI era. In this paper, for the first time in the literature, we propose a novel approach to design and insert powerful neural-level trojans or PoTrojan in pre-trained NN models. Most of the time, PoTrojans remain inactive, not affecting the normal functions of their host NN models. PoTrojans could only be triggered in very rare conditions. Once activated, however, the PoTrojans could cause the host NN models to malfunction, either falsely predicting or classifying, which is a significant threat to human society of the AI era. We would explain the principles of PoTrojans and the easiness of designing and inserting them in pre-trained deep learning models. PoTrojans doesn’t modify the existing architecture or parameters of the pre-trained models, without re-training. Hence, the proposed method is very efficient.

Imitation networks: Few-shot learning of neural networks from scratch

In this paper, we propose imitation networks, a simple but effective method for training neural networks with a limited amount of training data. Our approach inherits the idea of knowledge distillation that transfers knowledge from a deep or wide reference model to a shallow or narrow target model. The proposed method employs this idea to mimic predictions of reference estimators that are much more robust against overfitting than the network we want to train. Different from almost all the previous work for knowledge distillation that requires a large amount of labeled training data, the proposed method requires only a small amount of training data. Instead, we introduce pseudo training examples that are optimized as a part of model parameters. Experimental results for several benchmark datasets demonstrate that the proposed method outperformed all the other baselines, such as naive training of the target model and standard knowledge distillation.

System G Distributed Graph Database

Motivated by the need to extract knowledge and value frominterconnected data, graph analytics on big data is a veryactive area of research in both industry and academia. Tosupport graph analytics efficiently a large number of in memory graph libraries, graph processing systems and graphdatabases have emerged. Projects in each of these categories focus on particular aspects such as static versus dynamic graphs, off line versus on line processing, small versuslarge graphs, etc. While there has been much advance in graph processingin the past decades, there is still a need for a fast graph processing, using a cluster of machines with distributed storage. In this paper, we discuss a novel distributed graph database called System G designed for efficient graph data storage andprocessing on modern computing architectures. In particular we describe a single node graph database and a runtimeand communication layer that allows us to compose a distributed graph database from multiple single node instances. From various industry requirements, we find that fast insertions and large volume concurrent queries are critical partsof the graph databases and we optimize our database forsuch features. We experimentally show the efficiency of System G for storing data and processing graph queries onstate-of-the-art platforms.

Deep Private-Feature Extraction

We present and evaluate Deep Private-Feature Extractor (DPFE), a deep model which is trained and evaluated based on information theoretic constraints. Using the selective exchange of information between a user’s device and a service provider, DPFE enables the user to prevent certain sensitive information from being shared with a service provider, while allowing them to extract approved information using their model. We introduce and utilize the log-rank privacy, a novel measure to assess the effectiveness of DPFE in removing sensitive information and compare different models based on their accuracy-privacy tradeoff. We then implement and evaluate the performance of DPFE on smartphones to understand its complexity, resource demands, and efficiency tradeoffs. Our results on benchmark image datasets demonstrate that under moderate resource utilization, DPFE can achieve high accuracy for primary tasks while preserving the privacy of sensitive features.

Adversarial Metric Learning

In the past decades, intensive efforts have been put to design various loss functions and metric forms for metric learning problem. These improvements have shown promising results when the test data is similar to the training data. However, the trained models often fail to produce reliable distances on the ambiguous test pairs due to the distribution bias between training set and test set. To address this problem, the Adversarial Metric Learning (AML) is proposed in this paper, which automatically generates adversarial pairs to remedy the distribution bias and facilitate robust metric learning. Specifically, AML consists of two adversarial stages, i.e. confusion and distinguishment. In confusion stage, the ambiguous but critical adversarial data pairs are adaptively generated to mislead the learned metric. In distinguishment stage, a metric is exhaustively learned to try its best to distinguish both the adversarial pairs and the original training pairs. Thanks to the challenges posed by the confusion stage in such competing process, the AML model is able to grasp plentiful difficult knowledge that has not been contained by the original training pairs, so the discriminability of AML can be significantly improved. The entire model is formulated into optimization framework, of which the global convergence is theoretically proved. The experimental results on toy data and practical datasets clearly demonstrate the superiority of AML to the representative state-of-the-art metric learning methodologies.

Learning Robust Options

Robust reinforcement learning aims to produce policies that have strong guarantees even in the face of environments/transition models whose parameters have strong uncertainty. Existing work uses value-based methods and the usual primitive action setting. In this paper, we propose robust methods for learning temporally abstract actions, in the framework of options. We present a Robust Options Policy Iteration (ROPI) algorithm with convergence guarantees, which learns options that are robust to model uncertainty. We utilize ROPI to learn robust options with the Robust Options Deep Q Network (RO-DQN) that solves multiple tasks and mitigates model misspecification due to model uncertainty. We present experimental results which suggest that policy iteration with linear features may have an inherent form of robustness when using coarse feature representations. In addition, we present experimental results which demonstrate that robustness helps policy iteration implemented on top of deep neural networks to generalize over a much broader range of dynamics than non-robust policy iteration.

Video Event Recognition and Anomaly Detection by Combining Gaussian Process and Hierarchical Dirichlet Process Models

In this paper, we present an unsupervised learning framework for analyzing activities and interactions in surveillance videos. In our framework, three levels of video events are connected by Hierarchical Dirichlet Process (HDP) model: low-level visual features, simple atomic activities, and multi-agent interactions. Atomic activities are represented as distribution of low-level features, while complicated interactions are represented as distribution of atomic activities. This learning process is unsupervised. Given a training video sequence, low-level visual features are extracted based on optic flow and then clustered into different atomic activities and video clips are clustered into different interactions. The HDP model automatically decide the number of clusters, i.e. the categories of atomic activities and interactions. Based on the learned atomic activities and interactions, a training dataset is generated to train the Gaussian Process (GP) classifier. Then the trained GP models work in newly captured video to classify interactions and detect abnormal events in real time. Furthermore, the temporal dependencies between video events learned by HDP-Hidden Markov Models (HMM) are effectively integrated into GP classifier to enhance the accuracy of the classification in newly captured videos. Our framework couples the benefits of the generative model (HDP) with the discriminant model (GP). We provide detailed experiments showing that our framework enjoys favorable performance in video event classification in real-time in a crowded traffic scene.

Predictive Neural Networks

Recurrent neural networks are a powerful means to cope with time series. We show that already linearly activated recurrent neural networks can approximate any time-dependent function f(t) given by a number of function values. The approximation can effectively be learned by simply solving a linear equation system; no backpropagation or similar methods are needed. Furthermore the network size can be reduced by taking only the most relevant components of the network. Thus, in contrast to others, our approach not only learns network weights but also the network architecture. The networks have interesting properties: In the stationary case they end up in ellipse trajectories in the long run, and they allow the prediction of further values and compact representations of functions. We demonstrate this by several experiments, among them multiple superimposed oscillators (MSO) and robotic soccer. Predictive neural networks outperform the previous state-of-the-art for the MSO task with a minimal number of units.

Learning Localized Spatio-Temporal Models From Streaming Data

We address the problem of predicting spatio-temporal processes with temporal patterns that vary across spatial regions, when data is obtained as a stream. That is, when the training dataset is augmented sequentially. Specifically, we develop a localized spatio-temporal covariance model of the process that can capture spatially varying temporal periodicities in the data. We then apply a covariance-fitting methodology to learn the model parameters which yields a predictor that can be updated sequentially with each new data point. The proposed method is evaluated using both synthetic and real climate data which demonstrate its ability to accurately predict data missing in spatial regions over time.

Large Scale Constrained Linear Regression Revisited: Faster Algorithms via Preconditioning

In this paper, we revisit the large-scale constrained linear regression problem and propose faster methods based on some recent developments in sketching and optimization. Our algorithms combine (accelerated) mini-batch SGD with a new method called two-step preconditioning to achieve an approximate solution with a time complexity lower than that of the state-of-the-art techniques for the low precision case. Our idea can also be extended to the high precision case, which gives an alternative implementation to the Iterative Hessian Sketch (IHS) method with significantly improved time complexity. Experiments on benchmark and synthetic datasets suggest that our methods indeed outperform existing ones considerably in both the low and high precision cases.

Information Planning for Text Data

Information planning enables faster learning with fewer training examples. It is particularly applicable when training examples are costly to obtain. This work examines the advantages of information planning for text data by focusing on three supervised models: Naive Bayes, supervised LDA and deep neural networks. We show that planning based on entropy and mutual information outperforms random selection baseline and therefore accelerates learning.

ATPboost: Learning Premise Selection in Binary Setting with ATP Feedback

ATPboost is a system for solving sets of large-theory problems by interleaving ATP runs with state-of-the-art machine learning of premise selection from the proofs. Unlike many previous approaches that use multi-label setting, the learning is implemented as binary classification that estimates the pairwise-relevance of (theorem, premise) pairs. ATPboost uses for this the XGBoost gradient boosting algorithm, which is fast and has state-of-the-art performance on many tasks. Learning in the binary setting however requires negative examples, which is nontrivial due to many alternative proofs. We discuss and implement several solutions in the context of the ATP/ML feedback loop, and show that ATPboost with such methods significantly outperforms the k-nearest neighbors multilabel classifier.

Deep Hedging
Existence of augmented Lagrange multipliers: reduction to exact penalty functions and localization principle
Leveraging Coding Techniques for Speeding up Distributed Computing
Thompson Sampling for Dynamic Pricing
WorldTree: A Corpus of Explanation Graphs for Elementary Science Questions supporting Multi-Hop Inference
Oversampled Adaptive Sensing
Doppler Spread Estimation in MIMO Frequency-selective Fading Channels
Learning Latent Representations in Neural Networks for Clustering through Pseudo Supervision and Graph-based Activity Regularization
Comparison of data-driven uncertainty quantification methods for a carbon dioxide storage benchmark scenario
Generating Realistic Geology Conditioned on Physical Measurements with Generative Adversarial Networks
Mining Open Government Data Used in Scientific Research
Combining Satellite Imagery and Numerical Model Simulation to Estimate Ambient Air Pollution: An Ensemble Averaging Approach
Hole Filling with Multiple Reference Views in DIBR View Synthesis
Automatic segmenting teeth in X-ray images: Trends, a novel data set, benchmarking and future perspectives
A Note on Intervals in the Hales-Jewett Theorem
Embedding graphs in Euclidean space
The structure of state transition graphs in hysteresis models with return point memory. I. General Theory
Tracking Noisy Targets: A Review of Recent Object Tracking Approaches
Convolutional Hashing for Automated Scene Matching
Learning to Match
Optimized Bacteria are Environmental Prediction Engines
Zero Forcing in Claw-Free Cubic Graphs
Zero-Resource Neural Machine Translation with Multi-Agent Communication Game
Robust and Sparse Regression in GLM by Stochastic Optimization
Serre’s Condition, Balanced Simplicial Complexes, and Higher Nerves
Batch Kalman Normalization: Towards Training Deep Neural Networks with Micro-Batches
Small-Gain Stability Analysis of Hyperbolic-Parabolic PDE Loops
Neighborhood Change, One Pint at a Time: The Impact of Local Characteristics on Craft Breweries
A Minimum Message Length Criterion for Robust Linear Regression
Augmenting Librispeech with French Translations: A Multimodal Corpus for Direct Speech Translation Evaluation
Neural Dynamic Programming for Musical Self Similarity
Mode Selection and Spectrum Partition for D2D Inband Communications: A Physical Layer Security Perspective
On the Spectral Efficiency of Noncooperative Uplink Massive MIMO Systems
Heterogeneous and Multidimensional Clairvoyant Dynamic Bin Packing for Virtual Machine Placement
Boosting Image Forgery Detection using Resampling Detection and Copy-move analysis
Web-Based Implementation of Travelling Salesperson Problem Using Genetic Algorithm
Running Distributed and Dynamic IoT Choreographies
Distributed Spanner Approximation
URLNet: Learning a URL Representation with Deep Learning for Malicious URL Detection
A Unified Approach for Multi-step Temporal-Difference Learning with Eligibility Traces in Reinforcement Learning
In a One-Bit Rush: Low-Latency Wireless Spectrum Monitoring with Binary Sensor Arrays
Self-Bounded Prediction Suffix Tree via Approximate String Matching
Noise-Induced Limitations to the Scalability of Distributed Integral Control
An extended version of a Branch-Price-and-Cut Procedure for the Discrete Ordered Median Problem
Tracking all members of a honey bee colony over their lifetime
Limits on Sparse Data Acquisition: RIC Analysis of Finite Gaussian Matrices
Natural Language Inference over Interaction Space: ICLR 2018 Reproducibility Report
Curve Registered Coupled Low Rank Factorization
Drift Theory in Continuous Search Spaces: Expected Hitting Time of the (1+1)-ES with 1/5 Success Rule
Quantitative aspects of acyclicity
Deep clustering of longitudinal data
Passive tracer in non-Markovian, Gaussian velocity field
Balancing Two-Player Stochastic Games with Soft Q-Learning
Self-stabilizing processes based on random signs
The $b$-bibranching Problem: TDI System, Packing, and Discrete Convexity
Full-Frame Scene Coordinate Regression for Image-Based Localization
Recurrent Neural Network-Based Semantic Variational Autoencoder for Sequence-to-Sequence Learning
Using Discretization for Extending the Set of Predictive Features
Projecting UK Mortality using Bayesian Generalised Additive Models
RSDNet: Learning to Predict Remaining Surgery Duration from Laparoscopic Videos Without Manual Annotations
Piecewise Flat Embedding for Image Segmentation
Exponential mixing for a class of dissipative PDEs with bounded degenerate noise
Multiple Target Tracking by Learning Feature Representation and Distance Metric Jointly
Triplet-based Deep Similarity Learning for Person Re-Identification
A universal-algebraic proof of the complexity dichotomy for Monotone Monadic SNP
Optimal data fitting: a moment approach
Analysis of Summatory Functions of Regular Sequences: Transducer and Pascal’s Rhombus
Efficient Neural Architecture Search via Parameters Sharing
Unsupervised Deep Domain Adaptation for Pedestrian Detection
Enhancing Performance of Random Caching in Large-Scale Heterogeneous Wireless Networks with Random Discontinuous Transmission
Lower tail of the KPZ equation
Augmented Reality needle ablation guidance tool for Irreversible Electroporation in the pancreas
Slice Sampling Particle Belief Propagation
Temporally Object-based Video Co-Segmentation
Universality of phonon transport in surface-roughness dominated nanowires
On sequences covering all rainbow $k$-progressions
Sharpness for Inhomogeneous Percolation on Quasi-Transitive Graphs
Optimal time-complexity speed planning for robot manipulators
Bayesian inference for bivariate ranks
Multiple points of operator semistable Lévy processes
BKT universality class in the presence of correlated disorder
Parallelizing Workload Execution in Embedded and High-Performance Heterogeneous Systems
Nature vs. Nurture: The Role of Environmental Resources in Evolutionary Deep Intelligence
Predicting Audio Advertisement Quality
Opacity of nondeterministic transition systems: A (bi)simulation relation approach
Replica Approach for Minimal Investment Risk with Cost
Shapes Characterization on Address Event Representation Using Histograms of Oriented Events and an Extended LBP Approach
Device-to-Device Communications in the Millimeter Wave Band: A Novel Distributed Mechanism
Black-box Variational Inference for Stochastic Differential Equations
Automatic Passenger Counting: Introducing the t-Test Induced Equivalence Test
Long-Term-Unemployed hirings: Should targeted or untargeted policies be preferred?
A Two-Stage Method for Text Line Detection in Historical Documents
Scaling limits of the Schelling model
Acceleration and global convergence of a first-order primal–dual method for nonconvex problems
Directed polymers in heavy-tail random environment and Entropy-controlled Last Passage Percolation
Deep Learning for Malicious Flow Detection
Minimum weight codewords in dual Algebraic-Geometric codes from the Giulietti-Korchmáros curve
Generative ScatterNet Hybrid Deep Learning (G-SHDL) Network with Structural Priors for Semantic Image Segmentation
Necessary Optimality Conditions for Continuous-Time Optimization Problems with Equality and Inequality Constraints
Zero-sum Generalized Schur Numbers
Make the Minority Great Again: First-Order Regret Bound for Contextual Bandits
Zero-sum Analogues of van der Waerden’s Theorem on Arithmetic Progressions
Adding transmitters dramatically boosts coded-caching gains for finite file sizes
Not-So-CLEVR: Visual Relations Strain Feedforward Neural Networks
On domination perfect graphs