• Deep Hedging
• Existence of augmented Lagrange multipliers: reduction to exact penalty functions and localization principle
• Leveraging Coding Techniques for Speeding up Distributed Computing
• Thompson Sampling for Dynamic Pricing
• WorldTree: A Corpus of Explanation Graphs for Elementary Science Questions supporting Multi-Hop Inference
• Oversampled Adaptive Sensing
• Doppler Spread Estimation in MIMO Frequency-selective Fading Channels
• Learning Latent Representations in Neural Networks for Clustering through Pseudo Supervision and Graph-based Activity Regularization
• Comparison of data-driven uncertainty quantification methods for a carbon dioxide storage benchmark scenario
• Generating Realistic Geology Conditioned on Physical Measurements with Generative Adversarial Networks
• Mining Open Government Data Used in Scientific Research
• Combining Satellite Imagery and Numerical Model Simulation to Estimate Ambient Air Pollution: An Ensemble Averaging Approach
• Hole Filling with Multiple Reference Views in DIBR View Synthesis
• Automatic segmenting teeth in X-ray images: Trends, a novel data set, benchmarking and future perspectives
• A Note on Intervals in the Hales-Jewett Theorem
• Embedding graphs in Euclidean space
• The structure of state transition graphs in hysteresis models with return point memory. I. General Theory
• Tracking Noisy Targets: A Review of Recent Object Tracking Approaches
• Convolutional Hashing for Automated Scene Matching
• Learning to Match
• Optimized Bacteria are Environmental Prediction Engines
• Zero Forcing in Claw-Free Cubic Graphs
• Zero-Resource Neural Machine Translation with Multi-Agent Communication Game
• Robust and Sparse Regression in GLM by Stochastic Optimization
• Serre’s Condition, Balanced Simplicial Complexes, and Higher Nerves
• Batch Kalman Normalization: Towards Training Deep Neural Networks with Micro-Batches
• Small-Gain Stability Analysis of Hyperbolic-Parabolic PDE Loops
• Neighborhood Change, One Pint at a Time: The Impact of Local Characteristics on Craft Breweries
• A Minimum Message Length Criterion for Robust Linear Regression
• Augmenting Librispeech with French Translations: A Multimodal Corpus for Direct Speech Translation Evaluation
• Neural Dynamic Programming for Musical Self Similarity
• Mode Selection and Spectrum Partition for D2D Inband Communications: A Physical Layer Security Perspective
• On the Spectral Efficiency of Noncooperative Uplink Massive MIMO Systems
• Heterogeneous and Multidimensional Clairvoyant Dynamic Bin Packing for Virtual Machine Placement
• Boosting Image Forgery Detection using Resampling Detection and Copy-move analysis
• Web-Based Implementation of Travelling Salesperson Problem Using Genetic Algorithm
• Running Distributed and Dynamic IoT Choreographies
• Distributed Spanner Approximation
• URLNet: Learning a URL Representation with Deep Learning for Malicious URL Detection
• A Unified Approach for Multi-step Temporal-Difference Learning with Eligibility Traces in Reinforcement Learning
• In a One-Bit Rush: Low-Latency Wireless Spectrum Monitoring with Binary Sensor Arrays
• Self-Bounded Prediction Suffix Tree via Approximate String Matching
• Noise-Induced Limitations to the Scalability of Distributed Integral Control
• An extended version of a Branch-Price-and-Cut Procedure for the Discrete Ordered Median Problem
• Tracking all members of a honey bee colony over their lifetime
• Limits on Sparse Data Acquisition: RIC Analysis of Finite Gaussian Matrices
• Natural Language Inference over Interaction Space: ICLR 2018 Reproducibility Report
• Curve Registered Coupled Low Rank Factorization
• Drift Theory in Continuous Search Spaces: Expected Hitting Time of the (1+1)-ES with 1/5 Success Rule
• Quantitative aspects of acyclicity
• Deep clustering of longitudinal data
• Passive tracer in non-Markovian, Gaussian velocity field
• Balancing Two-Player Stochastic Games with Soft Q-Learning
• Self-stabilizing processes based on random signs
• The $b$-bibranching Problem: TDI System, Packing, and Discrete Convexity
• Full-Frame Scene Coordinate Regression for Image-Based Localization
• Recurrent Neural Network-Based Semantic Variational Autoencoder for Sequence-to-Sequence Learning
• Using Discretization for Extending the Set of Predictive Features
• Projecting UK Mortality using Bayesian Generalised Additive Models
• RSDNet: Learning to Predict Remaining Surgery Duration from Laparoscopic Videos Without Manual Annotations
• Piecewise Flat Embedding for Image Segmentation
• Exponential mixing for a class of dissipative PDEs with bounded degenerate noise
• Multiple Target Tracking by Learning Feature Representation and Distance Metric Jointly
• Triplet-based Deep Similarity Learning for Person Re-Identification
• A universal-algebraic proof of the complexity dichotomy for Monotone Monadic SNP
• Optimal data fitting: a moment approach
• Analysis of Summatory Functions of Regular Sequences: Transducer and Pascal’s Rhombus
• Efficient Neural Architecture Search via Parameters Sharing
• Unsupervised Deep Domain Adaptation for Pedestrian Detection
• Enhancing Performance of Random Caching in Large-Scale Heterogeneous Wireless Networks with Random Discontinuous Transmission
• Lower tail of the KPZ equation
• Augmented Reality needle ablation guidance tool for Irreversible Electroporation in the pancreas
• Slice Sampling Particle Belief Propagation
• Temporally Object-based Video Co-Segmentation
• Universality of phonon transport in surface-roughness dominated nanowires
• On sequences covering all rainbow $k$-progressions
• Sharpness for Inhomogeneous Percolation on Quasi-Transitive Graphs
• Optimal time-complexity speed planning for robot manipulators
• Bayesian inference for bivariate ranks
• Multiple points of operator semistable Lévy processes
• BKT universality class in the presence of correlated disorder
• Parallelizing Workload Execution in Embedded and High-Performance Heterogeneous Systems
• Nature vs. Nurture: The Role of Environmental Resources in Evolutionary Deep Intelligence
• Predicting Audio Advertisement Quality
• Opacity of nondeterministic transition systems: A (bi)simulation relation approach
• Replica Approach for Minimal Investment Risk with Cost
• Shapes Characterization on Address Event Representation Using Histograms of Oriented Events and an Extended LBP Approach
• Device-to-Device Communications in the Millimeter Wave Band: A Novel Distributed Mechanism
• Black-box Variational Inference for Stochastic Differential Equations
• Automatic Passenger Counting: Introducing the t-Test Induced Equivalence Test
• Long-Term-Unemployed hirings: Should targeted or untargeted policies be preferred?
• A Two-Stage Method for Text Line Detection in Historical Documents
• Scaling limits of the Schelling model
• Acceleration and global convergence of a first-order primal–dual method for nonconvex problems
• Directed polymers in heavy-tail random environment and Entropy-controlled Last Passage Percolation
• Deep Learning for Malicious Flow Detection
• Minimum weight codewords in dual Algebraic-Geometric codes from the Giulietti-Korchmáros curve
• Generative ScatterNet Hybrid Deep Learning (G-SHDL) Network with Structural Priors for Semantic Image Segmentation
• Necessary Optimality Conditions for Continuous-Time Optimization Problems with Equality and Inequality Constraints
• Zero-sum Generalized Schur Numbers
• Make the Minority Great Again: First-Order Regret Bound for Contextual Bandits
• Zero-sum Analogues of van der Waerden’s Theorem on Arithmetic Progressions
• Adding transmitters dramatically boosts coded-caching gains for finite file sizes
• Not-So-CLEVR: Visual Relations Strain Feedforward Neural Networks
• On domination perfect graphs
Feature extraction becomes increasingly important as data grows high dimensional. Autoencoder as a neural network based feature extraction method achieves great success in generating abstract features of high dimensional data. However, it fails to consider the relationships of data samples which may affect experimental results of using original and new features. In this paper, we propose a Relation Autoencoder model considering both data features and their relationships. We also extend it to work with other major autoencoder models including Sparse Autoencoder, Denoising Autoencoder and Variational Autoencoder. The proposed relational autoencoder models are evaluated on a set of benchmark datasets and the experimental results show that considering data relationships can generate more robust features which achieve lower construction loss and then lower error rate in further classification compared to the other variants of autoencoders.
Machine learning has become an important component for many systems and applications including computer vision, spam filtering, malware and network intrusion detection, among others. Despite the capabilities of machine learning algorithms to extract valuable information from data and produce accurate predictions, it has been shown that these algorithms are vulnerable to attacks. Data poisoning is one of the most relevant security threats against machine learning systems, where attackers can subvert the learning process by injecting malicious samples in the training data. Recent work in adversarial machine learning has shown that the so-called optimal attack strategies can successfully poison linear classifiers, degrading the performance of the system dramatically after compromising a small fraction of the training dataset. In this paper we propose a defence mechanism to mitigate the effect of these optimal poisoning attacks based on outlier detection. We show empirically that the adversarial examples generated by these attack strategies are quite different from genuine points, as no detectability constrains are considered to craft the attack. Hence, they can be detected with an appropriate pre-filtering of the training dataset.
With the popularity of deep learning (DL), artificial intelligence (AI) has been applied in many areas of human life. Neural network or artificial neural network (NN), the main technique behind DL, has been extensively studied to facilitate computer vision and natural language recognition. However, the more we rely on information technology, the more vulnerable we are. That is, malicious NNs could bring huge threat in the so-called coming AI era. In this paper, for the first time in the literature, we propose a novel approach to design and insert powerful neural-level trojans or PoTrojan in pre-trained NN models. Most of the time, PoTrojans remain inactive, not affecting the normal functions of their host NN models. PoTrojans could only be triggered in very rare conditions. Once activated, however, the PoTrojans could cause the host NN models to malfunction, either falsely predicting or classifying, which is a significant threat to human society of the AI era. We would explain the principles of PoTrojans and the easiness of designing and inserting them in pre-trained deep learning models. PoTrojans doesn’t modify the existing architecture or parameters of the pre-trained models, without re-training. Hence, the proposed method is very efficient.
In this paper, we propose imitation networks, a simple but effective method for training neural networks with a limited amount of training data. Our approach inherits the idea of knowledge distillation that transfers knowledge from a deep or wide reference model to a shallow or narrow target model. The proposed method employs this idea to mimic predictions of reference estimators that are much more robust against overfitting than the network we want to train. Different from almost all the previous work for knowledge distillation that requires a large amount of labeled training data, the proposed method requires only a small amount of training data. Instead, we introduce pseudo training examples that are optimized as a part of model parameters. Experimental results for several benchmark datasets demonstrate that the proposed method outperformed all the other baselines, such as naive training of the target model and standard knowledge distillation.
Motivated by the need to extract knowledge and value frominterconnected data, graph analytics on big data is a veryactive area of research in both industry and academia. Tosupport graph analytics efficiently a large number of in memory graph libraries, graph processing systems and graphdatabases have emerged. Projects in each of these categories focus on particular aspects such as static versus dynamic graphs, off line versus on line processing, small versuslarge graphs, etc. While there has been much advance in graph processingin the past decades, there is still a need for a fast graph processing, using a cluster of machines with distributed storage. In this paper, we discuss a novel distributed graph database called System G designed for efficient graph data storage andprocessing on modern computing architectures. In particular we describe a single node graph database and a runtimeand communication layer that allows us to compose a distributed graph database from multiple single node instances. From various industry requirements, we find that fast insertions and large volume concurrent queries are critical partsof the graph databases and we optimize our database forsuch features. We experimentally show the efficiency of System G for storing data and processing graph queries onstate-of-the-art platforms.
We present and evaluate Deep Private-Feature Extractor (DPFE), a deep model which is trained and evaluated based on information theoretic constraints. Using the selective exchange of information between a user’s device and a service provider, DPFE enables the user to prevent certain sensitive information from being shared with a service provider, while allowing them to extract approved information using their model. We introduce and utilize the log-rank privacy, a novel measure to assess the effectiveness of DPFE in removing sensitive information and compare different models based on their accuracy-privacy tradeoff. We then implement and evaluate the performance of DPFE on smartphones to understand its complexity, resource demands, and efficiency tradeoffs. Our results on benchmark image datasets demonstrate that under moderate resource utilization, DPFE can achieve high accuracy for primary tasks while preserving the privacy of sensitive features.
In the past decades, intensive efforts have been put to design various loss functions and metric forms for metric learning problem. These improvements have shown promising results when the test data is similar to the training data. However, the trained models often fail to produce reliable distances on the ambiguous test pairs due to the distribution bias between training set and test set. To address this problem, the Adversarial Metric Learning (AML) is proposed in this paper, which automatically generates adversarial pairs to remedy the distribution bias and facilitate robust metric learning. Specifically, AML consists of two adversarial stages, i.e. confusion and distinguishment. In confusion stage, the ambiguous but critical adversarial data pairs are adaptively generated to mislead the learned metric. In distinguishment stage, a metric is exhaustively learned to try its best to distinguish both the adversarial pairs and the original training pairs. Thanks to the challenges posed by the confusion stage in such competing process, the AML model is able to grasp plentiful difficult knowledge that has not been contained by the original training pairs, so the discriminability of AML can be significantly improved. The entire model is formulated into optimization framework, of which the global convergence is theoretically proved. The experimental results on toy data and practical datasets clearly demonstrate the superiority of AML to the representative state-of-the-art metric learning methodologies.
Robust reinforcement learning aims to produce policies that have strong guarantees even in the face of environments/transition models whose parameters have strong uncertainty. Existing work uses value-based methods and the usual primitive action setting. In this paper, we propose robust methods for learning temporally abstract actions, in the framework of options. We present a Robust Options Policy Iteration (ROPI) algorithm with convergence guarantees, which learns options that are robust to model uncertainty. We utilize ROPI to learn robust options with the Robust Options Deep Q Network (RO-DQN) that solves multiple tasks and mitigates model misspecification due to model uncertainty. We present experimental results which suggest that policy iteration with linear features may have an inherent form of robustness when using coarse feature representations. In addition, we present experimental results which demonstrate that robustness helps policy iteration implemented on top of deep neural networks to generalize over a much broader range of dynamics than non-robust policy iteration.
In this paper, we present an unsupervised learning framework for analyzing activities and interactions in surveillance videos. In our framework, three levels of video events are connected by Hierarchical Dirichlet Process (HDP) model: low-level visual features, simple atomic activities, and multi-agent interactions. Atomic activities are represented as distribution of low-level features, while complicated interactions are represented as distribution of atomic activities. This learning process is unsupervised. Given a training video sequence, low-level visual features are extracted based on optic flow and then clustered into different atomic activities and video clips are clustered into different interactions. The HDP model automatically decide the number of clusters, i.e. the categories of atomic activities and interactions. Based on the learned atomic activities and interactions, a training dataset is generated to train the Gaussian Process (GP) classifier. Then the trained GP models work in newly captured video to classify interactions and detect abnormal events in real time. Furthermore, the temporal dependencies between video events learned by HDP-Hidden Markov Models (HMM) are effectively integrated into GP classifier to enhance the accuracy of the classification in newly captured videos. Our framework couples the benefits of the generative model (HDP) with the discriminant model (GP). We provide detailed experiments showing that our framework enjoys favorable performance in video event classification in real-time in a crowded traffic scene.
Recurrent neural networks are a powerful means to cope with time series. We show that already linearly activated recurrent neural networks can approximate any time-dependent function f(t) given by a number of function values. The approximation can effectively be learned by simply solving a linear equation system; no backpropagation or similar methods are needed. Furthermore the network size can be reduced by taking only the most relevant components of the network. Thus, in contrast to others, our approach not only learns network weights but also the network architecture. The networks have interesting properties: In the stationary case they end up in ellipse trajectories in the long run, and they allow the prediction of further values and compact representations of functions. We demonstrate this by several experiments, among them multiple superimposed oscillators (MSO) and robotic soccer. Predictive neural networks outperform the previous state-of-the-art for the MSO task with a minimal number of units.
We address the problem of predicting spatio-temporal processes with temporal patterns that vary across spatial regions, when data is obtained as a stream. That is, when the training dataset is augmented sequentially. Specifically, we develop a localized spatio-temporal covariance model of the process that can capture spatially varying temporal periodicities in the data. We then apply a covariance-fitting methodology to learn the model parameters which yields a predictor that can be updated sequentially with each new data point. The proposed method is evaluated using both synthetic and real climate data which demonstrate its ability to accurately predict data missing in spatial regions over time.
In this paper, we revisit the large-scale constrained linear regression problem and propose faster methods based on some recent developments in sketching and optimization. Our algorithms combine (accelerated) mini-batch SGD with a new method called two-step preconditioning to achieve an approximate solution with a time complexity lower than that of the state-of-the-art techniques for the low precision case. Our idea can also be extended to the high precision case, which gives an alternative implementation to the Iterative Hessian Sketch (IHS) method with significantly improved time complexity. Experiments on benchmark and synthetic datasets suggest that our methods indeed outperform existing ones considerably in both the low and high precision cases.
Information planning enables faster learning with fewer training examples. It is particularly applicable when training examples are costly to obtain. This work examines the advantages of information planning for text data by focusing on three supervised models: Naive Bayes, supervised LDA and deep neural networks. We show that planning based on entropy and mutual information outperforms random selection baseline and therefore accelerates learning.
ATPboost is a system for solving sets of large-theory problems by interleaving ATP runs with state-of-the-art machine learning of premise selection from the proofs. Unlike many previous approaches that use multi-label setting, the learning is implemented as binary classification that estimates the pairwise-relevance of (theorem, premise) pairs. ATPboost uses for this the XGBoost gradient boosting algorithm, which is fast and has state-of-the-art performance on many tasks. Learning in the binary setting however requires negative examples, which is nontrivial due to many alternative proofs. We discuss and implement several solutions in the context of the ATP/ML feedback loop, and show that ATPboost with such methods significantly outperforms the k-nearest neighbors multilabel classifier.