Causal Learning and Explanation of Deep Neural Networks via Autoencoded Activations

Deep neural networks are complex and opaque. As they enter application in a variety of important and safety critical domains, users seek methods to explain their output predictions. We develop an approach to explaining deep neural networks by constructing causal models on salient concepts contained in a CNN. We develop methods to extract salient concepts throughout a target network by using autoencoders trained to extract human-understandable representations of network activations. We then build a bayesian causal model using these extracted concepts as variables in order to explain image classification. Finally, we use this causal model to identify and visualize features with significant causal influence on final classification.

Representation Learning for Resource Usage Prediction

Creating a model of a computer system that can be used for tasks such as predicting future resource usage and detecting anomalies is a challenging problem. Most current systems rely on heuristics and overly simplistic assumptions about the workloads and system statistics. These heuristics are typically a one-size-fits-all solution so as to be applicable in a wide range of applications and systems environments. With this paper, we present our ongoing work of integrating systems telemetry ranging from standard resource usage statistics to kernel and library calls of applications into a machine learning model. Intuitively, such a ML model approximates, at any point in time, the state of a system and allows us to solve tasks such as resource usage prediction and anomaly detection. To achieve this goal, we leverage readily-available information that does not require any changes to the applications run on the system. We train recurrent neural networks to learn a model of the system under consideration. As a proof of concept, we train models specifically to predict future resource usage of running applications.

Extending Causal Consistency to any Object Defined by a Sequential Specification

This paper presents a simple generalization of causal consistency suited to any object defined by a sequential specification. As causality is captured by a partial order on the set of operations issued by the processes on shared objects (concurrent operations are not ordered), it follows that causal consistency allows different processes to have different views of each object history.

Deep Temporal Clustering : Fully Unsupervised Learning of Time-Domain Features

Unsupervised learning of time series data, also known as temporal clustering, is a challenging problem in machine learning. Here we propose a novel algorithm, Deep Temporal Clustering (DTC), to naturally integrate dimensionality reduction and temporal clustering into a single end-to-end learning framework, fully unsupervised. The algorithm utilizes an autoencoder for temporal dimensionality reduction and a novel temporal clustering layer for cluster assignment. Then it jointly optimizes the clustering objective and the dimensionality reduction objec tive. Based on requirement and application, the temporal clustering layer can be customized with any temporal similarity metric. Several similarity metrics and state-of-the-art algorithms are considered and compared. To gain insight into temporal features that the network has learned for its clustering, we apply a visualization method that generates a region of interest heatmap for the time series. The viability of the algorithm is demonstrated using time series data from diverse domains, ranging from earthquakes to spacecraft sensor data. In each case, we show that the proposed algorithm outperforms traditional methods. The superior performance is attributed to the fully integrated temporal dimensionality reduction and clustering criterion.

Copula-based Partial Correlation Screening: a Joint and Robust Approach

Screening for ultrahigh dimensional features may encounter complicated issues such as outlying observations, heteroscedasticity or heavy-tailed distribution, multi-collinearity and confounding effects. Standard correlation-based marginal screening methods may be a weak solution to these issues. We contribute a novel robust joint screener to safeguard against outliers and distribution mis-specification for both the response variable and the covariates, and to account for external variables at the screening step. Specifically, we introduce a copula-based partial correlation (CPC) screener. We show that the empirical process of the estimated CPC converges weakly to a Gaussian process and establish the sure screening property for CPC screener under very mild technical conditions, where we need not require any moment condition, weaker than existing alternatives in the literature. Moreover, our approach allows for a diverging number of conditional variables from the theoretical point of view. Extensive simulation studies and two data applications are included to illustrate our proposal.

Robust Pre-Processing: A Robust Defense Method Against Adversary Attack

Deep learning algorithms and networks are vulnerable to perturbed inputs which are known as the adversarial attack. Many defense methodologies have been investigated to defend such adversarial attack. In this work, we propose a novel methodology to defend the existing powerful attack model. Such attack models have achieved record success against MNIST dataset to force it to miss-classify all of its inputs. Whereas Our proposed defense method robust pre-processing achieves the best accuracy among the current state of the art defenses. It consists of Tanh (hyperbolic tangent) function, smoothing and batch normalization to process the input data which will make it more robust over the adversarial attack. robust pre-processing improves the white box attack accuracy of MNIST from 94.3% to 98.7%. Even with increasing defense when others defenses completely fail, robust pre-processing remains one of the strongest ever reported. Another strength of our defense is that it eliminates the need for adversarial training as it can significantly increase the MNIST accuracy without adversarial training as well. This makes it a more generalized defense method with almost half training overhead and much-improved accuracy. robust pre-processing can also increase the inference accuracy in the face of the powerful attack on CIFAR-10 and SVHN data set as well without much sacrificing clean data accuracy.

Bayesian Modeling via Goodness-of-fit

The two key issues of modern Bayesian statistics are: (i) establishing principled approach for distilling statistical prior that is consistent with the given data from an initial believable scientific prior; and (ii) development of a Bayes-frequentist consolidated data analysis workflow that is more effective than either of the two separately. In this paper, we propose the idea of ‘Bayes via goodness of fit’ as a framework for exploring these fundamental questions, in a way that is general enough to embrace almost all of the familiar probability models. Several illustrative examples show the benefit of this new point of view as a practical data analysis tool. Relationship with other Bayesian cultures is also discussed.

Goal-Oriented Chatbot Dialog Management Bootstrapping with Transfer Learning

Goal-Oriented (GO) Dialogue Systems, colloquially known as goal oriented chatbots, help users achieve a predefined goal (e.g. book a movie ticket) within a closed domain. A first step is to understand the user’s goal by using natural language understanding techniques. Once the goal is known, the bot must manage a dialogue to achieve that goal, which is conducted with respect to a learnt policy. The success of the dialogue system depends on the quality of the policy, which is in turn reliant on the availability of high-quality training data for the policy learning method, for instance Deep Reinforcement Learning. Due to the domain specificity, the amount of available data is typically too low to allow the training of good dialogue policies. In this paper we introduce a transfer learning method to mitigate the effects of the low in-domain data availability. Our transfer learning based approach improves the bot’s success rate by 20% in relative terms for distant domains and we more than double it for close domains, compared to the model without transfer learning. Moreover, the transfer learning chatbots learn the policy up to 5 to 10 times faster. Finally, as the transfer learning approach is complementary to additional processing such as warm-starting, we show that their joint application gives the best outcomes.

Adaptive Memory Networks

We present Adaptive Memory Networks (AMN) that processes input-question pairs to dynamically construct a network architecture optimized for lower inference times for Question Answering (QA) tasks. AMN processes the input story to extract entities and stores them in memory banks. Starting from a single bank, as the number of input entities increases, AMN learns to create new banks as the entropy in a single bank becomes too high. Hence, after processing an input-question(s) pair, the resulting network represents a hierarchical structure where entities are stored in different banks, distanced by question relevance. At inference, one or few banks are used, creating a tradeoff between accuracy and performance. AMN is enabled by dynamic networks that allow input dependent network creation and efficiency in dynamic mini-batching as well as our novel bank controller that allows learning discrete decision making with high accuracy. In our results, we demonstrate that AMN learns to create variable depth networks depending on task complexity and reduces inference times for QA tasks.

Dimension Reduction via Gaussian Ridge Functions

Ridge functions have recently emerged as a powerful set of ideas for subspace-based dimension reduction. In this paper we begin by drawing parallels between ridge subspaces, sufficient dimension reduction and active subspaces; contrasting between techniques rooted in statistical regression to those rooted in approximation theory. This sets the stage for our new algorithm that approximates what we call a Gaussian ridge function—the posterior mean of a Gaussian process on a dimension reducing subspace—suitable for both regression and approximation problems. To compute this subspace we develop an iterative algorithm that optimizes over the Grassmann manifold to compute the subspace, followed by an optimization of the hyperparameters of the Gaussian process. We demonstrate the utility of the algorithm on an analytical function, where we obtain near exact ridge recovery, and a turbomachinery case study, where we compare the efficacy of our approach with four well known sufficient dimension reduction methods: MAVE, SIR, SAVE, CR. The comparisons motivate the use of the posterior variance as a heuristic for identifying the suitability of a dimension reducing subspace.

Complex Network Classification with Convolutional Neural Network

Classifying large scale networks into several categories and distinguishing them according to their fine structures is of great importance with several applications in real life. However, most studies of complex networks focus on properties of a single network but seldom on classification, clustering, and comparison between different networks, in which the network is treated as a whole. Due to the non-Euclidean properties of the data, conventional methods can hardly be applied on networks directly. In this paper, we propose a novel framework of complex network classifier (CNC) by integrating network embedding and convolutional neural network to tackle the problem of network classification. By training the classifiers on synthetic complex network data and real international trade network data, we show CNC can not only classify networks in a high accuracy and robustness, it can also extract the features of the networks automatically.

Visual Interpretability for Deep Learning: a Survey

This paper reviews recent studies in emerging directions of understanding neural-network representations and learning neural networks with interpretable/disentangled middle-layer representations. Although deep neural networks have exhibited superior performance in various tasks, the interpretability is always an Achilles’ heel of deep neural networks. At present, deep neural networks obtain a high discrimination power at the cost of low interpretability of their black-box representations. We believe that the high model interpretability may help people to break several bottlenecks of deep learning, e.g., learning from very few annotations, learning via human-computer communications at the semantic level, and semantically debugging network representations. In this paper, we focus on convolutional neural networks (CNNs), and we revisit the visualization of CNN representations, methods of diagnosing representations of pre-trained CNNs, approaches for disentangling pre-trained CNN representations, learning of CNNs with disentangled representations, and middle-to-end learning based on model interpretability. Finally, we discuss prospective trends of explainable artificial intelligence.

Incremental Classifier Learning with Generative Adversarial Networks

In this paper, we address the incremental classifier learning problem, which suffers from catastrophic forgetting. The main reason for catastrophic forgetting is that the past data are not available during learning. Typical approaches keep some exemplars for the past classes and use distillation regularization to retain the classification capability on the past classes and balance the past and new classes. However, there are four main problems with these approaches. First, the loss function is not efficient for classification. Second, there is unbalance problem between the past and new classes. Third, the size of pre-decided exemplars is usually limited and they might not be distinguishable from unseen new classes. Forth, the exemplars may not be allowed to be kept for a long time due to privacy regulations. To address these problems, we propose (a) a new loss function to combine the cross-entropy loss and distillation loss, (b) a simple way to estimate and remove the unbalance between the old and new classes , and (c) using Generative Adversarial Networks (GANs) to generate historical data and select representative exemplars during generation. We believe that the data generated by GANs have much less privacy issues than real images because GANs do not directly copy any real image patches. We evaluate the proposed method on CIFAR-100, Flower-102, and MS-Celeb-1M-Base datasets and extensive experiments demonstrate the effectiveness of our method.

A Survey on Acceleration of Deep Convolutional Neural Networks

Deep Neural Networks have achieved remarkable progress during the past few years and are currently the fundamental tools of many intelligent systems. At the same time, the computational complexity and resource consumption of these networks are also continuously increasing. This will pose a significant challenge to the deployment of such networks, especially for real-time applications or on resource-limited devices. Thus, network acceleration have become a hot topic within the deep learning community. As for hardware implementation of deep neural networks, a batch of accelerators based on FPGA/ASIC have been proposed these years. In this paper, we provide a comprehensive survey about the recent advances on network acceleration, compression and accelerator design from both algorithm and hardware side. Specifically, we provide thorough analysis for each of the following topics: network pruning, low-rank approximation, network quantization, teacher-student networks, compact network design and hardware accelerator. Finally, we make a discussion and introduce a few possible future directions.

Plan Explanations as Model Reconciliation — An Empirical Study

Recent work in explanation generation for decision making agents has looked at how unexplained behavior of autonomous systems can be understood in terms of differences in the model of the system and the human’s understanding of the same, and how the explanation process as a result of this mismatch can be then seen as a process of reconciliation of these models. Existing algorithms in such settings, while having been built on contrastive, selective and social properties of explanations as studied extensively in the psychology literature, have not, to the best of our knowledge, been evaluated in settings with actual humans in the loop. As such, the applicability of such explanations to human-AI and human-robot interactions remains suspect. In this paper, we set out to evaluate these explanation generation algorithms in a series of studies in a mock search and rescue scenario with an internal semi-autonomous robot and an external human commander. We demonstrate to what extent the properties of these algorithms hold as they are evaluated by humans, and how the dynamics of trust between the human and the robot evolve during the process of these interactions.

Hierarchical Adversarially Learned Inference

We propose a novel hierarchical generative model with a simple Markovian structure and a corresponding inference model. Both the generative and inference model are trained using the adversarial learning paradigm. We demonstrate that the hierarchical structure supports the learning of progressively more abstract representations as well as providing semantically meaningful reconstructions with different levels of fidelity. Furthermore, we show that minimizing the Jensen-Shanon divergence between the generative and inference network is enough to minimize the reconstruction error. The resulting semantically meaningful hierarchical latent structure discovery is exemplified on the CelebA dataset. There, we show that the features learned by our model in an unsupervised way outperform the best handcrafted features. Furthermore, the extracted features remain competitive when compared to several recent deep supervised approaches on an attribute prediction task on CelebA. Finally, we leverage the model’s inference network to achieve state-of-the-art performance on a semi-supervised variant of the MNIST digit classification task.

Software Engineers vs. Machine Learning Algorithms: An Empirical Study Assessing Performance and Reuse Tasks

Several papers have recently contained reports on applying machine learning (ML) to the automation of software engineering (SE) tasks, such as project management, modeling and development. However, there appear to be no approaches comparing how software engineers fare against machine-learning algorithms as applied to specific software development tasks. Such a comparison is essential to gain insight into which tasks are better performed by humans and which by machine learning and how cooperative work or human-in-the-loop processes can be implemented more effectively. In this paper, we present an empirical study that compares how software engineers and machine-learning algorithms perform and reuse tasks. The empirical study involves the synthesis of the control structure of an autonomous streetlight application. Our approach consists of four steps. First, we solved the problem using machine learning to determine specific performance and reuse tasks. Second, we asked software engineers with different domain knowledge levels to provide a solution to the same tasks. Third, we compared how software engineers fare against machine-learning algorithms when accomplishing the performance and reuse tasks based on criteria such as energy consumption and safety. Finally, we analyzed the results to understand which tasks are better performed by either humans or algorithms so that they can work together more effectively. Such an understanding and the resulting human-in-the-loop approaches, which take into account the strengths and weaknesses of humans and machine-learning algorithms, are fundamental not only to provide a basis for cooperative work in support of software engineering, but also, in other areas.

Learning Compact Neural Networks with Regularization

We study the impact of regularization for learning neural networks. Our goal is speeding up training, improving generalization performance, and training compact models that are cost efficient. Our results apply to weight-sharing (e.g.~convolutional), sparsity (i.e.~pruning), and low-rank constraints among others. We first introduce covering dimension of the constraint set and provide a Rademacher complexity bound providing insights on generalization properties. Then, we propose and analyze regularized gradient descent algorithms for learning shallow networks. We show that problem becomes well conditioned and local linear convergence occurs once the amount of data exceeds covering dimension (e.g.~\# of nonzero weights). Finally, we provide insights on layerwise training of deep models by studying a random activation model. Our results show how regularization can be beneficial to overcome overparametrization.

Zero-Shot Kernel Learning

In this paper, we address an open problem of zero-shot learning. Its principle is based on learning a mapping that associates feature vectors extracted from i.e. images and attribute vectors that describe objects and/or scenes of interest. In turns, this allows classifying unseen object classes and/or scenes by matching feature vectors via mapping to a newly defined attribute vector describing a new class. Due to importance of such a learning task, there exist many methods that learn semantic, probabilistic, linear or piece-wise linear mappings. In contrast, we apply well-established kernel methods to learn a non-linear mapping between the feature and attribute spaces. We propose an easy learning objective inspired by the Linear Discriminant Analysis, Kernel-Target Alignment and Kernel Polarization methods that promotes incoherence. We evaluate performance of our algorithm on the Polynomial as well as shift-invariant Gaussian and Cauchy kernels. Despite simplicity of our approach, we obtain state-of-the-art results on several zero-shot learning datasets and benchmarks including a recent AWA2 dataset.

Coordinated Exploration in Concurrent Reinforcement Learning

We consider a team of reinforcement learning agents that concurrently learn to operate in a common environment. We identify three properties – adaptivity, commitment, and diversity – which are necessary for efficient coordinated exploration and demonstrate that straightforward extensions to single-agent optimistic and posterior sampling approaches fail to satisfy them. As an alternative, we propose seed sampling, which extends posterior sampling in a manner that meets these requirements. Simulation results investigate how per-agent regret decreases as the number of agents grows, establishing substantial advantages of seed sampling over alternative exploration schemes.

To understand deep learning we need to understand kernel learning

Generalization performance of classifiers in deep learning has recently become a subject of intense study. Heavily over-parametrized deep models tend to fit training data exactly. Despite this overfitting, they perform well on test data, a phenomenon not yet fully understood. The first point of our paper is that strong performance of overfitted classifiers is not a unique feature of deep learning. Using real-world and synthetic datasets, we establish that kernel classifiers trained to have zero classification error (overfitting) or even zero regression error (interpolation) perform very well on test data. We proceed to prove lower bounds on the norm of overfitted solutions for smooth kernels, showing that they increase nearly exponentially with the data size. Since the available generalization bounds depend polynomially on the norm of the solution, this implies that the existing generalization bounds diverge as data increases. We also show experimentally that (non-smooth) Laplacian kernels easily fit random labels using a version of SGD, a finding that parallels results reported for ReLU neural networks. In contrast, fitting noisy data requires many more epochs for smooth Gaussian kernels. The observation that the performance of overfitted Laplacian and Gaussian classifiers on the test is quite similar, suggests that generalization is tied to the properties of the kernel function rather than the optimization process. We see that some key phenomena of deep learning are manifested similarly in kernel methods in the overfitted regime. We argue that progress on understanding deep learning will be difficult, until more analytically tractable ‘shallow’ kernel methods are better understood. The combination of the experimental and theoretical results presented in this paper indicates a need for a new theoretical basis for understanding classical kernel methods.

The Matrix Calculus You Need For Deep Learning

This paper is an attempt to explain all the matrix calculus you need in order to understand the training of deep neural networks. We assume no math knowledge beyond what you learned in calculus 1, and provide links to help you refresh the necessary math where needed. Note that you do not need to understand this material before you start learning to train and use deep learning in practice; rather, this material is for those who are already familiar with the basics of neural networks, and wish to deepen their understanding of the underlying math. Don’t worry if you get stuck at some point along the way—just go back and reread the previous section, and try writing down and working through some examples. And if you’re still stuck, we’re happy to answer your questions in the Theory category at Note: There is a reference section at the end of the paper summarizing all the key matrix calculus rules and terminology discussed here.

IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures

In this work we aim to solve a large collection of tasks using a single reinforcement learning agent with a single set of parameters. A key challenge is to handle the increased amount of data and extended training time, which is already a problem in single task learning. We have developed a new distributed agent IMPALA (Importance-Weighted Actor Learner Architecture) that can scale to thousands of machines and achieve a throughput rate of 250,000 frames per second. We achieve stable learning at high throughput by combining decoupled acting and learning with a novel off-policy correction method called V-trace, which was critical for achieving learning stability. We demonstrate the effectiveness of IMPALA for multi-task reinforcement learning on DMLab-30 (a set of 30 tasks from the DeepMind Lab environment (Beattie et al., 2016)) and Atari-57 (all available Atari games in Arcade Learning Environment (Bellemare et al., 2013a)). Our results show that IMPALA is able to achieve better performance than previous agents, use less data and crucially exhibits positive transfer between tasks as a result of its multi-task approach.

Bulk Fermi arc of disordered Dirac fermions in two dimensions
Sensitivity Sampling Over Dynamic Geometric Data Streams with Applications to $k$-Clustering
Twists and Twistability
APPLE Picker: Automatic Particle Picking, a Low-Effort Cryo-EM Framework
Learning random-walk label propagation for weakly-supervised semantic segmentation
On a fractional version of Haemers’ bound
Completing the Structural Analysis of the 2×4 Permutation Classes
Krasovskii-Subbotin approach to mean field type differential games
A New Registration Approach for Dynamic Analysis of Calcium Signals in Organs
Practical Bayesian Modeling and Inference for Massive Spatial Datasets On Modest Computing Environments
Approximating power by weights
Predictive Management of Electric Vehicles in a Community Microgrid
Learning Semantic Segmentation with Diverse Supervision
Zero-adjusted Birnbaum-Saunders regression model
Analysis of Fast Alternating Minimization for Structured Dictionary Learning
Toward Optimal Coupon Allocation in Social Networks: An Approximate Submodular Optimization Approach
Scalable Lévy Process Priors for Spectral Kernel Learning
Persistent Homology and the Upper Box Dimension
Scalable Preprocessing of High Volume Bird Acoustic Data
Decentralized Control of Stochastically Switched Linear System with Unreliable Communication
ExpNet: Landmark-Free, Deep, 3D Facial Expressions
Modeling polypharmacy side effects with graph convolutional networks
Asymptotic behavior of lifetime sums for random simplicial complex processes
Generating Redundant Features with Unsupervised Multi-Tree Genetic Programming
On the Predictive Risk in Misspecified Quantile Regression
Goethals–Seidel difference families with symmetric or skew base blocks
Unlabelled Sensing: A Sparse Bayesian Learning Approach
Interpretable Deep Convolutional Neural Networks via Meta-learning
Detecting zones and threat on 3D body in security airports using deep learning machine
An Instability in Variational Inference for Topic Models
A reversal phenomenon in estimation based on multiple samples from the Poisson–Dirichlet distribution
Optimization of the porous material described by the Biot model
Mixed-Resolution Image Representation and Compression with Convolutional Neural Networks
Discriminants of classical quasi-orthogonal polynomials, with combinatorial and number-theoretic applications
Hardening Deep Neural Networks via Adversarial Model Cascades
An LMI Approach to Stability Analysis of Coupled Parabolic Systems
A priori Error Estimates for Space-Time Finite Element Discretization of Parabolic Time-Optimal Control Problems
On tails of symmetric and totally asymmetric $α$-stable distributions
Sparse control of Hegselmann-Krause models: Black hole and declustering
Increased accuracy of planning tools for optimization of dynamic multileaf collimator delivery of radiotherapy through reformulated objective functions
When can $l_p$-norm objective functions be minimized via graph cuts?
The footprint of atmospheric turbulence in power grid frequency measurements
Activity-conditioned continuous human pose estimation for performance analysis of athletes using the example of swimming
Counting Environments and Closures
Expansion of Multiple Stratonovich Stochastic Integrals of Fifth Multiplicity, Based on Generalized Multiple Fourier Series
The boundary of random planar maps via looptrees
On the complexity of the outer-connected bondage and the outer-connected reinforcement problems
Maximum determinant positive definite Toeplitz completions
A continuous time tug-of-war game for parabolic $p(x,t)$-Laplace type equations
Green function for gradient perturbation of unimodal Lévy processes in the real line
On The Complexity of the Cayley Semigroup Membership Problem
Dual Memory Neural Computer for Asynchronous Two-view Sequential Learning
Convolutional neural network-based regression for depth prediction in digital holography
Deep Learning for Genomics: A Concise Overview
A novel approach to estimate the Cox model with temporal covariates and its application to medical cost data
Handwritten Isolated Bangla Compound Character Recognition: a new benchmark using a novel deep learning approach
Stochastic Kriging for Inadequate Simulation Models
A Simple Object that Spans the Whole Consensus Hierarchy
A version of the Loebl-Komlós-Sós conjecture for skewed trees
A Generative Model for Natural Sounds Based on Latent Force Modelling
How do Humans Understand Explanations from Machine Learning Systems? An Evaluation of the Human-Interpretability of Explanation
Privacy of Information Sharing Schemes in a Cloud-based Multi-sensor Estimation Problem
Size-aware Sharding For Improving Tail Latencies in In-memory Key-value Stores
From Clustering Supersequences to Entropy Minimizing Subsequences for Single and Double Deletions
Quantitative Fine-Grained Human Evaluation of Machine Translation Systems: a Case Study on English to Croatian
Asymptotic behavior for an additive functional of two independent self-similar Gaussian processes
Improved Runtime Bounds for the Univariate Marginal Distribution Algorithm via Anti-Concentration
Measuring Spark on AWS: A Case Study on Mining Scientific Publications with Annotation Query
The two-time distribution in geometric last-passage percolation
A unified approach to ruin probabilities with delays for spectrally negative Lévy processes
Explaining First Impressions: Modeling, Recognizing, and Explaining Apparent Personality from Videos
Short-term Memory of Deep RNN
On computational issues for stability analysis of LPV systems using parameter dependent Lyapunov functions and LMIs
Optimal probabilistic polynomial time compression and the Slepian-Wolf theorem: tighter version and simple proofs
Deep Convolutional Neural Networks for Breast Cancer Histology Image Analysis
Signal Processing for MIMO-NOMA: Present and Future Challenges
Submodularity-inspired Data Selection for Goal-oriented Chatbot Training based on Sentence Embeddings
Learning Attribute Representation for Human Activity Recognition
Refining the Central Limit Theorem Approximation via Extreme Value Theory
Order matters: Distributional properties of speech to young children bootstraps learning of semantic representations
No Modes left behind: Capturing the data distribution effectively using GANs
Green Stability Assumption: Unsupervised Learning for Statistics-Based Illumination Estimation
Some Ulam’s reconstruction problems for quantum states
Brownian motion in attenuated or renormalized inverse-square Poisson potential
Bayes Calculations from Quantile Implied Likelihood
Least squares estimation for path-distribution dependent stochastic differential equations
VIBNN: Hardware Acceleration of Bayesian Neural Networks
CoDiNA: an RPackage for Co-expression Differential Network Analysis in n Dimensions
Preserved Structure Across Vector Space Representations
Voting patterns in 2016: Exploration using multilevel regression and poststratification (MRP) on pre-election polls
Intriguing Properties of Randomly Weighted Networks: Generalizing While Learning Next to Nothing
Stirling Numbers in Braid Matroid Kazhdan-Lusztig Polynomials
Parameter and Uncertainty Estimation for Dynamical Systems Using Surrogate Stochastic Processes
Projections of the Aldous chain on binary trees: Intertwining and consistency
Is Self-Interference in Full-Duplex Communications a Foe or a Friend?
Bayesian Renewables Scenario Generation via Deep Generative Networks
Flip Graphs, Yoke Graphs and Diameter
Load-Balanced Fractional Repetition Codes
On taking advantage of multiple requests in error correcting codes
Study of SIC and RLS Channel Estimation for Large-Scale Antenna Systems with 1-Bit ADCs
A Protection Method in Active Distribution Grids with High Penetration of Renewable Energy Sources
Proportional Representation in Approval-based Committee Voting and Beyond
Interplay between cost and benefits triggers nontrivial vaccination uptake
A Model for Learned Bloom Filters and Related Structures
Lattices with exponentially large kissing numbers
To Numerical Modeling With Strong Orders 1.5 and 2.0 of Convergence for Multidimensional Dynamical Systems With Random Disturbances
Densely Connected Bidirectional LSTM with Applications to Sentence Classification
Joint Binary Neural Network for Multi-label Learning with Applications to Emotion Classification
Left-Center-Right Separated Neural Network for Aspect-based Sentiment Analysis with Rotatory Attention
Wireless MapReduce Distributed Computing
Representations of quadratic combinatorial optimization problems: A case study using the quadratic set covering problem
Learning Parametric Closed-Loop Policies for Markov Potential Games
Weak order in averaging principle for two-time-scale stochastic partial differential equations
Build a Compact Binary Neural Network through Bit-level Sensitivity and Data Pruning
The Power Allocation Game on A Dynamic Network: Equilibrium Selection
GeniePath: Graph Neural Networks with Adaptive Receptive Paths
AFT*: Integrating Active Learning and Transfer Learning to Reduce Annotation Efforts
Delay Analysis of Random Scheduling and Round Robin in Small Cell Networks
Typicality Matching for Pairs of Correlated Graphs
Multi-attention Recurrent Network for Human Communication Comprehension
Multimodal Sentiment Analysis with Word-Level Fusion and Reinforcement Learning
On the Minimax Misclassification Ratio of Hypergraph Community Detection
Memory Fusion Network for Multi-view Sequential Learning
On OTFS Modulation for High-Doppler Fading Channels
Mixed Precision Training of Convolutional Neural Networks using Integer Operations
Deep Learning Framework for Multi-class Breast Cancer Histology Image Classification
Incorporating Literals into Knowledge Graph Embeddings
Memory-Augmented Neural Networks for Predictive Process Analytics
Unlearning and Seyab’s theorem: a dialogue about updating probability
Learning the Synthesizability of Dynamic Texture Samples
Content based Weighted Consensus Summarization
Ensembling Neural Networks for Digital Pathology Images Classification and Segmentation
Resset: A Recurrent Model for Sequence of Sets with Applications to Electronic Medical Records
Scheduling and Checkpointing optimization algorithm for Byzantine fault tolerance in Cloud Clusters
A note on the folklore of free independence
Combinatorial proofs for identities related with generalizations of the mock theta functions $ω(q)$ and $ν(q)$
Coding Theory: the unit-derived methodology
Numerical verification of the microscopic time reversibility of Newton’s equations of motion: Fighting exponential divergence
Unveiling Relationships Between Crime and Property in England and Wales Via Density Scale-Adjusted Metrics and Network Tools
Parametric Presburger Arithmetic: Complexity of Counting and Quantifier Elimination
Proceedings First Workshop on Architectures, Languages and Paradigms for IoT
Pose Flow: Efficient Online Pose Tracking
The Ramsey and the ordering property for classes of lattices and semilattices
Adaptive Representation Selection in Contextual Bandit with Unlabeled History
Parameter estimation for the mean reversion parameter for the mixed Ornstein-Uhlenbeck process
Modeling Text with Graph Convolutional Network for Cross-Modal Information Retrieval
On the algebraic and arithmetic structure of the monoid of product-one sequences
A new integer valued AR(1) process with Poisson-Lindley innovations
Assessing Prediction Error at Interpolation and Extrapolation Points
nflWAR: A Reproducible Method for Offensive Player Evaluation in Football
A Software Package for Rigorously Calculating Optical Plasma Spectra and Automatically Rtrieving Plasma Properties
Equitable partitions of Latin-square graphs
Image Posterization Using Fuzzy Logic and Bilateral Filter
A Graph Theoretic Approach for Training Overhead Reduction in FDD Massive MIMO Systems
An Area and Energy Efficient Design of Domain-Wall Memory-Based Deep Convolutional Neural Networks using Stochastic Computing
Randomization Tests that Condition on Non-Categorical Covariate Balance
DeepType: Multilingual Entity Linking by Neural Type System Evolution
Lyapunov Design for Event-Triggered Exponential Stabilization
Fairness and Accountability Design Needs for Algorithmic Support in High-Stakes Public Sector Decision-Making
JobPruner: A Machine Learning Assistant for Exploring Parameter Spaces in HPC Applications
On Markovian random networks
Multi-task Learning for Continuous Control
Distance Metrics for Gamma Distributions
Smooth centrally symmetric polytopes in dimension 3 are IDP
Blind Joint MIMO Channel Estimation and Decoding
Using Poisson Binomial GLMs to Reveal Voter Preferences
Uncertainty Quantification of the time averaging of a Statistics Computed from Numerical Simulation of Turbulent Flow
Power Allocation Strategy of Maximizing Secrecy Rate for Secure Directional Modulation
How to Characterize the Worst-Case Performance of Algorithms for Nonconvex Optimization
Out-of-Core and Distributed Algorithms for Dense Subtensor Mining
A Highly Accelerated Parallel Multi-GPU based Reconstruction Algorithm for Generating Accurate Relative Stopping Powers
Weighted Hamming Metric Structures
Pair-Linking for Collective Entity Disambiguation: Two Could Be Better Than All
Mean-variance portfolio selection and variance hedging with random coefficients: closed-loop equilibrium strategy
Equilibrium controls in time inconsistent stochastic linear quadratic problems
Uniqueness of equilibrium strategies in dynamic mean-variance problems with random coefficients
General maximum principles for optimal control problems of stochastic Volterra integral equations
Characterizations of equilibrium controls in time inconsistent mean-field stochastic linear quadratic problems. I
About chromatic uniqueness of some complete tripartite graphs
INLA goes extreme: Bayesian tail regression for the estimation of high spatio-temporal quantiles
Sense-and-Predict: Harnessing Spatial Interference Correlation for Opportunistic Access in Cognitive Radio Networks
Some sharp results on the generalized Turán numbers
Museum Exhibit Identification Challenge for Domain Adaptation and Beyond
HPC Curriculum and Associated Ressources in the Academic Context
Multicoloring of Graphs to Secure a Secret
End2You — The Imperial Toolkit for Multimodal Profiling by End-to-End Learning
Object Detection and Sorting by Using a Global Texture-Shape 3D Feature Descriptor
Searching for Representative Modes on Hypergraphs for Robust Geometric Model Fitting
The genealogy of an exactly solvable Ornstein-Uhlenbeck type branching process with selection
Repeat-Accumulate Signal Codes
Uncoded Caching and Cross-level Coded Delivery for Non-uniform File Popularity
Simultaneous Selection of Multiple Important Single Nucleotide Polymorphisms in Familial Genome Wide Association Studies Data
Human Action Adverb Recognition: ADHA Dataset and A Three-Stream Hybrid Model
Uniqueness in law for stable-like processes of variable order
Testing to distinguish measures on metric spaces
Privacy-Aware Smart Metering: Progress and Challenges
Industrial Symbiotic Relations as Cooperative Games
Parameter estimators of random intersection graphs with thinned communities
Tunneling Neural Perception and Logic Reasoning through Abductive Learning
A Scheme-Driven Approach to Learning Programs from Input/Output Equations
On coset leader graphs of structured linear codes
Personalized Machine Learning for Robot Perception of Affect and Engagement in Autism Therapy
Different regimes of Purcell Effect in Disordered Photonic Crystals
Smooth $q$-Gram, and Its Applications to Detection of Overlaps among Long, Error-Prone Sequencing Reads
Heuristic Feature Selection for Clickbait Detection
The Anatomy of Leadership in Collective Behaviour
A characterisation of the Gaussian free field
A Sharp Bound on the s-Energy
Toward a Theory of Markov Influence Systems and their Renormalization
On Quadratic Embedding Constants of Star Product Graphs
Balanced diagonals in frequency squares
Efficient Video Object Segmentation via Network Modulation
Image Synthesis in Multi-Contrast MRI with Conditional Generative Adversarial Networks
Variational Principles for Optimal Control of Left-Invariant Multi-Agent Systems with Asymmetric Formation Constraints
MIMO with Energy Recycling
Tracking Multiple Moving Objects Using Unscented Kalman Filtering Techniques
Face Destylization
Listening to the cohomology of graphs
Counting and Uniform Sampling from Markov Equivalent DAGs
Enhancing Multi-Class Classification of Random Forest using Random Vector Functional Neural Network and Oblique Decision Surfaces
Semantic projection: recovering human knowledge of multiple, distinct object features from word embeddings
Fast Approximations for Metric-TSP via Linear Programming
On singular value distribution of large dimensional data matrices whose columns have different correlations
Chemical-protein relation extraction with ensembles of SVM, CNN, and RNN models
Phase retrieval with background information
q-Analogues of two ‘divergent’ Ramanujan-type supercongruences
Newtonized Orthogonal Matching Pursuit for Line Spectrum Estimation with Multiple Measurement Vectors
ClassSim: Similarity between Classes Defined by Misclassification Ratios of Trained Classifiers
Accurate brain extraction using Active Shape Model and Convolutional Neural Networks
Stochastic control and non-equilibrium thermodynamics: fundamental limits
Face recognition for monitoring operator shift in railways
Dream Formulations and Deep Neural Networks: Humanistic Themes in the Iconology of the Machine-Learned Image
Strong calmness of perturbed KKT system for a class of conic programming with degenerate solutions
Task-Aware Compressed Sensing with Generative Adversarial Networks
Data Augmentation of Railway Images for Track Inspection
Deep Learning-based Channel Estimation for Beamspace mmWave Massive MIMO Systems
Rate-Energy Region in Wireless Information and Power Transfer: New Receiver Architecture and Practical Modulation
Star Edge Coloring of the Cartesian Product of Graphs
Comparison of computer systems and ranking criteria for automatic melanoma detection in dermoscopic images
Truthful mechanisms for ownership transfer with expert advice
Competitive Online Algorithms for Resource Allocation over the Positive Semidefinite Cone
Gosig: Scalable Byzantine Consensus on Adversarial Wide Area Network for Blockchains
Study of Realistic Antenna Patterns in 5G mmWave Cellular Scenarios
Information Assisted Dictionary Learning for fMRI data analysis
Shortest $k$-Disjoint Paths via Determinants
Continuous-Domain Solutions of Linear Inverse Problems with Tikhonov vs. Generalized TV Regularization
DP-GAN: Diversity-Promoting Generative Adversarial Network for Generating Informative and Diversified Text
Lie Transform Based Polynomial Neural Networks for Dynamical Systems Simulation and Identification
A Bilevel Approach for Parameter Learning in Inverse Problems
Congruences for the Coefficients of the Powers of the Euler Product
Online Compact Convexified Factorization Machine
Randomness and isometries in echo state networks and compressed sensing
Counting inversions and descents of random elements in finite Coxeter groups
Holder continuity of the steepest descent direction for multiobjective optimization
Exponential functions of finite posets and the number of extensions with a fixed set of minimal points
Adversarial Vulnerability of Neural Networks Increases With Input Dimension
Re-thinking non-inferiority: a practical trial design for optimising treatment duration
A Method for Restoring the Training Set Distribution in an Image Classifier
Road Segmentation in SAR Satellite Images with Deep Fully-Convolutional Neural Networks
Reducing CMSO Model Checking to Highly Connected Graphs
Diverse Beam Search for Increased Novelty in Abstractive Summarization
Image restoration with generalized Gaussian mixture model patch priors
A $q$-analogue of Euler’s formula $ζ(2)=π^2/6$
An efficient counting method for the colored triad census
The Sea Exploration Problem: Data-driven Orienteering on a Continuous Surface
Explicit Inductive Bias for Transfer Learning with Convolutional Networks
Shoulder Physiotherapy Exercise Recognition: Machine Learning the Inertial Signals from a Smartwatch
An extreme function which is nonnegative and discontinuous everywhere
Exploring Spatial Context for 3D Semantic Segmentation of Point Clouds
Linear Convergence of the Primal-Dual Gradient Method for Convex-Concave Saddle Point Problems without Strong Convexity
On $q$-analogues of a Zeilberger-type series for $π^2$
Deterministic Regular Expressions With Back-References
Anti-van der Waerden numbers on Graphs
Covariance Matrix Estimation for Massive MIMO
3D non-rigid registration using color: Color Coherent Point Drift
Guided Policy Exploration for Markov Decision Processes using an Uncertainty-Based Value-of-Information Criterion
Heat content estimates for the fractional Schrödinger operator $\F+\ind$
Background subtraction using the factored 3-way restricted Boltzmann machines
Multilayer Network Modeling of Integrated Biological Systems
Abstractly Interpreting Argumentation Frameworks for Sharpening Extensions
Supporting UAV Cellular Communications through Massive MIMO
Optimal consensus control of the Cucker-Smale model
Switching and partially switching the hypercube while maintaining perfect state transfer
Real-time Prediction of Intermediate-Horizon Automotive Collision Risk
Exceedance-based nonlinear regression of tail dependence
Solution for a Bipartite Euclidean TSP in one dimension
Regularized Evolution for Image Classifier Architecture Search
Can One Escape Red Chains? Regular Path Queries Determinacy is Undecidable
One-Shot Imitation from Observing Humans via Domain-Adaptive Meta-Learning
Random walk on the randomly-oriented Manhattan lattice
Age-Minimal Online Policies for Energy Harvesting Sensors with Random Battery Recharges