Explainable AI: Beware of Inmates Running the Asylum Or: How I Learnt to Stop Worrying and Love the Social and Behavioural Sciences

In his seminal book `The Inmates are Running the Asylum: Why High-Tech Products Drive Us Crazy And How To Restore The Sanity’ [2004, Sams Indianapolis, IN, USA], Alan Cooper argues that a major reason why software is often poorly designed (from a user perspective) is that programmers are in charge of design decisions, rather than interaction designers. As a result, programmers design software for themselves, rather than for their target audience; a phenomenon he refers to as the `inmates running the asylum’. This paper argues that explainable AI risks a similar fate. While the re-emergence of explainable AI is positive, this paper argues most of us as AI researchers are building explanatory agents for ourselves, rather than for the intended users. But explainable AI is more likely to succeed if researchers and practitioners understand, adopt, implement, and improve models from the vast and valuable bodies of research in philosophy, psychology, and cognitive science; and if evaluation of these models is focused more on people than on technology. From a light scan of literature, we demonstrate that there is considerable scope to infuse more results from the social and behavioural sciences into explainable AI, and present some key results from these fields that are relevant to explainable AI.

Causal inference in the context of an error prone exposure: air pollution and mortality

We propose a new approach for estimating causal effects when the exposure is measured with error and confounding adjustment is performed via a generalized propensity score (GPS). Using validation data, we propose a regression calibration (RC)-based adjustment for a continuous error-prone exposure combined with GPS to adjust for confounding (RC-GPS). The outcome analysis is conducted after transforming the corrected continuous exposure into a categorical exposure. We consider confounding adjustment in the context of GPS subclassification, inverse probability treatment weighting (IPTW) and matching. In simulations with varying degrees of exposure error and confounding bias, RC-GPS eliminates bias from exposure error and confounding compared to standard approaches that rely on the error-prone exposure. We applied RC-GPS to a rich data platform to estimate the causal effect of long-term exposure to fine particles (PM_{2.5}) on mortality in New England for the period from 2000 to 2012. The main study consists of 2,202 zip codes covered by 217,660 1km \times 1km grid cells with yearly mortality rates, yearly PM_{2.5} averages estimated from a spatio-temporal model (error-prone exposure) and several potential confounders. The internal validation study includes a subset of 83 1km \times 1km grid cells within 75 zip codes from the main study with error-free yearly PM_{2.5} exposures obtained from monitor stations. Under assumptions of non-interference and weak unconfoundedness, using matching we found that exposure to moderate levels of PM_{2.5} (8 < PM_{2.5} \leq 10\ {\rm \mu g/m^3}) causes a 2.8\% (95\% CI: 0.6\%, 3.6\%) increase in all-cause mortality compared to low exposure (PM_{2.5} \leq 8\ {\rm \mu g/m^3}).

Learning Independent Causal Mechanisms

Independent causal mechanisms are a central concept in the study of causality with implications for machine learning tasks. In this work we develop an algorithm to recover a set of (inverse) independent mechanisms relating a distribution transformed by the mechanisms to a reference distribution. The approach is fully unsupervised and based on a set of experts that compete for data to specialize and extract the mechanisms. We test and analyze the proposed method on a series of experiments based on image transformations. Each expert successfully maps a subset of the transformed data to the original domain, and the learned mechanisms generalize to other domains. We discuss implications for domain transfer and links to recent trends in generative modeling.

Characterizing and Computing Causes for Query Answers in Databases from Database Repairs and Repair Programs

A correspondence between database tuples as causes for query answers in databases and tuple-based repairs of inconsistent databases with respect to denial constraints has already been established. In this work, answer-set programs that specify repairs of databases are used as a basis for solving computational and reasoning problems about causes. Here, causes are also introduced at the attribute level by appealing to a both null-based and attribute-based repair semantics. The corresponding repair programs are presented, and they are used as a basis for computation and reasoning about attribute-level causes.

Where Classification Fails, Interpretation Rises

An intriguing property of deep neural networks is their inherent vulnerability to adversarial inputs, which significantly hinders their application in security-critical domains. Most existing detection methods attempt to use carefully engineered patterns to distinguish adversarial inputs from their genuine counterparts, which however can often be circumvented by adaptive adversaries. In this work, we take a completely different route by leveraging the definition of adversarial inputs: while deceiving for deep neural networks, they are barely discernible for human visions. Building upon recent advances in interpretable models, we construct a new detection framework that contrasts an input’s interpretation against its classification. We validate the efficacy of this framework through extensive experiments using benchmark datasets and attacks. We believe that this work opens a new direction for designing adversarial input detection methods.

Progressive Neural Architecture Search

We propose a method for learning CNN structures that is more efficient than previous approaches: instead of using reinforcement learning (RL) or genetic algorithms (GA), we use a sequential model-based optimization (SMBO) strategy, in which we search for architectures in order of increasing complexity, while simultaneously learning a surrogate function to guide the search, similar to A* search. On the CIFAR-10 dataset, our method finds a CNN structure with the same classification accuracy (3.41% error rate) as the RL method of Zoph et al. (2017), but 2 times faster (in terms of number of models evaluated). It also outperforms the GA method of Liu et al. (2017), which finds a model with worse performance (3.63% error rate), and takes 5 times longer. Finally we show that the model we learned on CIFAR also works well at the task of ImageNet classification. In particular, we match the state-of-the-art performance of 82.9% top-1 and 96.1% top-5 accuracy.

MAgent: A Many-Agent Reinforcement Learning Platform for Artificial Collective Intelligence

We introduce MAgent, a platform to support research and development of many-agent reinforcement learning. Unlike previous research platforms on single or multi-agent reinforcement learning, MAgent focuses on supporting the tasks and the applications that require hundreds to millions of agents. Within the interactions among a population of agents, it enables not only the study of learning algorithms for agents’ optimal polices, but more importantly, the observation and understanding of individual agent’s behaviors and social phenomena emerging from the AI society, including communication languages, leaderships, altruism. MAgent is highly scalable and can host up to one million agents on a single GPU server. MAgent also provides flexible configurations for AI researchers to design their customized environments and agents. In this demo, we present three environments designed on MAgent and show emerged collective intelligence by learning from scratch.

Towards Robust Neural Networks via Random Self-ensemble

Recent studies have revealed the vulnerability of deep neural networks – A small adversarial perturbation that is imperceptible to human can easily make a well-trained deep neural network mis-classify. This makes it unsafe to apply neural networks in security-critical applications. In this paper, we propose a new defensive algorithm called Random Self-Ensemble (RSE) by combining two important concepts: {\bf randomness} and {\bf ensemble}. To protect a targeted model, RSE adds random noise layers to the neural network to prevent from state-of-the-art gradient-based attacks, and ensembles the prediction over random noises to stabilize the performance. We show that our algorithm is equivalent to ensemble an infinite number of noisy models f_\epsilon without any additional memory overhead, and the proposed training procedure based on noisy stochastic gradient descent can ensure the ensemble model has good predictive capability. Our algorithm significantly outperforms previous defense techniques on real datasets. For instance, on CIFAR-10 with VGG network (which has 92\% accuracy without any attack), under the state-of-the-art C&W attack within a certain distortion tolerance, the accuracy of unprotected model drops to less than 10\%, the best previous defense technique has 48\% accuracy, while our method still has 86\% prediction accuracy under the same level of attack. Finally, our method is simple and easy to integrate into any neural network.

GANGs: Generative Adversarial Network Games

Generative Adversarial Networks (GAN) have become one of the most successful frameworks for unsupervised generative modeling. As GANs are difficult to train much research has focused on this. However, very little of this research has directly exploited game-theoretic techniques. We introduce Generative Adversarial Network Games (GANGs), which explicitly model a finite zero-sum game between a generator (G) and classifier (C) that use mixed strategies. The size of these games precludes exact solution methods, therefore we define resource-bounded best responses (RBBRs), and a resource-bounded Nash Equilibrium (RB-NE) as a pair of mixed strategies such that neither G or C can find a better RBBR. The RB-NE solution concept is richer than the notion of `local Nash equilibria’ in that it captures not only failures of escaping local optima of gradient descent, but applies to any approximate best response computations, including methods with random restarts. To validate our approach, we solve GANGs with the Parallel Nash Memory algorithm, which provably monotonically converges to an RB-NE. We compare our results to standard GAN setups, and demonstrate that our method deals well with typical GAN problems such as mode collapse, partial mode coverage and forgetting.

GAGAN: Geometry-Aware Generative Adverserial Networks

Deep generative models learned through adversarial training have become increasingly popular for their ability to generate naturalistic image textures. However, apart from the visual texture, the visual appearance of objects is significantly affected by their shape geometry, information which is not taken into account by existing generative models. This paper introduces the Geometry-Aware Generative Adversarial Network (GAGAN) for incorporating geometric information into the image generation process. Specifically, in GAGAN the generator samples latent variables from the probability space of a statistical shape model. By mapping the output of the generator to a canonical coordinate frame through a differentiable geometric transformation, we enforce the geometry of the objects and add an implicit connection from the prior to the generated object. Experimental results on face generation indicate that the GAGAN can generate realistic images of faces with arbitrary facial attributes such as facial expression, pose, and morphology, that are of better quality compared to current GAN-based methods. Finally, our method can be easily incorporated into and improve the quality of the images generated by any existing GAN architecture.

Cascade R-CNN: Delving into High Quality Object Detection

In object detection, an intersection over union (IoU) threshold is required to define positives and negatives. An object detector, trained with low IoU threshold, e.g. 0.5, usually produces noisy detections. However, detection performance tends to degrade with increasing the IoU thresholds. Two main factors are responsible for this: 1) overfitting during training, due to exponentially vanishing positive samples, and 2) inference-time mismatch between the IoUs for which the detector is optimal and those of the input hypotheses. A multi-stage object detection architecture, the Cascade R-CNN, is proposed to address these problems. It consists of a sequence of detectors trained with increasing IoU thresholds, to be sequentially more selective against close false positives. The detectors are trained stage by stage, leveraging the observation that the output of a detector is a good distribution for training the next higher quality detector. The resampling of progressively improved hypotheses guarantees that all detectors have a positive set of examples of equivalent size, reducing the overfitting problem. The same cascade procedure is applied at inference, enabling a closer match between the hypotheses and the detector quality of each stage. A simple implementation of the Cascade R-CNN is shown to surpass all single-model object detectors on the challenging COCO dataset. Experiments also show that the Cascade R-CNN is widely applicable across detector architectures, achieving consistent gains independently of the baseline detector strength. The code will be made available at https://…/cascade-rcnn.

SHINE: Signed Heterogeneous Information Network Embedding for Sentiment Link Prediction

In online social networks people often express attitudes towards others, which forms massive sentiment links among users. Predicting the sign of sentiment links is a fundamental task in many areas such as personal advertising and public opinion analysis. Previous works mainly focus on textual sentiment classification, however, text information can only disclose the ‘tip of the iceberg’ about users’ true opinions, of which the most are unobserved but implied by other sources of information such as social relation and users’ profile. To address this problem, in this paper we investigate how to predict possibly existing sentiment links in the presence of heterogeneous information. First, due to the lack of explicit sentiment links in mainstream social networks, we establish a labeled heterogeneous sentiment dataset which consists of users’ sentiment relation, social relation and profile knowledge by entity-level sentiment extraction method. Then we propose a novel and flexible end-to-end Signed Heterogeneous Information Network Embedding (SHINE) framework to extract users’ latent representations from heterogeneous networks and predict the sign of unobserved sentiment links. SHINE utilizes multiple deep autoencoders to map each user into a low-dimension feature space while preserving the network structure. We demonstrate the superiority of SHINE over state-of-the-art baselines on link prediction and node recommendation in two real-world datasets. The experimental results also prove the efficacy of SHINE in cold start scenario.

Comment: A brief survey of the current state of play for Bayesian computation in data science at Big-Data scale

We wish to contribute to the discussion of ‘Comparing Consensus Monte Carlo Strategies for Distributed Bayesian Computation’ by offering our views on the current best methods for Bayesian computation, both at big-data scale and with smaller data sets, as summarized in Table 1. This table is certainly an over-simplification of a highly complicated area of research in constant (present and likely future) flux, but we believe that constructing summaries of this type is worthwhile despite their drawbacks, if only to facilitate further discussion.

SERKET: An Architecture For Connecting Stochastic Models to Realize a Large-Scale Cognitive Model

To realize human-like robot intelligence, a large-scale cognitive architecture is required for robots to understand the environment through a variety of sensors with which they are equipped. In this paper, we propose a novel framework named Serket that enables the construction of a large-scale generative model and its inference easily by connecting sub-modules to allow the robots to acquire various capabilities through interaction with their environments and others. We consider that large-scale cognitive models can be constructed by connecting smaller fundamental models hierarchically while maintaining their programmatic independence. Moreover, connected modules are dependent on each other, and parameters are required to be optimized as a whole. Conventionally, the equations for parameter estimation have to be derived and implemented depending on the models. However, it becomes harder to derive and implement those of a larger scale model. To solve these problems, in this paper, we propose a method for parameter estimation by communicating the minimal parameters between various modules while maintaining their programmatic independence. Therefore, Serket makes it easy to construct large-scale models and estimate their parameters via the connection of modules. Experimental results demonstrated that the model can be constructed by connecting modules, the parameters can be optimized as a whole, and they are comparable with the original models that we have proposed.

FSSD: Feature Fusion Single Shot Multibox Detector

SSD (Single Shot Multibox Detetor) is one of the best object detection algorithms with both high accuracy and fast speed. However, SSD’s feature pyramid detection method makes it hard to fuse the features from different scales. In this paper, we proposed FSSD (Feature Fusion Single Shot Multibox Detector), an enhanced SSD with a novel and lightweight feature fusion module which can improve the performance significantly over SSD with just a little speed drop. In the feature fusion module, features from different layers with different scales are concatenated together, followed by some down-sampling blocks to generate new feature pyramid, which will be fed to multibox detectors to predict the final detection results. On the Pascal VOC 2007 test, our network can achieve 82.7 mAP (mean average precision) at the speed of 65.8 FPS (frame per second) with the input size 300\times300 using a single Nvidia 1080Ti GPU. In addition, our result on COCO is also better than the conventional SSD with a large margin. Our FSSD outperforms a lot of state-of-the-art object detection algorithms in both aspects of accuracy and speed. Code will be made publicly available.

Drift Analysis

Drift analysis is one of the major tools for analysing evolutionary algorithms and nature-inspired search heuristics. In this chapter we give an introduction to drift analysis and give some examples of how to use it for the analysis of evolutionary algorithms.

Temporal Attention augmented Bilinear Network for Financial Time-Series Data Analysis

Financial time-series forecasting has long been a challenging problem because of the inherently noisy and stochastic nature of the market. In the High-Frequency Trading (HFT), forecasting for trading purposes is even a more challenging task since an automated inference system is required to be both accurate and fast. In this paper, we propose a neural network layer architecture that incorporates the idea of bilinear projection as well as an attention mechanism that enables the layer to detect and focus on crucial temporal information. The resulting network is highly interpretable, given its ability to highlight the importance and contribution of each temporal instance, thus allowing further analysis on the time instances of interest. Our experiments in a large-scale Limit Order Book (LOB) dataset show that a two-hidden-layer network utilizing our proposed layer outperforms by a large margin all existing state-of-the-art results coming from much deeper architectures while requiring far fewer computations.

Feature Generating Networks for Zero-Shot Learning

Suffering from the extreme training data imbalance between seen and unseen classes, most of existing state-of-the-art approaches fail to achieve satisfactory results for the challenging generalized zero-shot learning task. To circumvent the need for labeled examples of unseen classes, we propose a novel generative adversarial network (GAN) that synthesizes CNN features conditioned on class-level semantic information, offering a shortcut directly from a semantic descriptor of a class to a class-conditional feature distribution. Our proposed approach, pairing a Wasserstein GAN with a classification loss, is able to generate sufficiently discriminative CNN features to train softmax classifiers or any multimodal embedding method. Our experimental results demonstrate a significant boost in accuracy over the state of the art on five challenging datasets — CUB, FLO, SUN, AWA and ImageNet — in both the zero-shot learning and generalized zero-shot learning settings.

Adaptive Quantization for Deep Neural Network

In recent years Deep Neural Networks (DNNs) have been rapidly developed in various applications, together with increasingly complex architectures. The performance gain of these DNNs generally comes with high computational costs and large memory consumption, which may not be affordable for mobile platforms. Deep model quantization can be used for reducing the computation and memory costs of DNNs, and deploying complex DNNs on mobile equipment. In this work, we propose an optimization framework for deep model quantization. First, we propose a measurement to estimate the effect of parameter quantization errors in individual layers on the overall model prediction accuracy. Then, we propose an optimization process based on this measurement for finding optimal quantization bit-width for each layer. This is the first work that theoretically analyse the relationship between parameter quantization errors of individual layers and model accuracy. Our new quantization algorithm outperforms previous quantization optimization methods, and achieves 20-40% higher compression rate compared to equal bit-width quantization at the same model prediction accuracy.

Natural Langevin Dynamics for Neural Networks

One way to avoid overfitting in machine learning is to use model parameters distributed according to a Bayesian posterior given the data, rather than the maximum likelihood estimator. Stochastic gradient Langevin dynamics (SGLD) is one algorithm to approximate such Bayesian posteriors for large models and datasets. SGLD is a standard stochastic gradient descent to which is added a controlled amount of noise, specifically scaled so that the parameter converges in law to the posterior distribution [WT11, TTV16]. The posterior predictive distribution can be approximated by an ensemble of samples from the trajectory. Choice of the variance of the noise is known to impact the practical behavior of SGLD: for instance, noise should be smaller for sensitive parameter directions. Theoretically, it has been suggested to use the inverse Fisher information matrix of the model as the variance of the noise, since it is also the variance of the Bayesian posterior [PT13, AKW12, GC11]. But the Fisher matrix is costly to compute for large- dimensional models. Here we use the easily computed Fisher matrix approximations for deep neural networks from [MO16, Oll15]. The resulting natural Langevin dynamics combines the advantages of Amari’s natural gradient descent and Fisher-preconditioned Langevin dynamics for large neural networks. Small-scale experiments on MNIST show that Fisher matrix preconditioning brings SGLD close to dropout as a regularizing technique.

Statistical Inference for Incomplete Ranking Data: The Case of Rank-Dependent Coarsening

We consider the problem of statistical inference for ranking data, specifically rank aggregation, under the assumption that samples are incomplete in the sense of not comprising all choice alternatives. In contrast to most existing methods, we explicitly model the process of turning a full ranking into an incomplete one, which we call the coarsening process. To this end, we propose the concept of rank-dependent coarsening, which assumes that incomplete rankings are produced by projecting a full ranking to a random subset of ranks. For a concrete instantiation of our model, in which full rankings are drawn from a Plackett-Luce distribution and observations take the form of pairwise preferences, we study the performance of various rank aggregation methods. In addition to predictive accuracy in the finite sample setting, we address the theoretical question of consistency, by which we mean the ability to recover a target ranking when the sample size goes to infinity, despite a potential bias in the observations caused by the (unknown) coarsening.

Clustering Stable Instances of Euclidean k-means

The Euclidean k-means problem is arguably the most widely-studied clustering problem in machine learning. While the k-means objective is NP-hard in the worst-case, practitioners have enjoyed remarkable success in applying heuristics like Lloyd’s algorithm for this problem. To address this disconnect, we study the following question: what properties of real-world instances will enable us to design efficient algorithms and prove guarantees for finding the optimal clustering? We consider a natural notion called additive perturbation stability that we believe captures many practical instances. Stable instances have unique optimal k-means solutions that do not change even when each point is perturbed a little (in Euclidean distance). This captures the property that the k-means optimal solution should be tolerant to measurement errors and uncertainty in the points. We design efficient algorithms that provably recover the optimal clustering for instances that are additive perturbation stable. When the instance has some additional separation, we show an efficient algorithm with provable guarantees that is also robust to outliers. We complement these results by studying the amount of stability in real datasets and demonstrating that our algorithm performs well on these benchmark datasets.

Learning phase transitions from dynamics
Subject Selection on a Riemannian Manifold for Unsupervised Cross-subject Seizure Detection
Nonlinear Feynman-Kac formulae for SPDEs with space-time noise
Image to Image Translation for Domain Adaptation
Intelligent EHRs: Predicting Procedure Codes From Diagnosis Codes
A Pliable Lasso
The magnitude of the minimal displacement vector for compositions and convex combinations of firmly nonexpansive mappings
Visual Features for Context-Aware Speech Recognition
On ‘A Homogeneous Interior-Point Algorithm for Non-Symmetric Convex Conic Optimization’
The diachromatic number of digraphs
Propagating Uncertainty in Multi-Stage Bayesian Convolutional Neural Networks with Application to Pulmonary Nodule Detection
Optimization of Imperative Programs in a Relational Database
Prediction-Constrained Topic Models for Antidepressant Recommendation
$A$-Hypergeometric Modules and Gauss–Manin Systems
Statistical Design of Chaotic Waveforms with Enhanced Targeting Capabilities
Learning Neural Markers of Schizophrenia Disorder Using Recurrent Neural Networks
Multi-Content GAN for Few-Shot Font Style Transfer
An Elementary Analysis of the Probability that a Binomial Random Variable Exceeds its Expectation
Bayesian Semi-nonnegative Tri-matrix Factorization to Identify Pathways Associated with Cancer Types
Towards understanding feedback from supermassive black holes using convolutional neural networks
Adaptive Sampled Softmax with Kernel Based Sampling
A Theoretical Study of Process Dependence for Standard Two-Process Serial Models and Standard Two-Process Parallel Models
Survival-Supervised Topic Modeling with Anchor Words: Characterizing Pancreatitis Outcomes
Optimal Control of Load Shedding in Smart Grids
High Reliability and Low Latency for Vehicular Networks: Challenges and Solutions
Central limit theorem for the variable bandwidth kernel density estimators
Splenomegaly Segmentation using Global Convolutional Kernels and Conditional Generative Adversarial Networks
Improved Stability of Whole Brain Surface Parcellation with Multi-Atlas Segmentation
Calibrating a Stochastic Agent Based Model Using Quantile-based Emulation
A Two-Stage Allocation Scheme for Delay-Sensitive Services in Dense Vehicular Networks
Using Programmable Graphene Channels as Weights in Spin-Diffusive Neuromorphic Computing
An Enhanced LMMSE Channel Estimation under High Speed Railway Scenarios
A global feature extraction model for the effective computer aided diagnosis of mild cognitive impairment using structural MRI images
Recurrent Neural Network Language Models for Open Vocabulary Event-Level Cyber Anomaly Detection
Scalable Sparse Cox’s Regression for Large-Scale Survival Data via Broken Adaptive Ridge
Anesthesiologist-level forecasting of hypoxemia with only SpO2 data using deep learning
Unique effects of non-Gaussian diffusion in static disordered media
Supervised Hashing based on Energy Minimization
Lecture video indexing using boosted margin maximizing neural networks
Interactive Reinforcement Learning for Object Grounding via Self-Talking
Tracking the Best Expert in Non-stationary Stochastic Environments
Online Reinforcement Learning in Stochastic Games
Fruit recognition from images using deep learning
Gigantic random simplicial complexes
Taming Adversarial Domain Transfer with Structural Constraints for Image Enhancement
Price-Based Distributed Offloading for Mobile-Edge Computing with Computation Capacity Constraints
An Inverse Problem Study: Credit Risk Ratings as a Determinant of Corporate Governance and Capital Structure in Emerging Markets: Evidence from Chinese Listed Companies
Toward Reliable and Rapid Elasticity for Streaming Dataflows on Clouds
Improving Visually Grounded Sentence Representations with Self-Attention
Regular Bipartite Lattices with Large Values of Theta_2,2,2/C_4
Non-ergodic delocalized phase in Anderson model on Bethe lattice and regular graph
Adaptive Group Testing Algorithms to Estimate the Number of Defectives
Recurrent Neural Networks for Semantic Instance Segmentation
DR-Net: Transmission Steered Single Image Dehazing Network with Weakly Supervised Refinement
Distributed Topology Design for Network Coding Deployed Large-scale Sensor Networks
The exact minimum number of triangles in graphs of given order and size
PFAx: Predictable Feature Analysis to Perform Control
Network Coding Based Evolutionary Network Formation for Dynamic Wireless Networks
Compressed Video Action Recognition
Learning Sparse Adversarial Dictionaries For Multi-Class Audio Classification
Short-term Mortality Prediction for Elderly Patients Using Medicare Claims Data
From knowledge-based to data-driven modeling of fuzzy rule-based systems: A critical reflection
Coded Caching in a Multi-Server System with Random Topology
Non-branching tree-decompositions
Representation and Reinforcement Learning for Personalized Glycemic Control in Septic Patients
An Asymptotically Optimal Algorithm for Communicating Multiplayer Multi-Armed Bandit Problems
Nearly Optimal Scheduling of Wireless Ad Hoc Networks in Polynomial Time
Mix-and-Match Tuning for Self-Supervised Semantic Segmentation
Bayesian prior elicitation and selection for extreme values
Digraph Polynomials for Counting Cycles and Paths
An Introduction to Adjoints and Output Error Estimation in Computational Fluid Dynamics
Some extremal ratios of the distance and subtree problems in binary trees
Improving Network Robustness against Adversarial Attacks with Compact Convolution
Efficient Beam Alignment in Millimeter Wave Systems Using Contextual Bandits
Diffusion Adaptation Framework for Compressive Sensing Reconstruction
Low-Rank Tensor Completion by Truncated Nuclear Norm Regularization
Study of the Sparse Superposition Codes and the Generalized Approximate Message Passing Decoder for the Communication over Binary Symmetric and Z Channels
Simulated Annealing Algorithm for Graph Coloring
Iterative Collaborative Filtering for Sparse Matrix Estimation
The local geometry of testing in ellipses: Tight control via localized Kolomogorov widths
Evaluation of Alzheimer’s Disease by Analysis of MR Images using Multilayer Perceptrons and Kohonen SOM Classifiers as an Alternative to the ADC Maps
Spatial PixelCNN: Generating Images from Patches
Convolutional Phase Retrieval via Gradient Descent
Automatic Recognition of Coal and Gangue based on Convolution Neural Network
Feature Agglomeration Networks for Single Stage Face Detection
Conic-sector-based analysis and control synthesis for linear parameter varying systems
Sentiment Classification using Images and Label Embeddings
Joint Topic-Semantic-aware Social Recommendation for Online Voting
Incorporating External Knowledge to Answer Open-Domain Visual Questions with Dynamic Memory Networks
Towards Qualitative Advancement of Underwater Machine Vision with Generative Adversarial Networks
Toric Codes and Lattice Ideals
ALLSAT compressed with wildcards. Part 4: An invitation for C-programmers
Universal discrete-time reservoir computers with stochastic inputs and linear readouts using non-homogeneous state-affine systems
A branch and bound algorithm for the robust parall machine scheduling
Filtration of the gravitational frequency shift in the radio links communication with Earth’s satellite
Arbitrarily Varying Wiretap Channel with State Sequence Known or Unknown at the Receiver
Randomized incomplete $U$-statistics in high dimensions
Gradient Descent Learns One-hidden-layer CNN: Don’t be Afraid of Spurious Local Minima
Approachability with Constraints
Reconstruction of Electrical Impedance Tomography Using Fish School Search, Non-Blind Search, and Genetic Algorithm
Multimodal Visual Concept Learning with Weakly Supervised Techniques
Alternating and Variable Boundary Control for the Wave Equation
Sparse principal component analysis and its $l_1$-relaxation
Polystore Mathematics of Relational Algebra
Distinguishing critical graphs
The Complexity of Satisfiability in Non-Iterated and Iterated Probabilistic Logics
An exact lower bound on the misclassification probability
Semi-Global Stereo Matching with Surface Orientation Priors
Entanglement and secret-key-agreement capacities of bipartite quantum interactions and read-only memory devices
Tensor Train Neighborhood Preserving Embedding
Lecture notes on Liouville theory and the DOZZ formula
A Generalized Turán Problem and its Applications
A Polyhedral Proof of a Wreath Product Identity
Visual Explanation by High-Level Abduction: On Answer-Set Programming Driven Reasoning about Moving Objects
Large-scale analysis of disease pathways in the human interactome
Always Lurking: Understanding and Mitigating Bias in Online Human Trafficking Detection
Circular genome rearrangement models: applying representation theory to evolutionary distance calculations
On the Geometry of Nash and Correlated Equilibria with Cumulative Prospect Theoretic Preferences
Random walks of infinite moment on free semigroups
Exponential Lower Bounds on the Generalized Erdős-Ginzburg-Ziv Constant
A Deep Learning Approach to Drone Monitoring
Raw Waveform-based Audio Classification Using Sample-level CNN Architectures
Gaussian Process Regression for Arctic Coastal Erosion Forecasting
The Saukas-Song Selection Algorithm and Coarse Grained Parallel Sorting
Learning Object Detectors from Scratch with Gated Recurrent Feature Pyramids
Regularity of Edge Ideals and Their Powers
Data Dropout in Arbitrary Basis for Deep Network Regularization
Tracy-Widom limit for Kendall’s tau
Localised sequential state estimation for advection dominated flows with non-Gaussian uncertainty description
Proceedings of the Fifth Workshop on Proof eXchange for Theorem Proving
Composition-aided Sketch-realistic Portrait Generation
On the Effect of Shadowing Correlation on Wireless Network Performance
Joint User Scheduling and Power optimization in Full-Duplex Cells with Successive Interference Cancellation
Deep Learning Can Reverse Photon Migration for Diffuse Optical Tomography
Discrete Entropy Power Inequalities via Sperner Theory
Boolean function analysis meets stochastic optimization: An approximation scheme for stochastic knapsack
Learning Reduced-Resolution and Super-Resolution Networks in Synch
Central limit theorem for linear spectral statistics of deformed Wigner matrices
Algebraic Soft Decoding Algorithms for Reed-Solomon Codes Using Module
(Gap/S)ETH Hardness of SVP
V2X Content Distribution Using Batched Network Code
Hierarchical Actor-Critic
Interpolation inequality at one time point for parabolic equations with time-independent coefficients and applications
Composite Quantization
Leaf Identification Using a Deep Convolutional Neural Network
Convex and Lipschitz function approximations for Markov decision processes
Face Translation between Images and Videos using Identity-aware CycleGAN
Design of Polar Codes with Single and Multi-Carrier Modulation on Impulsive Noise Channels using Density Evolution
Inertial Proximal Incremental Aggregated Gradient Method
End-to-End Relation Extraction using Markov Logic Networks
Mining Supervisor Evaluation and Peer Feedback in Performance Appraisals
Multi-coloured jigsaw percolation on random graphs
Fast and stable multivariate kernel density estimation by fast sum updating
NEURAghe: Exploiting CPU-FPGA Synergies for Efficient and Flexible CNN Inference Acceleration on Zynq SoCs
Learning to detect chest radiographs containing lung nodules using visual attention networks
Chord Generation from Symbolic Melody Using BLSTM Networks
On data recovery with restraints on the spectrum range
Energy-relaxed Wassertein GANs(EnergyWGAN): Towards More Stable and High Resolution Image Generation
Rooted Tree Maps
A duality principle for non-convex variational problems applied to a Ginzburg-Landau Type equation
NEON+: Accelerated Gradient Methods for Extracting Negative Curvature for Non-Convex Optimization
Towards Faster Training of Global Covariance Pooling Networks by Iterative Matrix Square Root Normalization
A Continuous Family of Marked Poset Polytopes
Vprop: Variational Inference using RMSprop
Learning Deep Correspondence through Prior and Posterior Feature Constancy
Reclaiming memory for lock-free data structures: there has to be a better way
CNN based Learning using Reflection and Retinex Models for Intrinsic Image Decomposition
GANerated Hands for Real-time 3D Hand Tracking from Monocular RGB
Considering Slow Manifold Based Model Reduction for Multiscale Chemical Optimal Control Problems
A Second-Order Approach to Complex Event Recognition
Connecting Pixels to Privacy and Utility: Automatic Redaction of Private Information in Images
A Generalized Motion Pattern and FCN based approach for retinal fluid detection and segmentation
Coordinated Charging and Discharging Strategies for Plug-in Electric Bus Fast Charging Station with Energy Storage System
Robust 3D Action Recognition through Sampling Local Appearances and Global Distributions
Optimizing Electric Taxi Charging System: A Data-Driven Approach from Transport Energy Supply Chain Perspective
Inferring agent objectives at different scales of a complex adaptive system
Distributed Computing Made Secure: A New Cycle Cover Theorem
Stochastic Maximum Likelihood Optimization via Hypernetworks
Multipoint secant and interpolation methods with nonmonotone line search for solving systems of nonlinear equations
An Upper Bound on the GKS Game via Max Bipartite Matching
The Maximal Positively Invariant Set: Polynomial Setting
Consensus tracking in multi agent system with nonlinear and non identical dynamics via event driven sliding modes
Episodic memory for continual model learning
The Game of Blocking Pebbles
A Data-Centric View on Computational Complexity: P $\not =$ NP
A Dual Framework for Low-rank Tensor Completion
Why my photos look sideways or upside down? Detecting Canonical Orientation of Images using Convolutional Neural Networks
Particle Computation: Complexity, Algorithms, and Logic
tHoops: A Multi-Aspect Analytical Framework Spatio-Temporal Basketball Data Using Tensor Decomposition
Multi-oriented, sourced and directed graph complexes and quasi-isomorphisms between them
Refining the Two-Dimensional Signed Small Ball Inequality
The Case for Learned Index Structures
Speeding Up BigClam Implementation on SNAP
An Encoder-Decoder Model for ICD-10 Coding of Death Certificates
Iterative Deep Learning for Network Topology Extraction
Exact controllability for string with attached masses
On Capacity-Achieving Distributions Over Complex AWGN Channels Under Nonlinear Power Constraints and their Applications to SWIPT
Sub-clustering in decomposable graphs and size-varying junction trees
Minimal Input Selection for Robust Control
On the Real-time Vehicle Placement Problem
Learning by Asking Questions
Bisecting binomial coefficients (II)
Exponential Generalised Network Descriptors
On Out-of-Band Emissions of Quantized Precoding in Massive MU-MIMO-OFDM
The algebraic geometry of Kazhdan-Lusztig-Stanley polynomials
An Equivalence of Fully Connected Layer and Convolutional Layer
Upper Tail Large Deviations in First Passage Percolation