Deep learning for inferring cause of data anomalies

Daily operation of a large-scale experiment is a resource consuming task, particularly from perspectives of routine data quality monitoring. Typically, data comes from different sub-detectors and the global quality of data depends on the combinatorial performance of each of them. In this paper, the problem of identifying channels in which anomalies occurred is considered. We introduce a generic deep learning model and prove that, under reasonable assumptions, the model learns to identify ‘channels’ which are affected by an anomaly. Such model could be used for data quality manager cross-check and assistance and identifying good channels in anomalous data samples. The main novelty of the method is that the model does not require ground truth labels for each channel, only global flag is used. This effectively distinguishes the model from classical classification methods. Being applied to CMS data collected in the year 2010, this approach proves its ability to decompose anomaly by separate channels.


Deep Reinforcement Learning for Multi-Resource Multi-Machine Job Scheduling

Minimizing job scheduling time is a fundamental issue in data center networks that has been extensively studied in recent years. The incoming jobs require different CPU and memory units, and span different number of time slots. The traditional solution is to design efficient heuristic algorithms with performance guarantee under certain assumptions. In this paper, we improve a recently proposed job scheduling algorithm using deep reinforcement learning and extend it to multiple server clusters. Our study reveals that deep reinforcement learning method has the potential to outperform traditional resource allocation algorithms in a variety of complicated environments.


Adversarial Attacks Beyond the Image Space

Generating adversarial examples is an intriguing problem and an important way of understanding the working mechanism of deep neural networks. Recently, it has attracted a lot of attention in the computer vision community. Most existing approaches generated perturbations in image space, i.e., each pixel can be modified independently. However, it remains unclear whether these adversarial examples are authentic, in the sense that they correspond to actual changes in physical properties. This paper aims at exploring this topic in the contexts of object classification and visual question answering. The baselines are set to be several state-of-the-art deep neural networks which receive 2D input images. We augment these networks with a differentiable 3D rendering layer in front, so that a 3D scene (in physical space) is rendered into a 2D image (in image space), and then mapped to a prediction (in output space). There are two (direct or indirect) ways of attacking the physical parameters. The former back-propagates the gradients of error signals from output space to physical space directly, while the latter first constructs an adversary in image space, and then attempts to find the best solution in physical space that is rendered into this image. An important finding is that attacking physical space is much more difficult, as the direct method, compared with that used in image space, produces a much lower success rate and requires heavier perturbations to be added. On the other hand, the indirect method does not work out, suggesting that adversaries generated in image space are inauthentic. By interpreting them in physical space, most of these adversaries can be filtered out, showing promise in defending adversaries.


Verifying Neural Networks with Mixed Integer Programming

Neural networks have demonstrated considerable success in a wide variety of real-world problems. However, the presence of adversarial examples – slightly perturbed inputs that are misclassified with high confidence – limits our ability to guarantee performance for these networks in safety-critical applications. We demonstrate that, for networks that are piecewise affine (for example, deep networks with ReLU and maxpool units), proving no adversarial example exists – or finding the closest example if one does exist – can be naturally formulated as solving a mixed integer program. Solves for a fully-connected MNIST classifier with three hidden layers can be completed an order of magnitude faster than those of the best existing approach. To address the concern that adversarial examples are irrelevant because pixel-wise attacks are unlikely to happen in natural images, we search for adversaries over a natural class of perturbations written as convolutions with an adversarial blurring kernel. When searching over blurred images, we find that as opposed to pixelwise attacks, some misclassifications are impossible. Even more interestingly, a small fraction of input images are provably robust to blurs: every blurred version of the input is classified with the same, correct label.


The Promise and Peril of Human Evaluation for Model Interpretability

Transparency, user trust, and human comprehension are popular ethical motivations for interpretable machine learning. In support of these goals, researchers evaluate model explanation performance using humans and real world applications. This alone presents a challenge in many areas of artificial intelligence. In this position paper, we propose a distinction between descriptive and persuasive explanations. We discuss reasoning suggesting that functional interpretability may be correlated with cognitive function and user preferences. If this is indeed the case, evaluation and optimization using functional metrics could perpetuate implicit cognitive bias in explanations that threaten transparency. Finally, we propose two potential research directions to disambiguate cognitive function and explanation models, retaining control over the tradeoff between accuracy and interpretability.


Variable selection with genetic algorithms using repeated cross-validation of PLS regression models as fitness measure

Genetic algorithms are a widely used method in chemometrics for extracting variable subsets with high prediction power. Most fitness measures used by these genetic algorithms are based on the ordinary least-squares fit of the resulting model to the entire data or a subset thereof. Due to multicollinearity, partial least squares regression is often more appropriate, but rarely considered in genetic algorithms due to the additional cost for estimating the optimal number of components. We introduce two novel fitness measures for genetic algorithms, explicitly designed to estimate the internal prediction performance of partial least squares regression models built from the variable subsets. Both measures estimate the optimal number of components using cross-validation and subsequently estimate the prediction performance by predicting the response of observations not included in model-fitting. This is repeated multiple times to estimate the measures’ variations due to different random splits. Moreover, one measure was optimized for speed and more accurate estimation of the prediction performance for observations not included during variable selection. This leads to variable subsets with high internal and external prediction power. Results on high-dimensional chemical-analytical data show that the variable subsets acquired by this approach have competitive internal prediction power and superior external prediction power compared to variable subsets extracted with other fitness measures.


Learning to Organize Knowledge with N-Gram Machines

Deep neural networks (DNNs) had great success on NLP tasks such as language modeling, machine translation and certain question answering (QA) tasks. However, the success is limited at more knowledge intensive tasks such as QA from a big corpus. Existing end-to-end deep QA models (Miller et al., 2016; Weston et al., 2014) need to read the entire text after observing the question, and therefore their complexity in responding a question is linear in the text size. This is prohibitive for practical tasks such as QA from Wikipedia, a novel, or the Web. We propose to solve this scalability issue by using symbolic meaning representations, which can be indexed and retrieved efficiently with complexity that is independent of the text size. More specifically, we use sequence-to-sequence models to encode knowledge symbolically and generate programs to answer questions from the encoded knowledge. We apply our approach, called the N-Gram Machine (NGM), to the bAbI tasks (Weston et al., 2015) and a special version of them (‘life-long bAbI’) which has stories of up to 10 million sentences. Our experiments show that NGM can successfully solve both of these tasks accurately and efficiently. Unlike fully differentiable memory models, NGM’s time complexity and answering quality are not affected by the story length. The whole system of NGM is trained end-to-end with REINFORCE (Williams, 1992). To avoid high variance in gradient estimation, which is typical in discrete latent variable models, we use beam search instead of sampling. To tackle the exponentially large search space, we use a stabilized auto-encoding objective and a structure tweak procedure to iteratively reduce and refine the search space.


Leave no Trace: Learning to Reset for Safe and Autonomous Reinforcement Learning

Deep reinforcement learning algorithms can learn complex behavioral skills, but real-world application of these methods requires a large amount of experience to be collected by the agent. In practical settings, such as robotics, this involves repeatedly attempting a task, resetting the environment between each attempt. However, not all tasks are easily or automatically reversible. In practice, this learning process requires extensive human intervention. In this work, we propose an autonomous method for safe and efficient reinforcement learning that simultaneously learns a forward and reset policy, with the reset policy resetting the environment for a subsequent attempt. By learning a value function for the reset policy, we can automatically determine when the forward policy is about to enter a non-reversible state, providing for uncertainty-aware safety aborts. Our experiments illustrate that proper use of the reset policy can greatly reduce the number of manual resets required to learn a task, can reduce the number of unsafe actions that lead to non-reversible states, and can automatically induce a curriculum.


Tree-Structured Boosting: Connections Between Gradient Boosted Stumps and Full Decision Trees

Additive models, such as produced by gradient boosting, and full interaction models, such as classification and regression trees (CART), are widely used algorithms that have been investigated largely in isolation. We show that these models exist along a spectrum, revealing never-before-known connections between these two approaches. This paper introduces a novel technique called tree-structured boosting for creating a single decision tree, and shows that this method can produce models equivalent to CART or gradient boosted stumps at the extremes by varying a single parameter. Although tree-structured boosting is designed primarily to provide both the model interpretability and predictive performance needed for high-stake applications like medicine, it also can produce decision trees represented by hybrid models between CART and boosted stumps that can outperform either of these approaches.


FluidNets: Fast & Simple Resource-Constrained Structure Learning of Deep Networks

We present FluidNets, an approach to automate the design of neural network structures. FluidNets iteratively shrinks and expands a network, shrinking via a resource-weighted sparsifying regularizer on activations and expanding via a uniform multiplicative factor on all layers. In contrast to previous approaches, our method is scalable to large networks, adaptable to specific resource constraints (e.g. the number of floating-point operations per inference), and capable of increasing the network’s performance. When applied to standard network architectures on a wide variety of datasets, our approach discovers novel structures in each domain, obtaining higher performance while respecting the resource constraint.


Deep Gaussian Mixture Models

Deep learning is a hierarchical inference method formed by subsequent multiple layers of learning able to more efficiently describe complex relationships. In this work, Deep Gaussian Mixture Models are introduced and discussed. A Deep Gaussian Mixture model (DGMM) is a network of multiple layers of latent variables, where, at each layer, the variables follow a mixture of Gaussian distributions. Thus, the deep mixture model consists of a set of nested mixtures of linear models, which globally provide a nonlinear model able to describe the data in a very flexible way. In order to avoid overparameterized solutions, dimension reduction by factor models can be applied at each layer of the architecture thus resulting in deep mixtures of factor analysers.


Interleaver Design for Deep Neural Networks

We propose a class of interleavers for a novel deep neural network (DNN) architecture that uses algorithmically pre-determined, structured sparsity to significantly lower memory and computational requirements, and speed up training. The interleavers guarantee clash-free memory accesses to eliminate idle operational cycles, optimize spread and dispersion to improve network performance, and are designed to ease the complexity of memory address computations in hardware. We present a design algorithm with mathematical proofs for these properties. We also explore interleaver variations and analyze the behavior of neural networks as a function of interleaver metrics.


Decentralized High-Dimensional Bayesian Optimization with Factor Graphs

This paper presents a novel decentralized high-dimensional Bayesian optimization (DEC-HBO) algorithm that, in contrast to existing HBO algorithms, can exploit the interdependent effects of various input components on the output of the unknown objective function f for boosting the BO performance and still preserve scalability in the number of input dimensions without requiring prior knowledge or the existence of a low (effective) dimension of the input space. To realize this, we propose a sparse yet rich factor graph representation of f to be exploited for designing an acquisition function that can be similarly represented by a sparse factor graph and hence be efficiently optimized in a decentralized manner using distributed message passing. Despite richly characterizing the interdependent effects of the input components on the output of f with a factor graph, DEC-HBO can still guarantee no-regret performance asymptotically. Empirical evaluation on synthetic and real-world experiments (e.g., sparse Gaussian process model with 1811 hyperparameters) shows that DEC-HBO outperforms the state-of-the-art HBO algorithms.


Prior-aware Dual Decomposition: Document-specific Topic Inference for Spectral Topic Models

Spectral topic modeling algorithms operate on matrices/tensors of word co-occurrence statistics to learn topic-specific word distributions. This approach removes the dependence on the original documents and produces substantial gains in efficiency and provable topic inference, but at a cost: the model can no longer provide information about the topic composition of individual documents. Recently Thresholded Linear Inverse (TLI) is proposed to map the observed words of each document back to its topic composition. However, its linear characteristics limit the inference quality without considering the important prior information over topics. In this paper, we evaluate Simple Probabilistic Inverse (SPI) method and novel Prior-aware Dual Decomposition (PADD) that is capable of learning document-specific topic compositions in parallel. Experiments show that PADD successfully leverages topic correlations as a prior, notably outperforming TLI and learning quality topic compositions comparable to Gibbs sampling on various data.


Structured Stein Variational Inference for Continuous Graphical Models

We propose a novel distributed inference algorithm for continuous graphical models by extending Stein variational gradient descent (SVGD) to leverage the Markov dependency structure of the distribution of interest. The idea is to use a set of local kernel functions over the Markov blanket of each node, which alleviates the problem of the curse of high dimensionality and simultaneously yields a distributed algorithm for decentralized inference tasks. We justify our method with theoretical analysis and show that the use of local kernels can be viewed as a new type of localized approximation that matches the target distribution on the conditional distributions of each node over its Markov blanket. Our empirical results demonstrate that our method outperforms a variety of baselines including standard MCMC and particle message passing methods.


Classification with Costly Features using Deep Reinforcement Learning

We study a classification problem where each feature can be acquired for a cost and the goal is to optimize the trade-off between classification precision and the total feature cost. We frame the problem as a sequential decision-making problem, where we classify one sample in each episode. At each step, an agent can use values of acquired features to decide whether to purchase another one or whether to classify the sample. We use vanilla Double Deep Q-learning, a standard reinforcement learning technique, to find a classification policy. We show that this generic approach outperforms Adapt-Gbrt, currently the best-performing algorithm developed specifically for classification with costly features.


Deep Approximately Orthogonal Nonnegative Matrix Factorization for Clustering

Nonnegative Matrix Factorization (NMF) is a widely used technique for data representation. Inspired by the expressive power of deep learning, several NMF variants equipped with deep architectures have been proposed. However, these methods mostly use the only nonnegativity while ignoring task-specific features of data. In this paper, we propose a novel deep approximately orthogonal nonnegative matrix factorization method where both nonnegativity and orthogonality are imposed with the aim to perform a hierarchical clustering by using different level of abstractions of data. Experiment on two face image datasets showed that the proposed method achieved better clustering performance than other deep matrix factorization methods and state-of-the-art single layer NMF variants.


Bidirectional Conditional Generative Adversarial Networks

Conditional variants of Generative Adversarial Networks (GANs), known as cGANs, are generative models that can produce data samples (x) conditioned on both latent variables (z) and known auxiliary information (c). Another GAN variant, Bidirectional GAN (BiGAN) is a recently developed framework for learning the inverse mapping from x to z through an encoder trained simultaneously with the generator and the discriminator of an unconditional GAN. We propose the Bidirectional Conditional GAN (BCGAN), which combines cGANs and BiGANs into a single framework with an encoder that learns inverse mappings from x to both z and c, trained simultaneously with the conditional generator and discriminator in an end-to-end setting. We present crucial techniques for training BCGANs, which incorporate an extrinsic factor loss along with an associated dynamically-tuned importance weight. As compared to other encoder-based GANs, BCGANs not only encode c more accurately but also utilize z and c more effectively and in a more disentangled way to generate data samples.


Better Agnostic Clustering Via Relaxed Tensor Norms

We develop a new family of convex relaxations for k-means clustering based on sum-of-squares norms, a relaxation of the injective tensor norm that is efficiently computable using the Sum-of-Squares algorithm. We give an algorithm based on this relaxation that recovers a faithful approximation to the true means in the given data whenever the low-degree moments of the points in each cluster have bounded sum-of-squares norms. We then prove a sharp upper bound on the sum-of-squares norms for moment tensors of any distribution that satisfies the \emph{Poincare inequality}. The Poincare inequality is a central inequality in probability theory, and a large class of distributions satisfy it including Gaussians, product distributions, strongly log-concave distributions, and any sum or uniformly continuous transformation of such distributions. As an immediate corollary, for any \gamma > 0, we obtain an efficient algorithm for learning the means of a mixture of k arbitrary \Poincare distributions in \mathbb{R}^d in time d^{O(1/\gamma)} so long as the means have separation \Omega(k^{\gamma}). This in particular yields an algorithm for learning Gaussian mixtures with separation \Omega(k^{\gamma}), thus partially resolving an open problem of Regev and Vijayaraghavan \citet{regev2017learning}. Our algorithm works even in the outlier-robust setting where an \epsilon fraction of arbitrary outliers are added to the data, as long as the fraction of outliers is smaller than the smallest cluster. We, therefore, obtain results in the strong agnostic setting where, in addition to not knowing the distribution family, the data itself may be arbitrarily corrupted.


Recovering Lexicographic Triangulations
Fusing Bird View LIDAR Point Cloud and Front View Camera Image for Deep Object Detection
Learning Discriminative Affine Regions via Discriminability
Maximum-norm a posteriori error estimates for an optimal control problem
Manifold learning with bi-stochastic kernels
Integrating Disparate Sources of Experts for Robust Image Denoising
Techniques for proving Asynchronous Convergence results for Markov Chain Monte Carlo methods
Quarnet inference rules for level-1 networks
3D object classification and retrieval with Spherical CNNs
Phonological (un)certainty weights lexical activation
Information Gathering with Peers: Submodular Optimization with Peer-Prediction Constraints
Principal Manifolds of Middles: A Framework and Estimation Procedure Using Mixture Densities
Wing Loss for Robust Facial Landmark Localisation with Convolutional Neural Networks
Deep supervised learning using local errors
Improving particle filter performance with a generalized random field model of observation errors
Backward induction in presence of cycles
Generation and Consolidation of Recollections for Efficient Deep Lifelong Learning
Addressing Expensive Multi-objective Games with Postponed Preference Articulation via Memetic Co-evolution
Image Registration of Very Large Images via Genetic Programming
A Two-Phase Genetic Algorithm for Image Registration
Genetic Algorithm-Based Solver for Very Large Multiple Jigsaw Puzzles of Unknown Dimensions and Piece Orientation
An Automatic Solver for Very Large Jigsaw Puzzles Using Genetic Algorithms
A Generalized Genetic Algorithm-Based Solver for Very Large Jigsaw Puzzles of Complex Types
A Genetic Algorithm-Based Solver for Very Large Jigsaw Puzzles
Approximate Gradient Coding via Sparse Random Graphs
Separable discrete functions: recognition and sufficient conditions
Game Theoretic Analysis of Auction Mechanisms Modeled by Constrained Optimization Problems
Excitation Backprop for RNNs
Machine Learning Approaches for Traffic Volume Forecasting: A Case Study of the Moroccan Highway Network
Exact alignment recovery for correlated Erdos Renyi graphs
A primal-dual algorithm with optimal stepsizes and its application in decentralized consensus optimization
Measuring Territorial Control in Civil Wars Using Hidden Markov Models: A Data Informatics-Based Approach
Learning Aggregated Transmission Propagation Networks for Haze Removal and Beyond
MinimalRNN: Toward More Interpretable and Trainable Recurrent Neural Networks
Enumeration of Some Closed Knight Paths
Co-attending Free-form Regions and Detections with Multi-modal Multiplicative Feature Embedding for Visual Question Answering
Prediction Scores as a Window into Classifier Behavior
Short proofs for generalizations of the Lovász Local Lemma: Shearer’s condition and cluster expansion
Scalable Relaxations of Sparse Packing Constraints: Optimal Biocontrol in Predator-Prey Network
Reduction of total-cost and average-cost MDPs with weakly continuous transition probabilities to discounted MDPs
Fast Monte Carlo Markov chains for Bayesian shrinkage models with random effects
A Color Quantization Optimization Approach for Image Representation Learning
Household poverty classification in data-scarce environments: a machine learning approach
Convex Set of Doubly Substochastic Matrices
Acquiring Common Sense Spatial Knowledge through Implicit Spatial Templates
A novel Topological Model for Nonlinear Analysis and Prediction for Observations with Recurring Patterns
Low-dimensional Embeddings for Interpretable Anchor-based Topic Inference
Continuous-state branching processes with competition: Duality and Reflection at Infinity
Transferable Semi-supervised Semantic Segmentation
Random Access in Massive MIMO by Exploiting Timing Offsets and Excess Antennas
Proximal Gradient Method with Extrapolation and Line Search for a Class of Nonconvex and Nonsmooth Problems
Neural Network Reinforcement Learning for Audio-Visual Gaze Control in Human-Robot Interaction
Genetic Algorithms for Mentor-Assisted Evaluation Function Optimization
Simulating Human Grandmasters: Evolution and Coevolution of Evaluation Functions
Expert-Driven Genetic Algorithms for Simulating Evaluation Functions
Evaluating Roles of Central Users in Online Communication Networks: A Case Study of #PanamaLeaks
Local Clustering Coefficient of Spatial Preferential Attachment Model
DLTK: State of the Art Reference Implementations for Deep Learning on Medical Images
Style Transfer in Text: Exploration and Evaluation
From Common to Special: When Multi-Attribute Learning Meets Personalized Opinions
Bio-Inspired Local Information-Based Control for Probabilistic Swarm Distribution Guidance
Anonymous Hedonic Game for Task Allocation in a Large-Scale Multiple Agent System
Automatically Extracting Action Graphs from Materials Science Synthesis Procedures
Learning Dynamics and the Co-Evolution of Competing Sexual Species
Fission-fusion dynamics and group-size dependent composition in heterogeneous populations
Fully Dynamic Almost-Maximal Matching: Breaking the Polynomial Barrier for Worst-Case Time Bounds
Learning to select computations
Is China Entering WTO or shijie maoyi zuzhi–a Corpus Study of English Acronyms in Chinese Newspapers
Inversion of Tchebychev-Tchernov inequality
Single-Shot Refinement Neural Network for Object Detection
The Cultural Evolution of National Constitutions
On the second largest Laplacian eigenvalue of graph
Collective gradient sensing in fish schools
Optimal Stopping for Interval Estimation in Bernoulli Trials
Joint User Scheduling and Beam Selection Optimization for Beam-Based Massive MIMO Downlinks
Gazing into the Abyss: Real-time Gaze Estimation
Shifted tableaux crystals
Superlinear Lower Bounds for Distributed Subgraph Detection
Run, skeleton, run: skeletal model in a physics-based simulation
The Bayes Lepski’s Method and Credible Bands through Volume of Tubular Neighborhoods
Computational Results for Extensive-Form Adversarial Team Games
Average-case Approximation Ratio of Scheduling without Payments
Macdonald-positive specializations of the algebra of symmetric functions: Proof of the Kerov conjecture
Robust Synthetic Control
Node Profiles of Symmetric Digital Search Trees
An extension to the theory of controlled Lagrangians using the Helmholtz conditions
A novel total variation model based on kernel functions and its application
Approximating geodesics via random points
A systematic framework to discover pattern for web spam classification
BPGrad: Towards Global Optimality in Deep Learning via Branch and Pruning
The Strength of Multi-row Aggregation Cuts for Sign-pattern Integer Programs
Cyclone: High Availability for Persistent Key Value Stores
Intelligent Word Embeddings of Free-Text Radiology Reports
Unsupervised Domain Adaptation for Semantic Segmentation with GANs
How much is my car worth? A methodology for predicting used cars prices using Random Forest
MIT Autonomous Vehicle Technology Study: Large-Scale Deep Learning Based Analysis of Driver Behavior and Interaction with Automation
Enhanced Group Sparse Beamforming for Green Cloud-RAN: A Random Matrix Approach
Sequential Randomized Matrix Factorization for Gaussian Processes: Efficient Predictions and Hyper-parameter Optimization
Kill Two Birds with One Stone: Weakly-Supervised Neural Network for Image Annotation and Tag Refinement
A note on quadratic approximations of logistic log-likelihoods
Convergence Analysis of the Dynamics of a Special Kind of Two-Layered Neural Networks with $\ell_1$ and $\ell_2$ Regularization
Probabilistic approach to quantum separation effect for Feynman-Kac semigroup
Coherence-based Time Series Clustering for Brain Connectivity Visualization
A Discourse-Level Named Entity Recognition and Relation Extraction Dataset for Chinese Literature Text
MicroExpNet: An Extremely Small and Fast Model For Expression Recognition From Frontal Face Images
A note on Hadamard fractional differential equations with varying coefficients and their applications in probability
Incorporating Syntactic Uncertainty in Neural Machine Translation using a Forest-to-Seuqence Model
Zero Dynamics for Port-Hamiltonian Systems
Extremal graphs with respect to the total-eccentricity index
Image-Image Domain Adaptation with Preserved Self-Similarity and Domain-Dissimilarity for Person Re-identification
Mixed-integer linear representability, disjunctions, and Chvatal functions — modeling implications
Universal Cycles of Restricted Words
Normal Representations of Hyperplane Arrangements Over a Field with $1-ad$ Structure and Convex Positive Bijections
Two-level schemes for the advection equation
A Coordinate-wise Optimization Algorithm for Sparse Inverse Covariance Selection
An Improved Oscillating-Error Classifier with Branching
A Classifying Variational Autoencoder with Application to Polyphonic Music Generation
An Approximating Control Design for Optimal Mixing by Stokes Flows
A New Form of Williamson’s Product Theorem
Morphisms of open games
DeblurGAN: Blind Motion Deblurring Using Conditional Adversarial Networks
Diverse and Accurate Image Description Using a Variational Auto-Encoder with an Additive Gaussian Encoding Space
The destiny of constant structure discrete time closed semantic systems
Node Balanced Steady States: Unifying and Generalizing Complex and Detailed Balanced Steady States
On convergence rate for an infinite-channel queuing system with Poisson input flow
Does mitigating ML’s disparate impact require disparate treatment?
Estimation Considerations in Contextual Bandits
Equiangular tight frames that contain regular simplices
Second-Order Variational Analysis of Parametric Constraint and Variational Systems
Superexponential estimates and weighted lower bounds for the square function
Compression-Based Regularization with an Application to Multi-Task Learning
Probabilistic and Combinatorial Interpretations of the Bernoulli Symbol
Eigenvectors distribution and quantum unique ergodicity for deformed Wigner matrices
A Double Parametric Bootstrap Test for Topic Models
A note on quasi-convex functions
The invariant measure and the flow associated to the $Φ^4_3$-quantum field model
Modeling Epistemological Principles for Bias Mitigation in AI Systems: An Illustration in Hiring Decisions
Deletion-Robust Submodular Maximization at Scale
On the Stability of a N-class Aloha Network
Hello Edge: Keyword Spotting on Microcontrollers
CleanNet: Transfer Learning for Scalable Image Classifier Training with Label Noise
Critique of Barbosa’s ‘P != NP Proof’
Robust Non-line-of-sight Imaging with Single Photon Detectors
Schlegel Diagram and Optimizable Immediate Snapshot Protocol
Nonparametric Double Robustness
Optimal binary linear locally repairable codes with disjoint repair groups
On the Global Fluctuations of Block Gaussian Matrices
Spectral-Spatial Feature Extraction and Classification by ANN Supervised with Center Loss in Hyperspectral Imagery
On $e$-positivity and $e$-unimodality of chromatic quasisymmetric functions
Interactive, Intelligent Tutoring for Auxiliary Constructions in Geometry Proofs
Let Features Decide for Themselves: Feature Mask Network for Person Re-identification
Dynamic Neural Program Embedding for Program Repair
Parameter Reference Loss for Unsupervised Domain Adaptation
On the Feasibility of Interference Alignment in Compounded MIMO Broadcast Channels with Antenna Correlation and Mixed User Classes
Polyhedral parametrizations of canonical bases & cluster duality
Non-reversible, tuning- and rejection-free Markov chain Monte Carlo via iterated random functions
Is prioritized sweeping the better episodic control?
On a stochastic Hardy-Littlewood-Sobolev inequality with application to Strichartz estimates for the white noise dispersion
Block-Cyclic Stochastic Coordinate Descent for Deep Neural Networks
Softening and Yielding of Soft Glassy Materials
Method to Design UF-OFDM Filter and its Analysis
A new class of tests for multinormality with i.i.d. and Garch data based on the empirical moment generating function
End-to-end Trained CNN Encode-Decoder Networks for Image Steganography
List-Decodable Robust Mean Estimation and Learning Mixtures of Spherical Gaussians
Maximizing Non-monotone/Non-submodular Functions by Multi-objective Evolutionary Algorithms
Lefschetz and Lower Bound theorems for Minkowski sums
Model Extraction Warning in MLaaS Paradigm
Generalized Dual Dynamic Programming for Infinite Horizon Problems in Continuous State and Action Spaces
Linear-Complexity Relaxed Word Mover’s Distance with GPU Acceleration
Finite Time Analysis of Optimal Adaptive Policies for Linear-Quadratic Systems
Stochastic metamorphosis with template uncertainties
Statistics of the Voronoi cell perimeter in large bi-pointed maps
Tracking in Aerial Hyperspectral Videos using Deep Kernelized Correlation Filters
MegDet: A Large Mini-Batch Object Detector
Optical Character Recognition (OCR) for Telugu: Database, Algorithm and Application
Face Attention Network: An effective Face Detector for the Occluded Faces
Finite Horizon Robustness Analysis of LTV Systems Using Integral Quadratic Constraints
On the optimality of the uniform random strategy
Light-Head R-CNN: In Defense of Two-Stage Object Detector
Fast BTG-Forest-Based Hierarchical Sub-sentential Alignment
Evaluating the Performance of eMTC and NB-IoT for Smart City Applications
A Separation Between Run-Length SLPs and LZ77
Positive semi-definite embedding for dimensionality reduction and out-of-sample extensions
Facets, Tiers and Gems: Ontology Patterns for Hypernormalisation
Speech recognition for medical conversations
Backscatter Communications for the Internet of Things: A Stochastic Geometry Approach
Vision-and-Language Navigation: Interpreting visually-grounded navigation instructions in real environments
Quantum Query Algorithms are Completely Bounded Forms
Non-exchangeable random partition models for microclustering
When Fourth Moments Are Enough
Learning Steerable Filters for Rotation Equivariant CNNs
Bitmap Filter: Speeding up Exact Set Similarity Joins with Bitwise Operations
Optimization-Based Autonomous Racing of 1:43 Scale RC Cars
Zero-shot Learning via Shared-Reconstruction-Graph Pursuit
Solution of network localization problem with noisy distances and its convergence
Performance of In-band Transmission of System Information in Massive MIMO Systems
Cooperative Games With Bounded Dependency Degree
Detection of Tooth caries in Bitewing Radiographs using Deep Learning
A Note on Helffer-Sjöstrand Representation for A Ginzburg-Landau Process
Cascaded Pyramid Network for Multi-Person Pose Estimation
Proof Complexity Meets Algebra
On DNA Codes using the Ring Z4 + wZ4
Bayesian Active Edge Evaluation on Expensive Graphs
Robust Decentralized Secondary Frequency Control in Power Systems: Merits and Trade-Offs
Convergent Block Coordinate Descent for Training Tikhonov Regularized Deep Neural Networks
Community detection with spiking neural networks for neuromorphic hardware
Pixel-wise object tracking
Wasserstein and Kolmogorov error bounds for variance-gamma approximation via Stein’s method I
Spectral distribution of the free Jacobi process, revisited
Adaptive M-QAM for Indoor Wireless Environments : Rate & Power Adaptation
How morphological development can guide evolution
V2V-PoseNet: Voxel-to-Voxel Prediction Network for Accurate 3D Hand and Human Pose Estimation from a Single Depth Map
Non-Contextual Modeling of Sarcasm using a Neural Network Benchmark
Disentangling Factors of Variation by Mixing Them
Robust Seed Mask Generation for Interactive Image Segmentation
Outliers in the spectrum for products of independent random matrices
Informed proposals for local MCMC in discrete spaces
Modular Continual Learning in a Unified Visual Environment
Joint Object Category and 3D Pose Estimation from 2D Images
Action Recognition with Coarse-to-Fine Deep Feature Integration and Asynchronous Fusion
A local graph rewiring algorithm for sampling spanning trees
Relaxed Oracles for Semi-Supervised Clustering
On Convergence of Epanechnikov Mean Shift
On tight cycles in hypergraphs
A generalised framework for detailed classification of swimming paths inside the Morris Water Maze
Subcritical multitype branching process in random environment
Mixture Models, Robustness, and Sum of Squares Proofs
Families of nested graphs with compatible symmetric-group actions
Matrix Factorization for Nonparametric Multi-source Localization Exploiting Unimodal Properties
SquishedNets: Squishing SqueezeNet further for edge device scenarios via deep evolutionary synthesis
Glitch Classification and Clustering for LIGO with Deep Transfer Learning