Deep learning for inferring cause of data anomalies
Daily operation of a large-scale experiment is a resource consuming task, particularly from perspectives of routine data quality monitoring. Typically, data comes from different sub-detectors and the global quality of data depends on the combinatorial performance of each of them. In this paper, the problem of identifying channels in which anomalies occurred is considered. We introduce a generic deep learning model and prove that, under reasonable assumptions, the model learns to identify ‘channels’ which are affected by an anomaly. Such model could be used for data quality manager cross-check and assistance and identifying good channels in anomalous data samples. The main novelty of the method is that the model does not require ground truth labels for each channel, only global flag is used. This effectively distinguishes the model from classical classification methods. Being applied to CMS data collected in the year 2010, this approach proves its ability to decompose anomaly by separate channels.
Deep Reinforcement Learning for Multi-Resource Multi-Machine Job Scheduling
Minimizing job scheduling time is a fundamental issue in data center networks that has been extensively studied in recent years. The incoming jobs require different CPU and memory units, and span different number of time slots. The traditional solution is to design efficient heuristic algorithms with performance guarantee under certain assumptions. In this paper, we improve a recently proposed job scheduling algorithm using deep reinforcement learning and extend it to multiple server clusters. Our study reveals that deep reinforcement learning method has the potential to outperform traditional resource allocation algorithms in a variety of complicated environments.
Adversarial Attacks Beyond the Image Space
Generating adversarial examples is an intriguing problem and an important way of understanding the working mechanism of deep neural networks. Recently, it has attracted a lot of attention in the computer vision community. Most existing approaches generated perturbations in image space, i.e., each pixel can be modified independently. However, it remains unclear whether these adversarial examples are authentic, in the sense that they correspond to actual changes in physical properties. This paper aims at exploring this topic in the contexts of object classification and visual question answering. The baselines are set to be several state-of-the-art deep neural networks which receive 2D input images. We augment these networks with a differentiable 3D rendering layer in front, so that a 3D scene (in physical space) is rendered into a 2D image (in image space), and then mapped to a prediction (in output space). There are two (direct or indirect) ways of attacking the physical parameters. The former back-propagates the gradients of error signals from output space to physical space directly, while the latter first constructs an adversary in image space, and then attempts to find the best solution in physical space that is rendered into this image. An important finding is that attacking physical space is much more difficult, as the direct method, compared with that used in image space, produces a much lower success rate and requires heavier perturbations to be added. On the other hand, the indirect method does not work out, suggesting that adversaries generated in image space are inauthentic. By interpreting them in physical space, most of these adversaries can be filtered out, showing promise in defending adversaries.
Verifying Neural Networks with Mixed Integer Programming
Neural networks have demonstrated considerable success in a wide variety of real-world problems. However, the presence of adversarial examples – slightly perturbed inputs that are misclassified with high confidence – limits our ability to guarantee performance for these networks in safety-critical applications. We demonstrate that, for networks that are piecewise affine (for example, deep networks with ReLU and maxpool units), proving no adversarial example exists – or finding the closest example if one does exist – can be naturally formulated as solving a mixed integer program. Solves for a fully-connected MNIST classifier with three hidden layers can be completed an order of magnitude faster than those of the best existing approach. To address the concern that adversarial examples are irrelevant because pixel-wise attacks are unlikely to happen in natural images, we search for adversaries over a natural class of perturbations written as convolutions with an adversarial blurring kernel. When searching over blurred images, we find that as opposed to pixelwise attacks, some misclassifications are impossible. Even more interestingly, a small fraction of input images are provably robust to blurs: every blurred version of the input is classified with the same, correct label.
The Promise and Peril of Human Evaluation for Model Interpretability
Transparency, user trust, and human comprehension are popular ethical motivations for interpretable machine learning. In support of these goals, researchers evaluate model explanation performance using humans and real world applications. This alone presents a challenge in many areas of artificial intelligence. In this position paper, we propose a distinction between descriptive and persuasive explanations. We discuss reasoning suggesting that functional interpretability may be correlated with cognitive function and user preferences. If this is indeed the case, evaluation and optimization using functional metrics could perpetuate implicit cognitive bias in explanations that threaten transparency. Finally, we propose two potential research directions to disambiguate cognitive function and explanation models, retaining control over the tradeoff between accuracy and interpretability.
Variable selection with genetic algorithms using repeated cross-validation of PLS regression models as fitness measure
Genetic algorithms are a widely used method in chemometrics for extracting variable subsets with high prediction power. Most fitness measures used by these genetic algorithms are based on the ordinary least-squares fit of the resulting model to the entire data or a subset thereof. Due to multicollinearity, partial least squares regression is often more appropriate, but rarely considered in genetic algorithms due to the additional cost for estimating the optimal number of components. We introduce two novel fitness measures for genetic algorithms, explicitly designed to estimate the internal prediction performance of partial least squares regression models built from the variable subsets. Both measures estimate the optimal number of components using cross-validation and subsequently estimate the prediction performance by predicting the response of observations not included in model-fitting. This is repeated multiple times to estimate the measures’ variations due to different random splits. Moreover, one measure was optimized for speed and more accurate estimation of the prediction performance for observations not included during variable selection. This leads to variable subsets with high internal and external prediction power. Results on high-dimensional chemical-analytical data show that the variable subsets acquired by this approach have competitive internal prediction power and superior external prediction power compared to variable subsets extracted with other fitness measures.
Learning to Organize Knowledge with N-Gram Machines
Deep neural networks (DNNs) had great success on NLP tasks such as language modeling, machine translation and certain question answering (QA) tasks. However, the success is limited at more knowledge intensive tasks such as QA from a big corpus. Existing end-to-end deep QA models (Miller et al., 2016; Weston et al., 2014) need to read the entire text after observing the question, and therefore their complexity in responding a question is linear in the text size. This is prohibitive for practical tasks such as QA from Wikipedia, a novel, or the Web. We propose to solve this scalability issue by using symbolic meaning representations, which can be indexed and retrieved efficiently with complexity that is independent of the text size. More specifically, we use sequence-to-sequence models to encode knowledge symbolically and generate programs to answer questions from the encoded knowledge. We apply our approach, called the N-Gram Machine (NGM), to the bAbI tasks (Weston et al., 2015) and a special version of them (‘life-long bAbI’) which has stories of up to 10 million sentences. Our experiments show that NGM can successfully solve both of these tasks accurately and efficiently. Unlike fully differentiable memory models, NGM’s time complexity and answering quality are not affected by the story length. The whole system of NGM is trained end-to-end with REINFORCE (Williams, 1992). To avoid high variance in gradient estimation, which is typical in discrete latent variable models, we use beam search instead of sampling. To tackle the exponentially large search space, we use a stabilized auto-encoding objective and a structure tweak procedure to iteratively reduce and refine the search space.
Leave no Trace: Learning to Reset for Safe and Autonomous Reinforcement Learning
Deep reinforcement learning algorithms can learn complex behavioral skills, but real-world application of these methods requires a large amount of experience to be collected by the agent. In practical settings, such as robotics, this involves repeatedly attempting a task, resetting the environment between each attempt. However, not all tasks are easily or automatically reversible. In practice, this learning process requires extensive human intervention. In this work, we propose an autonomous method for safe and efficient reinforcement learning that simultaneously learns a forward and reset policy, with the reset policy resetting the environment for a subsequent attempt. By learning a value function for the reset policy, we can automatically determine when the forward policy is about to enter a non-reversible state, providing for uncertainty-aware safety aborts. Our experiments illustrate that proper use of the reset policy can greatly reduce the number of manual resets required to learn a task, can reduce the number of unsafe actions that lead to non-reversible states, and can automatically induce a curriculum.
Tree-Structured Boosting: Connections Between Gradient Boosted Stumps and Full Decision Trees
Additive models, such as produced by gradient boosting, and full interaction models, such as classification and regression trees (CART), are widely used algorithms that have been investigated largely in isolation. We show that these models exist along a spectrum, revealing never-before-known connections between these two approaches. This paper introduces a novel technique called tree-structured boosting for creating a single decision tree, and shows that this method can produce models equivalent to CART or gradient boosted stumps at the extremes by varying a single parameter. Although tree-structured boosting is designed primarily to provide both the model interpretability and predictive performance needed for high-stake applications like medicine, it also can produce decision trees represented by hybrid models between CART and boosted stumps that can outperform either of these approaches.
FluidNets: Fast & Simple Resource-Constrained Structure Learning of Deep Networks
We present FluidNets, an approach to automate the design of neural network structures. FluidNets iteratively shrinks and expands a network, shrinking via a resource-weighted sparsifying regularizer on activations and expanding via a uniform multiplicative factor on all layers. In contrast to previous approaches, our method is scalable to large networks, adaptable to specific resource constraints (e.g. the number of floating-point operations per inference), and capable of increasing the network’s performance. When applied to standard network architectures on a wide variety of datasets, our approach discovers novel structures in each domain, obtaining higher performance while respecting the resource constraint.
Deep Gaussian Mixture Models
Deep learning is a hierarchical inference method formed by subsequent multiple layers of learning able to more efficiently describe complex relationships. In this work, Deep Gaussian Mixture Models are introduced and discussed. A Deep Gaussian Mixture model (DGMM) is a network of multiple layers of latent variables, where, at each layer, the variables follow a mixture of Gaussian distributions. Thus, the deep mixture model consists of a set of nested mixtures of linear models, which globally provide a nonlinear model able to describe the data in a very flexible way. In order to avoid overparameterized solutions, dimension reduction by factor models can be applied at each layer of the architecture thus resulting in deep mixtures of factor analysers.
Interleaver Design for Deep Neural Networks
We propose a class of interleavers for a novel deep neural network (DNN) architecture that uses algorithmically pre-determined, structured sparsity to significantly lower memory and computational requirements, and speed up training. The interleavers guarantee clash-free memory accesses to eliminate idle operational cycles, optimize spread and dispersion to improve network performance, and are designed to ease the complexity of memory address computations in hardware. We present a design algorithm with mathematical proofs for these properties. We also explore interleaver variations and analyze the behavior of neural networks as a function of interleaver metrics.
Decentralized High-Dimensional Bayesian Optimization with Factor Graphs
This paper presents a novel decentralized high-dimensional Bayesian optimization (DEC-HBO) algorithm that, in contrast to existing HBO algorithms, can exploit the interdependent effects of various input components on the output of the unknown objective function f for boosting the BO performance and still preserve scalability in the number of input dimensions without requiring prior knowledge or the existence of a low (effective) dimension of the input space. To realize this, we propose a sparse yet rich factor graph representation of f to be exploited for designing an acquisition function that can be similarly represented by a sparse factor graph and hence be efficiently optimized in a decentralized manner using distributed message passing. Despite richly characterizing the interdependent effects of the input components on the output of f with a factor graph, DEC-HBO can still guarantee no-regret performance asymptotically. Empirical evaluation on synthetic and real-world experiments (e.g., sparse Gaussian process model with 1811 hyperparameters) shows that DEC-HBO outperforms the state-of-the-art HBO algorithms.
Prior-aware Dual Decomposition: Document-specific Topic Inference for Spectral Topic Models
Spectral topic modeling algorithms operate on matrices/tensors of word co-occurrence statistics to learn topic-specific word distributions. This approach removes the dependence on the original documents and produces substantial gains in efficiency and provable topic inference, but at a cost: the model can no longer provide information about the topic composition of individual documents. Recently Thresholded Linear Inverse (TLI) is proposed to map the observed words of each document back to its topic composition. However, its linear characteristics limit the inference quality without considering the important prior information over topics. In this paper, we evaluate Simple Probabilistic Inverse (SPI) method and novel Prior-aware Dual Decomposition (PADD) that is capable of learning document-specific topic compositions in parallel. Experiments show that PADD successfully leverages topic correlations as a prior, notably outperforming TLI and learning quality topic compositions comparable to Gibbs sampling on various data.
Structured Stein Variational Inference for Continuous Graphical Models
We propose a novel distributed inference algorithm for continuous graphical models by extending Stein variational gradient descent (SVGD) to leverage the Markov dependency structure of the distribution of interest. The idea is to use a set of local kernel functions over the Markov blanket of each node, which alleviates the problem of the curse of high dimensionality and simultaneously yields a distributed algorithm for decentralized inference tasks. We justify our method with theoretical analysis and show that the use of local kernels can be viewed as a new type of localized approximation that matches the target distribution on the conditional distributions of each node over its Markov blanket. Our empirical results demonstrate that our method outperforms a variety of baselines including standard MCMC and particle message passing methods.
Classification with Costly Features using Deep Reinforcement Learning
We study a classification problem where each feature can be acquired for a cost and the goal is to optimize the trade-off between classification precision and the total feature cost. We frame the problem as a sequential decision-making problem, where we classify one sample in each episode. At each step, an agent can use values of acquired features to decide whether to purchase another one or whether to classify the sample. We use vanilla Double Deep Q-learning, a standard reinforcement learning technique, to find a classification policy. We show that this generic approach outperforms Adapt-Gbrt, currently the best-performing algorithm developed specifically for classification with costly features.
Deep Approximately Orthogonal Nonnegative Matrix Factorization for Clustering
Nonnegative Matrix Factorization (NMF) is a widely used technique for data representation. Inspired by the expressive power of deep learning, several NMF variants equipped with deep architectures have been proposed. However, these methods mostly use the only nonnegativity while ignoring task-specific features of data. In this paper, we propose a novel deep approximately orthogonal nonnegative matrix factorization method where both nonnegativity and orthogonality are imposed with the aim to perform a hierarchical clustering by using different level of abstractions of data. Experiment on two face image datasets showed that the proposed method achieved better clustering performance than other deep matrix factorization methods and state-of-the-art single layer NMF variants.
Bidirectional Conditional Generative Adversarial Networks
Conditional variants of Generative Adversarial Networks (GANs), known as cGANs, are generative models that can produce data samples (
) conditioned on both latent variables (
) and known auxiliary information (
). Another GAN variant, Bidirectional GAN (BiGAN) is a recently developed framework for learning the inverse mapping from
to
through an encoder trained simultaneously with the generator and the discriminator of an unconditional GAN. We propose the Bidirectional Conditional GAN (BCGAN), which combines cGANs and BiGANs into a single framework with an encoder that learns inverse mappings from
to both
and
, trained simultaneously with the conditional generator and discriminator in an end-to-end setting. We present crucial techniques for training BCGANs, which incorporate an extrinsic factor loss along with an associated dynamically-tuned importance weight. As compared to other encoder-based GANs, BCGANs not only encode
more accurately but also utilize
and
more effectively and in a more disentangled way to generate data samples.
Better Agnostic Clustering Via Relaxed Tensor Norms
We develop a new family of convex relaxations for
-means clustering based on sum-of-squares norms, a relaxation of the injective tensor norm that is efficiently computable using the Sum-of-Squares algorithm. We give an algorithm based on this relaxation that recovers a faithful approximation to the true means in the given data whenever the low-degree moments of the points in each cluster have bounded sum-of-squares norms. We then prove a sharp upper bound on the sum-of-squares norms for moment tensors of any distribution that satisfies the \emph{Poincare inequality}. The Poincare inequality is a central inequality in probability theory, and a large class of distributions satisfy it including Gaussians, product distributions, strongly log-concave distributions, and any sum or uniformly continuous transformation of such distributions. As an immediate corollary, for any
, we obtain an efficient algorithm for learning the means of a mixture of
arbitrary \Poincare distributions in
in time
so long as the means have separation
. This in particular yields an algorithm for learning Gaussian mixtures with separation
, thus partially resolving an open problem of Regev and Vijayaraghavan \citet{regev2017learning}. Our algorithm works even in the outlier-robust setting where an
fraction of arbitrary outliers are added to the data, as long as the fraction of outliers is smaller than the smallest cluster. We, therefore, obtain results in the strong agnostic setting where, in addition to not knowing the distribution family, the data itself may be arbitrarily corrupted.
• Recovering Lexicographic Triangulations
• Fusing Bird View LIDAR Point Cloud and Front View Camera Image for Deep Object Detection
• Learning Discriminative Affine Regions via Discriminability
• Maximum-norm a posteriori error estimates for an optimal control problem
• Manifold learning with bi-stochastic kernels
• Integrating Disparate Sources of Experts for Robust Image Denoising
• Techniques for proving Asynchronous Convergence results for Markov Chain Monte Carlo methods
• Quarnet inference rules for level-1 networks
• 3D object classification and retrieval with Spherical CNNs
• Phonological (un)certainty weights lexical activation
• Information Gathering with Peers: Submodular Optimization with Peer-Prediction Constraints
• Principal Manifolds of Middles: A Framework and Estimation Procedure Using Mixture Densities
• Wing Loss for Robust Facial Landmark Localisation with Convolutional Neural Networks
• Deep supervised learning using local errors
• Improving particle filter performance with a generalized random field model of observation errors
• Backward induction in presence of cycles
• Generation and Consolidation of Recollections for Efficient Deep Lifelong Learning
• Addressing Expensive Multi-objective Games with Postponed Preference Articulation via Memetic Co-evolution
• Image Registration of Very Large Images via Genetic Programming
• A Two-Phase Genetic Algorithm for Image Registration
• Genetic Algorithm-Based Solver for Very Large Multiple Jigsaw Puzzles of Unknown Dimensions and Piece Orientation
• An Automatic Solver for Very Large Jigsaw Puzzles Using Genetic Algorithms
• A Generalized Genetic Algorithm-Based Solver for Very Large Jigsaw Puzzles of Complex Types
• A Genetic Algorithm-Based Solver for Very Large Jigsaw Puzzles
• Approximate Gradient Coding via Sparse Random Graphs
• Separable discrete functions: recognition and sufficient conditions
• Game Theoretic Analysis of Auction Mechanisms Modeled by Constrained Optimization Problems
• Excitation Backprop for RNNs
• Machine Learning Approaches for Traffic Volume Forecasting: A Case Study of the Moroccan Highway Network
• Exact alignment recovery for correlated Erdos Renyi graphs
• A primal-dual algorithm with optimal stepsizes and its application in decentralized consensus optimization
• Measuring Territorial Control in Civil Wars Using Hidden Markov Models: A Data Informatics-Based Approach
• Learning Aggregated Transmission Propagation Networks for Haze Removal and Beyond
• MinimalRNN: Toward More Interpretable and Trainable Recurrent Neural Networks
• Enumeration of Some Closed Knight Paths
• Co-attending Free-form Regions and Detections with Multi-modal Multiplicative Feature Embedding for Visual Question Answering
• Prediction Scores as a Window into Classifier Behavior
• Short proofs for generalizations of the Lovász Local Lemma: Shearer’s condition and cluster expansion
• Scalable Relaxations of Sparse Packing Constraints: Optimal Biocontrol in Predator-Prey Network
• Reduction of total-cost and average-cost MDPs with weakly continuous transition probabilities to discounted MDPs
• Fast Monte Carlo Markov chains for Bayesian shrinkage models with random effects
• A Color Quantization Optimization Approach for Image Representation Learning
• Household poverty classification in data-scarce environments: a machine learning approach
• Convex Set of Doubly Substochastic Matrices
• Acquiring Common Sense Spatial Knowledge through Implicit Spatial Templates
• A novel Topological Model for Nonlinear Analysis and Prediction for Observations with Recurring Patterns
• Low-dimensional Embeddings for Interpretable Anchor-based Topic Inference
• Continuous-state branching processes with competition: Duality and Reflection at Infinity
• Transferable Semi-supervised Semantic Segmentation
• Random Access in Massive MIMO by Exploiting Timing Offsets and Excess Antennas
• Proximal Gradient Method with Extrapolation and Line Search for a Class of Nonconvex and Nonsmooth Problems
• Neural Network Reinforcement Learning for Audio-Visual Gaze Control in Human-Robot Interaction
• Genetic Algorithms for Mentor-Assisted Evaluation Function Optimization
• Simulating Human Grandmasters: Evolution and Coevolution of Evaluation Functions
• Expert-Driven Genetic Algorithms for Simulating Evaluation Functions
• Evaluating Roles of Central Users in Online Communication Networks: A Case Study of #PanamaLeaks
• Local Clustering Coefficient of Spatial Preferential Attachment Model
• DLTK: State of the Art Reference Implementations for Deep Learning on Medical Images
• Style Transfer in Text: Exploration and Evaluation
• From Common to Special: When Multi-Attribute Learning Meets Personalized Opinions
• Bio-Inspired Local Information-Based Control for Probabilistic Swarm Distribution Guidance
• Anonymous Hedonic Game for Task Allocation in a Large-Scale Multiple Agent System
• Automatically Extracting Action Graphs from Materials Science Synthesis Procedures
• Learning Dynamics and the Co-Evolution of Competing Sexual Species
• Fission-fusion dynamics and group-size dependent composition in heterogeneous populations
• Fully Dynamic Almost-Maximal Matching: Breaking the Polynomial Barrier for Worst-Case Time Bounds
• Learning to select computations
• Is China Entering WTO or shijie maoyi zuzhi–a Corpus Study of English Acronyms in Chinese Newspapers
• Inversion of Tchebychev-Tchernov inequality
• Single-Shot Refinement Neural Network for Object Detection
• The Cultural Evolution of National Constitutions
• On the second largest Laplacian eigenvalue of graph
• Collective gradient sensing in fish schools
• Optimal Stopping for Interval Estimation in Bernoulli Trials
• Joint User Scheduling and Beam Selection Optimization for Beam-Based Massive MIMO Downlinks
• Gazing into the Abyss: Real-time Gaze Estimation
• Shifted tableaux crystals
• Superlinear Lower Bounds for Distributed Subgraph Detection
• Run, skeleton, run: skeletal model in a physics-based simulation
• The Bayes Lepski’s Method and Credible Bands through Volume of Tubular Neighborhoods
• Computational Results for Extensive-Form Adversarial Team Games
• Average-case Approximation Ratio of Scheduling without Payments
• Macdonald-positive specializations of the algebra of symmetric functions: Proof of the Kerov conjecture
• Robust Synthetic Control
• Node Profiles of Symmetric Digital Search Trees
• An extension to the theory of controlled Lagrangians using the Helmholtz conditions
• A novel total variation model based on kernel functions and its application
• Approximating geodesics via random points
• A systematic framework to discover pattern for web spam classification
• BPGrad: Towards Global Optimality in Deep Learning via Branch and Pruning
• The Strength of Multi-row Aggregation Cuts for Sign-pattern Integer Programs
• Cyclone: High Availability for Persistent Key Value Stores
• Intelligent Word Embeddings of Free-Text Radiology Reports
• Unsupervised Domain Adaptation for Semantic Segmentation with GANs
• How much is my car worth? A methodology for predicting used cars prices using Random Forest
• MIT Autonomous Vehicle Technology Study: Large-Scale Deep Learning Based Analysis of Driver Behavior and Interaction with Automation
• Enhanced Group Sparse Beamforming for Green Cloud-RAN: A Random Matrix Approach
• Sequential Randomized Matrix Factorization for Gaussian Processes: Efficient Predictions and Hyper-parameter Optimization
• Kill Two Birds with One Stone: Weakly-Supervised Neural Network for Image Annotation and Tag Refinement
• A note on quadratic approximations of logistic log-likelihoods
• Convergence Analysis of the Dynamics of a Special Kind of Two-Layered Neural Networks with $\ell_1$ and $\ell_2$ Regularization
• Probabilistic approach to quantum separation effect for Feynman-Kac semigroup
• Coherence-based Time Series Clustering for Brain Connectivity Visualization
• A Discourse-Level Named Entity Recognition and Relation Extraction Dataset for Chinese Literature Text
• MicroExpNet: An Extremely Small and Fast Model For Expression Recognition From Frontal Face Images
• A note on Hadamard fractional differential equations with varying coefficients and their applications in probability
• Incorporating Syntactic Uncertainty in Neural Machine Translation using a Forest-to-Seuqence Model
• Zero Dynamics for Port-Hamiltonian Systems
• Extremal graphs with respect to the total-eccentricity index
• Image-Image Domain Adaptation with Preserved Self-Similarity and Domain-Dissimilarity for Person Re-identification
• Mixed-integer linear representability, disjunctions, and Chvatal functions — modeling implications
• Universal Cycles of Restricted Words
• Normal Representations of Hyperplane Arrangements Over a Field with $1-ad$ Structure and Convex Positive Bijections
• Two-level schemes for the advection equation
• A Coordinate-wise Optimization Algorithm for Sparse Inverse Covariance Selection
• An Improved Oscillating-Error Classifier with Branching
• A Classifying Variational Autoencoder with Application to Polyphonic Music Generation
• An Approximating Control Design for Optimal Mixing by Stokes Flows
• A New Form of Williamson’s Product Theorem
• Morphisms of open games
• DeblurGAN: Blind Motion Deblurring Using Conditional Adversarial Networks
• Diverse and Accurate Image Description Using a Variational Auto-Encoder with an Additive Gaussian Encoding Space
• The destiny of constant structure discrete time closed semantic systems
• Node Balanced Steady States: Unifying and Generalizing Complex and Detailed Balanced Steady States
• On convergence rate for an infinite-channel queuing system with Poisson input flow
• Does mitigating ML’s disparate impact require disparate treatment?
• Estimation Considerations in Contextual Bandits
• Equiangular tight frames that contain regular simplices
• Second-Order Variational Analysis of Parametric Constraint and Variational Systems
• Superexponential estimates and weighted lower bounds for the square function
• Compression-Based Regularization with an Application to Multi-Task Learning
• Probabilistic and Combinatorial Interpretations of the Bernoulli Symbol
• Eigenvectors distribution and quantum unique ergodicity for deformed Wigner matrices
• A Double Parametric Bootstrap Test for Topic Models
• A note on quasi-convex functions
• The invariant measure and the flow associated to the $Φ^4_3$-quantum field model
• Modeling Epistemological Principles for Bias Mitigation in AI Systems: An Illustration in Hiring Decisions
• Deletion-Robust Submodular Maximization at Scale
• On the Stability of a N-class Aloha Network
• Hello Edge: Keyword Spotting on Microcontrollers
• CleanNet: Transfer Learning for Scalable Image Classifier Training with Label Noise
• Critique of Barbosa’s ‘P != NP Proof’
• Robust Non-line-of-sight Imaging with Single Photon Detectors
• Schlegel Diagram and Optimizable Immediate Snapshot Protocol
• Nonparametric Double Robustness
• Optimal binary linear locally repairable codes with disjoint repair groups
• On the Global Fluctuations of Block Gaussian Matrices
• Spectral-Spatial Feature Extraction and Classification by ANN Supervised with Center Loss in Hyperspectral Imagery
• On $e$-positivity and $e$-unimodality of chromatic quasisymmetric functions
• Interactive, Intelligent Tutoring for Auxiliary Constructions in Geometry Proofs
• Let Features Decide for Themselves: Feature Mask Network for Person Re-identification
• Dynamic Neural Program Embedding for Program Repair
• Parameter Reference Loss for Unsupervised Domain Adaptation
• On the Feasibility of Interference Alignment in Compounded MIMO Broadcast Channels with Antenna Correlation and Mixed User Classes
• Polyhedral parametrizations of canonical bases & cluster duality
• Non-reversible, tuning- and rejection-free Markov chain Monte Carlo via iterated random functions
• Is prioritized sweeping the better episodic control?
• On a stochastic Hardy-Littlewood-Sobolev inequality with application to Strichartz estimates for the white noise dispersion
• Block-Cyclic Stochastic Coordinate Descent for Deep Neural Networks
• Softening and Yielding of Soft Glassy Materials
• Method to Design UF-OFDM Filter and its Analysis
• A new class of tests for multinormality with i.i.d. and Garch data based on the empirical moment generating function
• End-to-end Trained CNN Encode-Decoder Networks for Image Steganography
• List-Decodable Robust Mean Estimation and Learning Mixtures of Spherical Gaussians
• Maximizing Non-monotone/Non-submodular Functions by Multi-objective Evolutionary Algorithms
• Lefschetz and Lower Bound theorems for Minkowski sums
• Model Extraction Warning in MLaaS Paradigm
• Generalized Dual Dynamic Programming for Infinite Horizon Problems in Continuous State and Action Spaces
• Linear-Complexity Relaxed Word Mover’s Distance with GPU Acceleration
• Finite Time Analysis of Optimal Adaptive Policies for Linear-Quadratic Systems
• Stochastic metamorphosis with template uncertainties
• Statistics of the Voronoi cell perimeter in large bi-pointed maps
• Tracking in Aerial Hyperspectral Videos using Deep Kernelized Correlation Filters
• MegDet: A Large Mini-Batch Object Detector
• Optical Character Recognition (OCR) for Telugu: Database, Algorithm and Application
• Face Attention Network: An effective Face Detector for the Occluded Faces
• Finite Horizon Robustness Analysis of LTV Systems Using Integral Quadratic Constraints
• On the optimality of the uniform random strategy
• Light-Head R-CNN: In Defense of Two-Stage Object Detector
• Fast BTG-Forest-Based Hierarchical Sub-sentential Alignment
• Evaluating the Performance of eMTC and NB-IoT for Smart City Applications
• A Separation Between Run-Length SLPs and LZ77
• Positive semi-definite embedding for dimensionality reduction and out-of-sample extensions
• Facets, Tiers and Gems: Ontology Patterns for Hypernormalisation
• Speech recognition for medical conversations
• Backscatter Communications for the Internet of Things: A Stochastic Geometry Approach
• Vision-and-Language Navigation: Interpreting visually-grounded navigation instructions in real environments
• Quantum Query Algorithms are Completely Bounded Forms
• Non-exchangeable random partition models for microclustering
• When Fourth Moments Are Enough
• Learning Steerable Filters for Rotation Equivariant CNNs
• Bitmap Filter: Speeding up Exact Set Similarity Joins with Bitwise Operations
• Optimization-Based Autonomous Racing of 1:43 Scale RC Cars
• Zero-shot Learning via Shared-Reconstruction-Graph Pursuit
• Solution of network localization problem with noisy distances and its convergence
• Performance of In-band Transmission of System Information in Massive MIMO Systems
• Cooperative Games With Bounded Dependency Degree
• Detection of Tooth caries in Bitewing Radiographs using Deep Learning
• A Note on Helffer-Sjöstrand Representation for A Ginzburg-Landau Process
• Cascaded Pyramid Network for Multi-Person Pose Estimation
• Proof Complexity Meets Algebra
• On DNA Codes using the Ring Z4 + wZ4
• Bayesian Active Edge Evaluation on Expensive Graphs
• Robust Decentralized Secondary Frequency Control in Power Systems: Merits and Trade-Offs
• Convergent Block Coordinate Descent for Training Tikhonov Regularized Deep Neural Networks
• Community detection with spiking neural networks for neuromorphic hardware
• Pixel-wise object tracking
• Wasserstein and Kolmogorov error bounds for variance-gamma approximation via Stein’s method I
• Spectral distribution of the free Jacobi process, revisited
• Adaptive M-QAM for Indoor Wireless Environments : Rate & Power Adaptation
• How morphological development can guide evolution
• V2V-PoseNet: Voxel-to-Voxel Prediction Network for Accurate 3D Hand and Human Pose Estimation from a Single Depth Map
• Non-Contextual Modeling of Sarcasm using a Neural Network Benchmark
• Disentangling Factors of Variation by Mixing Them
• Robust Seed Mask Generation for Interactive Image Segmentation
• Outliers in the spectrum for products of independent random matrices
• Informed proposals for local MCMC in discrete spaces
• Modular Continual Learning in a Unified Visual Environment
• Joint Object Category and 3D Pose Estimation from 2D Images
• Action Recognition with Coarse-to-Fine Deep Feature Integration and Asynchronous Fusion
• A local graph rewiring algorithm for sampling spanning trees
• Relaxed Oracles for Semi-Supervised Clustering
• On Convergence of Epanechnikov Mean Shift
• On tight cycles in hypergraphs
• A generalised framework for detailed classification of swimming paths inside the Morris Water Maze
• Subcritical multitype branching process in random environment
• Mixture Models, Robustness, and Sum of Squares Proofs
• Families of nested graphs with compatible symmetric-group actions
• Matrix Factorization for Nonparametric Multi-source Localization Exploiting Unimodal Properties
• SquishedNets: Squishing SqueezeNet further for edge device scenarios via deep evolutionary synthesis
• Glitch Classification and Clustering for LIGO with Deep Transfer Learning
Like this:
Like Loading...