• Thermal conductivity in 1d: disorder-induced transition from anomalous to normal scaling
• Temporally-Biased Sampling for Online Model Management
• tempoGAN: A Temporally Coherent, Volumetric GAN for Super-resolution Fluid Flow
• Object-based reasoning in VQA
• A Generalized Circuit for the Hamiltonian Dynamics Through the Truncated Series
• Model selection in sparse high-dimensional vine copula models with application to portfolio risk
• A Corpus for Modeling Word Importance in Spoken Dialogue Transcripts
• Distributed Model Construction in Radio Interferometric Calibration
• Deep Learning based Retinal OCT Segmentation
• Denoising Arterial Spin Labeling Cerebral Blood Flow Images Using Deep Learning
• Quantum Coarse-Graining, Symmetries and Reducibility of Dynamics
• Diffeomorphic registration of discrete geometric distributions
• Multicritical point on the de Almeida-Thouless line in spin glasses in $d>6$ dimensions
• Bounded Policy Synthesis for POMDPs with Safe-Reachability Objectives
• Evaluating approaches for supervised semantic labeling
• FEAST Eigensolver for Nonlinear Eigenvalue Problems
• Reparametrization of COM-Poisson Regression Models with Applications in the Analysis of Experimental Data
• Communication-Efficient Search for an Approximate Closest Lattice Point
• Earthmover Resilience and Testing in Ordered Structures
• Matrix Completion for Low-Observability Voltage Estimation
• Automatically Leveraging MapReduce Frameworks for Data-Intensive Applications
• Predicting Rapid Fire Growth (Flashover) Using Conditional Generative Adversarial Networks
• Algorithms for the Construction of Incoherent Frames Under Various Design Constraints
• The Intriguing Properties of Model Explanations
• A distributed-memory approximation algorithm for maximum weight perfect bipartite matching
• Personalized Survival Prediction with Contextual Explanation Networks
• Subgraph counts for dense random graphs with specified degrees
• Spatiotemporal intermittency and localized dynamic fluctuations upon approaching the glass transition
• Model-assisted inference for treatment effects using regularized calibrated estimation with high-dimensional data
• Learning to Emulate an Expert Projective Cone Scheduler
• Object Detection in Videos by Short and Long Range Object Linking
• On the global stability of departure time user equilibrium: A Lyapunov approach
• Robustness of classification ability of spiking neural networks
• Weighted Community Detection and Data Clustering Using Message Passing
• Mixture Proportion Estimation for Positive–Unlabeled Learning via Classifier Dimension Reduction
• Antenna Selection for Large-Scale MIMO Systems with Low-Resolution ADCs
• Open3D: A Modern Library for 3D Data Processing
• Over-representation of Extreme Events in Decision-Making: A Rational Metacognitive Account
• Sparsity in Max-Plus Algebra and Systems
• Cross-type Biomedical Named Entity Recognition with Deep Multi-Task Learning
• Structured Memory based Deep Model to Detect as well as Characterize Novel Inputs
• Accelerating recurrent neural network language model based online speech recognition system
• New characterizations of freeness for hyperplane arrangements
• Fast Power system security analysis with Guided Dropout
• An infinite family of subcubic graphs with unbounded packing chromatic number
• Boundary effect in competition processes
• Comparison of robustness of statistical procedures for network structure analysis
• Estimation of conditional extreme risk measures from heavy-tailed elliptical random vectors
• Contribution of the Extreme Term in the Sum of Samples with Regularly Varying Tail
• SIR Coverage Analysis in Cellular Networks with Temporal Traffic: A Stochastic Geometry Approach
• Variational and viscosity operators for the evolutive Hamilton-Jacobi equation
• Bayesian inverse problems with unknown operators
• Pilot study for the COST Action ‘Reassembling the Republic of Letters’: language-driven network analysis of letters from the Hartlib’s Papers
• Ito’s Formula for Gaussian Processes with Stochastic Discontinuities
• Approximate ground states of the random-field Potts model from graph cuts
• Properties of additive functionals of Brownian motion with resetting
• A Dynamic Process Interpretation of the Sparse ERGM Reference Model
• Input / Output Stability of a Damped String Equation coupled with Ordinary Differential System
• E2E-MLT – an Unconstrained End-to-End Method for Multi-Language Scene Text
• Analytical modeling and analysis of interleaving on correlated wireless channels
• Diagnose like a Radiologist: Attention Guided Convolutional Neural Network for Thorax Disease Classification
• The Necklace Process: A Generating Function Approach
• PEYMA: A Tagged Corpus for Persian Named Entities
• Fast Binary Compressive Sensing via \ell_0 Gradient Descent
• Large Deviations in Renewal Theory and Renewal Models of Statistical Mechanics
• Playing with universality classes of Barkhausen avalanches
• Nonparametric Bayesian volatility estimation
• Modeling Influence with Semantics in Social Networks: a Survey
• Secure and Robust Identification via Classical-Quantum Channels
• Social Event Scheduling
• Preparation of Improved Turkish DataSet for Sentiment Analysis in Social Media
• Extensions of Erdős-Gallai Theorem and Luo’s Theorem with Applications
• Operator Product Expansion in Liouville Field Theory and Seiberg type transitions in log-correlated Random Energy Models
• Standard modules, Jones-Wenzl projectors, and the valenced Temperley-Lieb algebra
• Benjamini-Schramm convergence of random planar maps
• Cardiac Arrhythmia Detection from ECG Combining Convolutional and Long Short-Term Memory Networks
• An Iterative Spanning Forest Framework for Superpixel Segmentation
• Analysis and optimal control of an intracellular delayed HIV model with CTL immune response
• Features, Projections, and Representation Change for Generalized Planning
• Uplink and Downlink Transceiver Design for OFDM with Index Modulation in Multi-user Networks
• Rigorous Restricted Isometry Property of Low-Dimensional Subspaces
• Deep Adversarial Attention Alignment for Unsupervised Domain Adaptation: the Benefit of Target Expectation Maximization
• Spectrum of SYK model
• A Machine Learning Approach to Quantitative Prosopography
• Modelling structure and predicting dynamics of discussion threads in online boards
• Performance of Media-based Modulation in Multi-user Networks
• Creative Exploration Using Topic Based Bisociative Networks
• An SPDE Model for Systemic Risk with Endogenous Contagion
• Asymptotic Analysis for Low-Resolution Massive MIMO Systems with MMSE Receiver
• Indistinguishable binomial decision tree of 3-SAT: Proof of class P is a proper subset of class NP
• TransRev: Modeling Reviews as Translations from Users to Items
• Graph limits of random unlabelled $k$-trees
• SegDenseNet: Iris Segmentation for Pre and Post Cataract Surgery
• Error estimates for spectral convergence of the graph Laplacian on random geometric graphs towards the Laplace–Beltrami operator
• Surprise in Elections
• Video-based Sign Language Recognition without Temporal Segmentation
• Greedy Morse matchings and discrete smoothness
• An Incremental Path-Following Splitting Method for Linearly Constrained Nonconvex Nonsmooth Programs
• Image Captioning at Will: A Versatile Scheme for Effectively Injecting Sentiments into Image Descriptions
• Universality for zeros of random polynomials
• Information Measures for Microphone Arrays
• Spherical CNNs
• Long scale Ollivier-Ricci curvature of graphs
• Random Access Communication for Wireless Control Systems with Energy Harvesting Sensors
The artificial neural network shows powerful ability of inference, but it is still criticized for lack of interpretability and prerequisite needs of big dataset. This paper proposes the Rule-embedded Neural Network (ReNN) to overcome the shortages. ReNN first makes local-based inferences to detect local patterns, and then uses rules based on domain knowledge about the local patterns to generate rule-modulated map. After that, ReNN makes global-based inferences that synthesizes the local patterns and the rule-modulated map. To solve the optimization problem caused by rules, we use a two-stage optimization strategy to train the ReNN model. By introducing rules into ReNN, we can strengthen traditional neural networks with long-term dependencies which are difficult to learn with limited empirical dataset, thus improving inference accuracy. The complexity of neural networks can be reduced since long-term dependencies are not modeled with neural connections, and thus the amount of data needed to optimize the neural networks can be reduced. Besides, inferences from ReNN can be analyzed with both local patterns and rules, and thus have better interpretability. In this paper, ReNN has been validated with a time-series detection problem.
This paper considers the problem of testing if a sequence of means of a non-stationary time series is stable in the sense that the difference of the means and between the initial time and any other time is smaller than a given level, that is for all . A test for hypotheses of this type is developed using a biascorrected monotone rearranged local linear estimator and asymptotic normality of the corresponding test statistic is established. As the asymptotic variance depends on the location and order of the critical roots of the equation a new bootstrap procedure is proposed to obtain critical values and its consistency is established. As a consequence we are able to quantitatively describe relevant deviations of a non-stationary sequence from its initial value. The results are illustrated by means of a simulation study and by analyzing data examples.
Anomaly detection is the practice of identifying items or events that do not conform to an expected behavior or do not correlate with other items in a dataset. It has previously been applied to areas such as intrusion detection, system health monitoring, and fraud detection in credit card transactions. In this paper, we describe a new method for detecting anomalous behavior over network performance data, gathered by perfSONAR, using two machine learning algorithms: Boosted Decision Trees (BDT) and Simple Feedforward Neural Network. The effectiveness of each algorithm was evaluated and compared. Both have shown sufficient performance and sensitivity.
Most work in the deep learning systems community has focused on faster inference, but arriving at a trained model requires lengthy experiments. Accelerating training lets developers iterate faster and come up with better models. DNN training is often seen as a compute-bound problem, best done in a single large compute node with many GPUs. As DNNs get bigger, training requires going distributed. Distributed deep neural network (DDNN) training constitutes an important workload on the cloud. Larger DNN models and faster compute engines shift training performance bottleneck from computation to communication. Our experiments show existing DNN training frameworks do not scale in a typical cloud environment due to insufficient bandwidth and inefficient parameter server software stacks. We propose PHub, a high performance parameter server (PS) software design that provides an optimized network stack and a streamlined gradient processing pipeline to benefit common PS setups, and PBox, a balanced, scalable central PS hardware that fully utilizes PHub capabilities. We show that in a typical cloud environment, PHub can achieve up to 3.8x speedup over state-of-theart designs when training ImageNet. We discuss future directions of integrating PHub with programmable switches for in-network aggregation during training, leveraging the datacenter network topology to reduce bandwidth usage and localize data movement.
In analyzing big data for finite population inference, it is critical to adjust for the selection bias in the big data. In this paper, we propose two methods of reducing the selection bias associated with the big data sample. The first method uses a version of inverse sampling by incorporating auxiliary information from external sources, and the second one borrows the idea of data integration by combining the big data sample with an independent probability sample. Two simulation studies show that the proposed methods are unbiased and have better coverage rates than their alternatives. In addition, the proposed methods are easy to implement in practice.
Recurrent models for sequences have been recently successful at many tasks, especially for language modeling and machine translation. Nevertheless, it remains challenging to extract good representations from these models. For instance, even though language has a clear hierarchical structure going from characters through words to sentences, it is not apparent in current language models. We propose to improve the representation in sequence models by augmenting current approaches with an autoencoder that is forced to compress the sequence through an intermediate discrete latent space. In order to propagate gradients though this discrete representation we introduce an improved semantic hashing technique. We show that this technique performs well on a newly proposed quantitative efficiency measure. We also analyze latent codes produced by the model showing how they correspond to words and phrases. Finally, we present an application of the autoencoder-augmented model to generating diverse translations.
The fundamental task of general density estimation has been of keen interest to machine learning. Recent advances in density estimation have either: a) proposed a flexible model to estimate the conditional factors of the chain rule, ; or b) used flexible, non-linear transformations of variables of a simple base distribution. Instead, this work jointly leverages transformations of variables and autoregressive conditional models, and proposes novel methods for both. We provide a deeper understanding of our methods, showing a considerable improvement through a comprehensive study over both real world and synthetic data. Moreover, we illustrate the use of our models in outlier detection and image modeling tasks.
Effective collaboration between humans and AI-based systems requires effective modeling of the human in the loop, both in terms of the mental state as well as the physical capabilities of the latter. However, these models can also open up pathways for manipulating and exploiting the human in the hopes of achieving some greater good, especially when the intent or values of the AI and the human are not aligned or when they have an asymmetrical relationship with respect to knowledge or computation power. In fact, such behavior does not necessarily require any malicious intent but can rather be borne out of cooperative scenarios. It is also beyond simple misinterpretation of intents, as in the case of value alignment problems, and thus can be effectively engineered if desired. Such techniques already exist and pose several unresolved ethical and moral questions with regards to the design of autonomy. In this paper, we illustrate some of these issues in a teaming scenario and investigate how they are perceived by participants in a thought experiment.
This paper reviews the state-of-the-art of semantic change computation, one emerging research field in computational linguistics, proposing a framework that summarizes the literature by identifying and expounding five essential components in the field: diachronic corpus, diachronic word sense characterization, change modelling, evaluation data and data visualization. Despite the potential of the field, the review shows that current studies are mainly focused on testifying hypotheses proposed in theoretical linguistics and that several core issues remain to be solved: the need for diachronic corpora of languages other than English, the need for comprehensive evaluation data for evaluation, the comparison and construction of approaches to diachronic word sense characterization and change modelling, and further exploration of data visualization techniques for hypothesis justification.
Relation detection plays a crucial role in Knowledge Base Question Answering (KBQA) because of the high variance of relation expression in the question. Traditional deep learning methods follow an encoding-comparing paradigm, where the question and the candidate relation are represented as vectors to compare their semantic similarity. Max- or average- pooling operation, which compresses the sequence of words into fixed-dimensional vectors, becomes the bottleneck of information. In this paper, we propose to learn attention-based word-level interactions between questions and relations to alleviate the bottleneck issue. Similar to the traditional models, the question and relation are firstly represented as sequences of vectors. Then, instead of merging the sequence into a single vector with pooling operation, soft alignments between words from the question and the relation are learned. The aligned words are subsequently compared with the convolutional neural network (CNN) and the comparison results are merged finally. Through performing the comparison on low-level representations, the attention-based word-level interaction model (ABWIM) relieves the information loss issue caused by merging the sequence into a fixed-dimensional vector before the comparison. The experimental results of relation detection on both SimpleQuestions and WebQuestions datasets show that ABWIM achieves state-of-the-art accuracy, demonstrating its effectiveness.
Clustering is inherently ill-posed: there often exist multiple valid clusterings of a single dataset, and without any additional information a clustering system has no way of knowing which clustering it should produce. This motivates the use of constraints in clustering, as they allow users to communicate their interests to the clustering system. Active constraint-based clustering algorithms select the most useful constraints to query, aiming to produce a good clustering using as few constraints as possible. We propose COBRA, an active method that first over-clusters the data by running K-means with a that is intended to be too large, and subsequently merges the resulting small clusters into larger ones based on pairwise constraints. In its merging step, COBRA is able to keep the number of pairwise queries low by maximally exploiting constraint transitivity and entailment. We experimentally show that COBRA outperforms the state of the art in terms of clustering quality and runtime, without requiring the number of clusters in advance.
Traditional text detection methods mostly focus on quadrangle text. In this study we propose a novel method named sliding line point regression (SLPR) in order to detect arbitrary-shape text in natural scene. SLPR regresses multiple points on the edge of text line and then utilizes these points to sketch the outlines of the text. The proposed SLPR can be adapted to many object detection architectures such as Faster R-CNN and R-FCN. Specifically, we first generate the smallest rectangular box including the text with region proposal network (RPN), then isometrically regress the points on the edge of text by using the vertically and horizontally sliding lines. To make full use of information and reduce redundancy, we calculate x-coordinate or y-coordinate of target point by the rectangular box position, and just regress the remaining y-coordinate or x-coordinate. Accordingly we can not only reduce the parameters of system, but also restrain the points which will generate more regular polygon. Our approach achieved competitive results on traditional ICDAR2015 Incidental Scene Text benchmark and curve text detection dataset CTW1500.
Population diversity is crucial in evolutionary algorithms to enable global exploration and to avoid poor performance due to premature convergence. This book chapter reviews runtime analyses that have shown benefits of population diversity, either through explicit diversity mechanisms or through naturally emerging diversity. These works show that the benefits of diversity are manifold: diversity is important for global exploration and the ability to find several global optima. Diversity enhances crossover and enables crossover to be more effective than mutation. Diversity can be crucial in dynamic optimization, when the problem landscape changes over time. And, finally, it facilitates search for the whole Pareto front in evolutionary multiobjective optimization. The presented analyses rigorously quantify the performance of evolutionary algorithms in the light of population diversity, laying the foundation for a rigorous understanding of how search dynamics are affected by the presence or absence of population diversity and the introduction of diversity mechanisms.
We study the incremental learning problem for the classification task, a key component in developing life-long learning systems. The main challenges while learning in an incremental manner are to preserve and update the knowledge of the model. In this work, we propose a generalization of Path Integral (Zenke et al., 2017) and EWC (Kirkpatrick et al., 2016} with a theoretically grounded KL-divergence based perspective. We show that, to preserve and update the knowledge, regularizing the model’s likelihood distribution is more intuitive and provides better insights to the problem. To do so, we use KL-divergence as a measure of distance which is equivalent to computing distance in a Riemannian manifold induced by the Fisher information matrix. Furthermore, to enhance the learning flexibility, the regularization is weighted by a parameter importance score that is calculated along the entire training trajectory. Contrary to forgetting, as the algorithm progresses, the regularized loss makes the network intransigent, resulting in its inability to discriminate new tasks from the old ones. We show that this problem of intransigence can be addressed by storing a small subset of representative samples from previous datasets. In addition, in order to evaluate the performance of an incremental learning algorithm, we introduce two novel metrics to evaluate forgetting and intransigence. Experimental evaluation on incremental version of MNIST and CIFAR-100 classification datasets shows that our approach outperforms existing state-of-the-art baselines in all the evaluation metrics.
We present a novel algorithm, called Links, designed to perform online clustering on unit vectors in a high-dimensional Euclidean space. The algorithm is appropriate when it is necessary to cluster data efficiently as it streams in, and is to be contrasted with traditional batch clustering algorithms that have access to all data at once. For example, Links has been successfully applied to embedding vectors generated from face images or voice recordings for the purpose of recognizing people, thereby providing real-time identification during video or audio capture.
The Continued Logarithm Algorithm – CL for short- introduced by Gosper in 1978 computes the gcd of two integers; it seems very efficient, as it only performs shifts and subtractions. Shallit has studied its worst-case complexity in 2016 and showed it to be linear. We here perform the average-case analysis of the algorithm: we study its main parameters (number of iterations, total number of shifts) and obtain precise asymptotics for their mean values. Our ‘dynamical’ analysis involves the dynamical system underlying the algorithm, that produces continued fraction expansions whose quotients are powers of 2. Even though this CL system has already been studied by Chan (around 2005), the presence of powers of 2 in the quotients ingrains into the central parameters a dyadic flavour that cannot be grasped solely by studying the CL system. We thus introduce a dyadic component and deal with a two-component system. With this new mixed system at hand, we then provide a complete average-case analysis of the CL algorithm, with explicit constants.