Bayesian Dynamic Modeling and Monitoring of Network Flows

In the context of a motivating study of dynamic network flow data on a large-scale e-commerce web site, we develop Bayesian models for on-line/sequential analysis for monitoring and adapting to changes reflected in node-node traffic. For large-scale networks, we customize core Bayesian time series analysis methods using dynamic generalized linear models (DGLMs) integrated into the multivariate network context using the concept of decouple/recouple recently introduced in multivariate time series. This enables flexible dynamic modeling of flows on large-scale networks and exploitation of partial parallelization of analysis while critically maintaining coherence with an over-arching multivariate dynamic flow model. Development is anchored in a case-study on internet data, with flows of visitors to a commercial news web site defining a long time series of node-node counts on over 56,000 node pairs. Characterizing inherent stochasticity in traffic patterns, understanding node-node interactions, adapting to dynamic changes in flows and allowing for sensitive monitoring to flag anomalies are central questions. The methodology of dynamic network DGLMs will be of interest and utility in broad ranges of dynamic network flow studies.

Anomaly Classification in Distribution Networks Using a Quotient Gradient System

The classification of anomalies or sudden changes in power networks versus normal abrupt changes or switching actions is essential to take appropriate maintenance actions that guarantee the quality of power delivery. This issue has increased in importance and has become more complicated with the proliferation of volatile resources that introduce variability, uncertainty, and intermittency in circuit behavior that can be observed as variations in voltage and current phasors. This makes diagnostics applications more challenging. This paper proposes using quotient gradient system (QGS) to train two-stage partially recurrent neural network to improve anomaly classification rate in power distribution networks using high-fidelity data from micro-phasor measurement units (PMUs). QGS is a systematic approach to finding solutions of constraint satisfaction problems. We transform the PMUs data from the power network into a constraint satisfaction problem and use QGS to train a neural network by solving the resulting optimization problem. Simulation results show that the proposed supervised classification method can reliably distinguish between different anomalies in power distribution networks. Comparison with other neural network classifiers shows that QGS trained networks provide significantly better classification. Sensitivity analysis is performed concerning the number of PMUs, reporting rates, noise level and early versus late data stream fusion frameworks.

A Gentle Introduction to Supervised Machine Learning

This tutorial is based on the lecture notes for the courses ‘Machine Learning: Basic Principles’ and ‘Artificial Intelligence’, which I have taught during fall 2017 and spring 2018 at Aalto university. The aim is to provide an accessible introduction to some of the main concepts and methods within supervised machine learning. Most of the current systems which are con- sidered as (artificially) intelligent are based on some form of supervised machine learning. After discussing the main building blocks of a formal machine learning problem, some of the most popular algorithmic design patterns for machine learning methods are presented.

Laconic Deep Learning Computing

We motivate a method for transparently identifying ineffectual computations in unmodified Deep Learning models and without affecting accuracy. Specifically, we show that if we decompose multiplications down to the bit level the amount of work performed during inference for image classification models can be consistently reduced by two orders of magnitude. In the best case studied of a sparse variant of AlexNet, this approach can ideally reduce computation work by more than 500x. We present Laconic a hardware accelerator that implements this approach to improve execution time, and energy efficiency for inference with Deep Learning Networks. Laconic judiciously gives up some of the work reduction potential to yield a low-cost, simple, and energy efficient design that outperforms other state-of-the-art accelerators. For example, a Laconic configuration that uses a weight memory interface with just 128 wires outperforms a conventional accelerator with a 2K-wire weight memory interface by 2.3x on average while being 2.13x more energy efficient on average. A Laconic configuration that uses a 1K-wire weight memory interface, outperforms the 2K-wire conventional accelerator by 15.4x and is 1.95x more energy efficient. Laconic does not require but rewards advances in model design such as a reduction in precision, the use of alternate numeric representations that reduce the number of bits that are ‘1’, or an increase in weight or activation sparsity.

Incremental Learning Framework Using Cloud Computing

High volume of data, perceived as either challenge or opportunity. Deep learning architecture demands high volume of data to effectively back propagate and train the weights without bias. At the same time, large volume of data demands higher capacity of the machine where it could be executed seamlessly. Budding data scientist along with many research professionals face frequent disconnection issue with cloud computing framework (working without dedicated connection) due to free subscription to the platform. Similar issues also visible while working on local computer where computer may run out of resource or power sometimes and researcher has to start training the models all over again. In this paper, we intend to provide a way to resolve this issue and progressively training the neural network even after having frequent disconnection or resource outage without loosing much of the progress

AttriGuard: A Practical Defense Against Attribute Inference Attacks via Adversarial Machine Learning

Users in various web and mobile applications are vulnerable to attribute inference attacks, in which an attacker leverages a machine learning classifier to infer a target user’s private attributes (e.g., location, sexual orientation, political view) from its public data (e.g., rating scores, page likes). Existing defenses leverage game theory or heuristics based on correlations between the public data and attributes. These defenses are not practical. Specifically, game-theoretic defenses require solving intractable optimization problems, while correlation-based defenses incur large utility loss of users’ public data. In this paper, we present AttriGuard, a practical defense against attribute inference attacks. AttriGuard is computationally tractable and has small utility loss. Our AttriGuard works in two phases. Suppose we aim to protect a user’s private attribute. In Phase I, for each value of the attribute, we find a minimum noise such that if we add the noise to the user’s public data, then the attacker’s classifier is very likely to infer the attribute value for the user. We find the minimum noise via adapting existing evasion attacks in adversarial machine learning. In Phase II, we sample one attribute value according to a certain probability distribution and add the corresponding noise found in Phase I to the user’s public data. We formulate finding the probability distribution as solving a constrained convex optimization problem. We extensively evaluate AttriGuard and compare it with existing methods using a real-world dataset. Our results show that AttriGuard substantially outperforms existing methods. Our work is the first one that shows evasion attacks can be used as defensive techniques for privacy protection.

Building Near-Real-Time Processing Pipelines with the Spark-MPI Platform

Advances in detectors and computational technologies provide new opportunities for applied research and the fundamental sciences. Concurrently, dramatic increases in the three Vs (Volume, Velocity, and Variety) of experimental data and the scale of computational tasks produced the demand for new real-time processing systems at experimental facilities. Recently, this demand was addressed by the Spark-MPI approach connecting the Spark data-intensive platform with the MPI high-performance framework. In contrast with existing data management and analytics systems, Spark introduced a new middleware based on resilient distributed datasets (RDDs), which decoupled various data sources from high-level processing algorithms. The RDD middleware significantly advanced the scope of data-intensive applications, spreading from SQL queries to machine learning to graph processing. Spark-MPI further extended the Spark ecosystem with the MPI applications using the Process Management Interface. The paper explores this integrated platform within the context of online ptychographic and tomographic reconstruction pipelines.

Transfer Learning from Adult to Children for Speech Recognition: Evaluation, Analysis and Recommendations

Children speech recognition is challenging mainly due to the inherent high variability in children’s physical and articulatory characteristics and expressions. This variability manifests in both acoustic constructs and linguistic usage due to the rapidly changing developmental stage in children’s life. Part of the challenge is due to the lack of large amounts of available children speech data for efficient modeling. This work attempts to address the key challenges using transfer learning from adult’s models to children’s models in a Deep Neural Network (DNN) framework for children’s Automatic Speech Recognition (ASR) task evaluating on multiple children’s speech corpora with a large vocabulary. The paper presents a systematic and an extensive analysis of the proposed transfer learning technique considering the key factors affecting children’s speech recognition from prior literature. Evaluations are presented on (i) comparisons of earlier GMM-HMM and the newer DNN Models, (ii) effectiveness of standard adaptation techniques versus transfer learning, (iii) various adaptation configurations in tackling the variabilities present in children speech, in terms of (a) acoustic spectral variability, and (b) pronunciation variability and linguistic constraints. Our Analysis spans over (i) number of DNN model parameters (for adaptation), (ii) amount of adaptation data, (iii) ages of children, (iv) age dependent-independent adaptation. Finally, we provide Recommendations on (i) the favorable strategies over various aforementioned – analyzed parameters, and (ii) potential future research directions and relevant challenges/problems persisting in DNN based ASR for children’s speech.

Metatrace: Online Step-size Tuning by Meta-gradient Descent for Reinforcement Learning Control

Reinforcement learning (RL) has had many successes in both ‘deep’ and ‘shallow’ settings. In both cases, significant hyperparameter tuning is often required to achieve good performance. Furthermore, when nonlinear function approximation is used, non-stationarity in the state representation can lead to learning instability. A variety of techniques exist to combat this — most notably large experience replay buffers or the use of multiple parallel actors. These techniques come at the cost of moving away from the online RL problem as it is traditionally formulated (i.e., a single agent learning online without maintaining a large database of training examples). Meta-learning can potentially help with both these issues by tuning hyperparameters online and allowing the algorithm to more robustly adjust to non-stationarity in a problem. This paper applies meta-gradient descent to derive a set of step-size tuning algorithms specifically for online RL control with eligibility traces. Our novel technique, Metatrace, makes use of an eligibility trace analogous to methods like TD(\lambda). We explore tuning both a single scalar step-size and a separate step-size for each learned parameter. We evaluate Metatrace first for control with linear function approximation in the classic mountain car problem and then in a noisy, non-stationary version. Finally, we apply Metatrace for control with nonlinear function approximation in 5 games in the Arcade Learning Environment where we explore how it impacts learning speed and robustness to initial step-size choice. Results show that the meta-step-size parameter of Metatrace is easy to set, Metatrace can speed learning, and Metatrace can allow an RL algorithm to deal with non-stationarity in the learning task.

Randomized Smoothing SVRG for Large-scale Nonsmooth Convex Optimization

In this paper, we consider the problem of minimizing the average of a large number of nonsmooth and convex functions. Such problems often arise in typical machine learning problems as empirical risk minimization, but are computationally very challenging. We develop and analyze a new algorithm that achieves robust linear convergence rate, and both its time complexity and gradient complexity are superior than state-of-art nonsmooth algorithms and subgradient-based schemes. Besides, our algorithm works without any extra error bound conditions on the objective function as well as the common strongly-convex condition. We show that our algorithm has wide applications in optimization and machine learning problems, and demonstrate experimentally that it performs well on a large-scale ranking problem.

ContextNet: Exploring Context and Detail for Semantic Segmentation in Real-time

Modern deep learning architectures produce highly accurate results on many challenging semantic segmentation datasets. State-of-the-art methods are, however, not directly transferable to real-time applications or embedded devices, since naive adaptation of such systems to reduce computational cost (speed, memory and energy) causes a significant drop in accuracy. We propose ContextNet, a new deep neural network architecture which builds on factorized convolution, network compression and pyramid representations to produce competitive semantic segmentation in real-time with low memory requirements. ContextNet combines a deep branch at low resolution that captures global context information efficiently with a shallow branch that focuses on high-resolution segmentation details. We analyze our network in a thorough ablation study and present results on the Cityscapes dataset, achieving 66.1% accuracy at 18.2 frames per second at full (1024×2048) resolution.

Textual Membership Queries

Human labeling of textual data can be very time-consuming and expensive, yet it is critical for the success of an automatic text classification system. In order to minimize human labeling efforts, we propose a novel active learning (AL) solution, that does not rely on existing sources of unlabeled data. It uses a small amount of labeled data as the core set for the synthesis of useful membership queries (MQs) – unlabeled instances synthesized by an algorithm for human labeling. Our solution uses modification operators, functions from the instance space to the instance space that change the input to some extent. We apply the operators on the core set, thus creating a set of new membership queries. Using this framework, we look at the instance space as a search space and apply search algorithms in order to create desirable MQs. We implement this framework in the textual domain. The implementation includes using methods such as WordNet and Word2vec, for replacing text fragments from a given sentence with semantically related ones. We test our framework on several text classification tasks and show improved classifier performance as more MQs are labeled and incorporated into the training set. To the best of our knowledge, this is the first work on membership queries in the textual domain.

TutorialBank: A Manually-Collected Corpus for Prerequisite Chains, Survey Extraction and Resource Recommendation

The field of Natural Language Processing (NLP) is growing rapidly, with new research published daily along with an abundance of tutorials, codebases and other online resources. In order to learn this dynamic field or stay up-to-date on the latest research, students as well as educators and researchers must constantly sift through multiple sources to find valuable, relevant information. To address this situation, we introduce TutorialBank, a new, publicly available dataset which aims to facilitate NLP education and research. We have manually collected and categorized over 6,300 resources on NLP as well as the related fields of Artificial Intelligence (AI), Machine Learning (ML) and Information Retrieval (IR). Our dataset is notably the largest manually-picked corpus of resources intended for NLP education which does not include only academic papers. Additionally, we have created both a search engine and a command-line tool for the resources and have annotated the corpus to include lists of research topics, relevant resources for each topic, prerequisite relations among topics, relevant sub-parts of individual resources, among other annotations. We are releasing the dataset and present several avenues for further research.

AdvEntuRe: Adversarial Training for Textual Entailment with Knowledge-Guided Examples

We consider the problem of learning textual entailment models with limited supervision (5K-10K training examples), and present two complementary approaches for it. First, we propose knowledge-guided adversarial example generators for incorporating large lexical resources in entailment models via only a handful of rule templates. Second, to make the entailment model – a discriminator – more robust, we propose the first GAN-style approach for training it using a natural language example generator that iteratively adjusts based on the discriminator’s performance. We demonstrate effectiveness using two entailment datasets, where the proposed methods increase accuracy by 4.7% on SciTail and by 2.8% on a 1% training sub-sample of SNLI. Notably, even a single hand-written rule, negate, improves the accuracy on the negation examples in SNLI by 6.1%.

Pool-Based Sequential Active Learning for Regression

Active learning is a machine learning approach for reducing the data labeling effort. Given a pool of unlabeled samples, it tries to select the most useful ones to label so that a model built from them can achieve the best possible performance. This paper focuses on pool-based sequential active learning for regression (ALR). We first propose three essential criteria that an ALR approach should consider in selecting the most useful unlabeled samples: informativeness, representativeness, and diversity, and compare four existing ALR approaches against them. We then propose a new ALR approach using passive sampling, which considers both the representativeness and the diversity in both the initialization and subsequent iterations. Remarkably, this approach can also be integrated with other existing ALR approaches in the literature to further improve the performance. Extensive experiments on 11 UCI, CMU StatLib, and UFL Media Core datasets from various domains verified the effectiveness of our proposed ALR approaches.

Towards Autonomous Reinforcement Learning: Automatic Setting of Hyper-parameters using Bayesian Optimization

With the increase of machine learning usage by industries and scientific communities in a variety of tasks such as text mining, image recognition and self-driving cars, automatic setting of hyper-parameter in learning algorithms is a key factor for achieving satisfactory performance regardless of user expertise in the inner workings of the techniques and methodologies. In particular, for a reinforcement learning algorithm, the efficiency of an agent learning a control policy in an uncertain environment is heavily dependent on the hyper-parameters used to balance exploration with exploitation. In this work, an autonomous learning framework that integrates Bayesian optimization with Gaussian process regression to optimize the hyper-parameters of a reinforcement learning algorithm, is proposed. Also, a bandits-based approach to achieve a balance between computational costs and decreasing uncertainty about the Q-values, is presented. A gridworld example is used to highlight how hyper-parameter configurations of a learning algorithm (SARSA) are iteratively improved based on two performance functions.

Generating Rescheduling Knowledge using Reinforcement Learning in a Cognitive Architecture

In order to reach higher degrees of flexibility, adaptability and autonomy in manufacturing systems, it is essential to develop new rescheduling methodologies which resort to cognitive capabilities, similar to those found in human beings. Artificial cognition is important for designing planning and control systems that generate and represent knowledge about heuristics for repair-based scheduling. Rescheduling knowledge in the form of decision rules is used to deal with unforeseen events and disturbances reactively in real time, and take advantage of the ability to act interactively with the user to counteract the effects of disruptions. In this work, to achieve the aforementioned goals, a novel approach to generate rescheduling knowledge in the form of dynamic first-order logical rules is proposed. The proposed approach is based on the integration of reinforcement learning with artificial cognitive capabilities involving perception and reasoning/learning skills embedded in the Soar cognitive architecture. An industrial example is discussed showing that the approach enables the scheduling system to assess its operational range in an autonomic way, and to acquire experience through intensive simulation while performing repair tasks.

Born Again Neural Networks

Knowledge distillation (KD) consists of transferring knowledge from one machine learning model (the teacher}) to another (the student). Commonly, the teacher is a high-capacity model with formidable performance, while the student is more compact. By transferring knowledge, one hopes to benefit from the student’s compactness. %we desire a compact model with performance close to the teacher’s. We study KD from a new perspective: rather than compressing models, we train students parameterized identically to their teachers. Surprisingly, these {Born-Again Networks (BANs), outperform their teachers significantly, both on computer vision and language modeling tasks. Our experiments with BANs based on DenseNets demonstrate state-of-the-art performance on the CIFAR-10 (3.5%) and CIFAR-100 (15.5%) datasets, by validation error. Additional experiments explore two distillation objectives: (i) Confidence-Weighted by Teacher Max (CWTM) and (ii) Dark Knowledge with Permuted Predictions (DKPP). Both methods elucidate the essential components of KD, demonstrating a role of the teacher outputs on both predicted and non-predicted classes. We present experiments with students of various capacities, focusing on the under-explored case where students overpower teachers. Our experiments show significant advantages from transferring knowledge between DenseNets and ResNets in either direction.

GAN Q-learning

Distributional reinforcement learning (distributional RL) has seen empirical success in complex Markov Decision Processes (MDPs) in the setting of nonlinear function approximation. However, there are many different ways in which one can leverage the distributional approach to reinforcement learning. In this paper, we propose GAN Q-learning, a novel distributional RL method based on generative adversarial networks (GANs) and analyze its performance in simple tabular environments, as well as OpenAI Gym. We empirically show that our algorithm leverages the flexibility and blackbox approach of deep learning models while providing a viable alternative to other state-of-the-art methods.

Extendable Neural Matrix Completion

Matrix completion is one of the key problems in signal processing and machine learning, with applications ranging from image processing and data gathering to classification and recommender systems. Recently, deep neural networks have been proposed as latent factor models for matrix completion and have achieved state-of-the-art performance. Nevertheless, a major problem with existing neural-network-based models is their limited capabilities to extend to samples unavailable at the training stage. In this paper, we propose a deep two-branch neural network model for matrix completion. The proposed model not only inherits the predictive power of neural networks, but is also capable of extending to partially observed samples outside the training set, without the need of retraining or fine-tuning. Experimental studies on popular movie rating datasets prove the ef- fectiveness of our model compared to the state of the art, in terms of both accuracy and extendability.

Doing the impossible: Why neural networks can be trained at all

As deep neural networks grow in size, from thousands to millions to billions of weights, the performance of those networks becomes limited by our ability to accurately train them. A common naive question arises: if we have a system with billions of degrees of freedom, don’t we also need billions of samples to train it Of course, the success of deep learning indicates that reliable models can be learned with reasonable amounts of data. Similar questions arise in protein folding, spin glasses and biological neural networks. With effectively infinite potential folding/spin/wiring configurations, how does the system find the precise arrangement that leads to useful and robust results Simple sampling of the possible configurations until an optimal one is reached is not a viable option even if one waited for the age of the universe. On the contrary, there appears to be a mechanism in the above phenomena that forces them to achieve configurations that live on a low-dimensional manifold, avoiding the curse of dimensionality. In the current work we use the concept of mutual information between successive layers of a deep neural network to elucidate this mechanism and suggest possible ways of exploiting it to accelerate training. We show that adding structure to the neural network that enforces higher mutual information between layers speeds training and leads to more accurate results. High mutual information between layers implies that the effective number of free parameters is exponentially smaller than the raw number of tunable weights.

Low-pass Recurrent Neural Networks – A memory architecture for longer-term correlation discovery

Reinforcement learning (RL) agents performing complex tasks must be able to remember observations and actions across sizable time intervals. This is especially true during the initial learning stages, when exploratory behaviour can increase the delay between specific actions and their effects. Many new or popular approaches for learning these distant correlations employ backpropagation through time (BPTT), but this technique requires storing observation traces long enough to span the interval between cause and effect. Besides memory demands, learning dynamics like vanishing gradients and slow convergence due to infrequent weight updates can reduce BPTT’s practicality; meanwhile, although online recurrent network learning is a developing topic, most approaches are not efficient enough to use as replacements. We propose a simple, effective memory strategy that can extend the window over which BPTT can learn without requiring longer traces. We explore this approach empirically on a few tasks and discuss its implications.

CARL: Content-Aware Representation Learning for Heterogeneous Networks

Heterogeneous networks not only present a challenge of heterogeneity in the types of nodes and relations, but also the attributes and content associated with the nodes. While recent works have looked at representation learning on homogeneous and heterogeneous networks, there is no work that has collectively addressed the following challenges: (a) the heterogeneous structural information of the network consisting of multiple types of nodes and relations; (b) the unstructured semantic content (e.g., text) associated with nodes; and (c) online updates due to incoming new nodes in growing network. We address these challenges by developing a Content-Aware Representation Learning model (CARL). CARL performs joint optimization of heterogeneous SkipGram and deep semantic encoding for capturing both heterogeneous structural closeness and unstructured semantic relations among all nodes, as function of node content, that exist in the network. Furthermore, an additional online update module is proposed for efficiently learning representations of incoming nodes. Extensive experiments demonstrate that CARL outperforms state-of-the-art baselines in various heterogeneous network mining tasks, such as link prediction, document retrieval, node recommendation and relevance search. We also demonstrate the effectiveness of the CARL’s online update module through a category visualization study.

Algorithms and Complexity of Range Clustering

We introduce a novel criterion in clustering that seeks clusters with limited range of values associated with each cluster’s elements. In clustering or classification the objective is to partition a set of objects into subsets, called clusters or classes, consisting of similar objects so that different clusters are as dissimilar as possible. We propose a number of objective functions that employ the range of the clusters as part of the objective function. Several of the proposed objectives mimic objectives based on sums of similarities. These objective functions are motivated by image segmentation problems, where the diameter, or range of values associated with objects in each cluster, should be small. It is demonstrated that range-based problems are in general easier, in terms of their complexity, than the analogous similarity-sum problems. Several of the problems we present could therefore be viable alternatives to existing clustering problems which are NP-hard, offering the advantage of efficient algorithms.

Copulas for Streaming Data

Empirical copula functions can be used to model the dependence structure of multivariate data. This paper adapts the Greenwald and Khanna algorithm in order to provide a space-memory efficient approximation to the empirical copula function of a bivariate stream of data. A succinct space-memory efficient summary of values seen in the stream up to a certain time is maintained and can be queried at any point to return an approximation to the empirical copula function with guaranteed error bounds. This paper then gives an example of a class of higher dimensional copulas that can be computed from a product of these bivariate copula approximations. The computational benefits and the approximation error of this algorithm is theoretically and numerically assessed.

A liability allocation game
Essential formulae for restricted maximum likelihood and its derivatives associated with the linear mixed models
Breaking the Scalability Barrier of Causal Broadcast for Large and Dynamic Systems
Research Curation on Knowledge Management
Energy-Efficient Downlink Power Control in mmWave Cell-Free and User-Centric Massive MIMO
Bivariate Discrete Exponentiated Weibull Distribution: Properties and Applications
Spectral representation of quasi-infinitely divisible processes
Construction of Forward Performance Processes in Stochastic Factor Models and an Extension of Widder’s Theorem
Functional Decomposition: A new method for search and limit setting
A volumetric deep Convolutional Neural Network for simulation of dark matter halo catalogues
Sentiment Composition of Words with Opposing Polarities
Distributed Minimum Vertex Coloring and Maximum Independent Set in Chordal Graphs
DFINITY Technology Overview Series, Consensus System
Networked Model Predictive Control Using a Wavelet Neural Network
Shaping dynamical folding and misfolding pathways in mechanical metamaterials
NRC-Canada at SMM4H Shared Task: Classifying Tweets Mentioning Adverse Drug Reactions and Medication Intake
Classification of Protein Crystallization X-Ray Images Using Major Convolutional Neural Network Architectures
Learning-induced categorical perception in a neural network model
Neural Factor Graph Models for Cross-lingual Morphological Tagging
The Steiner $k$-Wiener index of graphs with given minimum degree
Revisiting Dilated Convolution: A Simple Approach for Weakly- and Semi- Supervised Semantic Segmentation
Using the Best Linear Approximation With Varying Excitation Signals for Nonlinear System Characterization
Domain Adapted Word Embeddings for Improved Sentiment Classification
Fast Rates of ERM and Stochastic Approximation: Adaptive to Error Bound Conditions
Using Stastical and Semantic Models for Multi-Document Summarization
Activated dynamics: an intermediate model between REM and p-spin
TensOrMachine: Probabilistic Boolean Tensor Decomposition
Robust Comparison of Kernel Densities on Spherical Domains
Majority & Stabilization in Population Protocols
The Domain Transform Solver
Robust and Scalable Models of Microbiome Dynamics
Distances to Lattice Points in Knapsack Polyhedra
Joint Flow: Temporal Flow Fields for Multi Person Tracking
A Local Stochastic Algorithm for Separation in Heterogeneous Self-Organizing Particle Systems
Finitary isomorphisms of Poisson point processes
Double Embeddings and CNN-based Sequence Labeling for Aspect Extraction
Stochastic Approximation EM for Logistic Regression with Missing Values
Confidence Modeling for Neural Semantic Parsing
Unsupervised Learning for Fast Probabilistic Diffeomorphic Registration
Sample Truncation for Scenario Approach to Closed-loop Chance Constrained Trajectory Optimization for Linear Systems
3-uniform hypergraphs: modular decomposition and realization by tournaments
Peres-Style Recursive Algorithms
Twitter User Geolocation using Deep Multiview Learning
Breaking Transferability of Adversarial Samples with Randomness
On the $q$-derivative and $q$-series expansions
Agnostic tests can control the type I and type II errors simultaneously
Sharp Nearby, Fuzzy Far Away: How Neural Language Models Use Context
Strong Converse using Change of Measure Arguments
Admissibility of diagonal state-delayed systems with a one-dimensional input space
Constrained-CNN losses forweakly supervised segmentation
Image-derived generative modeling of pseudo-macromolecular structures – towards the statistical assessment of Electron CryoTomography template matching
Direction-aware Spatial Context Features for Shadow Detection and Removal
HOC-Tree: A Novel Index for efficient Spatio-temporal Range Search
The Assouad spectrum of random self-affine carpets
Low cost quantum circuits for classically intractable instances of the Hamiltonian dynamics simulation problem
Design of Order-of-Addition Experiments
Learning to Ask Good Questions: Ranking Clarification Questions using Neural Expected Value of Perfect Information
An Inner SOCP Approximate Algorithm for Robust Adaptive Beamforming for General-Rank Signal Model
Towards Distributed Clouds
Backpropagating through Structured Argmax using a SPIGOT
Examining a hate speech corpus for hate speech detection and popularity prediction
Randomization Approaches for Reducing PAPR with Partial Transmit Sequences and Semidefinite Relaxation
I Have Seen Enough: A Teacher Student Network for Video Classification Using Fewer Frames
Optimal switching sequence for switched linear systems
Kernel and wavelet density estimators on manifolds and more general metric spaces
Huge Automatically Extracted Training Sets for Multilingual Word Sense Disambiguation
Adversarial Task Transfer from Preference
BDD100K: A Diverse Driving Video Database with Scalable Annotation Tooling
Gaussian Mixture Latent Vector Grammars
New Embedded Representations and Evaluation Protocols for Inferring Transitive Relations
TED-LIUM 3: twice as much data and corpus repartition for experiments on speaker adaptation
A New Method for Epileptic Seizure Classification in EEG Using Adapted Wavelet Packets
Setting Reserve Requirements to Approximate the Efficiency of the Stochastic Dispatch
The temporal explorer who returns to the base
Exploring object-centric and scene-centric CNN features and their complementarity for human rights violations recognition in images
Unsupervised Semantic Frame Induction using Triclustering
Do Outliers Ruin Collaboration
Hindering reaction attacks by using monomial codes in the McEliece cryptosystem
Hamiltonian cycles and subsets of discounted occupational measures
An Indexing for Quadratic Residues Modulo $N$ and a Non-uniform Efficient Decoding Algorithm
A Dynamic Analysis of Nash Equilibria in Search Models with Fiat Money
Offline EEG-Based Driver Drowsiness Estimation Using Enhanced Batch-Mode Active Learning (EBMAL) for Regression
Agreement Rate Initialized Maximum Likelihood Estimator for Ensemble Classifier Aggregation and Its Application in Brain-Computer Interface
Counting zero-dimensional subschemes in higher dimensions
A Cognitive Approach to Real-time Rescheduling using SOAR-RL
Multifractal analysis of financial markets
A Simple and Effective Model-Based Variable Importance Measure
Predictive Uncertainty in Large Scale Classification using Dropout – Stochastic Gradient Hamiltonian Monte Carlo
Lift expectations of random sets
An Analog of Matrix Tree Theorem for Signless Laplacians
New Distributed Algorithms in Almost Mixing Time via Transformations from Parallel Algorithms
The Wisdom of the Network: How Adaptive Networks Promote Collective Intelligence
Security-Enhanced SC-FDMA Transmissions Using Temporal Artificial-Noise and Secret-Key Aided Schemes
Learning to Find Eye Region Landmarks for Remote Gaze Estimation in Unconstrained Settings
On the Optimality of Treating Interference as Noise for Interfering Multiple Access Channels
Almost Global Problems in the LOCAL Model
Convolutional CRFs for Semantic Segmentation
Fair Leader Election for Rational Agents in Asynchronous Rings and Networks
Persistent Non-Blocking Binary Search Trees Supporting Wait-Free Range Queries
Deploying Jupyter Notebooks at scale on XSEDE for Science Gateways and workshops
Nonlinear Metric Learning through Geodesic Polylinear Interpolation (ML-GPI)
Near-Optimal Policies for Dynamic Multinomial Logit Assortment Selection Models
Jointly Predicting Predicates and Arguments in Neural Semantic Role Labeling
Scene-Aware Audio for 360\textdegree{} Videos
Coarse-to-Fine Decoding for Neural Semantic Parsing
Lock-Free Search Data Structures: Throughput Modelling with Poisson Processes
Canonical tensor model through data analysis — Dimensions, topologies, and geometries —
On local antimagic chromatic number of graphs with cut-vertices
Exact asymptotic formulae of the stationary distribution of a discrete-time 2d-QBD process: an example and additional proofs
Zero-Shot Dialog Generation with Cross-Domain Latent Actions
Curriculum Adversarial Training
Networked Microgrids for Improving Economics and Resiliency
Triangular Architecture for Rare Language Translation
Optimal Allocation of Series FACTS Devices Under High Penetration of Wind Power Within a Market Environment
Fast and Scalable Group Mutual Exclusion
Closed-form expression for finite predictor coefficients of vector ARMA processes
An attention-based Bi-GRU-CapsNet model for hypernymy detection between compound entities
Enhanced Signal Recovery via Sparsity Inducing Image Priors
Spatial Uncertainty Sampling for End-to-End Control
Exact size counting in uniform population protocols in nearly logarithmic time
Hierarchical Neural Story Generation
Approximations of Mappings
Building Language Models for Text with Named Entities
Deterministic Blind Radio Networks
An Almost Tight RMR Lower Bound for Abortable Test-And-Set
Randomized Communication Without Network Knowledge
Learning to Ask Questions in Open-domain Conversational Systems with Typed Decoders
An interface-unfitted finite element method for elliptic interface optimal control problem
Covariance Pooling For Facial Expression Recognition
A Global, Continuous, and Exponentially Convergent Observer for Attitude and Gyro Bias
SPAIDS and OAMS Models in Wireless Ad Hoc Networks
Strategy-Proof Incentives for Predictions
Autoencoder as Assistant Supervisor: Improving Text Representation for Chinese Social Media Text Summarization
Bag-of-Words as Target for Neural Machine Translation
Kolmogorov-Sinai entropy and dissipation in driven classical Hamiltonian systems
On-the-fly Table Generation
UnibucKernel Reloaded: First Place in Arabic Dialect Identification for the Second Year in a Row
On the combinatorics of circular codes
On local antimagic chromatic number of cycle-related join graphs
Regularity Properties of the Stochastic Flow of a Skew Fractional Brownian Motion
Cartesian Magicness of 3-Dimensional Boards
The estimate of $χ^2$ distance between binomial and generalized binomial distributions
Neural Coreference Resolution with Deep Biaffine Attention by Joint Mention Detection and Mention Clustering
Distributional stability and deterministic equilibrium selection under heterogeneous evolutionary dynamics
Heterogeneity and aggregation in evolutionary dynamics: a general framework without aggregability
Gains in evolutionary dynamics: a unifying approach to stability for contractive games and ESS
A Bayesian semiparametric framework for causal inference in high-dimensional data
LMNet: Real-time Multiclass Object Detection on CPU using 3D LiDARs
Comprehensive Supersense Disambiguation of English Prepositions and Possessives
On the Practical Computational Power of Finite Precision RNNs for Language Recognition
About some extended Erlang-Sevast’yanov queueing system and its convergence rate (English and Russian versions)
Community Detection by Information Flow Simulation
Almost $\mathcal{R}$-trivial monoids are almost never Ramsey
An Enhanced MPPT Method based on ANN-assisted Sequential Monte Carlo and Quickest Change Detection
Fast Multidimensional Asymptotic and Approximate Consensus
Emergence and Evolution of Hierarchical Structure in Complex Systems
A Tempt to Unify Heterogeneous Driving Databases using Traffic Primitives
Lehmer Transform and its Theoretical Properties
Dyna: A Method of Momentum for Stochastic Optimization
The Global Optimization Geometry of Shallow Linear Neural Networks
On the Continuity of Center-Outward Distribution and Quantile Functions
DeLS-3D: Deep Localization and Segmentation with a 3D Semantic Map
Learning Rich Features for Image Manipulation Detection
Ramsey theory without pigeonhole principle and the adversarial Ramsey principle
Compressive sensing on diverse STEM scans: real-time feedback, low-dose and dynamic range
Accelerating Message Passing for MAP with Benders Decomposition
Mean field limits for non-Markovian interacting particles: convergence to equilibrium, GENERIC formalism, asymptotic limits and phase transitions
A matroid extension result
Multi-Agent Path Finding with Deadlines: Preliminary Results
A Matrix Representation of the Multiple Vehicle Routing Problem for Pickup and Delivery
On the $k$-partition dimension of graphs
Learning Temporal Strategic Relationships using Generative Adversarial Imitation Learning
Optimal Human Navigation in Steep Terrain: a Hamilton-Jacobi-Bellman Approach
Unifying and Merging Well-trained Deep Neural Networks for Inference Stage
Multiple Access Computational Offloading: Communication Resource Allocation in the Two-User Case (Extended Version)
Index Set Fourier Series Features for Approximating Multi-dimensional Periodic Kernels
Multiple Antenna Aided NOMA in UAV Networks: A Stochastic Geometry Approach
Word learning and the acquisition of syntactic–semantic overhypotheses
Enumerating sparse uniform hypergraphs with given degree sequence and forbidden edges
Utilizing Probase in Open Directory Project-based Text Classification
Discourse Coherence in the Wild: A Dataset, Evaluation and Methods
How can the score test be consistent
Double-Spending Risk Quantification in Private, Consortium and Public Ethereum Blockchains
Collaborative Item Embedding Model for Implicit Feedback Data
Square-free graphs with no six-vertex induced path
Integrating Hypertension Phenotype and Genotype with Hybrid Non-negative Matrix Factorization
Deep Decision Trees for Discriminative Dictionary Learning with Adversarial Multi-Agent Trajectories
Detecting Adversarial Samples for Deep Neural Networks through Mutation Testing
Last-mile shared delivery: A discrete sequential packing approach
Large deviations in a constrained two-dimensional stochastic process with one-dimensional KPZ fluctuations
An efficient and robust method for analyzing population pharmacokinetic data in genome-wide pharmacogenomic studies: a generalized estimating equation approach
An upper bound on the smallest singular value of a square random matrix
Learning Dual Convolutional Neural Networks for Low-Level Vision
A One-Class Decision Tree Based on Kernel Density Estimation
Cutoff for product replacement on finite groups
Multi-view Common Component Discriminant Analysis for Cross-view Classification
Asymptotic behavior of Betti numbers of random geometric complexes
A stochastic SIR model on a graph with epidemiological and population dynamics occurring over the same time scale
A Deep Learning Approach with an Attention Mechanism for Automatic Sleep Stage Classification
Recursive Discrete-Time Models for Continuous-Time Systems Under Band-Limited Assumptions
A Debt Management Problem with Currency Devaluation
A duality formula and a particle Gibbs sampler for continuous time Feynman-Kac measures on path spaces
Improved Reconciliation With Polar Codes In Quantum Key Distribution
Triclustering of Gene Expression Microarray data using Evolutionary Approach
Consistency of Variational Bayes Inference for Estimation and Model Selection in Mixtures
Token-level and sequence-level loss smoothing for RNN language models
EP-based turbo detection for MIMO receivers and large-scale systems
The Finite Sample Performance of Treatment Effects Estimators based on the Lasso
A Two-stage Approach to Estimate CFO and Channel with One-bit ADCs
KL-UCB-switch: optimal regret bounds for stochastic bandits from both a distribution-dependent and a distribution-free viewpoints
Discrete dividend payments in continuous time
Monte Carlo for high-dimensional degenerated Semi Linear and Full Non Linear PDEs
Constructing Narrative Event Evolutionary Graph for Script Event Prediction
User Blocking Considered Harmful An Attacker-controllable Side Channel to Identify Social Accounts
Unsupervised Intuitive Physics from Visual Observations
Parser Training with Heterogeneous Treebanks
Hyperspectral Data Analysis in R: the hsdar Package
The Spot the Difference corpus: a multi-modal corpus of spontaneous task oriented spoken interactions
Prophets and Secretaries with Overbooking
Bianet: A Parallel News Corpus in Turkish, Kurdish and English
Distributing Complexity: A New Approach to Antenna Selection for Distributed Massive MIMO
Hu-Fu: Hardware and Software Collaborative Attack Framework against Neural Networks
Sparse Convolutional Beamforming for Ultrasound Imaging
Structure and glass-forming ability of simulated Ni-Zr alloys
The duration of an $SIR$ epidemic on a configuration model
Exploiting the Value of the Center-dark Channel Prior for Salient Object Detection
Model selection with lasso-zero: adding straw to the haystack to better find needles
Note on Reverse Pinsker Inequalities
Gracefully Degrading Gathering in Dynamic Rings
Phase field models for two-dimensional branched transportation problems
A Twitter Tale of Three Hurricanes: Harvey, Irma, and Maria
On the 2-Vertex Fault Hamiltonicity for Graphs satisfying Ore’s Theorem
Domain Adaptation with Adversarial Training and Graph Embeddings
Early Scheduling in Parallel State Machine Replication
A population protocol for exact majority with $O(\log^{5/3} n)$ stabilization time and asymptotically optimal number of states
FastLORS: Joint Modeling for eQTL Mapping in R
Unpaired Sentiment-to-Sentiment Translation: A Cycled Reinforcement Learning Approach
Generative Adversarial Forests for Better Conditioned Adversarial Learning
Structural Dependence of Chemical Durability in Modified Aluminoborate Glasses
Cyclic permutations avoiding pairs of patterns of length three
Fork and Join Queueing Networks with Heavy Tails: Scaling Dimension and Throughput Limit
BioPhysical Modeling, Characterization and Optimization of Electro-Quasistatic Human Body Communication
A Dynamic Oracle for Linear-Time 2-Planar Dependency Parsing
Linear bounds on nowhere-zero group irregularity strength and nowhere-zero group sum chromatic number of graphs
Theoretically Efficient Parallel Graph Algorithms Can Be Fast and Scalable
Scaling advantages of all-to-all connectivity in physical annealers: the Coherent Ising Machine vs. D-Wave 2000Q
RETURNN as a Generic Flexible Neural Toolkit with Application to Translation and Speech Recognition
Assembling Omnitigs using Hidden-Order de Bruijn Graphs
Maximizing Expected Impact in an Agent Reputation Network — Technical Report
Bayesian forecasting of many count-valued time series
Decomposition of quantitative Gaifman graphs as a data analysis tool
Effects of Word Embeddings on Neural Network-based Pitch Accent Detection
Cross-intersecting subfamilies of levels of hereditary families
Nonautonomous Dynamics of Acute Cell Injury
Blockchain to Improve Security and Knowledge in Inter-Agent Communication and Collaboration over a Restrict Domains of the Internet Infrastructure
Monotonous subsequences, the Robinson-Schensted correspondence and the descent process of some central measures on the symmetric group
PAPR Analysis for Dual-Polarization FBMC
Normal Similarity Network for Generative Modelling
Iterative Bounded Distance Decoding of Product Codes with Scaled Reliability
Unsupervised Abstractive Meeting Summarization with Multi-Sentence Compression and Budgeted Submodular Maximization
On Neighbourhood Zagreb index of product graphs
On the Nature of Localization in Ti doped Si
A 3D Parallel Algorithm for QR Decomposition
Biased partitions of $\mathbb{Z}^n$
AMR Parsing as Graph Prediction with Latent Alignment
A Cost-Effective Framework for Preference Elicitation and Aggregation
Note on the geodesic Monte Carlo
Strong Skolem Starters
Functions with large additive energy supported on a Hamming Sphere
Transforming graph states using single-qubit operations
How to transform graph states using single-qubit operations: computational complexity and algorithms
Cycle-Dehaze: Enhanced CycleGAN for Single Image Dehazing
An application of the theory of FI-algebras to graph configuration spaces
Fuss-Schröder Paths and Rooted Plane Forests