SHOPPER: A Probabilistic Model of Consumer Choice with Substitutes and Complements

We develop SHOPPER, a sequential probabilistic model of market baskets. SHOPPER uses interpretable components to model the forces that drive how a customer chooses products; in particular, we designed SHOPPER to capture how items interact with other items. We develop an efficient posterior inference algorithm to estimate these forces from large-scale data, and we analyze a large dataset from a major chain grocery store. We are interested in answering counterfactual queries about changes in prices. We found that SHOPPER provides accurate predictions even under price interventions, and that it helps identify complementary and substitutable pairs of products.

Interpretable Vector AutoRegressions with Exogenous Time Series

The Vector AutoRegressive (VAR) model is fundamental to the study of multivariate time series. Although VAR models are intensively investigated by many researchers, practitioners often show more interest in analyzing VARX models that incorporate the impact of unmodeled exogenous variables (X) into the VAR. However, since the parameter space grows quadratically with the number of time series, estimation quickly becomes challenging. While several proposals have been made to sparsely estimate large VAR models, the estimation of large VARX models is under-explored. Moreover, typically these sparse proposals involve a lasso-type penalty and do not incorporate lag selection into the estimation procedure. As a consequence, the resulting models may be difficult to interpret. In this paper, we propose a lag-based hierarchically sparse estimator, called ‘HVARX’, for large VARX models. We illustrate the usefulness of HVARX on a cross-category management marketing application. Our results show how it provides a highly interpretable model, and improves out-of-sample forecast accuracy compared to a lasso-type approach.

Integrating User and Agent Models: A Deep Task-Oriented Dialogue System

Task-oriented dialogue systems can efficiently serve a large number of customers and relieve people from tedious works. However, existing task-oriented dialogue systems depend on handcrafted actions and states or extra semantic labels, which sometimes degrades user experience despite the intensive human intervention. Moreover, current user simulators have limited expressive ability so that deep reinforcement Seq2Seq models have to rely on selfplay and only work in some special cases. To address those problems, we propose a uSer and Agent Model IntegrAtion (SAMIA) framework inspired by an observation that the roles of the user and agent models are asymmetric. Firstly, this SAMIA framework model the user model as a Seq2Seq learning problem instead of ranking or designing rules. Then the built user model is used as a leverage to train the agent model by deep reinforcement learning. In the test phase, the output of the agent model is filtered by the user model to enhance the stability and robustness. Experiments on a real-world coffee ordering dataset verify the effectiveness of the proposed SAMIA framework.

Online Deep Learning: Learning Deep Neural Networks on the Fly

Deep Neural Networks (DNNs) are typically trained by backpropagation in a batch learning setting, which requires the entire training data to be made available prior to the learning task. This is not scalable for many real-world scenarios where new data arrives sequentially in a stream form. We aim to address an open challenge of ‘Online Deep Learning’ (ODL) for learning DNNs on the fly in an online setting. Unlike traditional online learning that often optimizes some convex objective function with respect to a shallow model (e.g., a linear/kernel-based hypothesis), ODL is significantly more challenging since the optimization of the DNN objective function is non-convex, and regular backpropagation does not work well in practice, especially for online learning settings. In this paper, we present a new online deep learning framework that attempts to tackle the challenges by learning DNN models of adaptive depth from a sequence of training data in an online learning setting. In particular, we propose a novel Hedge Backpropagation (HBP) method for online updating the parameters of DNN effectively, and validate the efficacy of our method on large-scale data sets, including both stationary and concept drifting scenarios.

Quantized Memory-Augmented Neural Networks

Memory-augmented neural networks (MANNs) refer to a class of neural network models equipped with external memory (such as neural Turing machines and memory networks). These neural networks outperform conventional recurrent neural networks (RNNs) in terms of learning long-term dependency, allowing them to solve intriguing AI tasks that would otherwise be hard to address. This paper concerns the problem of quantizing MANNs. Quantization is known to be effective when we deploy deep models on embedded systems with limited resources. Furthermore, quantization can substantially reduce the energy consumption of the inference procedure. These benefits justify recent developments of quantized multi layer perceptrons, convolutional networks, and RNNs. However, no prior work has reported the successful quantization of MANNs. The in-depth analysis presented here reveals various challenges that do not appear in the quantization of the other networks. Without addressing them properly, quantized MANNs would normally suffer from excessive quantization error which leads to degraded performance. In this paper, we identify memory addressing (specifically, content-based addressing) as the main reason for the performance degradation and propose a robust quantization method for MANNs to address the challenge. In our experiments, we achieved a computation-energy gain of 22x with 8-bit fixed-point and binary quantization compared to the floating-point implementation. Measured on the bAbI dataset, the resulting model, named the quantized MANN (Q-MANN), improved the error rate by 46% and 30% with 8-bit fixed-point and binary quantization, respectively, compared to the MANN quantized using conventional techniques.

Joint Sentiment/Topic Modeling on Text Data Using Boosted Restricted Boltzmann Machine

Recently by the development of the Internet and the Web, different types of social media such as web blogs become an immense source of text data. Through the processing of these data, it is possible to discover practical information about different topics, individuals opinions and a thorough understanding of the society. Therefore, applying models which can automatically extract the subjective information from the documents would be efficient and helpful. Topic modeling methods, also sentiment analysis are the most raised topics in the natural language processing and text mining fields. In this paper a new structure for joint sentiment-topic modeling based on Restricted Boltzmann Machine (RBM) which is a type of neural networks is proposed. By modifying the structure of RBM as well as appending a layer which is analogous to sentiment of text data to it, we propose a generative structure for joint sentiment topic modeling based on neutral networks. The proposed method is supervised and trained by the Contrastive Divergence algorithm. The new attached layer in the proposed model is a layer with the multinomial probability distribution which can be used in text data sentiment classification or any other supervised application. The proposed model is compared with existing models in the experiments such as evaluating as a generative model, sentiment classification, information retrieval and the corresponding results demonstrate the efficiency of the method.

Estimation of Cusp Location of Stochastic Processes: a Survey

We present a review of some recent results on estimation of location parameter for several models of observations with cusp-type singularity at the change point. We suppose that the cusp-type models fit better to the real phenomena described usually by change point models. The list of models includes Gaussian, inhomogeneous Poisson, ergodic diffusion processes, time series and the classical case of i.i.d. observations. We describe the properties of the maximum likelihood and Bayes estimators under some asymptotic assumptions. The asymptotic efficiency of estimators are discussed as well and the results of some numerical simulations are presented. We provide some heuristic arguments which demonstrate the convergence of log-likelihood ratios in the models under consideration to the fractional Brownian motion.

YEDDA: A Lightweight Collaborative Text Span Annotation Tool

In this paper, we introduce YEDDA, a lightweight but efficient open-source tool for text span annotation. YEDDA provides a systematic solution for text span annotation, ranging from collaborative user annotation to administrator evaluation and analysis. It overcomes the low efficiency of traditional text annotation tools by annotating entities through both command line and shortcut keys, which are configurable with custom labels. YEDDA also gives intelligent recommendations by training a predictive model using the up-to-date annotated text. An administrator client is developed to evaluate annotation quality of multiple annotators and generate detailed comparison report for each annotator pair. YEDDA is developed based on Tkinter and is compatible with all major operating systems.

Robotic Tactile Perception of Object Properties: A Review

Touch sensing can help robots understand their sur- rounding environment, and in particular the objects they interact with. To this end, roboticists have, in the last few decades, developed several tactile sensing solutions, extensively reported in the literature. Research into interpreting the conveyed tactile information has also started to attract increasing attention in recent years. However, a comprehensive study on this topic is yet to be reported. In an effort to collect and summarize the major scientific achievements in the area, this survey extensively reviews current trends in robot tactile perception of object properties. Available tactile sensing technologies are briefly presented before an extensive review on tactile recognition of object properties. The object properties that are targeted by this review are shape, surface material and object pose. The role of touch sensing in combination with other sensing sources is also discussed. In this review, open issues are identified and future directions for applying tactile sensing in different tasks are suggested.

LSTM Networks for Data-Aware Remaining Time Prediction of Business Process Instances

Predicting the completion time of business process instances would be a very helpful aid when managing processes under service level agreement constraints. The ability to know in advance the trend of running process instances would allow business managers to react in time, in order to prevent delays or undesirable situations. However, making such accurate forecasts is not easy: many factors may influence the required time to complete a process instance. In this paper, we propose an approach based on deep Recurrent Neural Networks (specifically LSTMs) that is able to exploit arbitrary information associated to single events, in order to produce an as-accurate-as-possible prediction of the completion time of running instances. Experiments on real-world datasets confirm the quality of our proposal.

GPflowOpt: A Bayesian Optimization Library using TensorFlow

A novel Python framework for Bayesian optimization known as GPflowOpt is introduced. The package is based on the popular GPflow library for Gaussian processes, leveraging the benefits of TensorFlow including automatic differentiation, parallelization and GPU computations for Bayesian optimization. Design goals focus on a framework that is easy to extend with custom acquisition functions and models. The framework is thoroughly tested and well documented, and provides scalability. The current released version of GPflowOpt includes some standard single-objective acquisition functions, the state-of-the-art max-value entropy search, as well as a Bayesian multi-objective approach. Finally, it permits easy use of custom modeling strategies implemented in GPflow.

Towards the Use of Deep Reinforcement Learning with Global Policy For Query-based Extractive Summarisation

Supervised approaches for text summarisation suffer from the problem of mismatch between the target labels/scores of individual sentences and the evaluation score of the final summary. Reinforcement learning can solve this problem by providing a learning mechanism that uses the score of the final summary as a guide to determine the decisions made at the time of selection of each sentence. In this paper we present a proof-of-concept approach that applies a policy-gradient algorithm to learn a stochastic policy using an undiscounted reward. The method has been applied to a policy consisting of a simple neural network and simple features. The resulting deep reinforcement learning system is able to learn a global policy and obtain encouraging results.

Neural-Symbolic Learning and Reasoning: A Survey and Interpretation

The study and understanding of human behaviour is relevant to computer science, artificial intelligence, neural computation, cognitive science, philosophy, psychology, and several other areas. Presupposing cognition as basis of behaviour, among the most prominent tools in the modelling of behaviour are computational-logic systems, connectionist models of cognition, and models of uncertainty. Recent studies in cognitive science, artificial intelligence, and psychology have produced a number of cognitive models of reasoning, learning, and language that are underpinned by computation. In addition, efforts in computer science research have led to the development of cognitive computational systems integrating machine learning and automated reasoning. Such systems have shown promise in a range of applications, including computational biology, fault diagnosis, training and assessment in simulators, and software verification. This joint survey reviews the personal ideas and views of several researchers on neural-symbolic learning and reasoning. The article is organised in three parts: Firstly, we frame the scope and goals of neural-symbolic computation and have a look at the theoretical foundations. We then proceed to describe the realisations of neural-symbolic computation, systems, and applications. Finally we present the challenges facing the area and avenues for further research.

Attend and Diagnose: Clinical Time Series Analysis using Attention Models

With widespread adoption of electronic health records, there is an increased emphasis for predictive models that can effectively deal with clinical time-series data. Powered by Recurrent Neural Network (RNN) architectures with Long Short-Term Memory (LSTM) units, deep neural networks have achieved state-of-the-art results in several clinical prediction tasks. Despite the success of RNNs, its sequential nature prohibits parallelized computing, thus making it inefficient particularly when processing long sequences. Recently, architectures which are based solely on attention mechanisms have shown remarkable success in transduction tasks in NLP, while being computationally superior. In this paper, for the first time, we utilize attention models for clinical time-series modeling, thereby dispensing recurrence entirely. We develop the \textit{SAnD} (Simply Attend and Diagnose) architecture, which employs a masked, self-attention mechanism, and uses positional encoding and dense interpolation strategies for incorporating temporal order. Furthermore, we develop a multi-task variant of \textit{SAnD} to jointly infer models with multiple diagnosis tasks. Using the recent MIMIC-III benchmark datasets, we demonstrate that the proposed approach achieves state-of-the-art performance in all tasks, outperforming LSTM models and classical baselines with hand-engineered features.

D-SLATS: Distributed Simultaneous Localization and Time Synchronization

Through the last decade, we have witnessed a surge of Internet of Things (IoT) devices, and with that a greater need to choreograph their actions across both time and space. Although these two problems, namely time synchronization and localization, share many aspects in common, they are traditionally treated separately or combined on centralized approaches that results in an ineffcient use of resources, or in solutions that are not scalable in terms of the number of IoT devices. Therefore, we propose D-SLATS, a framework comprised of three different and independent algorithms to jointly solve time synchronization and localization problems in a distributed fashion. The First two algorithms are based mainly on the distributed Extended Kalman Filter (EKF) whereas the third one uses optimization techniques. No fusion center is required, and the devices only communicate with their neighbors. The proposed methods are evaluated on custom Ultra-Wideband communication Testbed and a quadrotor, representing a network of both static and mobile nodes. Our algorithms achieve up to three microseconds time synchronization accuracy and 30 cm localization error.

Finite Sample Differentially Private Confidence Intervals

We study the problem of estimating finite sample confidence intervals of the mean of a normal population under the constraint of differential privacy. We consider both the known and unknown variance cases and construct differentially private algorithms to estimate confidence intervals. Crucially, our algorithms guarantee a finite sample coverage, as opposed to an asymptotic coverage. Unlike most previous differentially private algorithms, we do not require the domain of the samples to be bounded. We also prove lower bounds on the expected size of any differentially private confidence set showing that our the parameters are optimal up to polylogarithmic factors.

Lurking Variable Detection via Dimensional Analysis

Lurking variables represent hidden information, and preclude a full understanding of phenomena of interest. Detection is usually based on serendipity — visual detection of unexplained, systematic variation. However, these approaches are doomed to fail if the lurking variables do not vary. In this article, we address these challenges by introducing formal hypothesis tests for the presence of lurking variables, based on Dimensional Analysis. These procedures utilize a modified form of the Buckingham Pi theorem to provide structure for a suitable null hypothesis. We present analytic tools for reasoning about lurking variables in physical phenomena, construct procedures to handle cases of increasing complexity, and present examples of their application to engineering problems. The results of this work enable algorithm-driven lurking variable detection, complementing a traditionally inspection-based approach.

Bayesian Paragraph Vectors

Word2vec (Mikolov et al., 2013) has proven to be successful in natural language processing by capturing the semantic relationships between different words. Built on top of single-word embeddings, paragraph vectors (Le and Mikolov, 2014) find fixed-length representations for pieces of text with arbitrary lengths, such as documents, paragraphs, and sentences. In this work, we propose a novel interpretation for neural-network-based paragraph vectors by developing an unsupervised generative model whose maximum likelihood solution corresponds to traditional paragraph vectors. This probabilistic formulation allows us to go beyond point estimates of parameters and to perform Bayesian posterior inference. We find that the entropy of paragraph vectors decreases with the length of documents, and that information about posterior uncertainty improves performance in supervised learning tasks such as sentiment analysis and paraphrase detection.

Estimating the Entropy Rate of Finite Markov Chains with Application to Behavior Studies

Predictability of behavior has emerged an an important characteristic in many fields including biology, medicine, and marketing. Behavior can be recorded as a sequence of actions performed by an individual over a given time period. This sequence of actions can often be modeled as a stationary time-homogeneous Markov chain and the predictability of the individual’s behavior can be quantified by the entropy rate of the process. This paper provides a comprehensive investigation of three estimators of the entropy rate of finite Markov processes and a bootstrap procedure for providing standard errors. The first two methods directly estimate the entropy rate through estimates of the transition matrix and stationary distribution of the process; the methods differ in the technique used to estimate the stationary distribution. The third method is related to the sliding-window Lempel-Ziv (SWLZ) compression algorithm. The first two methods achieve consistent estimates of the true entropy rate for reasonably short observed sequences, but are limited by requiring a priori specification of the order of the process. The method based on the SWLZ algorithm does not require specifying the order of the process and is optimal in the limit of an infinite sequence, but is biased for short sequences. When used together, the methods can provide a clear picture of the entropy rate of an individual’s behavior.

Exploiting ConvNet Diversity for Flooding Identification
On Colorful Bin Packing Games
Deep Neural Networks for Physics Analysis on low-level whole-detector data at the LHC
Gaussian Mean Fields Lattice Gas
Geometry-constrained Degrees of Freedom Analysis for Imaging Systems: Monostatic and Multistatic
The stratified micro-randomized trial design: sample size considerations for testing nested causal effects of time-varying treatments
Fast matrix-free evaluation of discontinuous Galerkin finite element operators
Efficient-UCBV: An Almost Optimal Algorithm using Variance Estimates
Action Centered Contextual Bandits
The Wiener polarity index of benzenoid systems and nanotubes
The Lifted Matrix-Space Model for Semantic Composition
The generalized front-door criterion for estimation of indirect causal effects of a confounded treatment
Debiasing the Debiased Lasso with Bootstrap
Realizations and Factorizations of Positive Definite Kernels
Roots of random functions
Swarming in domains with boundaries: approximation and regularization by nonlinear diffusion
Alternating minimization for dictionary learning with random initialization
Learning and Real-time Classification of Hand-written Digits With Spiking Neural Networks
A Provable Approach for Double-Sparse Coding
Small-loss bounds for online learning with partial information
Stochastic Deep Learning in Memristive Networks
Optimal portfolio with insider information on the stochastic interest rate
Geometric Ergodicity in a Weighted Sobolev Space
Poverty Prediction with Public Landsat 7 Satellite Imagery and Machine Learning
Traffic Analysis with Deep Learning
A Latent Space Model for Cognitive Social Structures Data
Unsupervised Learning of Geometry with Edge-aware Depth-Normal Consistency
Bayesian Gaussian models for interpolating large-dimensional data at misaligned areal units
A primal-dual smoothing gap reduction framework for strongly convex-generally concave saddle point problems
Breast density classification with deep convolutional neural networks
Communicative Capital for Prosthetic Agents
Egocentric Hand Detection Via Dynamic Region Growing
Self-Supervised Intrinsic Image Decomposition
On an error-correcting code problem
Covert Communications with A Full-Duplex Receiver over Wireless Fading Channels
Document Context Neural Machine Translation with Memory Networks
Reinforcement Learning of Speech Recognition System Based on Policy Gradient and Hypothesis Selection
A Fully Convolutional Tri-branch Network (FCTN) for Domain Adaptation
Time-dependent spatially varying graphical models, with application to brain fMRI data analysis
A Complete Semidefinite Algorithm for Detecting Copositive Matrices and Tensors
Learning under $p$-Tampering Attacks
Synchronization of Kuramoto Oscillators via Cutset Projections
Robust Multi-Objective Portfolio Optimization Using Bertsimas Method
Frieze patterns over integers and other subsets of the complex numbers
Granular materials flow like complex fluids
Saliency Prediction for Mobile User Interfaces
On a Class of Singular Stochastic Control Problems for Reflected Diffusions
Efficient Simulation for Portfolio Credit Risk in Normal Mixture Copula Models
Chance-constrained optimization with tight confidence bounds
Lattice embeddings between types of fuzzy sets. Closed-valued fuzzy sets
Neural Skill Transfer from Supervised Language Tasks to Reading Comprehension
A Novel Bayesian Multiple Testing Approach to Deregulated miRNA Discovery Harnessing Positional Clustering
On the p-th mean S-asymptotically omega periodic solution for some Stochastic Evolution Equation driven by Q-Brownian motion
Cloud vs Edge Computing for Mobile Services: Delay-aware Decision Making to Minimize Energy Consumption
Long directed rainbow cycles and rainbow spanning trees
A Theoretical Analysis of Sparse Recovery Stability of Dantzig Selector and LASSO
Z2Z4-Additive Cyclic Codes: Kernel and Rank
Centralized Coded Caching of Correlated Contents
Tracking Multiple Vehicles Using a Variational Radar Model
Object Referring in Visual Scene with Spoken Language
Existence of Small Separators Depends on Geometry for Geometric Inhomogeneous Random Graphs
Learning with Options that Terminate Off-Policy
Cooperative control of multi-agent systems to locate source of an odor
Inexactness of the Hydro-Thermal Coordination Semidefinite Relaxation
On Time-of-Arrival Estimation in NB-IoT Systems
Modeling Asymmetric Relationships from Symmetric Networks
Uniform asymptotic stability of switched nonlinear time-varying systems and detectability of reduced limiting control systems
Packing coloring of Sierpiński-type graphs
Size bounds and query plans for relational joins
Robust Clustering with Subpopulation-specific Deviations
Clustering with Local Restrictions
Completely inapproximable monotone and antimonotone parameterized problems
Hamming distance completeness and sparse matrix multiplication
In-Depth Exploration of Single-Snapshot Lossy Compression Techniques for N-Body Simulations
Interpolation and Extrapolation of Toeplitz Matrices via Optimal Mass Transport
Arrhythmia Classification from the Abductive Interpretation of Short Single-Lead ECG Records
On the hardness of losing weight
Group Connectivity: $\mathbb Z_4$ v. $\mathbb Z_2^2$
LDPC-Based Code Hopping for Gaussian Wiretap Channel With Limited Feedback
Parallelogram polyominoes, partially labelled Dyck paths, and the Delta conjecture
A Stochastic Generator of Global Monthly Wind Energy with Tukey $g$-and-$h$ Autoregressive Processes
Accelerated Method for Stochastic Composition Optimization with Nonsmooth Regularization
CARLA: An Open Urban Driving Simulator
Dynamic Analysis of Executables to Detect and Characterize Malware
Manipulative Elicitation — A New Attack on Elections with Incomplete Preferences
Breaking the Softmax Bottleneck: A High-Rank RNN Language Model
EddyNet: A Deep Neural Network For Pixel-Wise Classification of Oceanic Eddies
StreetX: Spatio-Temporal Access Control Model for Data
Testing for observation-dependent regime switching in mixture autoregressive models
Energy Efficiency and Asymptotic Performance Evaluation of Beamforming Structures in Doubly Massive MIMO mmWave Systems