Cold-Start Aware User and Product Attention for Sentiment Classification

The use of user/product information in sentiment analysis is important, especially for cold-start users/products, whose number of reviews are very limited. However, current models do not deal with the cold-start problem which is typical in review websites. In this paper, we present Hybrid Contextualized Sentiment Classifier (HCSC), which contains two modules: (1) a fast word encoder that returns word vectors embedded with short and long range dependency features; and (2) Cold-Start Aware Attention (CSAA), an attention mechanism that considers the existence of cold-start problem when attentively pooling the encoded word vectors. HCSC introduces shared vectors that are constructed from similar users/products, and are used when the original distinct vectors do not have sufficient information (i.e. cold-start). This is decided by a frequency-guided selective gate vector. Our experiments show that in terms of RMSE, HCSC performs significantly better when compared with on famous datasets, despite having less complexity, and thus can be trained much faster. More importantly, our model performs significantly better than previous models when the training data is sparse and has cold-start problems.

Entity Commonsense Representation for Neural Abstractive Summarization

A major proportion of a text summary includes important entities found in the original text. These entities build up the topic of the summary. Moreover, they hold commonsense information once they are linked to a knowledge base. Based on these observations, this paper investigates the usage of linked entities to guide the decoder of a neural text summarizer to generate concise and better summaries. To this end, we leverage on an off-the-shelf entity linking system (ELS) to extract linked entities and propose Entity2Topic (E2T), a module easily attachable to a sequence-to-sequence model that transforms a list of entities into a vector representation of the topic of the summary. Current available ELS’s are still not sufficiently effective, possibly introducing unresolved ambiguities and irrelevant entities. We resolve the imperfections of the ELS by (a) encoding entities with selective disambiguation, and (b) pooling entity vectors using firm attention. By applying E2T to a simple sequence-to-sequence model with attention mechanism as base model, we see significant improvements of the performance in the Gigaword (sentence to title) and CNN (long document to multi-sentence highlights) summarization datasets by at least 2 ROUGE points.

Regression with Functional Errors-in-Predictors: A Generalized Method-of-Moments Approach

Functional regression is an important topic in functional data analysis. Traditionally, one often assumes that samples of the functional predictor are independent realizations of an underlying stochastic process, and are observed over a grid of points contaminated by independent and identically distributed measurement errors. In practice, however, the dynamical dependence across different curves may exist and the parametric assumption on the measurement error covariance structure could be unrealistic. In this paper, we consider functional linear regression with serially dependent functional predictors, when the contamination of predictors by the measurement error is ‘genuinely functional’ with fully nonparametric covariance structure. Inspired by the fact that the autocovariance operator of observed functional predictors automatically filters out the impact from the unobservable measurement error, we propose a novel autocovariance-based generalized method-of-moments estimate of the slope parameter. The asymptotic properties of the resulting estimators under different functional scenarios are established. We also demonstrate that our proposed method significantly outperforms possible competitors through intensive simulation studies. Finally, the proposed method is applied to a public financial dataset, revealing some interesting findings.

Low-rank geometric mean metric learning

We propose a low-rank approach to learning a Mahalanobis metric from data. Inspired by the recent geometric mean metric learning (GMML) algorithm, we propose a low-rank variant of the algorithm. This allows to jointly learn a low-dimensional subspace where the data reside and the Mahalanobis metric that appropriately fits the data. Our results show that we compete effectively with GMML at lower ranks.

Selfless Sequential Learning

Sequential learning studies the problem of learning tasks in a sequence with restricted access to only the data of the current task. In the setting with a fixed model capacity, the learning process should not be selfish and account for later tasks to be added and therefore aim at utilizing a minimum number of neurons, leaving enough capacity for future needs. We explore different regularization strategies and activation functions that could lead to less interference between the different tasks. We show that learning a sparse representation is more beneficial for sequential learning than encouraging parameter sparsity regardless of their corresponding neurons. We particularly propose a novel regularizer that encourages representation sparsity by means of neural inhibition. It results in few active neurons which in turn leaves more free neurons to be utilized by upcoming tasks. We combine our regularizer with state-of-the-art lifelong learning methods that penalize changes on important previously learned parts of the network. We show that increased sparsity translates in a performance improvement on the different tasks that are learned in a sequence.

Configurable Markov Decision Processes

In many real-world problems, there is the possibility to configure, to a limited extent, some environmental parameters to improve the performance of a learning agent. In this paper, we propose a novel framework, Configurable Markov Decision Processes (Conf-MDPs), to model this new type of interaction with the environment. Furthermore, we provide a new learning algorithm, Safe Policy-Model Iteration (SPMI), to jointly and adaptively optimize the policy and the environment configuration. After having introduced our approach and derived some theoretical results, we present the experimental evaluation in two explicative problems to show the benefits of the environment configurability on the performance of the learned policy.

Hierarchical interpretations for neural network predictions

Deep neural networks (DNNs) have achieved impressive predictive performance due to their ability to learn complex, non-linear relationships between variables. However, the inability to effectively visualize these relationships has led to DNNs being characterized as black boxes and consequently limited their applications. To ameliorate this problem, we introduce the use of hierarchical interpretations to explain DNN predictions through our proposed method, agglomerative contextual decomposition (ACD). Given a prediction from a trained DNN, ACD produces a hierarchical clustering of the input features, along with the contribution of each cluster to the final prediction. This hierarchy is optimized to identify clusters of features that the DNN learned are predictive. Using examples from Stanford Sentiment Treebank and ImageNet, we show that ACD is effective at diagnosing incorrect predictions and identifying dataset bias. Through human experiments, we demonstrate that ACD enables users both to identify the more accurate of two DNNs and to better trust a DNN’s outputs. We also find that ACD’s hierarchy is largely robust to adversarial perturbations, implying that it captures fundamental aspects of the input and ignores spurious noise.

What About Applied Fairness

Machine learning practitioners are often ambivalent about the ethical aspects of their products. We believe anything that gets us from that current state to one in which our systems are achieving some degree of fairness is an improvement that should be welcomed. This is true even when that progress does not get us 100% of the way to the goal of ‘complete’ fairness or perfectly align with our personal belief on which measure of fairness is used. Some measure of fairness being built would still put us in a better position than the status quo. Impediments to getting fairness and ethical concerns applied in real applications, whether they are abstruse philosophical debates or technical overhead such as the introduction of ever more hyper-parameters, should be avoided. In this paper we further elaborate on our argument for this viewpoint and its importance.

Understanding the Meaning of Understanding

Can we train a machine to detect if another machine has understood a concept In principle, this is possible by conducting tests on the subject of that concept. However we want this procedure to be done by avoiding direct questions. In other words, we would like to isolate the absolute meaning of an abstract idea by putting it into a class of equivalence, hence without adopting straight definitions or showing how this idea ‘works’ in practice. We discuss the metaphysical implications hidden in the above question, with the aim of providing a plausible reference framework.

Beyond Bags of Words: Inferring Systemic Nets

Textual analytics based on representations of documents as bags of words have been reasonably successful. However, analysis that requires deeper insight into language, into author properties, or into the contexts in which documents were created requires a richer representation. Systemic nets are one such representation. They have not been extensively used because they required human effort to construct. We show that systemic nets can be algorithmically inferred from corpora, that the resulting nets are plausible, and that they can provide practical benefits for knowledge discovery problems. This opens up a new class of practical analysis techniques for textual analytics.

Enabling End-To-End Machine Learning Replicability: A Case Study in Educational Data Mining

The use of machine learning techniques has expanded in education research, driven by the rich data from digital learning environments and institutional data warehouses. However, replication of machine learned models in the domain of the learning sciences is particularly challenging due to a confluence of experimental, methodological, and data barriers. We discuss the challenges of end-to-end machine learning replication in this context, and present an open-source software toolkit, the MOOC Replication Framework (MORF), to address them. We demonstrate the use of MORF by conducting a replication at scale, and provide a complete executable container, with unique DOIs documenting the configurations of each individual trial, for replication or future extension at https://…/fy2015-replication. This work demonstrates an approach to end-to-end machine learning replication which is relevant to any domain with large, complex or multi-format, privacy-protected data with a consistent schema.

Constrained existence problem for weak subgame perfect equilibria with omega-regular Boolean objectives
Status maximization as a source of fairness in a networked dictator game
Distributed Hypothesis Testing based on Unequal-Error Protection Codes
Correlation Tracking via Robust Region Proposals
EL-GAN: Embedding Loss Driven Generative Adversarial Networks for Lane Detection
Fast Decoding of Low Density Lattice Codes
Bounds and algorithms for $k$-truss
Improved Density-Based Spatio–Textual Clustering on Social Media
SemAxis: A Lightweight Framework to Characterize Domain-Specific Word Semantics Beyond Sentiment
Translations as Additional Contexts for Sentence Classification
The Exact Equivalence of Distance and Kernel Methods for Hypothesis Testing
Humor Detection in English-Hindi Code-Mixed Social Media Content : Corpus and Baseline System
NetScore: Towards Universal Metrics for Large-scale Performance Analysis of Deep Neural Networks for Practical Usage
ReConvNet: Video Object Segmentation with Spatio-Temporal Features Modulation
Dense Light Field Reconstruction From Sparse Sampling Using Residual Network
Neural Stethoscopes: Unifying Analytic, Auxiliary and Adversarial Network Probing
Statistical Aspects of Wasserstein Distances
Aspect Sentiment Model for Micro Reviews
On the ranking of Test match batsmen
Inference in Deep Gaussian Processes using Stochastic Gradient Hamiltonian Monte Carlo
Nearly Zero-Shot Learning for Semantic Decoding in Spoken Dialogue Systems
Hamiltonian cycles in planar cubic graphs with facial 2-factors, and a new partial solution of Barnette’s Conjecture
Morphological and Language-Agnostic Word Segmentation for NMT
Simultaneous Sensor and Actuator Selection/Placement through Output Feedback Control
Automatic Language Identification for Romance Languages using Stop Words and Diacritics
Copycat CNN: Stealing Knowledge by Persuading Confession with Random Non-Labeled Data
Efficient Active Learning for Image Classification and Segmentation using a Sample Selection and Conditional Generative Adversarial Network
Stabilization with a Specified External Gain for Linear MIMO Systems and Its Applications to Control of Networked Systems
The genus of the Erdős-Rényi random graph and the fragile genus property
An Input-Delay Event-Triggered Control Design for Nonlinear Systems
New Look at Finite Single Server Queue with Poisson Input and Semi-Markov Service Times
Learning Cross-lingual Distributed Logical Representations for Semantic Parsing
Semi-fractional diffusion equations
Analysis of the Effect of Unexpected Outliers in the Classification of Spectroscopy Data
Deep Generative Models in the Real-World: An Open Challenge from Medical Imaging
The committee machine: Computational to statistical gaps in learning a two-layers neural network
Asymptotic maximal order statistic for SIR in $κ-μ$ shadowed fading
Scalable load balancing in networked systems: A survey of recent advances
Stochastic Gradient Descent with Exponential Convergence Rates of Expected Classification Errors
ServeNet: A Deep Neural Network for Web Service Classification
Transfer Learning for Context-Aware Question Matching in Information-seeking Conversations in E-commerce
Approximation and duality problems of refracted processes
Urdu Word Segmentation using Conditional Random Fields (CRFs)
Improving precipitation forecast using extreme quantile regression
Maximum weight spectrum codes with reduced length
Sequential Bayesian inference for spatio-temporal models of temperature and humidity data
Ranking Recovery from Limited Comparisons using Low-Rank Matrix Completion
Learning Dynamics of Linear Denoising Autoencoders
1-bit Localization Scheme for Radar using Dithered Quantized Compressed Sensing
On the Perceptron’s Compression
Simple model of fractal networks formed by self-organized critical dynamics
Dynamical Isometry and a Mean Field Theory of RNNs: Gating Enables Signal Propagation in Recurrent Neural Networks
Dynamical Isometry and a Mean Field Theory of CNNs: How to Train 10,000-Layer Vanilla Convolutional Neural Networks
Theory of Estimation-of-Distribution Algorithms
Parameter Learning and Change Detection Using a Particle Filter With Accelerated Adaptation
PCAS: Pruning Channels with Attention Statistics
A bijection between permutation matrices and descending plane partitions without special parts, which respects the quadruplet of statistics considered by Behrend, Di Francesco and Zinn–Justin
On the heavy-tail behavior of the distributionally robust newsvendor
Single Image Reflection Separation with Perceptual Losses
Multi-Attention Multi-Class Constraint for Fine-grained Image Recognition
Fire SSD: Wide Fire Modules based Single Shot Detector on Edge Device
Financial Forecasting and Analysis for Low-Wage Workers
View-volume Network for Semantic Scene Completion from a Single Depth Image
A Game Theoretic Approach to Learning and Dynamics in Information Retrieval
Defending Against Saddle Point Attack in Byzantine-Robust Distributed Learning
Deep Multi-Output Forecasting: Learning to Accurately Predict Blood Glucose Trajectories
Finding GEMS: Multi-Scale Dictionaries for High-Dimensional Graph Signals
Scalable Neural Network Compression and Pruning Using Hard Clustering and L1 Regularization
Connecting descent and peak polynomials
Assessing the Accuracy of a Wrist Motion Tracking Method for Counting Bites across Demographic and Food Variables
Elastically Collective Nonlinear Langevin Equation Theory of Dynamics in Glass-Forming Liquids: Transient Localization, Thermodynamic Mapping and Cooperativity
Cut-edges and regular factors in regular graphs of odd degree
Convex Class Model on Symmetric Positive Definite Manifolds
From Trailers to Storylines: An Efficient Way to Learn from Movies
Normal approximation for sums of discrete $U$-statistics – application to Kolmogorov bounds in random subgraph counting
On 2-representation infinite algebras arising from dimer models
Identifying the Fake Base Station: A Location Based Approach
Base Station Cooperation in Millimeter Wave Cellular Networks: Performance Enhancement of Cell-Edge Users
Infinite-dimensional bilinear and stochastic balanced truncation
SCSP: Spectral Clustering Filter Pruning with Soft Self-adaption Manners
On the convergence of stationary solutions in the Smoluchowski-Kramers approximation of infinite dimensional systems
Rate-Splitting Robustness in Multi-Pair Massive MIMO Relay Systems
Exchangeable random partitions from max-infinitely-divisible distributions
Deep Reinforcement Learning for Dynamic Urban Transportation Problems
Positive Grassmannian and polyhedral subdivisions
Bounds on sizes of caps in $AG(n,q)$ via the Croot-Lev-Pach polynomial method
A Graphical Interactive Debugger for Distributed Systems
Shape Features Extraction Using a Partial Differential Equation
Apuntes de Redes Neuronales Artificiales
Pattern Dependence Detection using n-TARP Clustering
Hessian spectrum at the global minimum of high-dimensional random landscapes
Automatic formation of the structure of abstract machines in hierarchical reinforcement learning with state clustering
Asymptotic distribution of least square estimators for linear models with dependent errors
Bounds on the localization number
A Flexible Convolutional Solver with Application to Photorealistic Style Transfer
How Predictable is Your State Leveraging Lexical and Contextual Information for Predicting Legislative Floor Action at the State Level
An unbiased approach to compressed sensing
Limiting Behaviors of High Dimensional Stochastic Spin Ensemble
Full Bayesian Modeling for fMRI Group Analysis
Augmented Lagrangian-Based Decomposition Methods with Non-Ergodic Optimal Rates
Kuramoto model for excitation-inhibition-based oscillations
A theory of maximum likelihood for weighted infection graphs
Benchmarks for Image Classification and Other High-dimensional Pattern Recognition Problems
Large monochromatic components in multicolored bipartite graphs
Online Self-supervised Scene Segmentation for Micro Aerial Vehicles
Statistical Significance of CP Violation in Long Baseline Neutrino Experiments
Analysis of Search Stratagem Utilisation
SMHD: A Large-Scale Resource for Exploring Online Language Usage for Multiple Mental Health Conditions
Finding your Lookalike: Measuring Face Similarity Rather than Face Identity
Cover-Encodings of Fitness Landscapes
Reduced words for clans
Quasi-tight Framelets with Directionality or High Vanishing Moments Derived from Arbitrary Refinable Functions
The $e$-vector of a simplicial complex
Manifold Mixup: Encouraging Meaningful On-Manifold Interpolation as a Regularizer
End-to-End Parkinson Disease Diagnosis using Brain MR-Images by 3D-CNN
A latent spatial factor approach for synthesizing opioid associated deaths and treatment admissions in Ohio counties
Identifying Recurring Patterns with Deep Neural Networks for Natural Image Denoising
Shape correspondences from learnt template-based parametrization
Human Activity Recognition Based on Wearable Sensor Data: A Standardization of the State-of-the-Art
Leading Coefficients and the Multiplicity of Known Roots
Decentralized Ergodic Control: Distribution-Driven Sensing and Exploration for Multi-Agent Systems
Bringing replication and reproduction together with generalisability in NLP: Three reproduction studies for Target Dependent Sentiment Analysis
Line Search Methods for Convex-Composite Optimization
Impostor Networks for Fast Fine-Grained Recognition
Weak Closed-Loop Solvability of Stochastic Linear-Quadratic Optimal Control Problems
An Evaluation of Neural Machine Translation Models on Historical Spelling Normalization
On the regularity of join-meet ideals of modular lattices
Automatic counting of fission tracks in apatite and muscovite using image processing
Fully Convolutional Network for Automatic Road Extraction from Satellite Imagery
Cactus Graphs and Graphs Complement Conjecture
Distributed Constrained Nonconvex Optimization: the Asynchronous Method of Multipliers
A Retrospective Analysis of the Fake News Challenge Stance Detection Task
Extracting Parallel Sentences with Bidirectional Recurrent Neural Networks to Improve Machine Translation
Generating Sentences Using a Dynamic Canvas
Martingales and Super-martingales Relative to a Convex Set of Equivalent Measures
fMRI Semantic Category Decoding using Linguistic Encoding of Word Embeddings
Impact of atmospheric impairments on mmWave based outdoor communication
Maintenance of Smart Buildings using Fault Trees
A Unified Framework for Generalizable Style Transfer: Style and Content Separation
Are My EHRs Private Enough -Event-level Privacy Protection
Detecting Statistically Significant Communities
Weighted Tanimoto Coefficient for 3D Molecule Structure Similarity Measurement
A Profit Optimization Approach Based on the Use of Pumped-Hydro Energy Storage Unit and Dynamic Pricing