Foundations of Sequence-to-Sequence Modeling for Time Series

The availability of large amounts of time series data, paired with the performance of deep-learning algorithms on a broad class of problems, has recently led to significant interest in the use of sequence-to-sequence models for time series forecasting. We provide the first theoretical analysis of this time series forecasting framework. We include a comparison of sequence-to-sequence modeling to classical time series models, and as such our theory can serve as a quantitative guide for practitioners choosing between different modeling methodologies.

Structural Breaks in Time Series

This chapter covers methodological issues related to estimation, testing and computation for models involving structural changes. Our aim is to review developments as they relate to econometric applications based on linear models. Substantial advances have been made to cover models at a level of generality that allow a host of interesting practical applications. These include models with general stationary regressors and errors that can exhibit temporal dependence and heteroskedasticity, models with trending variables and possible unit roots and cointegrated models, among others. Advances have been made pertaining to computational aspects of constructing estimates, their limit distributions, tests for structural changes, and methods to determine the number of changes present. A variety of topics are covered. The first part summarizes and updates developments described in an earlier review, Perron (2006), with the exposition following heavily that of Perron (2008). Additions are included for recent developments: testing for common breaks, models with endogenous regressors (emphasizing that simply using least-squares is preferable over instrumental variables methods), quantile regressions, methods based on Lasso, panel data models, testing for changes in forecast accuracy, factors models and methods of inference based on a continuous records asymptotic framework. Our focus is on the so-called off-line methods whereby one wants to retrospectively test for breaks in a given sample of data and form confidence intervals about the break dates. The aim is to provide the readers with an overview of methods that are of direct usefulness in practice as opposed to issues that are mostly of theoretical interest.

Towards a universal neural network encoder for time series

We study the use of a time series encoder to learn representations that are useful on data set types with which it has not been trained on. The encoder is formed of a convolutional neural network whose temporal output is summarized by a convolutional attention mechanism. This way, we obtain a compact, fixed-length representation from longer, variable-length time series. We evaluate the performance of the proposed approach on a well-known time series classification benchmark, considering full adaptation, partial adaptation, and no adaptation of the encoder to the new data type. Results show that such strategies are competitive with the state-of-the-art, often outperforming conceptually-matching approaches. Besides accuracy scores, the facility of adaptation and the efficiency of pre-trained encoders make them an appealing option for the processing of scarcely- or non-labeled time series.

Statistical Analysis on E-Commerce Reviews, with Sentiment Classification using Bidirectional Recurrent Neural Network (RNN)

Understanding customer sentiments is of paramount importance in marketing strategies today. Not only will it give companies an insight as to how customers perceive their products and/or services, but it will also give them an idea on how to improve their offers. This paper attempts to understand the correlation of different variables in customer reviews on a women clothing e-commerce, and to classify each review whether it recommends the reviewed product or not and whether it consists of positive, negative, or neutral sentiment. To achieve these goals, we employed univariate and multivariate analyses on dataset features except for review titles and review texts, and we implemented a bidirectional recurrent neural network (RNN) with long-short term memory unit (LSTM) for recommendation and sentiment classification. Results have shown that a recommendation is a strong indicator of a positive sentiment score, and vice-versa. On the other hand, ratings in product reviews are fuzzy indicators of sentiment scores. We also found out that the bidirectional LSTM was able to reach an F1-score of 0.88 for recommendation classification, and 0.93 for sentiment classification.

Learning to Teach

Teaching plays a very important role in our society, by spreading human knowledge and educating our next generations. A good teacher will select appropriate teaching materials, impact suitable methodologies, and set up targeted examinations, according to the learning behaviors of the students. In the field of artificial intelligence, however, one has not fully explored the role of teaching, and pays most attention to machine \emph{learning}. In this paper, we argue that equal attention, if not more, should be paid to teaching, and furthermore, an optimization framework (instead of heuristics) should be used to obtain good teaching strategies. We call this approach `learning to teach’. In the approach, two intelligent agents interact with each other: a student model (which corresponds to the learner in traditional machine learning algorithms), and a teacher model (which determines the appropriate data, loss function, and hypothesis space to facilitate the training of the student model). The teacher model leverages the feedback from the student model to optimize its own teaching strategies by means of reinforcement learning, so as to achieve teacher-student co-evolution. To demonstrate the practical value of our proposed approach, we take the training of deep neural networks (DNN) as an example, and show that by using the learning to teach techniques, we are able to use much less training data and fewer iterations to achieve almost the same accuracy for different kinds of DNN models (e.g., multi-layer perceptron, convolutional neural networks and recurrent neural networks) under various machine learning tasks (e.g., image classification and text understanding).

OK Google, What Is Your Ontology Or: Exploring Freebase Classification to Understand Google’s Knowledge Graph

This paper reconstructs the Freebase data dumps to understand the underlying ontology behind Google’s semantic search feature. The Freebase knowledge base was a major Semantic Web and linked data technology that was acquired by Google in 2010 to support the Google Knowledge Graph, the backend for Google search results that include structured answers to queries instead of a series of links to external resources. After its shutdown in 2016, Freebase is contained in a data dump of 1.9 billion Resource Description Format (RDF) triples. A recomposition of the Freebase ontology will be analyzed in relation to concepts and insights from the literature on classification by Bowker and Star. This paper will explore how the Freebase ontology is shaped by many of the forces that also shape classification systems through a deep dive into the ontology and a small correlational study. These findings will provide a glimpse into the proprietary blackbox Knowledge Graph and what is meant by Google’s mission to ”organize the world’s information and make it universally accessible and useful”.

Loss-Calibrated Approximate Inference in Bayesian Neural Networks

Current approaches in approximate inference for Bayesian neural networks minimise the Kullback-Leibler divergence to approximate the true posterior over the weights. However, this approximation is without knowledge of the final application, and therefore cannot guarantee optimal predictions for a given task. To make more suitable task-specific approximations, we introduce a new loss-calibrated evidence lower bound for Bayesian neural networks in the context of supervised learning, informed by Bayesian decision theory. By introducing a lower bound that depends on a utility function, we ensure that our approximation achieves higher utility than traditional methods for applications that have asymmetric utility functions. Furthermore, in using dropout inference, we highlight that our new objective is identical to that of standard dropout neural networks, with an additional utility-dependent penalty term. We demonstrate our new loss-calibrated model with an illustrative medical example and a restricted model capacity experiment, and highlight failure modes of the comparable weighted cross entropy approach. Lastly, we demonstrate the scalability of our method to real world applications with per-pixel semantic segmentation on an autonomous driving data set.

Labelling as an unsupervised learning problem

Unravelling hidden patterns in datasets is a classical problem with many potential applications. In this paper, we present a challenge whose objective is to discover nonlinear relationships in noisy cloud of points. If a set of point satisfies a nonlinear relationship that is unlikely to be due to randomness, we will label the set with this relationship. Since points can satisfy one, many or no such nonlinear relationships, cloud of points will typically have one, multiple or no labels at all. This introduces the labelling problem that will be studied in this paper. The objective of this paper is to develop a framework for the labelling problem. We introduce a precise notion of a label, and we propose an algorithm to discover such labels in a given dataset, which is then tested in synthetic datasets. We also analyse, using tools from random matrix theory, the problem of discovering false labels in the dataset.

From Word to Sense Embeddings: A Survey on Vector Representations of Meaning

Over the past years, distributed representations have proven effective and flexible keepers of prior knowledge to be integrated into downstream applications. This survey is focused on semantic representation of meaning. We start from the theoretical background behind word vector space models and highlight one of its main limitations: the meaning conflation deficiency. Then, we explain how this deficiency can be addressed through a transition from word level to the more fine-grained level of word senses (in its broader acceptation) as a method for modelling unambiguous lexical meaning. We present a comprehensive overview of the wide range of techniques in the two main branches of sense representation, i.e., unsupervised and knowledge-based. Finally, this survey covers the main evaluation procedures and an analysis of five important aspects: interpretability, sense granularity, adaptability to different domains, compositionality and integration into downstream applications.

Inference Attacks Against Collaborative Learning

Collaborative machine learning and related techniques such as distributed and federated learning allow multiple participants, each with his own training dataset, to build a joint model. Participants train local models and periodically exchange model parameters or gradient updates computed during the training. We demonstrate that the training data used by participants in collaborative learning is vulnerable to inference attacks. First, we show that an adversarial participant can infer the presence of exact data points in others’ training data (i.e., membership inference). Then, we demonstrate that the adversary can infer properties that hold only for a subset of the training data and are independent of the properties that the joint model aims to capture. We evaluate the efficacy of our attacks on a variety of tasks, datasets, and learning configurations, and conclude with a discussion of possible defenses.

BayesLands: A Bayesian inference approach for parameter uncertainty quantification in Badlands
Low Rank Tensor Completion for Multiway Visual Data
An EEG pre-processing technique for the fast recognition of motor imagery movements
Adversarial Contrastive Estimation
Improving GAN Training via Binarized Representation Entropy (BRE) Regularization
Three tree priors and five datasets: A study of the effect of tree priors in Indo-European phylogenetics
SWIPT Signalling over Complex AWGN Channels with Two Nonlinear Energy Harvester Models
Total, asymmetric and frequency connectedness between oil and forex markets
Efficient Explicit Time Stepping of High Order Discontinuous Galerkin Schemes for Waves
End-to-End Polyphonic Sound Event Detection Using Convolutional Recurrent Neural Networks with Learned Time-Frequency Representation Input
MPI+X: task-based parallelization and dynamic load balance of finite element assembly
Automatic Article Commenting: the Task and Dataset
Optimality of the Maximum Likelihood estimator in Astrometry
The Dataset Nutrition Label: A Framework To Drive Higher Data Quality Standards
Robust-to-Dynamics Optimization
On the number of integer points in translated and expanded polyhedra
Self-Stabilizing Task Allocation In Spite of Noise
Quenched Survival of Bernoulli Percolation on Galton-Watson Trees
Fast and Accurate Tumor Segmentation of Histology Images using Persistent Homology and Deep Convolutional Features
Global existence for a free boundary problem of Fisher-KPP type
Mitigating the Risk of Voltage Collapse using Statistical Measures from PMU Data
Description of a Tracking Metric Inspired by KL-divergence
Auxetic networks with no re-entrant polygons
Incorporating Subword Information into Matrix Factorization Word Embeddings
Long Short-Term Memory as a Dynamically Computed Element-wise Weighted Sum
Greedy Sensor Placement with Cost Constraints
Neural Cache: Bit-Serial In-Cache Acceleration of Deep Neural Networks
Parameter estimation for high dimensional change point regression models without grid search
Creative Invention Benchmark
Decentralized Collaborative Knowledge Management using Git
On the Construction of Substitutes
ARES: Adaptive, Reconfigurable, Erasure coded, atomic Storage
Strengthening strong immersions with Kempe chains
Zero-error Function Computation on a Directed Acyclic Network
Sequence Aggregation Rules for Anomaly Detection in Computer Network Traffic
Cores, shell indices and the degeneracy of a graph limit
Graph Neural Networks for Learning Robot Team Coordination
On the Graver basis of block-structured integer programming
Comparing Covariate Prioritization via Matching to Machine Learning Methods for Causal Inference using Five Empirical Applications
Estimation Methods for Cluster Randomized Trials with Noncompliance: A Study of A Biometric Smartcard Payment System in India
Neural Machine Translation Decoding with Terminology Constraints
Capacities, removable sets and $L^p$-uniqueness on Wiener spaces
Numerical Linear Algebra in the Sliding Window Model
Discourse-Aware Neural Rewards for Coherent Text Generation
Optimal Power Flow with Disjoint Prohibited Zones: New Formulation and Solutions
The Evolution of Popularity and Images of Characters in Marvel Cinematic Universe Fanfictions
Deep Reinforcement Learning for Optimal Control of Space Heating
Threshold functions for patterns in random subsets of finite vector spaces
k-Space Deep Learning for Accelerated MRI
$M_2$-Ranks of overpartitions modulo $6$ and $10$
Continuously Tunable Dual-mode Bandstop Filter
SlugNERDS: A Named Entity Recognition Tool for Open Domain Dialogue Systems
Deep Learning of Geometric Constellation Shaping including Fiber Nonlinearities
Bifurcations in the Kuramoto model on graphs
MIMO radar waveform design with practical constraints: A low-complexity approach
Dust concentration vision measurement based on moment of inertia in gray level-rank co-occurrence matrix
Reliable and Secure Multishot Network Coding using Linearized Reed-Solomon Codes
hyperdoc2vec: Distributed Representations of Hypertext Documents
WikiPassageQA: A Benchmark Collection for Research on Non-factoid Answer Passage Retrieval
Learning Domain-Sensitive and Sentiment-Aware Word Embeddings
On the Universality of the Logistic Loss Function
Polyhedral-based Methods for Mixed-Integer SOCP in Tree Breeding
Modeling and Evaluation of Synchronous Stochastic Gradient Descent in Distributed Deep Learning on Multiple GPUs
UAV-Enabled Wireless Power Transfer with Directional Antenna: A Two-User Case
Training Classifiers with Natural Language Explanations
Compressed Wideband Spectrum Sensing: Concept, Challenges and Enablers
On asymptotic normality in estimation after a group sequential trial
Fundamental Limits of Database Alignment
Towards Inference-Oriented Reading Comprehension: ParallelQA
A comparable study of modeling units for end-to-end Mandarin speech recognition
Haplotype-aware graph indexes
Hybrid semi-Markov CRF for Neural Sequence Labeling
Wald Statistics in high-dimensional PCA
Dwarfs on Accelerators: Enhancing OpenCL Benchmarking for Heterogeneous Computing Architectures
Human Capital in Software Engineering: A Systematic Mapping of Reconceptualized Human Aspect Studies
Characterizations of Solution Sets of Fréchet Differentiable Problems with Quasiconvex Objective Function
On Arbitrarily Long Periodic Orbits of Evolutionary Games on Graphs
Call Me by Your Name: Epistemic Logic with Assignments and Non-rigid Names
Sparse System Identification in Pairs of FIR and TM Bases
Avatar-Net: Multi-scale Zero-shot Style Transfer by Feature Decoration
Signature Catalan Combinatorics
ETH-Hardness of Approximating 2-CSPs and Directed Steiner Network
Deep Covariance Descriptors for Facial Expression Recognition
Scaling Nakamoto Consensus to Thousands of Transactions per Second
Obligation and Prohibition Extraction Using Hierarchical RNNs
Learning Robust Search Strategies Using a Bandit-Based Approach
Structure-from-Motion using Dense CNN Features with Keypoint Relocalization
Feedback stabilization of double pendulum: Application to the crane systems with time-varying rope length
Modified Skellam, Poisson and Gaussian distributions in semi-open systems at charge-like conservation law
Effect of dilution in asymmetric recurrent neural networks
Scaling associative classification for very large datasets
A Generalized Xgamma Generator Family of Distributions
Dealing with sequences in the RGBDT space
Improv Chat: Second Response Generation for Chatbot
Topological phase transition in the quasiperiodic disordered Su-Schriffer-Heeger chain
TADPOLE Challenge: Prediction of Longitudinal Evolution in Alzheimer’s Disease
Uncertainty relations and information loss for spin-1/2 measurements
Ensemble Soft-Margin Softmax Loss for Image Classification
Unbiased and Consistent Nested Sampling via Sequential Monte Carlo
Resource-Bounded Kolmogorov Complexity Provides an Obstacle to Soficness of Multidimensional Shifts
Linear Convergence Rates for Extrapolated Fixed Point Algorithms
Differentiating resting brain states using ordinal symbolic analysis
Nitsche-XFEM for optimal control problems governed by elliptic PDEs with interfaces
A Lagrangean Relaxation Algorithm for the Simple Plant Location Problem with Preferences
WISER: A Semantic Approach for Expert Finding in Academia based on Entity Linking
The Hilbert transform and orthogonal martingales in Banach spaces
Study of constraint and impact of a nuisance parameter in maximum likelihood method
Monotone Learning with Rectifier Networks
Ring Exploration with Myopic Luminous Robots
Potential function, ladder variables and absorption probabilities of a recurrent random walk on $\mathbb{Z}$ with infinite variance
Active User Detection of Uplink Grant-Free SCMA in Frequency Selective Channel
Automatic Academic Paper Rating Based on Modularized Hierarchical Convolutional Neural Network
A Heuristic Algorithm for Traffic Light Synchronization Based on the MAXBAND Model
ABMOF: A Novel Optical Flow Algorithm for Dynamic Vision Sensors
Global Encoding for Abstractive Summarization
Order out of Chaos: Proving Linearizability Using Local Views
Multi-View Semantic Labeling of 3D Point Clouds for Automated Plant Phenotyping
Dense and Diverse Capsule Networks: Making the Capsules Learn Better
CloudLaunch: Discover and Deploy Cloud Applications
A Unified Knowledge Representation and Context-aware Recommender System in Internet of Things
A mixture autoregressive model based on Student’s $t$-distribution
Exploiting Location Information to Enhance Throughput in Downlink V2I Systems
Automatic Estimation of Simultaneous Interpreter Performance
Supervising Nyström Methods via Negative Margin Support Vector Selection
The number of independent sets in an irregular graph
Query for Architecture, Click through Military: Comparing the Roles of Search and Navigation on Wikipedia
Deep Nets: What have they ever done for Vision
Towards an Unequivocal Representation of Actions
Spatially Inhomogeneous Evolutionary Games
Electric-field Inputs for Molecular Quantum-dot Cellular Automata Circuits
Regularizing Output Distribution of Abstractive Chinese Social Media Text Summarization for Improved Semantic Consistency
Scaling limit of the Stein variational gradient descent part I: the mean field regime
A temporal factorization at the maximum for spectrally negative positive self-similar Markov processes
Packing and domination parameters in digraphs
On the supremum of products of symmetric stable processes
End-to-End Reinforcement Learning for Automatic Taxonomy Induction
Sufficient Statistics for Unobserved Heterogeneity in Structural Dynamic Logit Models
Classification of Household Materials via Spectroscopy
Reconfiguration of Satisfying Assignments and Subset Sums: Easy to Find, Hard to Connect
Spin-related phenomena in two-dimensional hopping regime in magnetic field
Asymptotic results for Representation Theory
Energy Complexity of Distance Computation in Multi-hop Networks
Hybrid CMOS-CNFET based NP dynamic Carry Look Ahead Adder
Learning to Estimate 3D Human Pose and Shape from a Single Color Image
Ordinal Depth Supervision for 3D Human Pose Estimation
Fighting Fake News: Image Splice Detection via Learned Self-Consistency
Arbitrary Style Transfer with Deep Feature Reshuffle
The Capacity of Private Information Retrieval from Uncoded Storage Constrained Databases