Deep Generative Networks For Sequence Prediction

This thesis investigates unsupervised time series representation learning for sequence prediction problems, i.e. generating nice-looking input samples given a previous history, for high dimensional input sequences by decoupling the static input representation from the recurrent sequence representation. We introduce three models based on Generative Stochastic Networks (GSN) for unsupervised sequence learning and prediction. Experimental results for these three models are presented on pixels of sequential handwritten digit (MNIST) data, videos of low-resolution bouncing balls, and motion capture data. The main contribution of this thesis is to provide evidence that GSNs are a viable framework to learn useful representations of complex sequential input data, and to suggest a new framework for deep generative models to learn complex sequences by decoupling static input representations from dynamic time dependency representations.

Short Term Electric Load Forecast with Artificial Neural Networks

This paper presents issues regarding short term electric load forecasting using feedforward and Elman recurrent neural networks. The study cases were developed using measured data representing electrical energy consume from Banat area. There were considered 35 different types of structure for both feedforward and recurrent network cases. For each type of neural network structure were performed many trainings and best solution was selected. The issue of forecasting the load on short term is essential in the effective energetic consume management in an open market environment.

Improving Long-Horizon Forecasts with Expectation-Biased LSTM Networks

State-of-the-art forecasting methods using Recurrent Neural Net- works (RNN) based on Long-Short Term Memory (LSTM) cells have shown exceptional performance targeting short-horizon forecasts, e.g given a set of predictor features, forecast a target value for the next few time steps in the future. However, in many applications, the performance of these methods decays as the forecasting horizon extends beyond these few time steps. This paper aims to explore the challenges of long-horizon forecasting using LSTM networks. Here, we illustrate the long-horizon forecasting problem in datasets from neuroscience and energy supply management. We then propose expectation-biasing, an approach motivated by the literature of Dynamic Belief Networks, as a solution to improve long-horizon forecasting using LSTMs. We propose two LSTM ar- chitectures along with two methods for expectation biasing that significantly outperforms standard practice.

Unlearn What You Have Learned: Adaptive Crowd Teaching with Exponentially Decayed Memory Learners

With the increasing demand for large amount of labeled data, crowdsourcing has been used in many large-scale data mining applications. However, most existing works in crowdsourcing mainly focus on label inference and incentive design. In this paper, we address a different problem of adaptive crowd teaching, which is a sub-area of machine teaching in the context of crowdsourcing. Compared with machines, human beings are extremely good at learning a specific target concept (e.g., classifying the images into given categories) and they can also easily transfer the learned concepts into similar learning tasks. Therefore, a more effective way of utilizing crowdsourcing is by supervising the crowd to label in the form of teaching. In order to perform the teaching and expertise estimation simultaneously, we propose an adaptive teaching framework named JEDI to construct the personalized optimal teaching set for the crowdsourcing workers. In JEDI teaching, the teacher assumes that each learner has an exponentially decayed memory. Furthermore, it ensures comprehensiveness in the learning process by carefully balancing teaching diversity and learner’s accurate learning in terms of teaching usefulness. Finally, we validate the effectiveness and efficacy of JEDI teaching in comparison with the state-of-the-art techniques on multiple data sets with both synthetic learners and real crowdsourcing workers.

Deep Probabilistic Programming Languages: A Qualitative Study

Deep probabilistic programming languages try to combine the advantages of deep learning with those of probabilistic programming languages. If successful, this would be a big step forward in machine learning and programming languages. Unfortunately, as of now, this new crop of languages is hard to use and understand. This paper addresses this problem directly by explaining deep probabilistic programming languages and indirectly by characterizing their current strengths and weaknesses.

Deep Multimodal Subspace Clustering Networks

We present convolutional neural network (CNN) based approaches for unsupervised multimodal subspace clustering. The proposed framework consists of three main stages – multimodal encoder, self-expressive layer, and multimodal decoder. The encoder takes multimodal data as input and fuses them to a latent space representation. We investigate early, late and intermediate fusion techniques and propose three different encoders corresponding to them for spatial fusion. The self-expressive layers and multimodal decoders are essentially the same for different spatial fusion-based approaches. In addition to various spatial fusion-based methods, an affinity fusion-based network is also proposed in which the self-expressiveness layer corresponding to different modalities is enforced to be the same. Extensive experiments on three datasets show that the proposed methods significantly outperform the state-of-the-art multimodal subspace clustering methods.

Fast Weight Long Short-Term Memory

Associative memory using fast weights is a short-term memory mechanism that substantially improves the memory capacity and time scale of recurrent neural networks (RNNs). As recent studies introduced fast weights only to regular RNNs, it is unknown whether fast weight memory is beneficial to gated RNNs. In this work, we report a significant synergy between long short-term memory (LSTM) networks and fast weight associative memories. We show that this combination, in learning associative retrieval tasks, results in much faster training and lower test error, a performance boost most prominent at high memory task difficulties.

Understanding Convolutional Neural Network Training with Information Theory

Using information theoretic concepts to understand and explore the inner organization of deep neural networks (DNNs) remains a big challenge. Recently, the concept of an information plane began to shed light on the analysis of multilayer perceptrons (MLPs). We provided an in-depth insight into stacked autoencoders (SAEs) using a novel matrix-based Renyi’s {\alpha}-entropy functional, enabling for the first time the analysis of the dynamics of learning using information flow in real-world scenario involving complex network architecture and large data. Despite the great potential of these past works, there are several open questions when it comes to applying information theoretic concepts to understand convolutional neural networks (CNNs). These include for instance the accurate estimation of information quantities among multiple variables, and the many different training methodologies. By extending the novel matrix-based Renyi’s {\alpha}-entropy functional to a multivariate scenario, this paper presents a systematic method to analyze CNNs training using information theory. Our results validate two fundamental data processing inequalities in CNNs, and also have direct impacts on previous work concerning the training and design of CNNs.

Successive Convexification: A Superlinearly Convergent Algorithm for Non-convex Optimal Control Problems

This paper presents the SCvx algorithm, a successive convexification algorithm designed to solve non-convex optimal control problems with global convergence and superlinear convergence-rate guarantees. The proposed algorithm handles nonlinear dynamics and non-convex state and control constraints by linearizing them about the solution of the previous iterate, and solving the resulting convex subproblem to obtain a solution for the current iterate. Additionally, the algorithm incorporates several safe-guarding techniques into each convex subproblem, employing virtual controls and virtual buffer zones to avoid artificial infeasibility, and a trust region to avoid artificial unboundedness. The procedure is repeated in succession, thus turning a difficult non-convex optimal control problem into a sequence of numerically tractable convex subproblems. Using fast and reliable Interior Point Method (IPM) solvers, the convex subproblems can be computed quickly, making the SCvx algorithm well suited for real-time applications. Analysis is presented to show that the algorithm converges both globally and superlinearly, guaranteeing the local optimality of the original problem. The superlinear convergence is obtained by exploiting the structure of optimal control problems, showcasing the superior convergence rate that can be obtained by leveraging specific problem properties when compared to generic nonlinear programming methods. Numerical simulations are performed for an illustrative non-convex quad-rotor motion planning example problem, and corresponding results obtained using Sequential Quadratic Programming (SQP) solver are provided for comparison. Our results show that the convergence rate of the SCvx algorithm is indeed superlinear, and surpasses that of the SQP-based method by converging in less than half the number of iterations.

State-Space Abstractions for Probabilistic Inference: A Systematic Review

Tasks such as social network analysis, human behavior recognition, or modeling biochemical reactions, can be solved elegantly by using the probabilistic inference framework. However, standard probabilistic inference algorithms work at a propositional level, and thus cannot capture the symmetries and redundancies that are present in these tasks. Algorithms that exploit those symmetries have been devised in different research fields, for example by the lifted inference-, multiple object tracking-, and modeling and simulation-communities. The common idea, that we call state space abstraction, is to perform inference over compact representations of sets of symmetric states. Although they are concerned with a similar topic, the relationship between these approaches has not been investigated systematically. This survey provides the following contributions. We perform a systematic literature review to outline the state of the art in probabilistic inference methods exploiting symmetries. From an initial set of more than 4,000 papers, we identify 116 relevant papers. Furthermore, we provide new high-level categories that classify the approaches, based on the problem classes the different approaches can solve. Researchers from different fields that are confronted with a state space explosion problem in a probabilistic system can use this classification to identify possible solutions. Finally, based on this conceptualization, we identify potentials for future research, as some relevant application domains are not addressed by current approaches.

Exact Distributed Training: Random Forest with Billions of Examples

We introduce an exact distributed algorithm to train Random Forest models as well as other decision forest models without relying on approximating best split search. We explain the proposed algorithm and compare it to related approaches for various complexity measures (time, ram, disk, and network complexity analysis). We report its running performances on artificial and real-world datasets of up to 18 billions examples. This figure is several orders of magnitude larger than datasets tackled in the existing literature. Finally, we empirically show that Random Forest benefits from being trained on more data, even in the case of already gigantic datasets. Given a dataset with 17.3B examples with 82 features (3 numerical, other categorical with high arity), our implementation trains a tree in 22h.

CoNet: Collaborative Cross Networks for Cross-Domain Recommendation

The cross-domain recommendation technique is an effective way of alleviating the data sparsity in recommender systems by leveraging the knowledge from relevant domains. Transfer learning is a class of algorithms underlying these techniques. In this paper, we propose a novel transfer learning approach for cross-domain recommendation by using neural networks as the base model. We assume that hidden layers in two base networks are connected by cross mappings, leading to the collaborative cross networks (CoNet). CoNet enables dual knowledge transfer across domains by introducing cross connections from one base network to another and vice versa. CoNet is achieved in multi-layer feedforward networks by adding dual connections and joint loss functions, which can be trained efficiently by back-propagation. The proposed model is evaluated on two real-world datasets and it outperforms baseline models by relative improvements of 3.56\% in MRR and 8.94\% in NDCG, respectively.

Validating Bayesian Inference Algorithms with Simulation-Based Calibration

Verifying the correctness of Bayesian computation is challenging. This is especially true for complex models that are common in practice, as these require sophisticated model implementations and algorithms. In this paper we introduce \emph{simulation-based calibration} (SBC), a general procedure for validating inferences from Bayesian algorithms capable of generating posterior samples. This procedure not only identifies inaccurate computation and inconsistencies in model implementations but also provides graphical summaries that can indicate the nature of the problems that arise. We argue that SBC is a critical part of a robust Bayesian workflow, as well as being a useful tool for those developing computational algorithms and statistical software.

Entropic Spectral Learning in Large Scale Networks

We present a novel algorithm for learning the spectral density of large scale networks using stochastic trace estimation and the method of maximum entropy. The complexity of the algorithm is linear in the number of non-zero elements of the matrix, offering a computational advantage over other algorithms. We apply our algorithm to the problem of community detection in large networks. We show state-of-the-art performance on both synthetic and real datasets.

Distribution-based Prediction of the Degree of Grammaticalization for German Prepositions
Using Convex Optimization of Autocorrelation with Constrained Support and Windowing for Improved Phase Retrieval Accuracy
Diagnostic Tests for Nested Sampling Calculations
DPRed: Making Typical Activation Values Matter In Deep Learning Computing
Encoding Longer-term Contextual Multi-modal Information in a Predictive Coding Model
Efficient Soft-Output Gauss-Seidel Data Detector for Massive MIMO Systems
On the coupling of Model Predictive Control and Robust Kalman Filtering
Efficient Channel Estimator with Angle-Division Multiple Access
On indefinite sums weighted by periodic sequences
The Vlasov-Navier-Stokes equations as a mean field limit
Deep Object Co-Segmentation
Terrain RL Simulator
Contextualised Browsing in a Digital Library’s Living Lab
Local Search is a PTAS for Feedback Vertex Set in Minor-free Graphs
The emergent integrated network structure of scientific research
Delete, Retrieve, Generate: A Simple Approach to Sentiment and Style Transfer
Vision Based Dynamic Offside Line Marker for Soccer Games
Personalized neural language models for real-world query auto completion
Are FPGAs Suitable for Edge Computing?
Detecting Linguistic Characteristics of Alzheimer’s Dementia by Interpreting Neural Models
The Fundamental Solution to the p-Laplacian in a class of Hörmander Vector Fields
Multi-Reward Reinforced Summarization with Saliency and Entailment
Efficient Search of Compact QC-LDPC and SC-LDPC Convolutional Codes with Large Girth
On Learning Intrinsic Rewards for Policy Gradient Methods
An Adaptive Clipping Approach for Proximal Policy Optimization
Mage: Online Interference-Aware Scheduling in Multi-Scale Heterogeneous Systems
Optimal Carbon Taxes for Emissions Targets in the Electricity Sector
Objective Bayesian Inference for Repairable System Subject to Competing Risks
A Galerkin Isogeometric Method for Karhunen-Loeve Approximation of Random Fields
Communication-Aware Scheduling of Serial Tasks for Dispersed Computing
Bayesian parameter estimation for relativistic heavy-ion collisions
Robust Machine Comprehension Models via Adversarial Training
On coprime percolation, the visibility graphon, and the local limit of the GCD profile
A Generalized Cover’s Problem
Simplex Queues for Hot-Data Download
Multivariate Gaussian Process Regression for Multiscale Data Assimilation and Uncertainty Reduction
Minimax rate of testing in sparse linear regression
A Capacity-Price Game for Uncertain Renewables Resources
Two-Player Games for Efficient Non-Convex Constrained Optimization
Numerical Integration in Multiple Dimensions with Designed Quadrature
Learning how to be robust: Deep polynomial regression
Zero-shot Learning with Complementary Attributes
Improving Character-based Decoding Using Target-Side Morphological Information for Neural Machine Translation
UCNN: Exploiting Computational Reuse in Deep Neural Networks via Weight Repetition
Structure from Recurrent Motion: From Rigidity to Recurrency
Dialogue Learning with Human Teaching and Feedback in End-to-End Trainable Task-Oriented Dialogue Systems
Faster Evaluation of Subtraction Games
Training Deep Networks with Synthetic Data: Bridging the Reality Gap by Domain Randomization
Diachronic Usage Relatedness (DURel): A Framework for the Annotation of Lexical Semantic Change
Online Non-Additive Path Learning under Full and Partial Information
Method to solve quantum few-body problems with artificial neural networks
The 1D Schrödinger equation with a spacetime white noise: the average wave function
Falling Things: A Synthetic Dataset for 3D Object Detection and Pose Estimation
Aspect Level Sentiment Classification with Attention-over-Attention Neural Networks
Improving information centrality of a node in complex networks by adding edges
Average Age-of-Information Minimization in UAV-assisted IoT Networks
Homogenization of Periodic Linear Nonlocal Partial Differential Equations
SFace: An Efficient Network for Face Detection in Large Scale Variations
A Mean Field View of the Landscape of Two-Layers Neural Networks
Combating the Control Signal Spoofing Attack in UAV Systems
The Erdös-Sós Conjecture for Spiders
A Communication-Efficient Random-Walk Algorithm for Decentralized Optimization
The weak order on Weyl posets
Fundamental domains for rhombic lattices with dihedral symmetry of order 8
Semi-Supervised Co-Analysis of 3D Shape Styles from Projected Lines
Free to move or trapped in your group: Mathematical modeling of information overload and coordination in crowded populations
Estimation of the extreme value index in a censorship framework: asymptotic and finite sample behaviour
On bounds on bend number of classes of split and cocomparability graphs
$N$-detachable pairs in 3-connected matroids III: the theorem
Fast Channel Estimation for Millimetre Wave Wireless Systems Using Overlapped Beam Patterns
Coexistence of URLLC and eMBB services in the C-RAN Uplink: An Information-Theoretic Study
Ruin probabilities for two collaborating insurance companies
Independent Distributions on a Multi-Branching AND-OR Tree of Height 2
PHD-GIFs: Personalized Highlight Detection for Automatic GIF Creation
Fast Lexically Constrained Decoding with Dynamic Beam Allocation for Neural Machine Translation
End-to-end Graph-based TAG Parsing with Neural Networks
Geographical Scheduling for Multicast Precoding in Multi-Beam Satellite Systems
Visualizing the Feature Importance for Black Box Models
A Solution for Large-scale Multi-object Tracking
Squarefree divisor complexes of certain numerical semigroup elements
Variational Disparity Estimation Framework for Plenoptic Image
DEA-based benchmarking for performance evaluation in pay-for-performance incentive plans
Experiments with Universal CEFR Classification
An Economic-Based Analysis of RANKING for Online Bipartite Matching
Rooted complete minors in line graphs with a Kempe coloring
Superframes, A Temporal Video Segmentation
Modular Verification of Vehicle Platooning with Respect to Decisions, Space and Time
Consensus Community Detection in Multilayer Networks using Parameter-free Graph Pruning
Numerical semigroups with a fixed number of gaps of second kind
Deep Face Recognition: A Survey
NTUA-SLP at SemEval-2018 Task 2: Predicting Emojis using RNNs with Context-aware Attention
NTUA-SLP at SemEval-2018 Task 1: Predicting Affective Content in Tweets with Deep Attentive RNNs and Transfer Learning
NTUA-SLP at SemEval-2018 Task 3: Tracking Ironic Tweets using Ensembles of Word and Character Level Attentive RNNs
E- and R-optimality of block designs for treatment-control comparisons
Active Learning for Breast Cancer Identification
Bayesian Metabolic Flux Analysis reveals intracellular flux couplings
The Graph Exploration Problem with Advice
Understanding Individual Neuron Importance Using Information Theory
Temporal Unknown Incremental Clustering (TUIC) Model for Analysis of Traffic Surveillance Videos
A Robot to Shape your Natural Plant: The Machine Learning Approach to Model and Control Bio-Hybrid Systems
High order synaptic learning in neuro-mimicking resistive memories
Platonic solids, Archimedean solids and semi-equivelar maps on the sphere
Is a Finite Intersection of Balls Covered by a Finite Union of Balls in Euclidean Spaces ?
Liveness Detection Using Implicit 3D Features
Index Codes for Interlinked Cycle Structures with Outer Cycles
Alquist: The Alexa Prize Socialbot
Impact of Non-orthogonal Multiple Access on the Offloading of Mobile Edge Computing
Promotion on oscillating and alternating tableaux and rotation of matchings and permutations
Are ResNets Provably Better than Linear Predictors?
An efficient open-source implementation to compute the Jacobian matrix for the Newton-Raphson power flow algorithm
Forecasting the presence and intensity of hostility on Instagram using linguistic and social features
Simulation-based Adversarial Test Generation for Autonomous Vehicles with Machine Learning Components
A General Account of Argumentation with Preferences
A Parallel/Distributed Algorithmic Framework for Mining All Quantitative Association Rules
Stopping Redundancy Hierarchy Beyond the Minimum Distance
Unspeech: Unsupervised Speech Context Embeddings
Quantifying the visual concreteness of words and topics in multimodal datasets
A lower bound on the number of homotopy types of simplicial complexes on $n$ vertices
A local approach to the Erdős-Sós conjecture
A note on number triangles that are almost their own production matrix
A Min.Max Algorithm for Spline Based Modeling of Violent Crime Rates in USA
Solving the Exponential Growth of Symbolic Regression Trees in Geometric Semantic Genetic Programming
On Abelian Longest Common Factor with and without RLE
ECG arrhythmia classification using a 2-D convolutional neural network
Automated detection of vulnerable plaque in intravascular ultrasound images
Active choice of teachers, learning strategies and goals for a socially guided intrinsic motivation learner
Automated diagnosis of pneumothorax using an ensemble of convolutional neural networks with multi-sized chest radiography images
Dissecting the NVIDIA Volta GPU Architecture via Microbenchmarking
HD-Index: Pushing the Scalability-Accuracy Boundary for Approximate kNN Search in High-Dimensional Spaces
Unveiling the Power of Deep Tracking
Delayed Blockchain Protocols