Hierarchical Conflict Propagation: Sequence Learning in a Recurrent Deep Neural Network

Recurrent neural networks (RNN) are capable of learning to encode and exploit activation history over an arbitrary timescale. However, in practice, state of the art gradient descent based training methods are known to suffer from difficulties in learning long term dependencies. Here, we describe a novel training method that involves concurrent parallel cloned networks, each sharing the same weights, each trained at different stimulus phase and each maintaining independent activation histories. Training proceeds by recursively performing batch-updates over the parallel clones as activation history is progressively increased. This allows conflicts to propagate hierarchically from short-term contexts towards longer-term contexts until they are resolved. We illustrate the parallel clones method and hierarchical conflict propagation with a character-level deep RNN tasked with memorizing a paragraph of Moby Dick (by Herman Melville).


PCA Method for Automated Detection of Mispronounced Words

This paper presents a method for detecting mispronunciations with the aim of improving Computer Assisted Language Learning (CALL) tools used by foreign language learners. The algorithm is based on Principle Component Analysis (PCA). It is hierarchical with each successive step refining the estimate to classify the test word as being either mispronounced or correct. Preprocessing before detection, like normalization and time-scale modification, is implemented to guarantee uniformity of the feature vectors input to the detection system. The performance using various features including spectrograms and Mel-Frequency Cepstral Coefficients (MFCCs) are compared and evaluated. Best results were obtained using MFCCs, achieving up to 99% accuracy in word verification and 93% in native/non-native classification. Compared with Hidden Markov Models (HMMs) which are used pervasively in recognition application, this particular approach is computational efficient and effective when training data is limited.


Deep Spiking Networks

We introduce the Spiking Multi-Layer Perceptron (SMLP). The SMLP is a spiking version of a conventional Multi-Layer Perceptron with rectified-linear units. Our architecture is event-based, meaning that neurons in the network communicate by sending ‘events’ to downstream neurons, and that the state of each neuron is only updated when it receives an event. We show that the SMLP behaves identically, during both prediction and training, to a conventional deep network of rectified-linear units in the limiting case where we run the spiking network for a long time. We apply this architecture to a conventional classification problem (MNIST) and achieve performance very close to that of a conventional MLP with the same architecture. Our network is a natural architecture for learning based on streaming event-based data, and has potential applications in robotic systems systems, which require low power and low response latency.


Virtualizing Deep Neural Networks for Memory-Efficient Neural Network Design

Auto-JacoBin: Auto-encoder Jacobian Binary Hashing

A q-Analog of Foulke’s conjecture

On forbidden induced subgraphs for unit disk graphs

Learning to Abstain from Binary Prediction

Efficient Bayesian Inference for Multivariate Factor Stochastic Volatility Models

Capacitated Kinetic Clustering in Mobile Networks by Optimal Transportation Theory

Harnessing disordered quantum dynamics for machine learning

Stability of stochastic differential equations with respect to time-changed Brownian motions

Streaming Verification of Graph Properties

An Exponential Separation Between Randomized and Deterministic Complexity in the LOCAL Model

Ratios and Cauchy Distribution

Search by Ideal Candidates: Next Generation of Talent Search at LinkedIn

DeepSpark: Spark-Based Deep Learning Supporting Asynchronous Updates and Caffe Compatibility

Scalable and Sustainable Deep Learning via Randomized Hashing

Floating up of the zero-energy Landau level in monolayer epitaxial graphene

Category theoretic analysis of single-photon decision maker

Learning and Free Energy in Expectation Consistent Approximate Inference

Architectural Complexity Measures of Recurrent Neural Networks

Distance spectral radius of uniform hypergraphs

Multimodal Emotion Recognition Using Multimodal Deep Learning

Stable Marriage with Covering Constraints: A Complete Computational Trichotomy

Truncation of Haar random matrices in $\mathrm{GL}_N(\mathbb{Z}_m)$

A Stein deficit for the logarithmic Sobolev inequality

Criticality and Energy Landscapes in Spin Glasses

The comb representation of compact ultrametric spaces

The Independent Domination Polynomial

Theoretical Analysis of the $k$-Means Algorithm – A Survey

Construction of Gene and Species Trees from Sequence Data incl. Orthologs, Paralogs, and Xenologs

Multiple testing of local maxima for detection of peaks on the (celestial) sphere

Balanced Allocation: Patience is not a Virtue

A note on Maximum Likelihood Estimation for cubic and quartic canonical toric del Pezzo Surfaces

Dynamic Erdös- Rényi random graph with forbidden degree

Enhancing Genetic Algorithms using Multi Mutations

Differentials on Graph Complexes III – Deleting a Vertex

Hole probability for nodal sets of the cut-off Gaussian Free Field

Bounded Rational Decision-Making in Feedforward Neural Networks

Stochastic Functional Differential Equations and Feynman-Kac Formula

Thermodynamic formalism and $k$-bonacci substitutions

The number of roots of full support

Large-Scale Detection of Non-Technical Losses in Imbalanced Data Sets

Cortical Computation via Iterative Constructions

A flexible multivariate random effects proportional odds model with application to adverse effects during radiation therapy

Certified Universal Gathering in $R^2$ for Oblivious Mobile Robots

Approximation Complexity of Max-Cut on Power Law Graphs

Graph Isomorphism for unit square graphs

Deterministic versus stochastic aspects of superexponential population growth models

Limit Theorems Associated With The Pitman-Yor Process

Exact Weighted Minwise Hashing in Constant Time

Multivariate Hawkes Processes for Large-scale Inference

Shape-aware Surface Reconstruction from Sparse Data

Quenched invariance principles for the random conductance model on a random graph with degenerate ergodic weights

Simple Bayesian Algorithms for Best Arm Identification

Objective Bayesian Analysis for the Lomax Distribution

Epidemic Processes over Adaptive State-Dependent Networks

Alpaka – An Abstraction Library for Parallel Kernel Acceleration

Tight Bounds for Distributed Graph Computations