Concept-Oriented Deep Learning

Concepts are the foundation of human deep learning, understanding, and knowledge integration and transfer. We propose concept-oriented deep learning (CODL) which extends (machine) deep learning with concept representations and conceptual understanding capability. CODL addresses some of the major limitations of deep learning: interpretability, transferability, contextual adaptation, and requirement for lots of labeled training data. We discuss the major aspects of CODL including concept graph, concept representations, concept exemplars, and concept representation learning systems supporting incremental and continual learning.

GuideR: a guided separate-and-conquer rule learning in classification, regression, and survival settings

This article presents GuideR, a user-guided rule induction algorithm, which overcomes the largest limitation of the existing methods-the lack of the possibility to introduce user’s preferences or domain knowledge to the rule learning process. Automatic selection of attributes and attribute ranges often leads to the situation in which resulting rules do not contain interesting information. We propose an induction algorithm which takes into account user’s requirements. Our method uses the sequential covering approach and is suitable for classification, regression, and survival analysis problems. The effectiveness of the algorithm in all these tasks has been verified experimentally, confirming guided rule induction to be a powerful data analysis tool.

A Primer on Causal Analysis

We provide a conceptual map to navigate causal analysis problems. Focusing on the case of discrete random variables, we consider the case of causal effect estimation from observational data. The presented approaches apply also to continuous variables, but the issue of estimation becomes more complex. We then introduce the four schools of thought for causal analysis


In many applications, the interdependencies among a set of N time series \{ x_{nk}, k>0 \}_{n=1}^{N} are well captured by a graph or network G. The network itself may change over time as well (i.e., as G_k). We expect the network changes to be at a much slower rate than that of the time series. This paper introduces eigennetworks, networks that are building blocks to compose the actual networks G_k capturing the dependencies among the time series. These eigennetworks can be estimated by first learning the time series of graphs G_k from the data, followed by a Principal Network Analysis procedure. Algorithms for learning both the original time series of graphs and the eigennetworks are presented and discussed. Experiments on simulated and real time series data demonstrate the performance of the learning and the interpretation of the eigennetworks.

LSTM Benchmarks for Deep Learning Frameworks

This study provides benchmarks for different implementations of LSTM units between the deep learning frameworks PyTorch, TensorFlow, Lasagne and Keras. The comparison includes cuDNN LSTMs, fused LSTM variants and less optimized, but more flexible LSTM implementations. The benchmarks reflect two typical scenarios for automatic speech recognition, notably continuous speech recognition and isolated digit recognition. These scenarios cover input sequences of fixed and variable length as well as the loss functions CTC and cross entropy. Additionally, a comparison between four different PyTorch versions is included. The code is available online https://…/rnn_benchmarks.

Analysis of DAWNBench, a Time-to-Accuracy Machine Learning Performance Benchmark

The deep learning community has proposed optimizations spanning hardware, software, and learning theory to improve the computational performance of deep learning workloads. While some of these optimizations perform the same operations faster (e.g., switching from a NVIDIA K80 to P100), many modify the semantics of the training procedure (e.g., large minibatch training, reduced precision), which can impact a model’s generalization ability. Due to a lack of standard evaluation criteria that considers these trade-offs, it has become increasingly difficult to compare these different advances. To address this shortcoming, DAWNBENCH and the upcoming MLPERF benchmarks use time-to-accuracy as the primary metric for evaluation, with the accuracy threshold set close to state-of-the-art and measured on a held-out dataset not used in training; the goal is to train to this accuracy threshold as fast as possible. In DAWNBENCH , the winning entries improved time-to-accuracy on ImageNet by two orders of magnitude over the seed entries. Despite this progress, it is unclear how sensitive time-to-accuracy is to the chosen threshold as well as the variance between independent training runs, and how well models optimized for time-to-accuracy generalize. In this paper, we provide evidence to suggest that time-to-accuracy has a low coefficient of variance and that the models tuned for it generalize nearly as well as pre-trained models. We additionally analyze the winning entries to understand the source of these speedups, and give recommendations for future benchmarking efforts.

BindsNET: A machine learning-oriented spiking neural networks library in Python

The development of spiking neural network simulation software is a critical component enabling the modeling of neural systems and the development of biologically inspired algorithms. Existing software frameworks support a wide range of neural functionality, software abstraction levels, and hardware devices, yet are typically not suitable for rapid prototyping or application to problems in the domain of machine learning. In this paper, we describe a new Python package for the simulation of spiking neural networks, specifically geared towards machine learning and reinforcement learning. Our software, called BindsNET, enables rapid building and simulation of spiking networks and features user-friendly, concise syntax. BindsNET is built on top of the PyTorch deep neural networks library, enabling fast CPU and GPU computation for large spiking networks. The BindsNET framework can be adjusted to meet the needs of other existing computing and hardware environments, e.g., TensorFlow. We also provide an interface into the OpenAI gym library, allowing for training and evaluation of spiking networks on reinforcement learning problems. We argue that this package facilitates the use of spiking networks for large-scale machine learning experimentation, and show some simple examples of how we envision BindsNET can be used in practice. BindsNET code is available at https://…/bindsnet

Alchemist: An Apache Spark MPI Interface

The Apache Spark framework for distributed computation is popular in the data analytics community due to its ease of use, but its MapReduce-style programming model can incur significant overheads when performing computations that do not map directly onto this model. One way to mitigate these costs is to off-load computations onto MPI codes. In recent work, we introduced Alchemist, a system for the analysis of large-scale data sets. Alchemist calls MPI-based libraries from within Spark applications, and it has minimal coding, communication, and memory overheads. In particular, Alchemist allows users to retain the productivity benefits of working within the Spark software ecosystem without sacrificing performance efficiency in linear algebra, machine learning, and other related computations. In this paper, we discuss the motivation behind the development of Alchemist, and we provide a detailed overview its design and usage. We also demonstrate the efficiency of our approach on medium-to-large data sets, using some standard linear algebra operations, namely matrix multiplication and the truncated singular value decomposition of a dense matrix, and we compare the performance of Spark with that of Spark+Alchemist. These computations are run on the NERSC supercomputer Cori Phase 1, a Cray XC40.

Accelerating CNN inference on FPGAs: A Survey

Convolutional Neural Networks (CNNs) are currently adopted to solve an ever greater number of problems, ranging from speech recognition to image classification and segmentation. The large amount of processing required by CNNs calls for dedicated and tailored hardware support methods. Moreover, CNN workloads have a streaming nature, well suited to reconfigurable hardware architectures such as FPGAs. The amount and diversity of research on the subject of CNN FPGA acceleration within the last 3 years demonstrates the tremendous industrial and academic interest. This paper presents a state-of-the-art of CNN inference accelerators over FPGAs. The computational workloads, their parallelism and the involved memory accesses are analyzed. At the level of neurons, optimizations of the convolutional and fully connected layers are explained and the performances of the different methods compared. At the network level, approximate computing and datapath optimization methods are covered and state-of-the-art approaches compared. The methods and tools investigated in this survey represent the recent trends in FPGA CNN inference accelerators and will fuel the future advances on efficient hardware deep learning.

Study of Automatic GPU Offloading Technology for Open IoT
Relational recurrent neural networks

Memory-based neural networks model temporal data by leveraging an ability to remember information for long periods. It is unclear, however, whether they also have an ability to perform complex relational reasoning with the information they remember. Here, we first confirm our intuitions that standard memory architectures may struggle at tasks that heavily involve an understanding of the ways in which entities are connected — i.e., tasks involving relational reasoning. We then improve upon these deficits by using a new memory module — a \textit{Relational Memory Core} (RMC) — which employs multi-head dot product attention to allow memories to interact. Finally, we test the RMC on a suite of tasks that may profit from more capable relational reasoning across sequential information, and show large gains in RL domains (e.g. Mini PacMan), program evaluation, and language modeling, achieving state-of-the-art results on the WikiText-103, Project Gutenberg, and GigaWord datasets.

Perturbative Neural Networks

Convolutional neural networks are witnessing wide adoption in computer vision systems with numerous applications across a range of visual recognition tasks. Much of this progress is fueled through advances in convolutional neural network architectures and learning algorithms even as the basic premise of a convolutional layer has remained unchanged. In this paper, we seek to revisit the convolutional layer that has been the workhorse of state-of-the-art visual recognition models. We introduce a very simple, yet effective, module called a perturbation layer as an alternative to a convolutional layer. The perturbation layer does away with convolution in the traditional sense and instead computes its response as a weighted linear combination of non-linearly activated additive noise perturbed inputs. We demonstrate both analytically and empirically that this perturbation layer can be an effective replacement for a standard convolutional layer. Empirically, deep neural networks with perturbation layers, called Perturbative Neural Networks (PNNs), in lieu of convolutional layers perform comparably with standard CNNs on a range of visual datasets (MNIST, CIFAR-10, PASCAL VOC, and ImageNet) with fewer parameters.

Mix&Match – Agent Curricula for Reinforcement Learning

We introduce Mix&Match (M&M) – a training framework designed to facilitate rapid and effective learning in RL agents, especially those that would be too slow or too challenging to train otherwise. The key innovation is a procedure that allows us to automatically form a curriculum over agents. Through such a curriculum we can progressively train more complex agents by, effectively, bootstrapping from solutions found by simpler agents. In contradistinction to typical curriculum learning approaches, we do not gradually modify the tasks or environments presented, but instead use a process to gradually alter how the policy is represented internally. We show the broad applicability of our method by demonstrating significant performance gains in three different experimental setups: (1) We train an agent able to control more than 700 actions in a challenging 3D first-person task; using our method to progress through an action-space curriculum we achieve both faster training and better final performance than one obtains using traditional methods. (2) We further show that M&M can be used successfully to progress through a curriculum of architectural variants defining an agents internal state. (3) Finally, we illustrate how a variant of our method can be used to improve agent performance in a multitask setting.

EasyConvPooling: Random Pooling with Easy Convolution for Accelerating Training and Testing

Convolution operations dominate the overall execution time of Convolutional Neural Networks (CNNs). This paper proposes an easy yet efficient technique for both Convolutional Neural Network training and testing. The conventional convolution and pooling operations are replaced by Easy Convolution and Random Pooling (ECP). In ECP, we randomly select one pixel out of four and only conduct convolution operations of the selected pixel. As a result, only a quarter of the conventional convolution computations are needed. Experiments demonstrate that the proposed EasyConvPooling can achieve 1.45x speedup on training time and 1.64x on testing time. What’s more, a speedup of 5.09x on pure Easy Convolution operations is obtained compared to conventional convolution operations.

Human-like generalization in a machine through predicate learning

Humans readily generalize, applying prior knowledge to novel situations and stimuli. Advances in machine learning and artificial intelligence have begun to approximate and even surpases human performance, but machine systems reliably struggle to generalize information to untrained situations. We describe a neural network model that is trained to play one video game (Breakout) and demonstrates one-shot generalization to a new game (Pong). The model generalizes by learning representations that are functionally and formally symbolic from training data, without feedback, and without requiring that structured representations be specified a priori. The model uses unsupervised comparison to discover which characteristics of the input are invariant, and to learn relational predicates; it then applies these predicates to arguments in a symbolic fashion, using oscillatory regularities in network firing to dynamically bind predicates to arguments. We argue that models of human cognition must account for far- reaching and flexible generalization, and that in order to do so, models must be able to discover symbolic representations from unstructured data, a process we call predicate learning. Only then can models begin to adequately explain where human-like representations come from, why human cognition is the way it is, and why it continues to differ from machine intelligence in crucial ways.

Competing Prediction Algorithms

Prediction is a well-studied machine learning task, and prediction algorithms are core ingredients in online products and services. Despite their centrality in the competition between online companies who offer prediction-based products, the strategic use of prediction algorithms remains unexplored. The goal of this paper is to examine strategic use of prediction algorithms. We introduce a novel game-theoretic setting that is based on the PAC learning framework, where each player (aka a prediction algorithm at competition) seeks to maximize the sum of points for which it produces an accurate prediction and the others do not. We show that algorithms aiming at generalization may wittingly miss-predict some points to perform better than others on expectation. We analyze the empirical game, i.e. the game induced on a given sample, prove that it always possesses a pure Nash equilibrium, and show that every better-response learning process converges. Moreover, our learning-theoretic analysis suggests that players can, with high probability, learn an approximate pure Nash equilibrium for the whole population using a small number of samples.

Generative Reversible Networks

Generative models with an encoding component such as autoencoders currently receive great interest. However, training of autoencoders is typically complicated by the need for training of a separate encoder and decoder model that have to be enforced to be reciprocal to each other. Here, we propose to use the by-design reversible neural networks (RevNets) as a new class of generative models. We investigate the generative performance of RevNets on the CelebA dataset, showing that generative RevNets can indeed generate coherent faces with similar quality as Variational Autoencoders. This first attempt to use RevNets as a generative model still slightly underperformed relative to recent advanced generative models using an autoencoder component on CelebA, but this gap may diminish with further optimization of the training setup of generative RevNets. In addition to the experiments on CelebA, we show a proof-of-principle experiment on the MNIST dataset suggesting that adversary-free trained RevNets can discover meaningful dimensions without pre-specifying the number of latent dimensions of the sampling distribution. In summary, this study shows that RevNets enable generative applications with an encoding component while overcoming the need of training separate encoder and decoder models.

A Visual Quality Index for Fuzzy C-Means

Cluster analysis is widely used in the areas of machine learning and data mining. Fuzzy clustering is a particular method that considers that a data point can belong to more than one cluster. Fuzzy clustering helps obtain flexible clusters, as needed in such applications as text categorization. The performance of a clustering algorithm critically depends on the number of clusters, and estimating the optimal number of clusters is a challenging task. Quality indices help estimate the optimal number of clusters. However, there is no quality index that can obtain an accurate number of clusters for different datasets. Thence, in this paper, we propose a new cluster quality index associated with a visual, graph-based solution that helps choose the optimal number of clusters in fuzzy partitions. Moreover, we validate our theoretical results through extensive comparison experiments against state-of-the-art quality indices on a variety of numerical real-world and artificial datasets.

ClusterNet : Semi-Supervised Clustering using Neural Networks

Clustering using neural networks has recently demonstrated promising performance in machine learning and computer vision applications. However, the performance of current approaches is limited either by unsupervised learning or their dependence on large set of labeled data samples. In this paper, we propose ClusterNet that uses pairwise semantic constraints from very few labeled data samples (< 5% of total data) and exploits the abundant unlabeled data to drive the clustering approach. We define a new loss function that uses pairwise semantic similarity between objects combined with constrained k-means clus- tering to efficiently utilize both labeled and unlabeled data in the same framework. The proposed network uses convolution autoencoder to learn a latent representation that groups data into k specified clusters, while also learning the cluster centers simultaneously. We evaluate and com- pare the performance of ClusterNet on several datasets and state of the art deep clustering approaches.

Combining Multiple Algorithms in Classifier Ensembles using Generalized Mixture Functions

Classifier ensembles are pattern recognition structures composed of a set of classification algorithms (members), organized in a parallel way, and a combination method with the aim of increasing the classification accuracy of a classification system. In this study, we investigate the application of a generalized mixture (GM) functions as a new approach for providing an efficient combination procedure for these systems through the use of dynamic weights in the combination process. Therefore, we present three GM functions to be applied as a combination method. The main advantage of these functions is that they can define dynamic weights at the member outputs, making the combination process more efficient. In order to evaluate the feasibility of the proposed approach, an empirical analysis is conducted, applying classifier ensembles to 25 different classification data sets. In this analysis, we compare the use of the proposed approaches to ensembles using traditional combination methods as well as the state-of-the-art ensemble methods. Our findings indicated gains in terms of performance when comparing the proposed approaches to the traditional ones as well as comparable results with the state-of-the-art methods.

An Explainable Adversarial Robustness Metric for Deep Learning Neural Networks

Deep Neural Networks(DNN) have excessively advanced the field of computer vision by achieving state of the art performance in various vision tasks. These results are not limited to the field of vision but can also be seen in speech recognition and machine translation tasks. Recently, DNNs are found to poorly fail when tested with samples that are crafted by making imperceptible changes to the original input images. This causes a gap between the validation and adversarial performance of a DNN. An effective and generalizable robustness metric for evaluating the performance of DNN on these adversarial inputs is still missing from the literature. In this paper, we propose Noise Sensitivity Score (NSS), a metric that quantifies the performance of a DNN on a specific input under different forms of fix-directional attacks. An insightful mathematical explanation is provided for deeply understanding the proposed metric. By leveraging the NSS, we also proposed a skewness based dataset robustness metric for evaluating a DNN’s adversarial performance on a given dataset. Extensive experiments using widely used state of the art architectures along with popular classification datasets, such as MNIST, CIFAR-10, CIFAR-100, and ImageNet, are used to validate the effectiveness and generalization of our proposed metrics. Instead of simply measuring a DNN’s adversarial robustness in the input domain, as previous works, the proposed NSS is built on top of insightful mathematical understanding of the adversarial attack and gives a more explicit explanation of the robustness.

Understanding Regularized Spectral Clustering via Graph Conductance

This paper uses the relationship between graph conductance and spectral clustering to study (i) the failures of spectral clustering and (ii) the benefits of regularization. The explanation is simple. Sparse and stochastic graphs create a lot of small trees that are connected to the core of the graph by only one edge. Graph conductance is sensitive to these noisy `dangling sets’. Spectral clustering inherits this sensitivity. The second part of the paper starts from a previously proposed form of regularized spectral clustering and shows that it is related to the graph conductance on a `regularized graph’. We call the conductance on the regularized graph CoreCut. Based upon previous arguments that relate graph conductance to spectral clustering (e.g. Cheeger inequality), minimizing CoreCut relaxes to regularized spectral clustering. Simple inspection of CoreCut reveals why it is less sensitive to small cuts in the graph. Together, these results show that unbalanced partitions from spectral clustering can be understood as overfitting to noise in the periphery of a sparse and stochastic graph. Regularization fixes this overfitting. In addition to this statistical benefit, these results also demonstrate how regularization can improve the computational speed of spectral clustering. We provide simulations and data examples to illustrate these results.

CFCM: Segmentation via Coarse to Fine Context Memory

Recent neural-network-based architectures for image segmentation make extensive usage of feature forwarding mechanisms to integrate information from multiple scales. Although yielding good results, even deeper architectures and alternative methods for feature fusion at different resolutions have been scarcely investigated for medical applications. In this work we propose to implement segmentation via an encoder-decoder architecture which differs from any other previously published method since (i) it employs a very deep architecture based on residual learning and (ii) combines features via a convolutional Long Short Term Memory (LSTM), instead of concatenation or summation. The intuition is that the memory mechanism implemented by LSTMs can better integrate features from different scales through a coarse-to-fine strategy; hence the name Coarse-to-Fine Context Memory (CFCM). We demonstrate the remarkable advantages of this approach on two datasets: the Montgomery county lung segmentation dataset, and the EndoVis 2015 challenge dataset for surgical instrument segmentation.

Factorized Adversarial Networks for Unsupervised Domain Adaptation

In this paper, we propose Factorized Adversarial Networks (FAN) to solve unsupervised domain adaptation problems for image classification tasks. Our networks map the data distribution into a latent feature space, which is factorized into a domain-specific subspace that contains domain-specific characteristics and a task-specific subspace that retains category information, for both source and target domains, respectively. Unsupervised domain adaptation is achieved by adversarial training to minimize the discrepancy between the distributions of two task-specific subspaces from source and target domains. We demonstrate that the proposed approach outperforms state-of-the-art methods on multiple benchmark datasets used in the literature for unsupervised domain adaptation. Furthermore, we collect two real-world tagging datasets that are much larger than existing benchmark datasets, and get significant improvement upon baselines, proving the practical value of our approach.

Quantum singular value transformation and beyond: exponential improvements for quantum matrix arithmetics
Geometry and algorithms for upper triangular tropical matrix identities
Multiplicative chaos and the characteristic polynomial of the CUE: the $L^1$-phase
Relational Deep Reinforcement Learning
Eliciting Binary Performance Metrics
The Effect of Planning Shape on Dyna-style Planning in High-dimensional State Spaces
Integrating Flexible Normalization into Mid-Level Representations of Deep Convolutional Neural Networks
The poset of graphs ordered by induced containment
AdaGrad stepsizes: Sharp convergence over nonconvex landscapes, from any initialization
Videos as Space-Time Region Graphs
Tree Path Majority Data Structures
On MIMO Channel Capacity with Output Quantization Constraints
Survey and Taxonomy of Lossless Graph Compression and Space-Efficient Graph Representations
Stochastic Gradient Descent on Separable Data: Exact Convergence with a Fixed Learning Rate
Sequential Attend, Infer, Repeat: Generative Modelling of Moving Objects
Contextual Slot Carryover for Disparate Schemas
Cycle-Consistent Adversarial Learning as Approximate Bayesian Inference
Evidential Deep Learning to Quantify Classification Uncertainty
Graph Saliency Maps through Spectral Convolutional Networks: Application to Sex Classification with Brain Connectivity
Predictive Accuracy of Markers or Risk Scores for Interval Censored Survival Data
Monte Carlo Convolution for Learning on Non-Uniformly Sampled Point Clouds
Estimating Shortest Path Length Distributions via Random Walk Sampling
Neural-Kernelized Conditional Density Estimation
A Machine Learning Framework for Stock Selection
Adapting Neural Text Classification for Improved Software Categorization
Disease Prediction using Graph Convolutional Networks: Application to Autism Spectrum Disorder and Alzheimer’s Disease
Design of remote control software of near infrared Sky Brightness Monitor in Antarctica
Luminoso at SemEval-2018 Task 10: Distinguishing Attributes Using Text Corpora and Relational Knowledge
Machine Learning for Yield Curve Feature Extraction: Application to Illiquid Corporate Bonds (Preliminary Draft)
Long-time behaviour and phase transitions for the McKean–Vlasov equation on the torus
Level-Based Analysis of the Population-Based Incremental Learning Algorithm
Graph topology inference based on sparsifying transform learning
On iterated product sets with shifts II
A Quantitative Analysis of Possible Futures of Autonomous Transport
Understanding Meanings in Multilingual Customer Feedback
Optimal control of a commercial building’s thermostatic load for off-peak demand response
Many-body localization in Fock-space: a local perspective
A Projection Method for Metric-Constrained Optimization
Practical Deep Stereo (PDS): Toward applications-friendly deep stereo matching
Clique-factors in sparse pseudorandom graphs
Recurrent Convolutional Fusion for RGB-D Object Recognition
Simplicity of the automorphism groups of some binary homogeneous structures determined by triangle constraints
On Latent Distributions Without Finite Mean in Generative Models
On the global convergent of an inexact quasi-Newton conditional gradient method for constrained nonlinear systems
Fast Dynamic Programming on Graph Decompositions
Singing voice phoneme segmentation by hierarchically inferring syllable and phoneme onset positions
Hierarchical Graph Clustering using Node Pair Sampling
RG Smoothing Algorithm Which Makes Data Compression
Native Directly Follows Operator
Deep Gaussian Processes with Convolutional Kernels
Adaptive twisting sliding mode control for quadrotor unmanned aerial vehicles
Real-time Lane Marker Detection Using Template Matching with RGB-D Camera
Explaining Away Syntactic Structure in Semantic Document Representations
BOCK : Bayesian Optimization with Cylindrical Kernels
Power-law cross-correlations: Issues, solutions and future challenges
merlin – a unified modelling framework for data analysis and methods development in Stata
Energy-efficient localised rollback after failures via data flow analysis
On layer-level control of DNN training and its impact on generalization
On the Energy Efficiency of MIMO Hybrid Beamforming for Millimeter Wave Systems with Nonlinear Power Amplifiers
Accelerated Randomized Coordinate Descent Methods for Stochastic Optimization and Online Learning
Predicting the temporal activity patterns of new venues
Brain synchronizability, a false friend
A penalty criterion for score forecasting in soccer
Stochastic Gradient Descent with Hyperbolic-Tangent Decay
Discovering and Removing Exogenous State Variables and Rewards for Reinforcement Learning
Dynamic Programming Optimization in Line of Sight Networks
Adaptive Importance Learning for Improving Lightweight Image Super-resolution Network
Strong and weak convergence rates of finite element method for stochastic partial differential equation with non-globally Lipschitz coefficients
Multi-sensor data fusion based on a generalised belief divergence measure
Combining covariance tapering and lasso driven low rank decomposition for the kriging of large spatial datasets
Combining Multiple Optimised FPGA-based Pulsar Search Modules Using OpenCL
Small gaps of circular $β$-ensemble
Dynamical aspects of generalized Schr{ö}dinger problem via Otto calculus — A heuristic point of view
Mixed Effect Composite RNN-GP: A Personalized and Reliable Prediction Model for Healthcare
TS-Net: Combining modality specific and common features for multimodal patch matching
Zonal congestion management mixing large battery storage systems and generation curtailment
Deep Mixture of Experts via Shallow Embedding
The universal approximation power of finite-width deep ReLU networks
A physical approach to dissipation-induced instabilities
Leolani: a reference machine with a theory of mind for social communication
Product formulas for certain skew tableaux
Construction of all-in-focus images assisted by depth sensing
Multi-Task Active Learning for Neural Semantic Role Labeling on Low Resource Conversational Corpus
Hyperparameter Learning for Bilevel Nonsmooth Optimization
How Do Source-side Monolingual Word Embeddings Impact Neural Machine Translation
Two general classes of integral inequalities including weight functions
Accounting for Uncertainty About Past Values In Probabilistic Projections of the Total Fertility Rate for All Countries
Bearing fault diagnosis based on domain adaptation using transferable features under different working conditions
Attention Based Fully Convolutional Network for Speech Emotion Recognition
Graph Compression Using Pattern Matching Techniques
Boredom-driven curious learning by Homeo-Heterostatic Value Gradients
Information Aggregation via Dynamic Routing for Sequence Encoding
Decomposability and time consistency of risk averse multistage programs
Deep Image Compression via End-to-End Learning
Dynamic optimal contract under parameter uncertainty with risk averse agent and principal
Leave-out estimation of variance components
Newton-Kantorovitch method for decoupled forward-backward stochastic differential equations
Near-Optimal Time and Sample Complexities for for Solving Discounted Markov Decision Process with a Generative Model
Design of 32-channel TDC Based on Single FPGA for μSR Spectrometer at CSNS
Berry-Esseen bound for the Parameter Estimation of Fractional Ornstein-Uhlenbeck Processes
Forecasting Crime with Deep Learning
3D Human Pose Estimation with 2D Marginal Heatmaps
JTAV: Jointly Learning Social Media Content Representation by Fusing Textual, Acoustic, and Visual Features
SoPhie: An Attentive GAN for Predicting Paths Compliant to Social and Physical Constraints
Sampling and Super-resolution of Sparse Signals Beyond the Fourier Domain
PAC-learning in the presence of evasion adversaries
Labeling Algorithm and Compact Routing Scheme for a Small World Network Model
Informative Gene Selection for Microarray Classification via Adaptive Elastic Net with Conditional Mutual Information
Dynamic Function-on-Scalars Regression
The Value of Information in Retrospect
A Consistent Variance Estimator for 2SLS When Instruments Identify Different LATEs
Calibration for computer experiments with binary responses
Asymptotic Refinements of a Misspecification-Robust Bootstrap for Generalized Method of Moments Estimators
Querying Complex Networks in Vector Space
Analysis and Design of 8-Bit CMOS Priority Encoders
On Computing the Multiplicity of Short Cycles in Bipartite Graphs Using the Degree Distribution and the Spectrum of the Graph
A Uniform-in-$P$ Edgeworth Expansion under Weak Cramér Conditions
Composite Marginal Likelihood Methods for Random Utility Models
Unsteady PDE-constrained optimization with spectral elements using PETSc and TAO
Permanent Magnet Synchronous Motors are Globally Asymptotically Stabilizable with PI Current Control
A Fast Algorithm for Maximum Likelihood Estimation of Mixture Proportions Using Sequential Quadratic Programming
Learning Scene Flow in 3D Point Clouds
A Bayesian Penalized Hidden Markov Model for Ant Interactions
On estimation and inference in latent structure random graphs
Learning to track on-the-fly using a particle filter with annealed- weighted QPSO modeled after a singular Dirac delta potential
The shape of a memorised random walk
An analog magnon adder for all-magnonic neurons
New And Surprising Ways to Be Mean. Adversarial NPCs with Coupled Empowerment Minimisation
Invertibility of adjacency matrices for random d-regular directed graphs
A General Approach to Multi-Armed Bandits Under Risk Criteria
Strong Pseudo Transitivity and Intersection Graphs
Improving rewards in overloaded real-time systems
Mode-Coupling Theory of the Glass Transition: A Primer
Adversarial Reinforcement Learning Framework for Benchmarking Collision Avoidance Mechanisms in Autonomous Vehicles
The data-Drive Schroedinger bridge
Playing Atari with Six Neurons
Adversarial Domain Adaptation for Classification of Prostate Histopathology Whole-Slide Images
gprHOG: Several Simple Improvements to the Histogram of Oriented Gradients Feature for Threat Detection in Ground-Penetrating Radar
Importance Sampling Policy Evaluation with an Estimated Behavior Policy
Extracting relevant structures from self-determination theory questionnaires via Information Bottleneck method
Data-Driven Participation Factors for Nonlinear Systems Based on Koopman Mode Decomposition
Asymmetry Helps: Improved Private Information Retrieval Protocols for Distributed Storage
Design of optimal illumination patterns in single-pixel imaging using image dictionaries
Computing the Spatial Probability of Inclusion inside Partial Contours for Computer Vision Applications
Backdrop: Stochastic Backpropagation
Lifting tropical self intersections
The Impact of Supervision and Incentive Process in Explaining Wage Profile and Variance
Precise Runtime Analysis for Plateaus
Absolute Orientation for Word Embedding Alignment
Post model-fitting exploration via a ‘Next-Door’ analysis
Adaptive Critical Value for Constrained Likelihood Ratio Testing
Hypergraph encoding GDD and their linear representations
Past Visions of Artificial Futures: One Hundred and Fifty Years under the Spectre of Evolving Machines
A dedicated codec for compression of Gravitational Waves Sound
Cube Padding for Weakly-Supervised Saliency Prediction in 360° Videos
Data-driven Localization and Estimation of Disturbance in the Interconnected Power System
Universal Statistics of Fisher Information in Deep Neural Networks: Mean Field Approach
Continuous vs. Discontinuous Transitions in the Generalized Kuramoto Model: The Strong Effect of Dimensionality
Y-Net: Joint Segmentation and Classification for Diagnosis of Breast Biopsy Images
Towards the Practical Application of Near-Term Quantum Computers in Quantum Chemistry Simulations: A Problem Decomposition Approach
MOSES: A Streaming Algorithm for Linear Dimensionality Reduction
On sunlet graphs connected to a specific map on $\{1,2,\dots,p-1\}$
Two-particle spectral function for disordered s-wave superconductors: local maps and collective modes
Towards Understanding Acceleration Tradeoff between Momentum and Asynchrony in Nonconvex Stochastic Optimization
Gradient-based Filter Design for the Dual-tree Wavelet Transform
Evaluation of matrix factorisation approaches for muscle synergy extraction
Muscle Activity Analysis using Higher-Order Tensor Models: Application to Shared Muscle Synergy Identification
A FORTRAN Package for Efficient Multi-Accuracy Computations of the Faddeyeva Function and Related Functions of Complex Arguments
A New Wireless Communication Paradigm through Software-controlled Metasurfaces
Environment induced Symmetry Breaking of the Oscillation-Death State
A Possibility Distribution Based Multi-Criteria Decision Algorithm for Resilient Supplier Selection Problems
Inter-Satellite Communication System based on Visible Light
Internal Model from Observations for Reward Shaping
Transient temperature calculation method for complex fluid-solid heat transfer problems with scattering boundary conditions
Equivalence Between Wasserstein and Value-Aware Model-based Reinforcement Learning
OpenTag: Open Attribute Value Extraction from Product Profiles
Natural Language Generation for Electronic Health Records
An empirical characterization of community structures in complex networks using a bivariate map of quality metrics
Understanding diseases as increased heterogeneity: a complex network computational framework
Document Chunking and Learning Objective Generation for Instruction Design
Numerical Integration as an Initial Value Problem
Load Restoration Methodology Considering Renewable Energies and Combined Heat and Power Systems
Time-varying Rotational Inverted Pendulum Control using Fuzzy Approach
Compressed Sensing ECG using Restricted Boltzmann Machines
Adaptive System Identification Using LMS Algorithm Integrated with Evolutionary Computation
Comparing Alternatives to Measure the Impact of DDoS Attack Announcements on Target Stock Prices
Stochastic and Coarse-Grained Two-Dimensional Modeling of Directional Particle Movement
Safe Driving Capacity of Autonomous Vehicles
Evaluating Impact of Human Errors on the Availability of Data Storage Systems