Model-Driven Artificial Intelligence for Online Network Optimization

Future 5G wireless networks will rely on agile and automated network management, where the usage of diverse resources must be jointly optimized with surgical accuracy. A number of key wireless network functionalities (e.g., traffic steering, energy savings) give rise to hard optimization problems. What is more, high spatio-temporal traffic variability coupled with the need to satisfy strict per slice/service SLAs in modern networks, suggest that these problems must be constantly (re-)solved, to maintain close-to-optimal performance. To this end, in this paper we propose the framework of Online Network Optimization (ONO), which seeks to maintain both agile and efficient control over time, using an arsenal of data-driven, adaptive, and AI-based techniques. Since the mathematical tools and the studied regimes vary widely among these methodologies, a theoretical comparison is often out of reach. Therefore, the important question ‘what is the right ONO technique ‘ remains open to date. In this paper, we discuss the pros and cons of each technique and further attempt a direct quantitative comparison for a specific use case, using real data. Our results suggest that carefully combining the insights of problem modeling with state-of-the-art AI techniques provides significant advantages at reasonable complexity.

Progressive Evaluation of Queries over Untagged Data

Modern information systems often collect raw data in the form of text, images, video, and sensor readings. Such data needs to be further interpreted/enriched prior to being analyzed. Enrichment is often a result of automated machine learning and or signal processing techniques that associate appropriate but uncertain tags with the data. Traditionally, with the notable exception of a few systems, enrichment is considered to be a separate pre-processing step performed independently prior to data analysis. Such an approach is becoming increasingly infeasible since modern data capture technologies enable creation of very large data collections for which it is computationally difficult/impossible and ultimately not beneficial to derive all tags as a preprocessing step. Hence, approaches that perform tagging at query/analysis time on the data of interest need to be considered. This paper explores the problem of joint tagging and query processing. In particular, the paper considers a scenario where tagging can be performed using several techniques that differ in cost and accuracy and develops a progressive approach to answering queries (SPJ queries with a restricted version of join) that enriches the right data to the right degree so as to maximize the quality of the query results. The experimental results show that proposed approach performs significantly better compared to baseline approaches.

Towards Adversarial Configurations for Software Product Lines

Ensuring that all supposedly valid configurations of a software product line (SPL) lead to well-formed and acceptable products is challenging since it is most of the time impractical to enumerate and test all individual products of an SPL. Machine learning classifiers have been recently used to predict the acceptability of products associated with unseen configurations. For some configurations, a tiny change in their feature values can make them pass from acceptable to non-acceptable regarding users’ requirements and vice-versa. In this paper, we introduce the idea of leveraging these specific configurations and their positions in the feature space to improve the classifier and therefore the engineering of an SPL. Starting from a variability model, we propose to use Adversarial Machine Learning techniques to create new, adversarial configurations out of already known configurations by modifying their feature values. Using an industrial video generator we show how adversarial configurations can improve not only the classifier, but also the variability model, the variability implementation, and the testing oracle.

Modeling Cognitive Processes in Social Tagging to Improve Tag Recommendations

With the emergence of Web 2.0, tag recommenders have become important tools, which aim to support users in finding descriptive tags for their bookmarked resources. Although current algorithms provide good results in terms of tag prediction accuracy, they are often designed in a data-driven way and thus, lack a thorough understanding of the cognitive processes that play a role when people assign tags to resources. This thesis aims at modeling these cognitive dynamics in social tagging in order to improve tag recommendations and to better understand the underlying processes. As a first attempt in this direction, we have implemented an interplay between individual micro-level (e.g., categorizing resources or temporal dynamics) and collective macro-level (e.g., imitating other users’ tags) processes in the form of a novel tag recommender algorithm. The preliminary results for datasets gathered from BibSonomy, CiteULike and Delicious show that our proposed approach can outperform current state-of-the-art algorithms, such as Collaborative Filtering, FolkRank or Pairwise Interaction Tensor Factorization. We conclude that recommender systems can be improved by incorporating related principles of human cognition.

Optimal Control Via Neural Networks: A Convex Approach

Control of complex systems involves both system identification and controller design. Deep neural networks have proven to be successful in many identification tasks, such as classification, prediction, and end-to-end system modeling. However, from the controller design perspective, these networks are difficult to work with because they are typically nonlinear and nonconvex. Therefore many systems are still optimized and controlled based on simple linear models despite their poor identification performance. In this paper we address this problem by explicitly constructing deep neural networks that are convex with respect to their inputs. We show that these input convex networks can be trained to obtain accurate models of complex physical systems. In particular, we design input convex recurrent neural networks to capture temporal behavior of dynamical systems. Then optimal controllers based on these networks can be designed by solving convex optimization problems. Results on both toy models and real-world image denoising and building energy optimization problems demonstrate the modeling accuracy and control efficiency of the proposed approach.

Predictive Performance Modeling for Distributed Computing using Black-Box Monitoring and Machine Learning

In many domains, the previous decade was characterized by increasing data volumes and growing complexity of computational workloads, creating new demands for highly data-parallel computing in distributed systems. Effective operation of these systems is challenging when facing uncertainties about the performance of jobs and tasks under varying resource configurations, e.g., for scheduling and resource allocation. We survey predictive performance modeling (PPM) approaches to estimate performance metrics such as execution duration, required memory or wait times of future jobs and tasks based on past performance observations. We focus on non-intrusive methods, i.e., methods that can be applied to any workload without modification, since the workload is usually a black-box from the perspective of the systems managing the computational infrastructure. We classify and compare sources of performance variation, predicted performance metrics, required training data, use cases, and the underlying prediction techniques. We conclude by identifying several open problems and pressing research needs in the field.

Building your Cross-Platform Application with RHEEM

Today, organizations typically perform tedious and costly tasks to juggle their code and data across different data processing platforms. Addressing this pain and achieving automatic cross-platform data processing is quite challenging because it requires quite good expertise for all the available data processing platforms. In this report, we present Rheem, a general-purpose cross-platform data processing system that alleviates users from the pain of finding the most efficient data processing platform for a given task. It also splits a task into subtasks and assigns each subtask to a specific platform to minimize the overall cost (e.g., runtime or monetary cost). To offer cross-platform functionality, it features (i) a robust interface to easily compose data analytic tasks; (ii) a novel cost-based optimizer able to find the most efficient platform in almost all cases; and (iii) an executor to efficiently orchestrate tasks over different platforms. As a result, it allows users to focus on the business logic of their applications rather than on the mechanics of how to compose and execute them. Rheem is released under an open source license.

Teaching Meaningful Explanations

The adoption of machine learning in high-stakes applications such as healthcare and law has lagged in part because predictions are not accompanied by explanations comprehensible to the domain user, who often holds ultimate responsibility for decisions and outcomes. In this paper, we propose an approach to generate such explanations in which training data is augmented to include, in addition to features and labels, explanations elicited from domain users. A joint model is then learned to produce both labels and explanations from the input features. This simple idea ensures that explanations are tailored to the complexity expectations and domain knowledge of the consumer. Evaluation spans multiple modeling techniques on a simple game dataset, an image dataset, and a chemical odor dataset, showing that our approach is generalizable across domains and algorithms. Results demonstrate that meaningful explanations can be reliably taught to machine learning algorithms, and in some cases, improve modeling accuracy.

One-at-a-time: A Meta-Learning Recommender-System for Recommendation-Algorithm Selection on Micro Level

In this proposal we present the idea of a ‘macro recommender system’, and ‘micro recommender system’. Both systems can be considered as a recommender system for recommendation algorithms. A macro recommender system recommends the best performing recommendation algorithm to an organization that wants to build a recommender system. This way, an organization does not need to test many algorithms over long periods to find the best one for their particular platform. A micro recommender system recommends the best performing recommendation algorithm for each individual recommendation request. This proposal is based on the premise that there is no single-best algorithm for all users, items, and contexts. For instance, a micro recommender system might recommend one algorithm when recommendations for an elderly male user in the evening should be created. When recommendations for a young female user in the morning should be given, the micro recommender system might recommend a different algorithm.

Marian: Cost-effective High-Quality Neural Machine Translation in C++

This paper describes the submissions of the ‘Marian’ team to the WNMT 2018 shared task. We investigate combinations of teacher-student training, low-precision matrix products, auto-tuning and other methods to optimize the Transformer model on GPU and CPU. By further integrating these methods with the new averaging attention networks, a recently introduced faster Transformer variant, we create a number of high-quality, high-performance models on the GPU and CPU, dominating the Pareto frontier for this shared task.

MPDCompress – Matrix Permutation Decomposition Algorithm for Deep Neural Network Compression

Deep neural networks (DNNs) have become the state-of-the-art technique for machine learning tasks in various applications. However, due to their size and the computational complexity, large DNNs are not readily deployable on edge devices in real-time. To manage complexity and accelerate computation, network compression techniques based on pruning and quantization have been proposed and shown to be effective in reducing network size. However, such network compression can result in irregular matrix structures that are mismatched with modern hardware-accelerated platforms, such as graphics processing units (GPUs) designed to perform the DNN matrix multiplications in a structured (block-based) way. We propose MPDCompress, a DNN compression algorithm based on matrix permutation decomposition via random mask generation. In-training application of the masks molds the synaptic weight connection matrix to a sub-graph separation format. Aided by the random permutations, a hardware-desirable block matrix is generated, allowing for a more efficient implementation and compression of the network. To show versatility, we empirically verify MPDCompress on several network models, compression rates, and image datasets. On the LeNet 300-100 model (MNIST dataset), Deep MNIST, and CIFAR10, we achieve 10 X network compression with less than 1% accuracy loss compared to non-compressed accuracy performance. On AlexNet for the full ImageNet ILSVRC-2012 dataset, we achieve 8 X network compression with less than 1% accuracy loss, with top-5 and top-1 accuracies of 79.6% and 56.4%, respectively. Finally, we observe that the algorithm can offer inference speedups across various hardware platforms, with 4 X faster operation achieved on several mobile GPUs.

Towards Understanding the Role of Over-Parametrization in Generalization of Neural Networks

Despite existing work on ensuring generalization of neural networks in terms of scale sensitive complexity measures, such as norms, margin and sharpness, these complexity measures do not offer an explanation of why neural networks generalize better with over-parametrization. In this work we suggest a novel complexity measure based on unit-wise capacities resulting in a tighter generalization bound for two layer ReLU networks. Our capacity bound correlates with the behavior of test error with increasing network sizes, and could potentially explain the improvement in generalization with over-parametrization. We further present a matching lower bound for the Rademacher complexity that improves over previous capacity lower bounds for neural networks.

Regularized Kernel and Neural Sobolev Descent: Dynamic MMD Transport

We introduce Regularized Kernel and Neural Sobolev Descent for transporting a source distribution to a target distribution along smooth paths of minimum kinetic energy (defined by the Sobolev discrepancy), related to dynamic optimal transport. In the kernel version, we give a simple algorithm to perform the descent along gradients of the Sobolev critic, and show that it converges asymptotically to the target distribution in the MMD sense. In the neural version, we parametrize the Sobolev critic with a neural network with input gradient norm constrained in expectation. We show in theory and experiments that regularization has an important role in favoring smooth transitions between distributions, avoiding large discrete jumps. Our analysis could provide a new perspective on the impact of critic updates (early stopping) on the paths to equilibrium in the GAN setting.

Counterstrike: Defending Deep Learning Architectures Against Adversarial Samples by Langevin Dynamics with Supervised Denoising Autoencoder

Adversarial attacks on deep learning models have been demonstrated to be imperceptible to a human, while decreasing the model performance considerably. Attempts to provide invariance against such attacks have denoised adversarial samples to only send cleaned samples to the classifier. In a similar spirit this paper proposes a novel effective strategy that allows to relax adversarial samples onto the underlying manifold of the (unknown) target class distribution. Specifically, given an off-manifold adversarial example, our Metroplis-adjusted Langevin algorithm (Mala) guided through a supervised denoising autoencoder network (sDAE) allows to drive the adversarial samples towards high density regions of the data generating distribution. So, in a nutshell the adversarial example is transformed back from off-manifold onto the data manifold for which the learning model was originally trained and where it can perform well and robustly. Experiments on various benchmark datasets show that our novel Malade method exhibits a high robustness against blackbox and whitebox attacks and outperforms state-of-the-art defense algorithms.

The Dynamics of Learning: A Random Matrix Approach

Understanding the learning dynamics of neural networks is one of the key issues for the improvement of optimization algorithms as well as for the theoretical comprehension of why deep neural nets work so well today. In this paper, we introduce a random matrix-based framework to analyze the learning dynamics of a single-layer linear network on a binary classification problem, for data of simultaneously large dimension and size, trained by gradient descent. Our results provide rich insights into common questions in neural nets, such as overfitting, early stopping and the initialization of training, thereby opening the door for future studies of more elaborate structures and models appearing in today’s neural networks.

Grow and Prune Compact, Fast, and AccurateLSTMs

Long short-term memory (LSTM) has been widely used for sequential data modeling. Researchers have increased LSTM depth by stacking LSTM cells to improve performance. This incurs model redundancy, increases run-time delay, and makes the LSTMs more prone to overfitting. To address these problems, we propose a hidden-layer LSTM (H-LSTM) that adds hidden layers to LSTM’s original one level non-linear control gates. H-LSTM increases accuracy while employing fewer external stacked layers, thus reducing the number of parameters and run-time latency significantly. We employ grow-and-prune (GP) training to iteratively adjust the hidden layers through gradient-based growth and magnitude-based pruning of connections. This learns both the weights and the compact architecture of H-LSTM control gates. We have GP-trained H-LSTMs for image captioning and speech recognition applications. For the NeuralTalk architecture on the MSCOCO dataset, our three models reduce the number of parameters by 38.7x [floating-point operations (FLOPs) by 45.5x], run-time latency by 4.5x, and improve the CIDEr score by 2.6. For the DeepSpeech2 architecture on the AN4 dataset, our two models reduce the number of parameters by 19.4x (FLOPs by 23.5x), run-time latency by 15.7%, and the word error rate from 12.9% to 8.7%. Thus, GP-trained H-LSTMs can be seen to be compact, fast, and accurate.

To Trust Or Not To Trust A Classifier

Knowing when a classifier’s prediction can be trusted is useful in many applications and critical for safely using AI. While the bulk of the effort in machine learning research has been towards improving classifier performance, understanding when a classifier’s predictions should and should not be trusted has received far less attention. The standard approach is to use the classifier’s discriminant or confidence score; however, we show there exists a considerably more effective alternative. We propose a new score, called the trust score, which measures the agreement between the classifier and a modified nearest-neighbor classifier on the testing example. We show empirically that high (low) trust scores produce surprisingly high precision at identifying correctly (incorrectly) classified examples, consistently outperforming the classifier’s confidence score as well as many other baselines. Further, under some mild distributional assumptions, we show that if the trust score for an example is high (low), the classifier will likely agree (disagree) with the Bayes-optimal classifier. Our guarantees consist of non-asymptotic rates of statistical consistency under various nonparametric settings and build on recent developments in topological data analysis.

Collaborative Learning for Deep Neural Networks

We introduce collaborative learning in which multiple classifier heads of the same network are simultaneously trained on the same training data to improve generalization and robustness to label noise with no extra inference cost. It acquires the strengths from auxiliary training, multi-task learning and knowledge distillation. There are two important mechanisms involved in collaborative learning. First, the consensus of multiple views from different classifier heads on the same example provides supplementary information as well as regularization to each classifier, thereby improving generalization. Second, intermediate-level representation (ILR) sharing with backpropagation rescaling aggregates the gradient flows from all heads, which not only reduces training computational complexity, but also facilitates supervision to the shared layers. The empirical results on CIFAR and ImageNet datasets demonstrate that deep neural networks learned as a group in a collaborative way significantly reduce the generalization error and increase the robustness to label noise.

Learn to Combine Modalities in Multimodal Deep Learning

Combining complementary information from multiple modalities is intuitively appealing for improving the performance of learning-based approaches. However, it is challenging to fully leverage different modalities due to practical challenges such as varying levels of noise and conflicts between modalities. Existing methods do not adopt a joint approach to capturing synergies between the modalities while simultaneously filtering noise and resolving conflicts on a per sample basis. In this work we propose a novel deep neural network based technique that multiplicatively combines information from different source modalities. Thus the model training process automatically focuses on information from more reliable modalities while reducing emphasis on the less reliable modalities. Furthermore, we propose an extension that multiplicatively combines not only the single-source modalities, but a set of mixtured source modalities to better capture cross-modal signal correlations. We demonstrate the effectiveness of our proposed technique by presenting empirical results on three multimodal classification tasks from different domains. The results show consistent accuracy improvements on all three tasks.

Sapphire: Querying RDF Data Made Simple

RDF data in the linked open data (LOD) cloud is very valuable for many different applications. In order to unlock the full value of this data, users should be able to issue complex queries on the RDF datasets in the LOD cloud. SPARQL can express such complex queries, but constructing SPARQL queries can be a challenge to users since it requires knowing the structure and vocabulary of the datasets being queried. In this paper, we introduce Sapphire, a tool that helps users write syntactically and semantically correct SPARQL queries without prior knowledge of the queried datasets. Sapphire interactively helps the user while typing the query by providing auto-complete suggestions based on the queried data. After a query is issued, Sapphire provides suggestions on ways to change the query to better match the needs of the user. We evaluated Sapphire based on performance experiments and a user study and showed it to be superior to competing approaches.

Rethinking Knowledge Graph Propagation for Zero-Shot Learning

The potential of graph convolutional neural networks for the task of zero-shot learning has been demonstrated recently. These models are highly sample efficient as related concepts in the graph structure share statistical strength allowing generalization to new classes when faced with a lack of data. However, knowledge from distant nodes can get diluted when propagating through intermediate nodes, because current approaches to zero-shot learning use graph propagation schemes that perform Laplacian smoothing at each layer. We show that extensive smoothing does not help the task of regressing classifier weights in zero-shot learning. In order to still incorporate information from distant nodes and utilize the graph structure, we propose an Attentive Dense Graph Propagation Module (ADGPM). ADGPM allows us to exploit the hierarchical graph structure of the knowledge graph through additional connections. These connections are added based on a node’s relationship to its ancestors and descendants and an attention scheme is further used to weigh their contribution depending on the distance to the node. Finally, we illustrate that finetuning of the feature representation after training the ADGPM leads to considerable improvements. Our method achieves competitive results, outperforming previous zero-shot learning approaches.

A Novel Multi-clustering Method for Hierarchical Clusterings, Based on Boosting

Bagging and boosting are proved to be the best methods of building multiple classifiers in classification combination problems. In the area of ‘flat clustering’ problems, it is also recognized that multi-clustering methods based on boosting provide clusterings of an improved quality. In this paper, we introduce a novel multi-clustering method for ‘hierarchical clusterings’ based on boosting theory, which creates a more stable hierarchical clustering of a dataset. The proposed algorithm includes a boosting iteration in which a bootstrap of samples is created by weighted random sampling of elements from the original dataset. A hierarchical clustering algorithm is then applied to selected subsample to build a dendrogram which describes the hierarchy. Finally, dissimilarity description matrices of multiple dendrogram results are combined to a consensus one, using a hierarchical-clustering-combination approach. Experiments on real popular datasets show that boosted method provides superior quality solutions compared to standard hierarchical clustering methods.

Supervised Policy Update

We propose a new sample-efficient methodology, called Supervised Policy Update (SPU), for deep reinforcement learning. Starting with data generated by the current policy, SPU optimizes over the proximal policy space to find a non-parameterized policy. It then solves a supervised regression problem to convert the non-parameterized policy to a parameterized policy, from which it draws new samples. There is significant flexibility in setting the labels in the supervised regression problem, with different settings corresponding to different underlying optimization problems. We develop a methodology for finding an optimal policy in the non-parameterized policy space, and show how Trust Region Policy Optimization (TRPO) and Proximal Policy Optimization (PPO) can be addressed by this methodology. In terms of sample efficiency, our experiments show SPU can outperform PPO for simulated robotic locomotion tasks.

Variational Inverse Control with Events: A General Framework for Data-Driven Reward Definition

The design of a reward function often poses a major practical challenge to real-world applications of reinforcement learning. Approaches such as inverse reinforcement learning attempt to overcome this challenge, but require expert demonstrations, which can be difficult or expensive to obtain in practice. We propose variational inverse control with events (VICE), which generalizes inverse reinforcement learning methods to cases where full demonstrations are not needed, such as when only samples of desired goal states are available. Our method is grounded in an alternative perspective on control and reinforcement learning, where an agent’s goal is to maximize the probability that one or more events will happen at some point in the future, rather than maximizing cumulative rewards. We demonstrate the effectiveness of our methods on continuous control tasks, with a focus on high-dimensional observations like images where rewards are hard or even impossible to specify.

LSTMs Exploit Linguistic Attributes of Data

While recurrent neural networks have found success in a variety of natural language processing applications, they are general models of sequential data. We investigate how the properties of natural language data affect an LSTM’s ability to learn a nonlinguistic task: recalling elements from its input. We find that models trained on natural language data are able to recall tokens from much longer sequences than models trained on non-language sequential data. Furthermore, we show that the LSTM learns to solve the memorization task by explicitly using a subset of its neurons to count timesteps in the input. We hypothesize that the patterns and structure in natural language data enable LSTMs to learn by providing approximate ways of reducing loss, but understanding the effect of different training data on the learnability of LSTMs remains an open question.

High Dimensional Robust Sparse Regression

We provide a novel — and to the best of our knowledge, the first — algorithm for high dimensional sparse regression with corruptions in explanatory and/or response variables. Our algorithm recovers the true sparse parameters in the presence of a constant fraction of arbitrary corruptions. Our main contribution is a robust variant of Iterative Hard Thresholding. Using this, we provide accurate estimators with sub-linear sample complexity. Our algorithm consists of a novel randomized outlier removal technique for robust sparse mean estimation that may be of interest in its own right: it is orderwise more efficient computationally than existing algorithms, and succeeds with high probability, thus making it suitable for general use in iterative algorithms. We demonstrate the effectiveness on large-scale sparse regression problems with arbitrary corruptions.

On Consensus-Optimality Trade-offs in Collaborative Deep Learning
Critical and minimal connectivity of power graphs of finite groups
Amnestic Forgery: an Ontology of Conceptual Metaphors
Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models
Comparative analysis of the structures and outcomes of geophysical flow models and modeling assumptions using uncertainty quantification
Context-aware Cascade Attention-based RNN for Video Emotion Recognition
Social Signals in the Ethereum Trading Network
Automorphism groups of designs with $λ=1$
PID2018 Benchmark Challenge: learning feedforward control
CuisineNet: Food Attributes Classification using Multi-scale Convolution Network
Automatic, fast and robust characterization of noise distributions for diffusion MRI
Code-Switching Language Modeling using Syntax-Aware Multi-Task Learning
A Robust and Effective Approach Towards Accurate Metastasis Detection and pN-stage Classification in Breast Cancer
A four vertex theorem for frieze patterns
Stochastic Deep Compressive Sensing for the Reconstruction of Diffusion Tensor Cardiac MRI
Bilingual Character Representation for Efficiently Addressing Out-of-Vocabulary Words in Code-Switching Named Entity Recognition
On Theorem 6 in ‘Relative Entropy and the Multivariable Multidimensional Moment Problem’ [Mar 2006 1052-1066]
Connectedness of the Cross-Join Graph of de Bruijn Sequences
The Aldous chain on cladograms in the diffusion limit
Unwinding the model manifold: choosing similarity measures to remove local minima in sloppy dynamical systems
Graph Sparsification, Spectral Sketches, and Faster Resistance Computation, via Short Cycle Decompositions
Robust Place Categorization with Deep Domain Generalization
End-to-end named entity extraction from speech
Predicting County Level Corn Yields Using Deep Long Short Term Memory Models
Polynomial Factorization Is Simple and Helpful — More So Than It Seems to Be
BUNDLEP: Prioritizing Conflict Free Regions in Multi-Threaded Programs to Improve Cache Reuse — Extended Results and Technical Report
Optimal dividends with partial information and stopping of a degenerate reflecting diffusion
Identifying and Understanding User Reactions to Deceptive and Trusted Social News Sources
On short expressions for cosets of permutation subgroups
Privacy Aware Offloading of Deep Neural Networks
On $q$-ratio CMSV for sparse recovery
Generalizing to Unseen Domains via Adversarial Data Augmentation
Optimal Placement of Baseband Functions for Energy Harvesting Virtual Small Cells
Beam Discovery Using Linear Block Codes for Millimeter Wave Communication Networks
Why Is My Classifier Discriminatory
Reference-free Calibration in Sensor Networks
l0-norm Based Centers Selection for Failure Tolerant RBF Networks
Automatic generation of object shapes with desired functionalities
MolGAN: An implicit generative model for small molecular graphs
Two-stage Method for Millimeter Wave Channel Estimation
Automatic Large-Scale Data Acquisition via Crowdsourcing for Crosswalk Classification: A Deep Learning Approach
Short-term Load Forecasting with Deep Residual Networks
Adjacency and Tensor Representation in General Hypergraphs.Part 2: Multisets, Hb-graphs and Related e-adjacency Tensors
Fast L1-Minimization Algorithm for Sparse Approximation Based on an Improved LPNN-LCA framework
Two-stage Method for the Reconstruction of a Low-Rank Matrix
A Lagrangian Dual Based Approach to Sparse Linear Programming
Well-posedness of Stochastic 3D Leray-$α$ Model with Fractional Dissipation
Character-Level Models versus Morphology in Semantic Role Labeling
Matrix-free multigrid block-preconditioners for higher order Discontinuous Galerkin discretisations
Learning to Generate Facial Depth Maps
Square-free Groebner degenerations
Anonymous Walk Embeddings
Multiple Manifolds Metric Learning with Application to Image Set Classification
On the Spectrum of Random Features Maps of High Dimensional Data
Iterative Antenna Selection for Secrecy Enhancement in Massive MIMO Wiretap Channels
Propagating Confidences through CNNs for Sparse Data Regression
Needle Tip Force Estimation using an OCT Fiber and a Fused convGRU-CNN Architecture
Quantitative approach to multifractality induced by correlations and broad distribution of data
Who Learns Better Bayesian Network Structures: Constraint-Based, Score-based or Hybrid Algorithms
Estimation of seasonal long-memory parameters
Q-Graph: Preserving Query Locality in Multi-Query Graph Processing
Capacity bounds for bandlimited Gaussian channels with peak-to-average-power-ratio constraint
Differential Properties of Sinkhorn Approximation for Learning with Wasserstein Distance
RLS Recovery with Asymmetric Penalty: Fundamental Limits and Algorithmic Approaches
Theoretical Bounds on MAP Estimation in Distributed Sensing Networks
Multi-Message Private Information Retrieval with Private Side Information
Orientable arithmetic matroids
DATA:SEARCH’18 — Searching Data on the Web
Energy-Efficient Caching for Scalable Videos in Heterogeneous Networks
The One-Shot Crowdfunding Game
Multidimensional free-mobility equilibrium: Tiebout revisited
A Corpus of English-Hindi Code-Mixed Tweets for Sarcasm Detection
An English-Hindi Code-Mixed Corpus: Stance Annotation and Baseline System
Using Inter-Sentence Diverse Beam Search to Reduce Redundancy in Visual Storytelling
Space-Efficient DFS and Applications: Simpler, Leaner, Faster
Foresee: Attentive Future Projections of Chaotic Road Environments with Online Training
Resilience Control of DC Shipboard Power Systems
Invariance pressure of control sets
RUN:Residual U-Net for Computer-Aided Detection of Pulmonary Nodules without Candidate Selection
ADAGIO: Interactive Experimentation with Adversarial Attack and Defense for Audio
Neural Joking Machine : Humorous image captioning
An Information-Theoretic Analysis of Thompson Sampling for Large Action Spaces
Learning multiple non-mutually-exclusive tasks for improved classification of inherently ordered labels
A Radial Basis Function based Optimization Algorithm with Regular Simplex set geometry in Ellipsoidal Trust-Regions
Anaphora and Coreference Resolution: A Review
Generic CP-Supported CMSA for Binary Integer Linear Programs
Visual Referring Expression Recognition: What Do Systems Actually Learn
Enabling Pedestrian Safety using Computer Vision Techniques: A Case Study of the 2018 Uber Inc. Self-driving Car Crash
The VIREO KIS at VBS 2018
Stochastic Zeroth-order Optimization via Variance Reduction method
A Markov Chain Model for the Cure Rate of Non-Performing Loans
New Bounds for the Signless Laplacian Spread
CRRN: Multi-Scale Guided Concurrent Reflection Removal Network
Long short-term memory networks in memristor crossbars
Accelerating Large-Scale Data Analysis by Offloading to High-Performance Computing Libraries using Alchemist
Automated proof synthesis for propositional logic with deep neural networks
Quantum correlations and entanglement in a Kitaev-type spin chain
Infinite Arms Bandit: Optimality via Confidence Bounds
Tight Regret Bounds for Bayesian Optimization in One Dimension
A Fine-to-Coarse Convolutional Neural Network for 3D Human Action Recognition
Multi-function Convolutional Neural Networks for Improving Image Classification Performance
Hyperspectral Imaging Technology and Transfer Learning Utilized in Identification Haploid Maize Seeds
Critical Exponent of the Anderson Transition using Massively Parallel Supercomputing
Detecting Data Leakage from Databases on Android Apps with Concept Drift
Cellular Controlled Cooperative Unmanned Aerial Vehicle Networks with Sense-and-Send Protocol
Object Detection using Domain Randomization and Generative Adversarial Refinement of Synthetic Images
Efficient Sequential and Parallel Algorithms for Estimating Higher Order Spectra
Planning, Inference and Pragmatics in Sequential Language Games
Autonomous Vehicles that Interact with Pedestrians:A Survey of Theory and Practice
Critical point for infinite cycles in a random loop model on trees
AutoZOOM: Autoencoder-based Zeroth Order Optimization Method for Attacking Black-box Neural Networks
Fast Incremental von Neumann Graph Entropy Computation: Theory, Algorithm, and Applications
‘Press Space to Fire’: Automatic Video Game Tutorial Generation
A Geometric Property of Relative Entropy and the Universal Threshold Phenomenon for Binary-Input Channels with Noisy State Information at the Encoder
Adversarial Learning of Task-Oriented Neural Dialog Models
Optimal Testing in the Experiment-rich Regime
Multi-turn Dialogue Response Generation in an Adversarial Learning Framework
On seeking efficient Pareto optimal points in multi-player minimum cost flow problems with application to transportation systems
Unsupervised Text Style Transfer using Language Models as Discriminators
Sublinear decoding schemes for non-adaptive group testing with inhibitors
Bayesian Estimations for Diagonalizable Bilinear SPDEs
Semantic Road Layout Understanding by Generative Adversarial Inpainting
Superpixel-enhanced Pairwise Conditional Random Field for Semantic Segmentation
A study on prefixes of $c_2$ invariants
Inexact Stochastic Mirror Descent for two-stage nonlinear stochastic programs
HeadOn: Real-time Reenactment of Human Portrait Videos
Characterizing Energy Efficiency of Wireless Transmission for Green Internet of Things: A Data-Oriented Approach
Fairness and Sum-Rate Maximization via Joint Channel and Power Allocation in Uplink SCMA Networks
The Age of Updates in a Simple Relay Network
Bottom-up approach to torus bifurcation in neuron models
Deep Mesh Projectors for Inverse Problems
Deep Video Portraits
Depth and nonlinearity induce implicit exploration for RL
Active and Adaptive Sequential learning
Deep Semantic Architecture with discriminative feature visualization for neuroimage analysis
Regularization of time-varying covariance matrices using linear stochastic systems
Regularization of covariance matrices on Riemannian manifolds using linear systems
Coded Computation Against Distributed Straggling Decoders for Gaussian Channels in C-RAN
A projected primal-dual splitting for solving constrained monotone inclusions
Can DNNs Learn to Lipread Full Sentences
On Visibility Problems with an Infinite Discrete, set of Obstacles
Simulation of particle systems interacting through hitting times
Why Botnets Work: Distributed Brute-Force Attacks Need No Synchronization
A Unified Particle-Optimization Framework for Scalable Bayesian Sampling
A doubly stochastic enhancement of the Failure Forecast Method using a noisy mean-reverting process
Sign matrix polytopes from Young tableaux
Optimal Bidding, Allocation and Budget Spending for a Demand Side Platform Under Many Auction Types
Classifying Rotationally-Closed Languages Having Greedy Universal Cycles
K-Beam Subgradient Descent for Minimax Optimization
Diagnosing Glaucoma Progression with Visual Field Data Using a Spatiotemporal Boundary Detection Method
Continuity Of Pontryagin Extremals With Respect To Delays In Nonlinear Optimal Control
Entropy-controlled Last-Passage Percolation
A law of large numbers for the range of rotor walks on periodic trees
Long Short-Term Memory Networks for CSI300 Volatility Prediction with Baidu Search Volume
On a sufficient condition for a Fano manifold to be covered by rational $N$-folds
Duopoly Investment Problems with Minimally Bounded Adjustment Costs
Algebraic Expression of Spatial and Temporal Pattern
Deep Learning for Topological Invariants
Biologically Motivated Algorithms for Propagating Local Target Representations
Splitting source code identifiers using Bidirectional LSTM Recurrent Neural Network
Dynamic Advisor-Based Ensemble (dynABE): Case Study in Stock Trend Prediction of a Major Critical Metal Producer