Accelerating Human-in-the-loop Machine Learning: Challenges and Opportunities

Development of machine learning (ML) workflows is a tedious process of iterative experimentation: developers repeatedly make changes to workflows until the desired accuracy is attained. We describe our vision for a ‘human-in-the-loop’ ML system that accelerates this process: by intelligently tracking changes and intermediate results over time, such a system can enable rapid iteration, quick responsive feedback, introspection and debugging, and background execution and automation. We finally describe Helix, our preliminary attempt at such a system that has already led to speedups of up to 10x on typical iterative workflows against competing systems.

Predicting Future Machine Failure from Machine State Using Logistic Regression

Accurately predicting machine failures in advance can decrease maintenance cost and help allocate maintenance resources more efficiently. Logistic regression was applied to predict machine state 24 hours in the future given the current machine state.

High Dimensional Time Series Generators

Multidimensional time series are sequences of real valued vectors. They occur in different areas, for example handwritten characters, GPS tracking, and gestures of modern virtual reality motion controllers. Within these areas, a common task is to search for similar time series. Dynamic Time Warping (\dtw) is a common distance function to compare two time series. The Edit Distance with Real Penalty (\erp) and the Dog Keeper Distance (\frechet) are two more distance functions on time series. Their behaviour has been analyzed on 1-dimensional time series. However, it is not easy to evaluate their behaviour in relation to growing dimensionality. For this reason we propose two new data synthesizers generating multidimensional time series. The first synthesizer extends the well known cylinder-bell-funnel (CBF) dataset to multidimensional time series. Here, each time series has an arbitrary type (cylinder, bell, or funnel) in each dimension, thus for d-dimensional time series there are 3^{d} different classes. The second synthesizer (\ram) creates time series with ideas adapted from Brownian motions which is a common model of movement in physics. Finally, we evaluate the applicability of a 1-nearest neighbor classifier using \dtw{} on datasets generated by our synthesizers.

Graph-based Selective Outlier Ensembles

An ensemble technique is characterized by the mechanism that generates the components and by the mechanism that combines them. A common way to achieve the consensus is to enable each component to equally participate in the aggregation process. A problem with this approach is that poor components are likely to negatively affect the quality of the consensus result. To address this issue, alternatives have been explored in the literature to build selective classifier and cluster ensembles, where only a subset of the components contributes to the computation of the consensus. Of the family of ensemble methods, outlier ensembles are the least studied. Only recently, the selection problem for outlier ensembles has been discussed. In this work we define a new graph-based class of ranking selection methods. A method in this class is characterized by two main steps: (1) Mapping the rankings onto a graph structure; and (2) Mining the resulting graph to identify a subset of rankings. We define a specific instance of the graph-based ranking selection class. Specifically, we map the problem of selecting ensemble components onto a mining problem in a graph. An extensive evaluation was conducted on a variety of heterogeneous data and methods. Our empirical results show that our approach outperforms state-of-the-art selective outlier ensemble techniques.

CytonRL: an Efficient Reinforcement Learning Open-source Toolkit Implemented in C++

This paper presents an open-source enforcement learning toolkit named CytonRL (https://…/cytonRL ). The toolkit implements four recent advanced deep Q-learning algorithms from scratch using C++ and NVIDIA’s GPU-accelerated libraries. The code is simple and elegant, owing to an open-source general-purpose neural network library named CytonLib. Benchmark shows that the toolkit achieves competitive performances on the popular Atari game of Breakout.

BigDL: A Distributed Deep Learning Framework for Big Data

In this paper, we present BigDL, a distributed deep learning framework for Big Data platforms and workflows. It is implemented on top of Apache Spark, and allows users to write their deep learning applications as standard Spark programs (running directly on large-scale big data clusters in a distributed fashion). It provides an expressive, ‘data-analytics integrated’ deep learning programming model, so that users can easily build the end-to-end analytics + AI pipelines under a unified programming paradigm; by implementing an AllReduce like operation using existing primitives in Spark (e.g., shuffle, broadcast, and in-memory data persistence), it also provides a highly efficient ‘parameter server’ style architecture, so as to achieve highly scalable, data-parallel distributed training. Since its initial open source release, BigDL users have built many analytics and deep learning applications (e.g., object detection, sequence-to-sequence generation, neural recommendations, fraud detection, etc.) on Spark.

M-PACT: Michigan Platform for Activity Classification in Tensorflow

Action classification is a widely known and popular task that offers an approach towards video understanding. The absence of an easy-to-use platform containing state-of-the-art (SOTA) models presents an issue for the community. Given that individual research code is not written with an end user in mind and in certain cases code is not released, even for published articles, the importance of a common unified platform capable of delivering results while removing the burden of developing an entire system cannot be overstated. To try and overcome these issues, we develop a tensorflow-based unified platform to abstract away unnecessary overheads in terms of an end-to-end pipeline setup in order to allow the user to quickly and easily prototype action classification models. With the use of a consistent coding style across different models and seamless data flow between various submodules, the platform lends itself to the quick generation of results on a wide range of SOTA methods across a variety of datasets. All of these features are made possible through the use of fully pre-defined training and testing blocks built on top of a small but powerful set of modular functions that handle asynchronous data loading, model initializations, metric calculations, saving and loading of checkpoints, and logging of results. The platform is geared towards easily creating models, with the minimum requirement being the definition of a network architecture and preprocessing steps from a large custom selection of layers and preprocessing functions. M-PACT currently houses four SOTA activity classification models which include, I3D, C3D, ResNet50+LSTM and TSN. The classification performance achieved by these models are, 43.86% for ResNet50+LSTM on HMDB51 while C3D and TSN achieve 93.66% and 85.25% on UCF101 respectively.

Chronos: A Unifying Optimization Framework for Speculative Execution of Deadline-critical MapReduce Jobs

Meeting desired application deadlines in cloud processing systems such as MapReduce is crucial as the nature of cloud applications is becoming increasingly mission-critical and deadline-sensitive. It has been shown that the execution times of MapReduce jobs are often adversely impacted by a few slow tasks, known as stragglers, which result in high latency and deadline violations. While a number of strategies have been developed in existing work to mitigate stragglers by launching speculative or clone task attempts, none of them provides a quantitative framework that optimizes the speculative execution for offering guaranteed Service Level Agreements (SLAs) to meet application deadlines. In this paper, we bring several speculative scheduling strategies together under a unifying optimization framework, called Chronos, which defines a new metric, Probability of Completion before Deadlines (PoCD), to measure the probability that MapReduce jobs meet their desired deadlines. We systematically analyze PoCD for popular strategies including Clone, Speculative-Restart, and Speculative-Resume, and quantify their PoCD in closed-form. The result illuminates an important tradeoff between PoCD and the cost of speculative execution, measured by the total (virtual) machine time required under different strategies. We propose an optimization problem to jointly optimize PoCD and execution cost in different strategies, and develop an algorithmic solution that is guaranteed to be optimal. Chronos is prototyped on Hadoop MapReduce and evaluated against three baseline strategies using both experiments and trace-driven simulations, achieving 50% net utility increase with up to 80% PoCD and 88% cost improvements.

Rafiki: Machine Learning as an Analytics Service System

Big data analytics is gaining massive momentum in the last few years. Applying machine learning models to big data has become an implicit requirement or an expectation for most analysis tasks, especially on high-stakes applications.Typical applications include sentiment analysis against reviews for analyzing on-line products, image classification in food logging applications for monitoring user’s daily intake and stock movement prediction. Extending traditional database systems to support the above analysis is intriguing but challenging. First, it is almost impossible to implement all machine learning models in the database engines. Second, expertise knowledge is required to optimize the training and inference procedures in terms of efficiency and effectiveness, which imposes heavy burden on the system users. In this paper, we develop and present a system, called Rafiki, to provide the training and inference service of machine learning models, and facilitate complex analytics on top of cloud platforms. Rafiki provides distributed hyper-parameter tuning for the training service, and online ensemble modeling for the inference service which trades off between latency and accuracy. Experimental results confirm the efficiency, effectiveness, scalability and usability of Rafiki.

Fast Flux Detection via Data Mining on Passive DNS Traffic
Walk-Steered Convolution for Graph Classification

Graph classification is a fundamental but challenging problem due to the non-Euclidean property of graph. In this work, we jointly leverage the powerful representation ability of random walk and the essential success of standard convolutional network work (CNN), to propose a random walk based convolutional network, called walk-steered convolution (WSC). Different from those existing graph CNNs with deterministic neighbor searching, we randomly sample multi-scale walk fields by using random walk, which is more flexible to the scalability of graph. To encode each-scale walk field consisting of several walk paths, specifically, we characterize the directions of walk field by multiple Gaussian models so as to better analogize the standard CNNs on images. Each Gaussian implicitly defines a directions and all of them properly encode the spatial layout of walks after the gradient projecting to the space of Gaussian parameters. Further, a graph coarsening layer using dynamical clustering is stacked upon the Gaussian encoding to capture high-level semantics of graph. Comprehensive evaluations on several public datasets well demonstrate the superiority of our proposed graph learning method over other state-of-the-arts for graph classification.

Application of the Ranking Relative Principal Component Attributes Network Model (REL-PCANet) for the Inclusive Development Index Estimation

In 2018, at the World Economic Forum in Davos it was presented a new countries’ economic performance metric named the Inclusive Development Index (IDI) composed of 12 indicators. The new metric implies that countries might need to realize structural reforms for improving both economic expansion and social inclusion performance. That is why, it is vital for the IDI calculation method to have strong statistical and mathematical basis, so that results are accurate and transparent for public purposes. In the current work, we propose a novel approach for the IDI estimation – the Ranking Relative Principal Component Attributes Network Model (REL-PCANet). The model is based on RELARM and RankNet principles and combines elements of PCA, techniques applied in image recognition and learning to rank mechanisms. Also, we define a new approach for estimation of target probabilities matrix to reflect dynamic changes in countries’ inclusive development. Empirical study proved that REL-PCANet ensures reliable and robust scores and rankings, thus is recommended for practical implementation.

Compressibility and Generalization in Large-Scale Deep Learning

Modern neural networks are highly overparameterized, with capacity to substantially overfit to training data. Nevertheless, these networks often generalize well in practice. It has also been observed that trained networks can often be ‘compressed’ to much smaller representations. The purpose of this paper is to connect these two empirical observations. Our main technical result is a generalization bound for compressed networks based on the compressed size. Combined with off-the-shelf compression algorithms, the bound leads to state of the art generalization guarantees; in particular, we provide the first non-vacuous generalization guarantees for realistic architectures applied to the ImageNet classification problem. As additional evidence connecting compression and generalization, we show that compressibility of models that tend to overfit is limited: We establish an absolute limit on expected compressibility as a function of expected generalization error, where the expectations are over the random choice of training examples. The bounds are complemented by empirical results that show an increase in overfitting implies an increase in the number of bits required to describe a trained network.

Confidence intervals for the area under the receiver operating characteristic curve in the presence of ignorable missing data

Receiver operating characteristic (ROC) curves are widely used as a measure of accuracy of diagnostic tests and can be summarized using the area under the ROC curve (AUC). Often, it is useful to construct a confidence intervals for the AUC, however, since there are a number of different proposed methods to measure variance of the AUC, there are thus many different resulting methods for constructing these intervals. In this manuscript, we compare different methods of constructing Wald-type confidence interval in the presence of missing data where the missingness mechanism is ignorable. We find that constructing confidence intervals using multiple imputation (MI) based on logistic regression (LR) gives the most robust coverage probability and the choice of CI method is less important. However, when missingness rate is less severe (e.g. less than 70%), we recommend using Newcombe’s Wald method for constructing confidence intervals along with multiple imputation using predictive mean matching (PMM).

Neural Models for Reasoning over Multiple Mentions using Coreference

Many problems in NLP require aggregating information from multiple mentions of the same entity which may be far apart in the text. Existing Recurrent Neural Network (RNN) layers are biased towards short-term dependencies and hence not suited to such tasks. We present a recurrent layer which is instead biased towards coreferent dependencies. The layer uses coreference annotations extracted from an external system to connect entity mentions belonging to the same cluster. Incorporating this layer into a state-of-the-art reading comprehension model improves performance on three datasets — Wikihop, LAMBADA and the bAbi AI tasks — with large gains when training data is scarce.

A Boosting Framework of Factorization Machine

Recently, Factorization Machines (FM) has become more and more popular for recommendation systems, due to its effectiveness in finding informative interactions between features. Usually, the weights for the interactions is learnt as a low rank weight matrix, which is formulated as an inner product of two low rank matrices. This low rank can help improve the generalization ability of Factorization Machines. However, to choose the rank properly, it usually needs to run the algorithm for many times using different ranks, which clearly is inefficient for some large-scale datasets. To alleviate this issue, we propose an Adaptive Boosting framework of Factorization Machines (AdaFM), which can adaptively search for proper ranks for different datasets without re-training. Instead of using a fixed rank for FM, the proposed algorithm will adaptively gradually increases its rank according to its performance until the performance does not grow, using boosting strategy. To verify the performance of our proposed framework, we conduct an extensive set of experiments on many real-world datasets. Encouraging empirical results shows that the proposed algorithms are generally more effective than state-of-the-art other Factorization Machines.

Reinforced Co-Training

Co-training is a popular semi-supervised learning framework to utilize a large amount of unlabeled data in addition to a small labeled set. Co-training methods exploit predicted labels on the unlabeled data and select samples based on prediction confidence to augment the training. However, the selection of samples in existing co-training methods is based on a predetermined policy, which ignores the sampling bias between the unlabeled and the labeled subsets, and fails to explore the data space. In this paper, we propose a novel method, Reinforced Co-Training, to select high-quality unlabeled samples to better co-train on. More specifically, our approach uses Q-learning to learn a data selection policy with a small labeled dataset, and then exploits this policy to train the co-training classifiers automatically. Experimental results on clickbait detection and generic text classification tasks demonstrate that our proposed method can obtain more accurate text classification results.

Cross-Domain Adversarial Auto-Encoder

In this paper, we propose the Cross-Domain Adversarial Auto-Encoder (CDAAE) to address the problem of cross-domain image inference, generation and transformation. We make the assumption that images from different domains share the same latent code space for content, while having separate latent code space for style. The proposed framework can map cross-domain data to a latent code vector consisting of a content part and a style part. The latent code vector is matched with a prior distribution so that we can generate meaningful samples from any part of the prior space. Consequently, given a sample of one domain, our framework can generate various samples of the other domain with the same content of the input. This makes the proposed framework different from the current work of cross-domain transformation. Besides, the proposed framework can be trained with both labeled and unlabeled data, which makes it also suitable for domain adaptation. Experimental results on data sets SVHN, MNIST and CASIA show the proposed framework achieved visually appealing performance for image generation task. Besides, we also demonstrate the proposed method achieved superior results for domain adaptation. Code of our experiments is available in https://…/CDAAE.

Feature Propagation on Graph: A New Perspective to Graph Representation Learning

We study feature propagation on graph, an inference process involved in graph representation learning tasks. It’s to spread the features over the whole graph to the t-th orders, thus to expand the end’s features. The process has been successfully adopted in graph embedding or graph neural networks, however few works studied the convergence of feature propagation. Without convergence guarantees, it may lead to unexpected numerical overflows and task failures. In this paper, we first define the concept of feature propagation on graph formally, and then study its convergence conditions to equilibrium states. We further link feature propagation to several established approaches such as node2vec and structure2vec. In the end of this paper, we extend existing approaches from represent nodes to edges (edge2vec) and demonstrate its applications on fraud transaction detection in real world scenario. Experiments show that it is quite competitive.

A Support Tensor Train Machine

There has been growing interest in extending traditional vector-based machine learning techniques to their tensor forms. An example is the support tensor machine (STM) that utilizes a rank-one tensor to capture the data structure, thereby alleviating the overfitting and curse of dimensionality problems in the conventional support vector machine (SVM). However, the expressive power of a rank-one tensor is restrictive for many real-world data. To overcome this limitation, we introduce a support tensor train machine (STTM) by replacing the rank-one tensor in an STM with a tensor train. Experiments validate and confirm the superiority of an STTM over the SVM and STM.

VC-Dimension Based Generalization Bounds for Relational Learning

In many applications of relational learning, the available data can be seen as a sample from a larger relational structure (e.g. we may be given a small fragment from some social network). In this paper we are particularly concerned with scenarios in which we can assume that (i) the domain elements appearing in the given sample have been uniformly sampled without replacement from the (unknown) full domain and (ii) the sample is complete for these domain elements (i.e. it is the full substructure induced by these elements). Within this setting, we study bounds on the error of sufficient statistics of relational models that are estimated on the available data. As our main result, we prove a bound based on a variant of the Vapnik-Chervonenkis dimension which is suitable for relational data.

MetaBags: Bagged Meta-Decision Trees for Regression

Ensembles are popular methods for solving practical supervised learning problems. They reduce the risk of having underperforming models in production-grade software. Although critical, methods for learning heterogeneous regression ensembles have not been proposed at large scale, whereas in classical ML literature, stacking, cascading and voting are mostly restricted to classification problems. Regression poses distinct learning challenges that may result in poor performance, even when using well established homogeneous ensemble schemas such as bagging or boosting. In this paper, we introduce MetaBags, a novel, practically useful stacking framework for regression. MetaBags is a meta-learning algorithm that learns a set of meta-decision trees designed to select one base model (i.e. expert) for each query, and focuses on inductive bias reduction. A set of meta-decision trees are learned using different types of meta-features, specially created for this purpose – to then be bagged at meta-level. This procedure is designed to learn a model with a fair bias-variance trade-off, and its improvement over base model performance is correlated with the prediction diversity of different experts on specific input space subregions. The proposed method and meta-features are designed in such a way that they enable good predictive performance even in subregions of space which are not adequately represented in the available training data. An exhaustive empirical testing of the method was performed, evaluating both generalization error and scalability of the approach on synthetic, open and real-world application datasets. The obtained results show that our method significantly outperforms existing state-of-the-art approaches.

On Improving Deep Reinforcement Learning for POMDPs

Deep Reinforcement Learning (RL) recently emerged as one of the most competitive approaches for learning in sequential decision making problems with fully observable environments, e.g., computer Go. However, very little work has been done in deep RL to handle partially observable environments. We propose a new architecture called Action-specific Deep Recurrent Q-Network (ADRQN) to enhance learning performance in partially observable domains. Actions are encoded by a fully connected layer and coupled with a convolutional observation to form an action-observation pair. The time series of action-observation pairs are then integrated by an LSTM layer that learns latent states based on which a fully connected layer computes Q-values as in conventional Deep Q-Networks (DQNs). We demonstrate the effectiveness of our new architecture in several partially observable domains, including flickering Atari games.

Clustering Analysis on Locally Asymptotically Self-similar Processes
Distributed Approximate Newton Algorithms and Weight Design for Constrained Optimization
Weighted Low-Rank Approximation of Matrices and Background Modeling
Dual CNN Models for Unsupervised Monocular Depth Estimation
Joint Recursive Beam and Channel Tracking for 2-dimensional Phased Antenna Arrays
Optimal chemotherapy and immunotherapy schedules for a cancer-obesity model with Caputo time fractional derivative
Analysis of Extremely Obese Individuals Using Deep Learning Stacked Autoencoders and Genome-Wide Genetic Data
Spin transport in long-range interacting one-dimensional chain
Universal Dependency Parsing for Hindi-English Code-switching
Egocentric 6-DoF Tracking of Small Handheld Objects
Limits of multiplicative inhomogeneous random graphs and Lévy trees
Uniform Substitution for Differential Game Logic
Evaluating Massive MIMO Precoding based on 3D-Channel Measurements with a Spider Antenna
Area Rate Evaluation based on Spatial Clustering of massive MIMO Channel Measurements
Subcarrier-Interlaced FDD for Faster-than-TDD Channel Tracking in Massive MIMO Systems
Densely Connected High Order Residual Network for Single Frame Image Super Resolution
Tree Morphology for Phenotyping from Semantics-Based Mapping in Orchard Environments
An information-theoretic on-line update principle for perception-action coupling
Persistence probability of a random polynomial arising from evolutionary game theory
Trace class Markov chains for the Normal-Gamma Bayesian shrinkage model
Heuristic Approaches for Goal Recognition in Incomplete Domain Models
Improving Implicit Discourse Relation Classification by Modeling Inter-dependencies of Discourse Units in a Paragraph
A stochastic second-order generalized estimating equations approach for estimating intraclass correlation coefficient in the presence of informative missing data
UCBoost: A Boosting Approach to Tame Complexity and Optimality for Stochastic Bandits
Towards Robust Monitoring of Stealthy Diffusion
Unimodal Polynomials and Lattice Walk Enumeration with Experimental Mathematics
Learning a Deep Listwise Context Model for Ranking Refinement
Unbiased Learning to Rank with Unbiased Propensity Estimation
Approaching Neural Grammatical Error Correction as a Low-Resource Machine Translation Task
Segmentation of both Diseased and Healthy Skin from Clinical Photographs in a Primary Care Setting
Near Human-Level Performance in Grammatical Error Correction with Hybrid Machine Translation
One-point boundaries of ends of clusters in percolation in $\mathbb H^d$
Distribution Estimation in Discounted MDPs via a Transformation
Erdős-Szekeres On-Line
k-Maximum Subarrays for Small k: Divide-and-Conquer made simpler
Can Neural Machine Translation be Improved with User Feedback?
Structured Recovery with Heavy-tailed Measurements: A Thresholding Procedure and Optimal Rates
All Technologies Work Together for Good: A Glance to Future Mobile Networks
MaxGain: Regularisation of Neural Networks by Constraining Activation Magnitudes
Two-dimensional Brownian random interlacements
Source-channel separation for two-way interactive communication with fidelity criteria
A Deeper Look into Dependency-Based Word Embeddings
Optimal mean squared error bandwidth for spectral variance estimators in MCMC simulations
Undecidability of approximating the capacity of time-invariant Markoff channel with feedback, and non-existence of linear finite-letter conditional mutual information characterizations for this channel assuming Schanuel’s conjecture
Controlling the Charging of Electric Vehicles with Neural Networks
A Univariate Bound of Area Under ROC
A Fusion Framework for Camouflaged Moving Foreground Detection in the Wavelet Domain
Equivalence between spectral properties of graphs with and without loops
Learning Joint Semantic Parsers from Disjoint Data
Structuring Wikipedia Articles with Section Recommendations
A New Decidable Class of Tuple Generating Dependencies: The Triangularly-Guarded Class
Sharper bounds and structural results for minimally nonlinear 0-1 matrices
Asymptotic Achievable Rate of Two-Dimensional Constraint Codes based on Column by Column Encoding
Joint Quantizer Optimization based on Neural Quantizer for Sum-Product Decoder
The Subfield Codes of Hyperoval and Conic codes
Monte Carlo Syntax Marginals for Exploring and Using Dependency Parses
Geometry-aware Deep Network for Single-Image Novel View Synthesis
On extremal cacti with respect to the edge revised Szeged index
God Save the Queen
Improving Temporal Relation Extraction with a Globally Acquired Statistical Resource
Regret Bounds for Model-Free Linear Quadratic Control
DoubleFusion: Real-time Capture of Human Performances with Inner Body Shapes from a Single Depth Sensor
Fortification of Neural Morphological Segmentation Models for Polysynthetic Minimal-Resource Languages
Learning to Color from Language
ListOps: A Diagnostic Dataset for Latent Tree Learning
$N$-detachable pairs in 3-connected matroids II: life in $X$
Pixels, voxels, and views: A study of shape representations for single view 3D object shape prediction
On Barotropic Mechanisms of Uncertainty Propagation in Estimation of Drake Passage Transport
On incidence choosability of cubic graphs
A combinatorial model for $\nabla m_μ$
Real-Time Rotation-Invariant Face Detection with Progressive Calibration Networks
Constructions of maximum few-distance sets in Euclidean spaces
A Concatenated Residual Network for Image Deblurring
Co-occurrence Feature Learning from Skeleton Data for Action Recognition and Detection with Hierarchical Aggregation
Adversarial Example Generation with Syntactically Controlled Paraphrase Networks
Improving Deep Binary Embedding Networks by Order-aware Reweighting of Triplets
Deep Learning on Operational Facility Data Related to Large-Scale Distributed Area Scientific Workflows
Analytical Approach for Active Distribution Network Restoration Including Optimal Voltage Regulation
Patterns in random permutations avoiding some sets of multiple patterns
Central Limit Theorems for Diophantine approximants
Automatic Construction of Parallel Portfolios via Explicit Instance Grouping
Sparse Unsupervised Capsules Generalize Better
Parametric Models for Mutual Kernel Matrix Completion
Real-Time Algorithm for Globally Optimal Impulsive Control of Linear Time-Variant Systems
Max-linear models on infinte graphs generated by Bernoulli bond percolation
Progress on the adjacent vertex distinguishing edge colouring conjecture
Human Motion Capture Using a Drone
The TUM VI Benchmark for Evaluating Visual-Inertial Odometry
Automatic Assessment of Artistic Quality of Photos
Pooling or Sampling: Collective Dynamics for Electrical Flow Estimation
Fast and Accurate Tensor Completion with Tensor Trains: A System Identification Approach
Multiple sets exponential concentration and higher order eigenvalues
SeerNet at SemEval-2018 Task 1: Domain Adaptation for Affect in Tweets
Self-Conjugate-Reciprocal Irreducible Monic Factors of $x^n-1$ over Finite Fields and Their Applications
A General Formula for the Stationary Distribution of the Age of Information and Its Application to Single-Server Queues
Quenched convergence and strong local equilibrium for asymmetric zero-range process with site disorder
Classicalization Clearly: Quantum Transition into States of Maximal Memory Storage Capacity
Skew divided difference operators in the Nichols algebra associated to a finite Coxeter group
Packing the Boolean lattice with copies of a poset
Regular expansion for the characteristic exponent of a product of $2 \times 2$ random matrices
Parity Games with Weights
On the Simplicity of Eigenvalues of Two Nonhomogeneous Euler-Bernoulli Beams Connected by a Point Mass
Memetic Algorithms Beat Evolutionary Algorithms on the Class of Hurdle Problems
Sampling of graph signals via randomized local aggregations
Benford or not Benford: a systematic but not always well-founded use of an elegant law in experimental fields
Probabilistic entailment and iterated conditionals
Investigating Backtranslation in Neural Machine Translation
Reaching Distributed Equilibrium with Limited ID Space
LCMR: Local and Centralized Memories for Collaborative Filtering with Unstructured Text
IGCV$2$: Interleaved Structured Sparse Convolutional Neural Networks
Effective Filtering for Multiscale Stochastic Dynamical Systems in Hilbert Spaces
Simple Baselines for Human Pose Estimation and Tracking
DetNet: A Backbone network for Object Detection
Learning Sparse Latent Representations with the Deep Copula Information Bottleneck
Hierarchical correlation reconstruction with missing data
Asymptotic behavior of large Gaussian correlated Wishart matrices
A Comparison of Machine Learning Algorithms for the Surveillance of Autism Spectrum Disorder
An Efficient SIMD Implementation of Pseudo-Verlet Lists for Neighbour Interactions in Particle-Based Codes
A Saliency-based Convolutional Neural Network for Table and Chart Detection in Digitized Documents
Vortex Pooling: Improving Context Representation in Semantic Segmentation
3D positive lattice walks and spherical triangles
PM-GANs: Discriminative Representation Learning for Action Recognition Using Partial-modalities
Temporal Coherent and Graph Optimized Manifold Ranking for Visual Tracking
Synthetic data generation for Indic handwritten text recognition
Random walk on the Poincaré disk induced by a group of Möbius transformations
Automated vehicle’s behavior decision making using deep reinforcement learning and high-fidelity simulation environment
Pattern Avoidance of Generalized Permutations
Structured networks and coarse-grained descriptions: a dynamical perspective
Balanced shellings and moves on balanced manifolds
Network Signatures from Image Representation of Adjacency Matrices: Deep/Transfer Learning for Subgraph Classification
PlaneNet: Piece-wise Planar Reconstruction from a Single RGB Image
A General Duality Principle for Non-Convex Variational Optimization
Bayesian model-data synthesis with an application to global Glacio-Isostatic Adjustment
Efficient Solvers for Sparse Subspace Clustering
Optimization Strategies for Real-Time Control of an Autonomous Melting Probe
PredRNN++: Towards A Resolution of the Deep-in-Time Dilemma in Spatiotemporal Predictive Learning
Three-Dimensional GPU-Accelerated Active Contours for Automated Localization of Cells in Large Images
Leveraging Statistical Multi-Agent Online Planning with Emergent Value Function Approximation
Learning Awareness Models
Robust Kalman Filtering: Asymptotic Analysis of the Least Favorable Model
When and Why are Pre-trainedWord Embeddings Useful for Neural Machine Translation?
Classifying Antimicrobial and Multifunctional Peptides with Bayesian Network Models
Formal Duality in Finite Abelian Groups
Training a Binary Weight Object Detector by Knowledge Transfer for Autonomous Driving
Similarity between Learning Outcomes from Course Objectives using Semantic Analysis, Blooms taxonomy and Corpus statistics
On $f$-Divergences: Integral Representations, Local Behavior, and Inequalities
New bounds on the distance Laplacian and distance signless Laplacian spectral radii
Generalized Hypergraph Coloring
Not-so-supervised: a survey of semi-supervised, multi-instance, and transfer learning in medical image analysis
An Exponential Speedup in Parallel Running Time for Submodular Maximization without Loss in Approximation
Time- and space-optimal algorithms for the many-visits TSP
DGPose: Disentangled Semi-supervised Deep Generative Models for Human Body Analysis
Piecewise Linearization of Quadratic Branch Flow Limits by Irregular Polygon
Short proofs in extrema of spectrally one sided Lévy processes
The Sphere Packing Bound For Memoryless Channels
Im2Avatar: Colorful 3D Reconstruction from a Single Image
Hyperbolic quantum color codes
Data-based Distributionally Robust Stochastic Optimal Power Flow, Part II: Case studies
Bootstrapping Generators from Noisy Data
Occurrence of anomalous diffusion and non-local response in highly-scattering acoustic periodic media
Data-based Distributionally Robust Stochastic Optimal Power Flow, Part I: Methodologies
Computationally Efficient Day-Ahead OPF using Post-Optimal Analysis with Renewable and Load Uncertainties