Efficient Interactive Annotation of Segmentation Datasets with Polygon-RNN++

Manually labeling datasets with object masks is extremely time consuming. In this work, we follow the idea of Polygon-RNN to produce polygonal annotations of objects interactively using humans-in-the-loop. We introduce several important improvements to the model: 1) we design a new CNN encoder architecture, 2) show how to effectively train the model with Reinforcement Learning, and 3) significantly increase the output resolution using a Graph Neural Network, allowing the model to accurately annotate high-resolution objects in images. Extensive evaluation on the Cityscapes dataset shows that our model, which we refer to as Polygon-RNN++, significantly outperforms the original model in both automatic (10% absolute and 16% relative improvement in mean IoU) and interactive modes (requiring 50% fewer clicks by annotators). We further analyze the cross-domain scenario in which our model is trained on one dataset, and used out of the box on datasets from varying domains. The results show that Polygon-RNN++ exhibits powerful generalization capabilities, achieving significant improvements over existing pixel-wise methods. Using simple online fine-tuning we further achieve a high reduction in annotation time for new datasets, moving a step closer towards an interactive annotation tool to be used in practice.

HAMLET: Interpretable Human And Machine co-LEarning Technique

Efficient label acquisition processes are key to obtaining robust classifiers. However, data labeling is often challenging and subject to high levels of label noise. This can arise even when classification targets are well defined, if instances to be labeled are more difficult than the prototypes used to define the class, leading to disagreements among the expert community. Here, we enable efficient training of deep neural networks. From low-confidence labels, we iteratively improve their quality by simultaneous learning of machines and experts. We call it Human And Machine co-LEarning Technique (HAMLET). Throughout the process, experts become more consistent, while the algorithm provides them with explainable feedback for confirmation. HAMLET uses a neural embedding function and a memory module filled with diverse reference embeddings from different classes. Its output includes classification labels and highly relevant reference embeddings as explanation. We took the study of brain monitoring at intensive care unit (ICU) as an application of HAMLET on continuous electroencephalography (cEEG) data. Although cEEG monitoring yields large volumes of data, labeling costs and difficulty make it hard to build a classifier. Additionally, while experts agree on the labels of clear-cut examples of cEEG patterns, labeling many real-world cEEG data can be extremely challenging. Thus, a large minority of sequences might be mislabeled. HAMLET has shown significant performance gain against deep learning and other baselines, increasing accuracy from 7.03% to 68.75% on challenging inputs. Besides improved performance, clinical experts confirmed the interpretability of those reference embeddings in helping explaining the classification results by HAMLET.

MOrdReD: Memory-based Ordinal Regression Deep Neural Networks for Time Series Forecasting

Time series forecasting is ubiquitous in the modern world. Applications range from health care to astronomy, include climate modelling, financial trading and monitoring of critical engineering equipment. To offer value over this range of activities we must have models that not only provide accurate forecasts but that also quantify and adjust their uncertainty over time. Furthermore, such models must allow for multimodal, non-Gaussian behaviour that arises regularly in applied settings. In this work, we propose a novel, end-to-end deep learning method for time series forecasting. Crucially, our model allows the principled assessment of predictive uncertainty as well as providing rich information regarding multiple modes of future data values. Our approach not only provides an excellent predictive forecast, shadowing true future values, but also allows us to infer valuable information, such as the predictive distribution of the occurrence of critical events of interest, accurately and reliably even over long time horizons. We find the method outperforms other state-of-the-art algorithms, such as Gaussian Processes.

Locality-Sensitive Hashing for Earthquake Detection: A Case Study Scaling Data-Driven Science

In this work, we report on a novel application of Locality Sensitive Hashing (LSH) to seismic data at scale. Based on the high waveform similarity between reoccurring earthquakes, our application identifies potential earthquakes by searching for similar time series segments via LSH. However, a straightforward implementation of this LSH-enabled application has difficulty scaling beyond 3 months of continuous time series data measured at a single seismic station. As a case study of a data-driven science workflow, we illustrate how domain knowledge can be incorporated into the workload to improve both the efficiency and result quality. We describe several end-to-end optimizations of the analysis pipeline from pre-processing to post-processing, which allow the application to scale to time series data measured at multiple seismic stations. Our optimizations enable an over 100x speed up in the end-to-end analysis pipeline. This improved scalability enabled seismologists to perform seismic analysis on more than ten years of continuous time series data from over ten seismic stations, and has directly enabled the discovery of 597 new earthquakes near the Diablo Canyon nuclear power plant in California and 6123 new earthquakes in New Zealand.

Computational Power and the Social Impact of Artificial Intelligence

Machine learning is a computational process. To that end, it is inextricably tied to computational power – the tangible material of chips and semiconductors that the algorithms of machine intelligence operate on. Most obviously, computational power and computing architectures shape the speed of training and inference in machine learning, and therefore influence the rate of progress in the technology. But, these relationships are more nuanced than that: hardware shapes the methods used by researchers and engineers in the design and development of machine learning models. Characteristics such as the power consumption of chips also define where and how machine learning can be used in the real world. Despite this, many analyses of the social impact of the current wave of progress in AI have not substantively brought the dimension of hardware into their accounts. While a common trope in both the popular press and scholarly literature is to highlight the massive increase in computational power that has enabled the recent breakthroughs in machine learning, the analysis frequently goes no further than this observation around magnitude. This paper aims to dig more deeply into the relationship between computational power and the development of machine learning. Specifically, it examines how changes in computing architectures, machine learning methodologies, and supply chains might influence the future of AI. In doing so, it seeks to trace a set of specific relationships between this underlying hardware layer and the broader social impacts and risks around AI.

Datasheets for Datasets

Currently there is no standard way to identify how a dataset was created, and what characteristics, motivations, and potential skews it represents. To begin to address this issue, we propose the concept of a datasheet for datasets, a short document to accompany public datasets, commercial APIs, and pretrained models. The goal of this proposal is to enable better communication between dataset creators and users, and help the AI community move toward greater transparency and accountability. By analogy, in computer hardware, it has become industry standard to accompany everything from the simplest components (e.g., resistors), to the most complex microprocessor chips, with datasheets detailing standard operating characteristics, test results, recommended usage, and other information. We outline some of the questions a datasheet for datasets should answer. These questions focus on when, where, and how the training data was gathered, its recommended use cases, and, in the case of human-centric datasets, information regarding the subjects’ demographics and consent as applicable. We develop prototypes of datasheets for two well-known datasets: Labeled Faces in The Wild~\cite{lfw} and the Pang \& Lee Polarity Dataset~\cite{polarity}.

Handling Adversarial Concept Drift in Streaming Data

Classifiers operating in a dynamic, real world environment, are vulnerable to adversarial activity, which causes the data distribution to change over time. These changes are traditionally referred to as concept drift, and several approaches have been developed in literature to deal with the problem of drift handling and detection. However, most concept drift handling techniques, approach it as a domain independent task, to make them applicable to a wide gamut of reactive systems. These techniques were developed from an adversarial agnostic perspective, where they are naive and assume that drift is a benign change, which can be fixed by updating the model. However, this is not the case when an active adversary is trying to evade the deployed classification system. In such an environment, the properties of concept drift are unique, as the drift is intended to degrade the system and at the same time designed to avoid detection by traditional concept drift detection techniques. This special category of drift is termed as adversarial drift, and this paper analyzes its characteristics and impact, in a streaming environment. A novel framework for dealing with adversarial concept drift is proposed, called the Predict-Detect streaming framework. Experimental evaluation of the framework, on generated adversarial drifting data streams, demonstrates that this framework is able to provide reliable unsupervised indication of drift, and is able to recover from drifts swiftly. While traditional partially labeled concept drift detection methodologies fail to detect adversarial drifts, the proposed framework is able to detect such drifts and operates with <6% labeled data, on average. Also, the framework provides benefits for active learning over imbalanced data streams, by innately providing for feature space honeypots, where minority class adversarial samples may be captured.

Clipping free attacks against artificial neural networks

During the last years, a remarkable breakthrough has been made in AI domain thanks to artificial deep neural networks that achieved a great success in many machine learning tasks in computer vision, natural language processing, speech recognition, malware detection and so on. However, they are highly vulnerable to easily crafted adversarial examples. Many investigations have pointed out this fact and different approaches have been proposed to generate attacks while adding a limited perturbation to the original data. The most robust known method so far is the so called C&W attack [1]. Nonetheless, a countermeasure known as feature squeezing coupled with ensemble defense showed that most of these attacks can be destroyed [6]. In this paper, we present a new method we call Centered Initial Attack (CIA) whose advantage is twofold : first, it insures by construction the maximum perturbation to be smaller than a threshold fixed beforehand, without the clipping process that degrades the quality of attacks. Second, it is robust against recently introduced defenses such as feature squeezing, JPEG encoding and even against a voting ensemble of defenses. While its application is not limited to images, we illustrate this using five of the current best classifiers on ImageNet dataset among which two are adversarialy retrained on purpose to be robust against attacks. With a fixed maximum perturbation of only 1.5% on any pixel, around 80% of attacks (targeted) fool the voting ensemble defense and nearly 100% when the perturbation is only 6%. While this shows how it is difficult to defend against CIA attacks, the last section of the paper gives some guidelines to limit their impact.

Calibrated Prediction Intervals for Neural Network Regressors

Ongoing developments in neural network models are continually advancing the state-of-the-art in terms of system accuracy. However, the predicted labels should not be regarded as the only core output; also important is a well calibrated estimate of the prediction uncertainty. Such estimates and their calibration is critical in relation to robust handling of out of distribution events not observed in training data. Despite their obvious aforementioned advantage in relation to accuracy, contemporary neural networks can, generally, be regarded as poorly calibrated and as such do not produce reliable output probability estimates. Further, while post-processing calibration solutions can be found in the relevant literature, these tend to be for systems performing classification. In this regard, we herein present a method for acquiring calibrated predictions intervals for neural network regressors by posing the regression task as a multi-class classification problem and applying one of three proposed calibration methods on the classifiers’ output. Testing our method on two exemplar tasks – speaker age prediction and signal-to-noise ratio estimation – indicates both the suitability of the classification-based regression models and that post-processing by our proposed empirical calibration or temperature scaling methods yields well calibrated prediction intervals. The code for computing calibrated predicted intervals is publicly available.

Algorithm Configuration: Learning policies for the quick termination of poor performers

One way to speed up the algorithm configuration task is to use short runs instead of long runs as much as possible, but without discarding the configurations that eventually do well on the long runs. We consider the problem of selecting the top performing configurations of the Conditional Markov Chain Search (CMCS), a general algorithm schema that includes, for examples, VNS. We investigate how the structure of performance on short tests links with those on long tests, showing that significant differences arise between test domains. We propose a ‘performance envelope’ method to exploit the links; that learns when runs should be terminated, but that automatically adapts to the domain.

DRACO: Robust Distributed Training via Redundant Gradients

Distributed model training is vulnerable to worst-case system failures and adversarial compute nodes, i.e., nodes that use malicious updates to corrupt the global model stored at a parameter server (PS). To tolerate node failures and adversarial attacks, recent work suggests using variants of the geometric median to aggregate distributed updates at the PS, in place of bulk averaging. Although median-based update rules are robust to adversarial nodes, their computational cost can be prohibitive in large-scale settings and their convergence guarantees often require relatively strong assumptions. In this work, we present DRACO, a scalable framework for robust distributed training that uses ideas from coding theory. In DRACO, each compute node evaluates redundant gradients that are then used by the parameter server to eliminate the effects of adversarial updates. We present problem-independent robustness guarantees for DRACO and show that the model it produces is identical to the one trained in the adversary-free setup. We provide extensive experiments on real datasets and distributed setups across a variety of large-scale models, where we show that DRACO is several times to orders of magnitude faster than median-based approaches.

Efficient parametrization of multi-domain deep neural networks

A practical limitation of deep neural networks is their high degree of specialization to a single task and visual domain. Recently, inspired by the successes of transfer learning, several authors have proposed to learn instead universal, fixed feature extractors that, used as the first stage of any deep network, work well for several tasks and domains simultaneously. Nevertheless, such universal features are still somewhat inferior to specialized networks. To overcome this limitation, in this paper we propose to consider instead universal parametric families of neural networks, which still contain specialized problem-specific models, but differing only by a small number of parameters. We study different designs for such parametrizations, including series and parallel residual adapters, joint adapter compression, and parameter allocations, and empirically identify the ones that yield the highest compression. We show that, in order to maximize performance, it is necessary to adapt both shallow and deep layers of a deep network, but the required changes are very small. We also show that these universal parametrization are very effective for transfer learning, where they outperform traditional fine-tuning techniques.

A Provably Correct Algorithm for Deep Learning that Actually Works

We describe a layer-by-layer algorithm for training deep convolutional networks, where each step involves gradient updates for a two layer network followed by a simple clustering algorithm. Our algorithm stems from a deep generative model that generates mages level by level, where lower resolution images correspond to latent semantic classes. We analyze the convergence rate of our algorithm assuming that the data is indeed generated according to this model (as well as additional assumptions). While we do not pretend to claim that the assumptions are realistic for natural images, we do believe that they capture some true properties of real data. Furthermore, we show that our algorithm actually works in practice (on the CIFAR dataset), achieving results in the same ballpark as that of vanilla convolutional neural networks that are being trained by stochastic gradient descent. Finally, our proof techniques may be of independent interest.

Collaborative Filtering with Topic and Social Latent Factors Incorporating Implicit Feedback

Recommender systems (RSs) provide an effective way of alleviating the information overload problem by selecting personalized items for different users. Latent factors based collaborative filtering (CF) has become the popular approaches for RSs due to its accuracy and scalability. Recently, online social networks and user-generated content provide diverse sources for recommendation beyond ratings. Although {\em social matrix factorization} (Social MF) and {\em topic matrix factorization} (Topic MF) successfully exploit social relations and item reviews, respectively, both of them ignore some useful information. In this paper, we investigate the effective data fusion by combining the aforementioned approaches. First, we propose a novel model {\em \mbox{MR3}} to jointly model three sources of information (i.e., ratings, item reviews, and social relations) effectively for rating prediction by aligning the latent factors and hidden topics. Second, we incorporate the implicit feedback from ratings into the proposed model to enhance its capability and to demonstrate its flexibility. We achieve more accurate rating prediction on real-life datasets over various state-of-the-art methods. Furthermore, we measure the contribution from each of the three data sources and the impact of implicit feedback from ratings, followed by the sensitivity analysis of hyperparameters. Empirical studies demonstrate the effectiveness and efficacy of our proposed model and its extension.

Why Comparing Single Performance Scores Does Not Allow to Draw Conclusions About Machine Learning Approaches

Developing state-of-the-art approaches for specific tasks is a major driving force in our research community. Depending on the prestige of the task, publishing it can come along with a lot of visibility. The question arises how reliable are our evaluation methodologies to compare approaches? One common methodology to identify the state-of-the-art is to partition data into a train, a development and a test set. Researchers can train and tune their approach on some part of the dataset and then select the model that worked best on the development set for a final evaluation on unseen test data. Test scores from different approaches are compared, and performance differences are tested for statistical significance. In this publication, we show that there is a high risk that a statistical significance in this type of evaluation is not due to a superior learning approach. Instead, there is a high risk that the difference is due to chance. For example for the CoNLL 2003 NER dataset we observed in up to 26% of the cases type I errors (false positives) with a threshold of p < 0.05, i.e., falsely concluding a statistically significant difference between two identical approaches. We prove that this evaluation setup is unsuitable to compare learning approaches. We formalize alternative evaluation setups based on score distributions.

Evaluation of Session-based Recommendation Algorithms

Recommender systems help users find relevant items of interest, for example on e-commerce or media streaming sites. Most academic research is concerned with approaches that personalize the recommendations according to long-term user profiles. In many real-world applications, however, such long-term profiles often do not exist and recommendations therefore have to be made solely based on the observed behavior of a user during an ongoing session. Given the high practical relevance of the problem, an increased interest in this problem can be observed in recent years, leading to a number of proposals for session-based recommendation algorithms that typically aim to predict the user’s immediate next actions. In this work, we present the results of an in-depth performance comparison of a number of such algorithms, using a variety of datasets and evaluation measures. Our comparison includes the most recent approaches based on recurrent neural networks like GRU4REC, factorized Markov model approaches such as FISM or Fossil, as well as more simple methods based, e.g., on nearest neighbor schemes. Our experiments reveal that algorithms of this latter class, despite their sometimes almost trivial nature, often perform equally well or significantly better than today’s more complex approaches based on deep neural networks. Our results therefore suggest that there is substantial room for improvement regarding the development of more sophisticated session-based recommendation algorithms.

BAGAN: Data Augmentation with Balancing GAN

Image classification datasets are often imbalanced, characteristic that negatively affects the accuracy of deeplearning classifiers. In this work we propose balancing GANs (BAGANs) as an augmentation tool to restore balance in imbalanced datasets. This is challenging because the few minority-class images may not be enough to train a GAN. We overcome this issue by including during training all available images of majority and minority classes. The generative model learns useful features from majority classes and uses these to generate images for minority classes. We apply class-conditioning in the latent space to drive the generation process towards a target class. Additionally, we couple GANs with autoencoding techniques to reduce the risk of collapsing toward the generation of few foolish examples. We compare the proposed methodology with state-of-the-art GANs and demonstrate that BAGAN generates images of superior quality when trained with an imbalanced dataset.

Predicting the Future with Transformational States

An intelligent observer looks at the world and sees not only what is, but what is moving and what can be moved. In other words, the observer sees how the present state of the world can transform in the future. We propose a model that predicts future images by learning to represent the present state and its transformation given only a sequence of images. To do so, we introduce an architecture with a latent state composed of two components designed to capture (i) the present image state and (ii) the transformation between present and future states, respectively. We couple this latent state with a recurrent neural network (RNN) core that predicts future frames by transforming past states into future states by applying the accumulated state transformation with a learned operator. We describe how this model can be integrated into an encoder-decoder convolutional neural network (CNN) architecture that uses weighted residual connections to integrate representations of the past with representations of the future. Qualitatively, our approach generates image sequences that are stable and capture realistic motion over multiple predicted frames, without requiring adversarial training. Quantitatively, our method achieves prediction results comparable to state-of-the-art results on standard image prediction benchmarks (Moving MNIST, KTH, and UCF101).

Transferable Joint Attribute-Identity Deep Learning for Unsupervised Person Re-Identification

Most existing person re-identification (re-id) methods require supervised model learning from a separate large set of pairwise labelled training data for every single camera pair. This significantly limits their scalability and usability in real-world large scale deployments with the need for performing re-id across many camera views. To address this scalability problem, we develop a novel deep learning method for transferring the labelled information of an existing dataset to a new unseen (unlabelled) target domain for person re-id without any supervised learning in the target domain. Specifically, we introduce an Transferable Joint Attribute-Identity Deep Learning (TJ-AIDL) for simultaneously learning an attribute-semantic and identitydiscriminative feature representation space transferrable to any new (unseen) target domain for re-id tasks without the need for collecting new labelled training data from the target domain (i.e. unsupervised learning in the target domain). Extensive comparative evaluations validate the superiority of this new TJ-AIDL model for unsupervised person re-id over a wide range of state-of-the-art methods on four challenging benchmarks including VIPeR, PRID, Market-1501, and DukeMTMC-ReID.

Diagonalwise Refactorization: An Efficient Training Method for Depthwise Convolutions

Depthwise convolutions provide significant performance benefits owing to the reduction in both parameters and mult-adds. However, training depthwise convolution layers with GPUs is slow in current deep learning frameworks because their implementations cannot fully utilize the GPU capacity. To address this problem, in this paper we present an efficient method (called diagonalwise refactorization) for accelerating the training of depthwise convolution layers. Our key idea is to rearrange the weight vectors of a depthwise convolution into a large diagonal weight matrix so as to convert the depthwise convolution into one single standard convolution, which is well supported by the cuDNN library that is highly-optimized for GPU computations. We have implemented our training method in five popular deep learning frameworks. Evaluation results show that our proposed method gains 15.4\times training speedup on Darknet, 8.4\times on Caffe, 5.4\times on PyTorch, 3.5\times on MXNet, and 1.4\times on TensorFlow, compared to their original implementations of depthwise convolutions.

Tensor graph convolutional neural network

In this paper, we propose a novel tensor graph convolutional neural network (TGCNN) to conduct convolution on factorizable graphs, for which here two types of problems are focused, one is sequential dynamic graphs and the other is cross-attribute graphs. Especially, we propose a graph preserving layer to memorize salient nodes of those factorized subgraphs, i.e. cross graph convolution and graph pooling. For cross graph convolution, a parameterized Kronecker sum operation is proposed to generate a conjunctive adjacency matrix characterizing the relationship between every pair of nodes across two subgraphs. Taking this operation, then general graph convolution may be efficiently performed followed by the composition of small matrices, which thus reduces high memory and computational burden. Encapsuling sequence graphs into a recursive learning, the dynamics of graphs can be efficiently encoded as well as the spatial layout of graphs. To validate the proposed TGCNN, experiments are conducted on skeleton action datasets as well as matrix completion dataset. The experiment results demonstrate that our method can achieve more competitive performance with the state-of-the-art methods.

DeepJDOT: Deep Joint distribution optimal transport for unsupervised domain adaptation

In computer vision, one is often confronted with problems of domain shifts, which occur when one applies a classifier trained on a source dataset to target data sharing similar characteristics (e.g. same classes), but also different latent data structures (e.g. different acquisition conditions). In such a situation, the model will perform poorly on the new data, since the classifier is specialized to recognize visual cues specific to the source domain. In this work we explore a solution, named DeepJDOT, to tackle this problem: through a measure of discrepancy on joint deep representations/labels based on optimal transport, we not only learn new data representations aligned between the source and target domain, but also simultaneously preserve the discriminative information used by the classifier. We applied DeepJDOT to a series of visual recognition tasks, where it compares favorably against state-of-the-art deep domain adaptation methods.

Bayesian Gradient Descent: Online Variational Bayes Learning with Increased Robustness to Catastrophic Forgetting and Weight Pruning

We suggest a novel approach for the estimation of the posterior distribution of the weights of a neural network, using an online version of the variational Bayes method. Having a confidence measure of the weights allows to combat several shortcomings of neural networks, such as their parameter redundancy, and their notorious vulnerability to the change of input distribution (‘catastrophic forgetting’). Specifically, We show that this approach helps alleviate the catastrophic forgetting phenomenon – even without the knowledge of when the tasks are been switched. Furthermore, it improves the robustness of the network to weight pruning – even without re-training.

Stein Points

An important task in computational statistics and machine learning is to approximate a posterior distribution p(x) with an empirical measure supported on a set of representative points \{x_i\}_{i=1}^n. This paper focuses on methods where the selection of points is essentially deterministic, with an emphasis on achieving accurate approximation when n is small. To this end, we present `Stein Points’. The idea is to exploit either a greedy or a conditional gradient method to iteratively minimise a kernel Stein discrepancy between the empirical measure and p(x). Our empirical results demonstrate that Stein Points enable accurate approximation of the posterior at modest computational cost. In addition, theoretical results are provided to establish convergence of the method.

Wavelet spectral testing: application to nonstationary circadian rhythms
Cluster analysis of stocks using price movements of high frequency data from National Stock Exchange
A Low-Resolution ADC Module Assisted Hybrid Beamforming Architecture for mmWave Communications
On the Approximation Ratio of Greedy Parsings
Fréchet ChemblNet Distance: A metric for generative models for molecules
Self-Attentional Acoustic Models
Universal Compressed Text Indexing
On the Tits cone of a Weyl groupoid
Ordinary lines in space
On the utility of Metropolis-Hastings with asymmetric acceptance ratio
Deep Representation for Patient Visits from Electronic Health Records
Connectionist Recommendation in the Wild
Revisiting First-Order Convex Optimization Over Linear Spaces
A General Path-Based Representation for Predicting Program Properties
Local verification of global proofs
A Linear Independence Theorem involving Complete Graphs and Polynomials
SIG-DB: leveraging homomorphic encryption to Securely Interrogate privately held Genomic DataBases
On Regularized Losses for Weakly-supervised CNN Segmentation
Colouring set families without monochromatic k-chains
Long short-term memory and Learning-to-learn in networks of spiking neurons
Dushnik-Miller dimension of TD-Delaunay complexes
Discrete Morse theory for the collapsibility of supremum sections
Schramm-Loewner evolution with Lie superalgebra symmetry
Finite horizon risk-sensitive continuous-time Markov decision processes with unbounded transition and cost rates
Efficient Image Dataset Classification Difficulty Estimation for Predicting Deep-Learning Accuracy
Rule-based Autoregressive Moving Average Models for Forecasting Load on Special Days: A Case Study for France
Similarity based hierarchical clustering of physiological parameters for the identification of health states – a feasibility study
One-Shot Segmentation in Clutter
Threshold Progressions in a Variety of Covering and Packing Contexts
Comparison of angular spread for 6 and 60 GHz based on 3GPP standard
Evaluation of angular dispersion for various propagation environments in emerging 5G systems
Path loss model modification for various gains and directions of antennas
Modeling power angle spectrum and antenna pattern directions in multipath propagation environment
Correlation properties of signal at mobile receiver for different propagation environments
Backbone decomposition of multitype superprocesses
Learning the Multiple Traveling Salesmen Problem with Permutation Invariant Pooling Networks
BER measurements in the evaluation of operation correctness of VSAT modem traffic interfaces
A Tutte-like polynomial for rooted trees and specific posets
Metric Learning with Dynamically Generated Pairwise Constraints for Ear Recognition
Sparse Recovery over Graph Incidence Matrices: Polynomial Time Guarantees and Location Dependent Performance
On the Limitation of Local Intrinsic Dimensionality for Characterizing the Subspaces of Adversarial Examples
On the multipacking number of grid graphs
Unsupervised Separation of Transliterable and Native Words for Malayalam
A multilayer backpropagation saliency detection algorithm and its applications
On negative association of some finite point processes on general state spaces, extended version
A quantitative fourth moment theorem in free probability theory
On the Intrinsic Dimensionality of Face Representation
Extra Space during Initialization of Succinct Data Structures and of Dynamical Initializable Arrays
On shrinking horizon move-blocking predictive control
I/O Logic in HOL — First Steps
Systoles, Special Lagrangians, and Bridgeland stability conditions
Strict monotonicity of $p_c$ under covering maps
Flow From Motion: A Deep Learning Approach
Design optimisation and post-trial analysis in group sequential stepped-wedge cluster randomised trials
The $\{1,s\}$-weighted Davenport constant in $\mathbb Z_n$ and an application in an inverse problem
Robust principal components for irregularly spaced longitudinal data
On the Runtime Analysis of the Clearing Diversity-Preserving Mechanism
Random Nodal Lengths and Wiener Chaos
Parameterized Intractability of Even Set and Shortest Vector Problem from Gap-ETH
On the Importance of Stereo for Accurate Depth Estimation: An Efficient Semi-Supervised Deep Neural Network Approach
CliCR: A Dataset of Clinical Case Reports for Machine Reading Comprehension
3D Human Pose Estimation in the Wild by Adversarial Learning
Resilient Active Information Gathering with Mobile Robots
Convolutional Attribute Embedding and Cross-Domain Representations for Domain Transfer Learning
A Scalable Empirical Bayes Approach to Variable Selection in Generalized Linear Models
DJAM: distributed Jacobi asynchronous method for learning personal models
English verb regularization in books and tweets
Non-power-law universality in one-dimensional quasicrystals
Runtime Analysis of Probabilistic Crowding and Restricted Tournament Selection for Bimodal Optimisation
Bridging Many-Body Quantum Physics and Deep Learning via Tensor Networks
Enhancing confidence in the detection of gravitational waves from compact binaries via Bayesian model comparison
Highly entangled tensors
On Chatbots Exhibiting Goal-Directed Autonomy in Dynamic Environments
A Common Framework for Natural Gradient and Taylor based Optimisation using Manifold Theory
Min-Max Tours for Task Allocation to Heterogeneous Agents
Women also Snowboard: Overcoming Bias in Captioning Models
Demystifying Core Ranking in Pinterest Image Search
Polynomial graph invariants and the KP hierarchy
On approximations for the distribution of the time of first level crossing
Generating Talking Face Landmarks from Speech
Deep learning as a tool for neural data analysis: speech classification and cross-frequency coupling in human sensorimotor cortex
Spectral feature mapping with mimic loss for robust speech recognition
A disciplined approach to neural network hyper-parameters: Part 1 — learning rate, batch size, momentum, and weight decay
Low-Shot Learning for the Semantic Segmentation of Remote Sensing Imagery
Cox Regression Model Under Dependent Truncation
Heat Kernel analysis of Syntactic Structures
Kauffman cellular automata on quasicrystal topology
Empirical Analysis of Foundational Distinctions in the Web of Data
Neural Baby Talk
Adaptive nonparametric estimation for compound Poisson processes robust to the discrete-observation scheme
Weakly nonlinear analysis for car-following model with consideration of cooperation and time delays
Attributes as Operators
Generative Design in Minecraft (GDMC), Settlement Generation Competition
WebSeg: Learning Semantic Segmentation from Web Searches
Three Birds One Stone: A Unified Framework for Salient Object Segmentation, Edge Detection and Skeleton Extraction
A Decision Tree Approach to Predicting Recidivism in Domestic Violence
Accelerating Empowerment Computation with UCT Tree Search
Towards Human-Machine Cooperation: Self-supervised Sample Mining for Object Detection
Bypassing Feature Squeezing by Increasing Adversary Strength
Rate-distortion functions of non-stationary Markoff chains and their block-independent approximations
A Web Scraping Methodology for Bypassing Twitter API Restrictions
JSweep: A Patch-centric Data-driven Approach for Parallel Sweeps on Large-scale Meshes
k-ary Spanning Trees Contained in Tournaments
Diversity Regularized Spatiotemporal Attention for Video-based Person Re-identification
MLE-induced Likelihood for Markov Random Fields
Cloud-based MPC with Encrypted Data
Multi-Scale Structure-Aware Network for Human Pose Estimation
On Fairness of Systemic Risk Measures
On Difference of SOS Decompositions and Difference of SOS Convex Decompositions for Polynomials
Mittens: An Extension of GloVe for Learning Domain-Specialized Representations
Compassionately Conservative Balanced Cuts for Image Segmentation
Free hyperplane arrangements over arbitrary fields
A Divide-and-Conquer Approach to Compressed Sensing MRI
On the the successive passage times of certain one-dimensional diffusions
Network Science approach to Modelling Emergence and Topological Robustness of Supply Networks: A Review and Perspective
Reconfigurable Antenna Multiple Access for 5G mmWave Systems
A statistical mechanics approach to de-biasing and uncertainty estimation in LASSO for random measurements
Entropy Controlled Non-Stationarity for Improving Performance of Independent Learners in Anonymous MARL Settings
Worst-Case Optimal Join Algorithms: Techniques, Results, and Open Problems
Image Semantic Transformation: Faster, Lighter and Stronger
Image-based deep learning for classification of noise transients in gravitational wave detectors
Dual Attention Matching Network for Context-Aware Feature Sequence based Person Re-Identification
Directional Modulation: A Secure Solution to 5G and Beyond Mobile Networks
Weakly Consistent Extensions of Lower Previsions
Iteration-complexity of first-order augmented Lagrangian methods for convex conic programming
Complex-Valued Restricted Boltzmann Machine for Direct Speech Parameterization from Complex Spectra
Periodic Fourier representation of boolean functions
A Faster FPTAS for the Subset-Sums Ratio Problem
Exact eigenvalue assignment of linear scalar systems with single delay using Lambert W function
Quantum speedup in stoquastic adiabatic quantum computation
Learning Synergies between Pushing and Grasping with Self-supervised Deep Reinforcement Learning
An Efficient Method to Transform SAT problems to Binary Integer Linear Programming Problem
Classification of external Zonotopal algebras
Reinforcement Learning for Fair Dynamic Pricing
A Probit Network Model with Arbitrary Dependence
Inferring network connectivity from event timing patterns
Physical foundations of biological complexity
Hiding in the Crowd: A Massively Distributed Algorithm for Private Averaging with Malicious Adversaries
New contributions to the study of stochastic processes of the class $(Σ)$
Minimal Linear Codes over Finite Fields
Applications of Artificial Intelligence to Network Security
Modelling and simulating Lenski’s long-term evolution experiment
Local densities for a class of degenerate diffusions
Cross-validation in high-dimensional spaces: a lifeline for least-squares models and multi-class LDA
On Dispersable Book Embeddings
Approximate Bayesian Computation for Finite Mixture Models
Recent Developments from Attribute Profiles for Remote Sensing Image Classification
Learning Depth from Single Images with Deep Neural Network Embedding Focal Length
A Game-Theoretic Approach to Information-Flow Control via Protocol Composition
A joint model for multiple dynamic processes and clinical endpoints: application to Alzheimer’s disease
Kinetic Compressive Sensing
Fast Parametric Learning with Activation Memorization
Congruences for Apéry-like numbers
A New Argument for p<0.005
On the critical exponents of the yielding transition of amorphous solids
A Framework for Evaluating 6-DOF Object Trackers
A sequent calculus for a semi-associative law
Emergence of Cooperation in the thermodynamic limit
Dicke Phase Transition in a Disordered Emitter-Graphene Plasmon System
Point Convolutional Neural Networks by Extension Operators
A New Target-specific Object Proposal Generation Method for Visual Tracking
Random Polyhedral Scenes: An Image Generator for Active Vision System Experiments
A Fast Face Detection Method via Convolutional Neural Network
Event-based Dynamic Face Detection and Tracking Based on Activity
Gradient Algorithms for Complex Non-Gaussian Independent Component/Vector Extraction
Building state-of-the-art distant speech recognition using the CHiME-4 challenge with a setup of speech enhancement baseline
Learning distributions of shape trajectories from longitudinal datasets: a hierarchical model on a manifold of diffeomorphisms
Quantifying the weight of fingerprint evidence using an ROC-based Approximate Bayesian Computation algorithm
World Models
Blinded and unblinded sample size re-estimation in crossover trials balanced for period
The algebra of predicting agents
Investigating Generative Adversarial Networks based Speech Dereverberation for Robust Speech Recognition
You are your Metadata: Identification and Obfuscation of Social Media Users using Metadata Information
Anderson transition for elastic waves in three dimensions
Comprehending Real Numbers: Development of Bengali Real Number Speech Corpus
New concavity and convexity results for symmetric polynomials and their ratios
Empirical Evaluation of Speaker Adaptation on DNN based Acoustic Model
Learning to Branch
Learning Driving Models with a Surround-View Camera System and a Route Planner
A Ramsey theorem for biased graphs
Mean Reflected Stochastic Differential Equations with Jumps
Quasi-solution of linear inverse problems in non-reflexive Banach spaces
Distributed Adaptive Sampling for Kernel Matrix Approximation
Derivative-Free Optimization of Noisy Functions via Quasi-Newton Methods
Almost sure, L_1- and L_2-growth behavior of supercritical multi-type continuous state and continuous time branching processes with immigration
Generalized vector space partitions
An Optimal Algorithm for Computing the Visibility Area of a Polygon from a Point Using Constant-Memory
HDM-Net: Monocular Non-Rigid 3D Reconstruction with Learned Deformation Model
Theory of combustion in disordered media