SLAQ: Quality-Driven Scheduling for Distributed Machine Learning

Training machine learning (ML) models with large datasets can incur significant resource contention on shared clusters. This training typically involves many iterations that continually improve the quality of the model. Yet in exploratory settings, better models can be obtained faster by directing resources to jobs with the most potential for improvement. We describe SLAQ, a cluster scheduling system for approximate ML training jobs that aims to maximize the overall job quality. When allocating cluster resources, SLAQ explores the quality-runtime trade-offs across multiple jobs to maximize system-wide quality improvement. To do so, SLAQ leverages the iterative nature of ML training algorithms, by collecting quality and resource usage information from concurrent jobs, and then generating highly-tailored quality-improvement predictions for future iterations. Experiments show that SLAQ achieves an average quality improvement of up to 73% and an average delay reduction of up to 44% on a large set of ML training jobs, compared to resource fairness schedulers.

Learning Confidence for Out-of-Distribution Detection in Neural Networks

Modern neural networks are very powerful predictive models, but they are often incapable of recognizing when their predictions may be wrong. Closely related to this is the task of out-of-distribution detection, where a network must determine whether or not an input is outside of the set on which it is expected to safely perform. To jointly address these issues, we propose a method of learning confidence estimates for neural networks that is simple to implement and produces intuitively interpretable outputs. We demonstrate that on the task of out-of-distribution detection, our technique surpasses recently proposed techniques which construct confidence based on the network’s output distribution, without requiring any additional labels or access to out-of-distribution examples. Additionally, we address the problem of calibrating out-of-distribution detectors, where we demonstrate that misclassified in-distribution examples can be used as a proxy for out-of-distribution examples.

SimplE Embedding for Link Prediction in Knowledge Graphs

The aim of knowledge graphs is to gather knowledge about the world and provide a structured representation of this knowledge. Current knowledge graphs are far from complete. To address the incompleteness of the knowledge graphs, link prediction approaches have been developed which make probabilistic predictions about new links in a knowledge graph given the existing links. Tensor factorization approaches have proven promising for such link prediction problems. In this paper, we develop a simple tensor factorization model called SimplE, through a slight modification of the Polyadic Decomposition model from 1927. The complexity of SimplE grows linearly with the size of embeddings. The embeddings learned through SimplE are interpretable, and certain types of expert knowledge in terms of logical rules can be incorporated into these embeddings through weight tying. We prove SimplE is fully-expressive and derive a bound on the size of its embeddings for full expressivity. We show empirically that, despite its simplicity, SimplE outperforms several state-of-the-art tensor factorization techniques.

GILBO: One Metric to Measure Them All

We propose a simple, tractable lower bound on the mutual information contained in the joint generative density of any latent variable generative model: the GILBO (Generative Information Lower BOund). It offers a data independent measure of the complexity of the learned latent variable description, giving the log of the effective description length. It is well-defined for both VAEs and GANs. We compute the GILBO for 800 GANs and VAEs trained on MNIST and discuss the results.

Statistical Inference for Online Learning and Stochastic Approximation via Hierarchical Incremental Gradient Descent

Stochastic gradient descent (SGD) is an immensely popular approach for online learning in settings where data arrives in a stream or data sizes are very large. However, despite an ever-increasing volume of work on SGD, much less is known about the statistical inferential properties of SGD-based predictions. Taking a fully inferential viewpoint, this paper introduces a novel procedure termed HiGrad to conduct statistical inference for online learning, without incurring additional computational cost compared with SGD. The HiGrad procedure begins by performing SGD updates for a while and then splits the single thread into several threads, and this procedure hierarchically operates in this fashion along each thread. With predictions provided by multiple threads in place, a t-based confidence interval is constructed by decorrelating predictions using covariance structures given by the Ruppert–Polyak averaging scheme. Under certain regularity conditions, the HiGrad confidence interval is shown to attain asymptotically exact coverage probability. Finally, the performance of HiGrad is evaluated through extensive simulation studies and a real data example. An R package higrad has been developed to implement the method.

Uncertainty Estimation via Stochastic Batch Normalization

In this work, we investigate Batch Normalization technique and propose its probabilistic interpretation. We propose a probabilistic model and show that Batch Normalization maximazes the lower bound of its marginalized log-likelihood. Then, according to the new probabilistic model, we design an algorithm which acts consistently during train and test. However, inference becomes computationally inefficient. To reduce memory and computational cost, we propose Stochastic Batch Normalization — an efficient approximation of proper inference procedure. This method provides us with a scalable uncertainty estimation technique. We demonstrate the performance of Stochastic Batch Normalization on popular architectures (including deep convolutional architectures: VGG-like and ResNets) for MNIST and CIFAR-10 datasets.

Field-Programmable Deep Neural Network (DNN) Learning and Inference accelerator: a concept

An accelerator is a specialized integrated circuit designed to perform specific computations faster than if those were performed by CPU or GPU. A Field-Programmable DNN learning and inference accelerator (FProg-DNN) using hybrid systolic and non-systolic techniques, distributed information-control and deep pipelined structure is proposed and its microarchitecture and operation presented here. Reconfigurability attends diverse DNN designs and allows for different number of workers to be assigned to different layers as a function of the relative difference in computational load among layers. The computational delay per layer is made roughly the same along pipelined accelerator structure. VGG-16 and recently proposed Inception Modules are used for showing the flexibility of the FProg-DNN reconfigurability. Special structures were also added for a combination of convolution layer, map coincidence and feedback for state of the art learning with small set of examples, which is the focus of a companion paper by the author (Franca-Neto, 2018). The accelerator described is able to reconfigure from (1) allocating all a DNN computations to a single worker in one extreme of sub-optimal performance to (2) optimally allocating workers per layer according to computational load in each DNN layer to be realized. Due the pipelined architecture, more than 50x speedup is achieved relative to GPUs or TPUs. This speed-up is consequence of hiding the delay in transporting activation outputs from one layer to the next in a DNN behind the computations in the receiving layer. This FProg-DNN concept has been simulated and validated at behavioral-functional level.

Geometry-Based Data Generation

Many generative models attempt to replicate the density of their input data. However, this approach is often undesirable, since data density is highly affected by sampling biases, noise, and artifacts. We propose a method called SUGAR (Synthesis Using Geometrically Aligned Random-walks) that uses a diffusion process to learn a manifold geometry from the data. Then, it generates new points evenly along the manifold by pulling randomly generated points into its intrinsic structure using a diffusion kernel. SUGAR equalizes the density along the manifold by selectively generating points in sparse areas of the manifold. We demonstrate how the approach corrects sampling biases and artifacts, while also revealing intrinsic patterns (e.g. progression) and relations in the data. The method is applicable for correcting missing data, finding hypothetical data points, and learning relationships between data features.

Context-Specific Validation of Data-Driven Models

With an increasing use of data-driven models to control robotic systems, it has become important to develop a methodology for validating such models before they can be deployed to design a controller for the actual system. Specifically, it must be ensured that the controller designed for an abstract or learned model would perform as expected on the actual physical system. We propose a context-specific validation framework to quantify the quality of a learned model based on a distance metric between the closed-loop actual system and the learned model. We then propose an active sampling scheme to compute a probabilistic upper bound on this distance in a sample-efficient manner. The proposed framework validates the learned model against only those behaviors of the system that are relevant for the purpose for which we intend to use this model, and does not require any a priori knowledge of the system dynamics. Several simulations illustrate the practicality of the proposed framework for validating the models of real-world systems.

Graph2Seq: Scalable Learning Dynamics for Graphs

Neural networks have been shown to be an effective tool for learning algorithms over graph-structured data. However, graph representation techniques–that convert graphs to real-valued vectors for use with neural networks–are still in their infancy. Recent works have proposed several approaches (e.g., graph convolutional networks), but these methods have difficulty scaling and generalizing to graphs with different sizes and shapes. We present Graph2Seq, a new technique that represents graphs as an infinite time-series. By not limiting the representation to a fixed dimension, Graph2Seq scales naturally to graphs of arbitrary sizes and shapes. Graph2Seq is also reversible, allowing full recovery of the graph structure from the sequence. By analyzing a formal computational model for graph representation, we show that an unbounded sequence is necessary for scalability. Our experimental results with Graph2Seq show strong generalization and new state-of-the-art performance on a variety of graph combinatorial optimization problems.

D2KE: From Distance to Kernel and Embedding

For many machine learning problem settings, particularly with structured inputs such as sequences or sets of objects, a distance measure between inputs can be specified more naturally than a feature representation. However, most standard machine models are designed for inputs with a vector feature representation. In this work, we consider the estimation of a function f:\mathcal{X} \rightarrow \R based solely on a dissimilarity measure d:\mathcal{X}\times\mathcal{X} \rightarrow \R between inputs. In particular, we propose a general framework to derive a family of \emph{positive definite kernels} from a given dissimilarity measure, which subsumes the widely-used \emph{representative-set method} as a special case, and relates to the well-known \emph{distance substitution kernel} in a limiting case. We show that functions in the corresponding Reproducing Kernel Hilbert Space (RKHS) are Lipschitz-continuous w.r.t. the given distance metric. We provide a tractable algorithm to estimate a function from this RKHS, and show that it enjoys better generalizability than Nearest-Neighbor estimates. Our approach draws from the literature of Random Features, but instead of deriving feature maps from an existing kernel, we construct novel kernels from a random feature map, that we specify given the distance measure. We conduct classification experiments with such disparate domains as strings, time series, and sets of vectors, where our proposed framework compares favorably to existing distance-based learning methods such as k-nearest-neighbors, distance-substitution kernels, pseudo-Euclidean embedding, and the representative-set method.

M4CD: A Robust Change Detection Method for Intelligent Visual Surveillance

In this paper, we propose a robust change detection method for intelligent visual surveillance. This method, named M4CD, includes three major steps. Firstly, a sample-based background model that integrates color and texture cues is built and updated over time. Secondly, multiple heterogeneous features (including brightness variation, chromaticity variation, and texture variation) are extracted by comparing the input frame with the background model, and a multi-source learning strategy is designed to online estimate the probability distributions for both foreground and background. The three features are approximately conditionally independent, making multi-source learning feasible. Pixel-wise foreground posteriors are then estimated with Bayes rule. Finally, the Markov random field (MRF) optimization and heuristic post-processing techniques are used sequentially to improve accuracy. In particular, a two-layer MRF model is constructed to represent pixel-based and superpixel-based contextual constraints compactly. Experimental results on the CDnet dataset indicate that M4CD is robust under complex environments and ranks among the top methods.

Convolutional Neural Networks over Control Flow Graphs for Software Defect Prediction

Existing defects in software components is unavoidable and leads to not only a waste of time and money but also many serious consequences. To build predictive models, previous studies focus on manually extracting features or using tree representations of programs, and exploiting different machine learning algorithms. However, the performance of the models is not high since the existing features and tree structures often fail to capture the semantics of programs. To explore deeply programs’ semantics, this paper proposes to leverage precise graphs representing program execution flows, and deep neural networks for automatically learning defect features. Firstly, control flow graphs are constructed from the assembly instructions obtained by compiling source code; we thereafter apply multi-view multi-layer directed graph-based convolutional neural networks (DGCNNs) to learn semantic features. The experiments on four real-world datasets show that our method significantly outperforms the baselines including several other deep learning approaches.

Robust Continuous Co-Clustering

Clustering consists of grouping together samples giving their similar properties. The problem of modeling simultaneously groups of samples and features is known as Co-Clustering. This paper introduces ROCCO – a Robust Continuous Co-Clustering algorithm. ROCCO is a scalable, hyperparameter-free, easy and ready to use algorithm to address Co-Clustering problems in practice over massive cross-domain datasets. It operates by learning a graph-based two-sided representation of the input matrix. The underlying proposed optimization problem is non-convex, which assures a flexible pool of solutions. Moreover, we prove that it can be solved with a near linear time complexity on the input size. An exhaustive large-scale experimental testbed conducted with both synthetic and real-world datasets demonstrates ROCCO’s properties in practice: (i) State-of-the-art performance in cross-domain real-world problems including Biomedicine and Text Mining; (ii) very low sensitivity to hyperparameter settings; (iii) robustness to noise and (iv) a linear empirical scalability in practice. These results highlight ROCCO as a powerful general-purpose co-clustering algorithm for cross-domain practitioners, regardless of their technical background.

Benchmarking Framework for Performance-Evaluation of Causal Inference Analysis

Causal inference analysis is the estimation of the effects of actions on outcomes. In the context of healthcare data this means estimating the outcome of counter-factual treatments (i.e. including treatments that were not observed) on a patient’s outcome. Compared to classic machine learning methods, evaluation and validation of causal inference analysis is more challenging because ground truth data of counter-factual outcome can never be obtained in any real-world scenario. Here, we present a comprehensive framework for benchmarking algorithms that estimate causal effect. The framework includes unlabeled data for prediction, labeled data for validation, and code for automatic evaluation of algorithm predictions using both established and novel metrics. The data is based on real-world covariates, and the treatment assignments and outcomes are based on simulations, which provides the basis for validation. In this framework we address two questions: one of scaling, and the other of data-censoring. The framework is available as open source code at https://…-Causal-Inference-Benchmarking-Framework.

Classification of Scientific Papers With Big Data Technologies

Data sizes that cannot be processed by conventional data storage and analysis systems are named as Big Data.It also refers to nex technologies developed to store, process and analyze large amounts of data. Automatic information retrieval about the contents of a large number of documents produced by different sources, identifying research fields and topics, extraction of the document abstracts, or discovering patterns are some of the topics that have been studied in the field of big data.In this study, Naive Bayes classification algorithm, which is run on a data set consisting of scientific articles, has been tried to automatically determine the classes to which these documents belong. We have developed an efficient system that can analyze the Turkish scientific documents with the distributed document classification algorithm run on the Cloud Computing infrastructure. The Apache Mahout library is used in the study. The servers required for classifying and clustering distributed documents are

Crowd ideation of supervised learning problems

Crowdsourcing is an important avenue for collecting machine learning data, but crowdsourcing can go beyond simple data collection by employing the creativity and wisdom of crowd workers. Yet crowd participants are unlikely to be experts in statistics or predictive modeling, and it is not clear how well non-experts can contribute creatively to the process of machine learning. Here we study an end-to-end crowdsourcing algorithm where groups of non-expert workers propose supervised learning problems, rank and categorize those problems, and then provide data to train predictive models on those problems. Problem proposal includes and extends feature engineering because workers propose the entire problem, not only the input features but also the target variable. We show that workers without machine learning experience can collectively construct useful datasets and that predictive models can be learned on these datasets. In our experiments, the problems proposed by workers covered a broad range of topics, from politics and current events to problems capturing health behavior, demographics, and more. Workers also favored questions showing positively correlated relationships, which has interesting implications given many supervised learning methods perform as well with strong negative correlations. Proper instructions are crucial for non-experts, so we also conducted a randomized trial to understand how different instructions may influence the types of problems proposed by workers. In general, shifting the focus of machine learning tasks from designing and training individual predictive models to problem proposal allows crowdsourcers to design requirements for problems of interest and then guide workers towards contributing to the most suitable problems.

On the Blindspots of Convolutional Networks

Deep convolutional network has been the state-of-the-art approach for a wide variety of tasks over the last few years. Its successes have, in many cases, turned it into the default model in quite a few domains. In this work we will demonstrate that convolutional networks have limitations that may, in some cases, hinder it from learning properties of the data, which are easily recognizable by traditional, less demanding, models. To this end, we present a series of competitive analysis studies on image recognition and text analysis tasks, for which convolutional networks are known to provide state-of-the-art results. In our studies, we inject a truth-reveling signal, indiscernible for the network, thus hitting time and again the network’s blind spots. The signal does not impair the network’s existing performances, but it does provide an opportunity for a significant performance boost by models that can capture it. The various forms of the carefully designed signals shed a light on the strengths and weaknesses of convolutional network, which may provide insights for both theoreticians that study the power of deep architectures, and for practitioners that consider to apply convolutional networks to the task at hand.

Understanding the Role of Adaptivity in Machine Teaching: The Case of Version Space Learners

In real-world applications of education and human teaching, an effective teacher chooses the next example intelligently based on the learner’s current state. However, most of the existing works in algorithmic machine teaching focus on the batch setting, where adaptivity plays no role. In this paper, we study the case of teaching consistent, version space learners in an interactive setting—at any time step, the teacher provides an example, the learner performs an update, and the teacher observes the learner’s new state. We highlight that adaptivity does not speed up the teaching process when considering existing models of version space learners, such as the ‘worst-case’ model (the learner picks the next hypothesis randomly from the version space) and ‘preference-based’ model (the learner picks hypothesis according to some global preference). Inspired by human teaching, we propose a new model where the learner picks hypothesis according to some local preference defined by the current hypothesis. We show that our model exhibits several desirable properties, e.g., adaptivity plays a key role, and the learner’s transitions over hypotheses are smooth/interpretable. We develop efficient teaching algorithms for our model, and demonstrate our results via simulations as well as user studies.

Generating Plans that Predict Themselves

Collaboration requires coordination, and we coordinate by anticipating our teammates’ future actions and adapting to their plan. In some cases, our teammates’ actions early on can give us a clear idea of what the remainder of their plan is, i.e. what action sequence we should expect. In others, they might leave us less confident, or even lead us to the wrong conclusion. Our goal is for robot actions to fall in the first category: we want to enable robots to select their actions in such a way that human collaborators can easily use them to correctly anticipate what will follow. While previous work has focused on finding initial plans that convey a set goal, here we focus on finding two portions of a plan such that the initial portion conveys the final one. We introduce t-\ACty{}: a measure that quantifies the accuracy and confidence with which human observers can predict the remaining robot plan from the overall task goal and the observed initial t actions in the plan. We contribute a method for generating t-predictable plans: we search for a full plan that accomplishes the task, but in which the first t actions make it as easy as possible to infer the remaining ones. The result is often different from the most efficient plan, in which the initial actions might leave a lot of ambiguity as to how the task will be completed. Through an online experiment and an in-person user study with physical robots, we find that our approach outperforms a traditional efficiency-based planner in objective and subjective collaboration metrics.

Story Generation and Aviation Incident Representation
Evolved Policy Gradients
Identify Susceptible Locations in Medical Records via Adversarial Attacks on Deep Predictive Models
Leveraging the Exact Likelihood of Deep Latent Variables Models
Prediction of next career moves from scientific profiles
Challenging Images For Minds and Machines
Network Estimation from Point Process Data
A note on randomly scaled scale-decorated Poisson point processes
Sources of Variance in Two-Photon Microscopy Neuroimaging
State Space Gaussian Processes with Non-Gaussian Likelihood
Clustering and Semi-Supervised Classification for Clickstream Data via Mixture Models
Persistence Codebooks for Topological Data Analysis
A theoretical guideline for designing an effective adaptive particle swarm
Bases of the quantum matrix bialgebra and induced sign characters of the Hecke algebra
Distribution-free Junta Testing
Local Descent For Temporal Logic Falsification of Cyber-Physical Systems (Extended Technical Report)
Learning via social awareness: improving sketch representations with facial feedback
Satellite Image Forgery Detection and Localization Using GAN and One-Class Classifier
Efficient Discovery of Variable-length Time Series Motifs with Large Length Range in Million Scale Time Series
Distributionally Robust Mean-Variance Portfolio Selection with Wasserstein Distances
Probabilistic Warnings in National Security Crises: Pearl Harbor Revisited
The false positive risk: a proposal concerning what to do about p-values
Understanding Membership Inferences on Well-Generalized Learning Models
Computer-Aided Knee Joint Magnetic Resonance Image Segmentation – A Survey
Molecular Structure Extraction From Documents Using Deep Learning
Ultrahigh-dimensional Robust and Efficient Sparse Regression using Non-Concave Penalized Density Power Divergence
Compressive Sensing with Low Precision Data Representation: Radio Astronomy and Beyond
Conditional Density Estimation with Bayesian Normalising Flows
Linear-Time Algorithm for Learning Large-Scale Sparse Graphical Models
Web-Scale Responsive Visual Search at Bing
Prophit: Causal inverse classification for multiple continuously valued treatment policies
DVAE++: Discrete Variational Autoencoders with Overlapping Transformations
Stability of circulant graphs
Beamforming with Multiple One-Bit Wireless Transceivers
Exploring Hidden Dimensions in Parallelizing Convolutional Neural Networks
Bias Correction Estimation for Continuous-Time Asset Return Model with Jumps
Gallai-Ramsey numbers for books
Energy Spatio-Temporal Pattern Prediction for Electric Vehicle Networks
MemeSequencer: Sparse Matching for Embedding Image Macros
Isolating Sources of Disentanglement in Variational Autoencoders
$\mathcal{CIRFE}$: A Distributed Random Fields Estimator
Edge Attention-based Multi-Relational Graph Convolutional Networks
Attack RMSE Leaderboard: An Introduction and Case Study
ForkBase: An Efficient Storage Engine for Blockchain and Forkable Applications
Optimal Fairness-Aware Time and Power Allocation in Wireless Powered Communication Networks
Multiterminal Secret Key Agreement at Asymptotically Zero Discussion Rate
Vertex nomination: The canonical sampling and the extended spectral nomination schemes
Disjoint Multi-task Learning between Heterogeneous Human-centric Tasks
Destination Choice Game: A Spatial Interaction Theory on Human Mobility
DESlib: A Dynamic ensemble selection library in Python
Median Shapes
A Framework for Input-Output Analysis of Wall-Bounded Shear Flows
Paraphrasing Complex Network: Network Compression via Factor Transfer
Singularly perturbed forward-backward stochastic differential equations: application to the optimal control of bilinear systems
Adaptive importance sampling with forward-backward stochastic differential equations
On ranks of polynomials
PlayeRank: Multi-dimensional and role-aware rating of soccer player performance
The Depoissonisation quintet: Rice-Poisson-Mellin-Newton-Laplace
American Options in the Hobson-Rogers Model
Using Longitudinal Targeted Maximum Likelihood Estimation in Complex Settings with Dynamic Interventions
The step Sidorenko property and non-norming edge-transitive graphs
SIR epidemics and vaccination on random graphs with clustering
Distributional Term Set Expansion
Parameter estimation for discretely-observed linear birth-and-death processes
Multilevel nested simulation for efficient risk estimation
Recursive Chaining of Reversible Image-to-image Translators For Face Aging
Directed cycles have the edge-Erd\H os-Pósa property
Not to Cry Wolf: Distantly Supervised Multitask Learning in Critical Care
Tackling Multilabel Imbalance through Label Decoupling and Data Resampling Hybridization
Dealing with Difficult Minority Labels in Imbalanced Mutilabel Data Sets
Nonnegative PARAFAC2: a flexible coupling approach
Network structure inhibits information cascades in heavy-tailed social networks
A note on packing of uniform hypergraphs
GEP-PG: Decoupling Exploration and Exploitation in Deep Reinforcement Learning Algorithms
Understanding Book Popularity on Goodreads
Contingent derivatives and regularization for noncoercive inverse problems
Analysis of Large Urn Models with Local Mean-Field Interactions
On Quasi-Infinitely Divisible Distributions with a Point Mass
Limit Theorems for the Alloy-type Random Energy Model
Min-Max-Min Robustness for Combinatorial Problems with Budgeted Uncertainty
L4: Practical loss-based stepsize adaptation for deep learning
Fast dynamics perspective on the breakdown of the Stokes-Einstein law in fragile glassformers
Linguistic unit discovery from multi-modal inputs in unwritten languages: Summary of the ‘Speaking Rosetta’ JSALT 2017 Workshop
The Multiscale Bowler-Hat Transform for Vessel Enhancement in 3D Biomedical Images
DiCE: The Infinitely Differentiable Monte-Carlo Estimator
An adaptive procedure for Fourier estimators: illustration to deconvolution and decompounding
Co-training for Extraction of Adverse Drug Reaction Mentions from Tweets
Geometric probabilities for a cluster of needles and a lattice of rectangles
Multi-Task Learning for Extraction of Adverse Drug Reaction Mentions from Tweets
Deep Learning and Data Assimilation for Real-Time Production Prediction in Natural Gas Wells
Morphologic for knowledge dynamics: revision, fusion, abduction
Maximum Total Correntropy Diffusion Adaptation over Networks with Noisy Links
Channel Reconstruction-Based Hybrid Precoding for Millimeter Wave Multi-User MIMO Systems
Some central limit theorems for random walks associated with hypergeometric functions of type BC
Stepwise Transmit Antenna Selection in Downlink Massive Multiuser MIMO
A Convection-Diffusion Model for Gang Territoriality
Toward Deeper Understanding of Nonconvex Stochastic Optimization with Momentum using Diffusion Approximations
Algebraically grid-like graphs have large tree-width
$W^{1,p}$ regularity of solutions to Kolmogorov equation with Gilbarg-Serrin matrix
Bounds on the norm of Wigner-type random matrices
Sampling Superquadric Point Clouds with Normals
Bayesian Meta-Analysis of Multiple Continuous Treatments: An Application to Antipsychotic Drugs
Security Analysis and Enhancement of Model Compressed Deep Learning Systems under Adversarial Attacks
Generative Models for Spear Phishing Posts on Social Media
Perfect shuffling by lazy swaps
Fully Convolutional Network Ensembles for White Matter Hyperintensities Segmentation in MR Images
Enabling Interactive Mobile Simulations Through Distributed Reduced Models
Sum Secrecy Rate Maximization in a Multi-Carrier MIMO Wiretap Channel with Full-Duplex Jamming
Counting subgraphs in fftp graphs with symmetry
Learning Privacy Preserving Encodings through Adversarial Training
Inference for Heavy-Tailed Max-Renewal Processes
Who Killed Albert Einstein? From Open Data to Murder Mystery Games
Stochastic Darboux transformations for quasi-birth-and-death processes and urn models
Necessary and Sufficient Null Space Condition for Nuclear Norm Minimization in Low-Rank Matrix Recovery
Robust Target Localization Based on Squared Range Iterative Reweighted Least Squares
Efficient Exact Paths For Dyck and semi-Dyck Labeled Path Reachability
Interference Cancellation and Iterative Detection for Orthogonal Time Frequency Space Modulation
Distributionally Robust Submodular Maximization
Differentially Private Empirical Risk Minimization Revisited: Faster and More General
Upgrading nodes in tree-shaped hub location
Kardar-Parisi-Zhang Universality in First-Passage Percolation: the Role of Geodesic Degeneracy
Dynamic Sensor Selection for Reliable Spectrum Sensing via E-Optimal Criterion
On macroscopic holes in some supercritical strongly dependent percolation models
Permutation polynomials over $\mathbb{F}_{q^2}$ from rational functions