Collapsing-Fast-Large-Almost-Matching-Exactly: A Matching Method for Causal Inference

We aim to create the highest possible quality of treatment-control matches for categorical data in the potential outcomes framework. Matching methods are heavily used in the social sciences due to their interpretability, but most matching methods in the past do not pass basic sanity checks in that they fail when irrelevant variables are introduced. Also, past methods tend to be either computationally slow or produce poor matches. The method proposed in this work aims to match units on a weighted Hamming distance, taking into account the relative importance of the covariates; the algorithm aims to match units on as many relevant variables as possible. To do this, the algorithm creates a hierarchy of covariate combinations on which to match (similar to downward closure), in the process solving an optimization problem for each unit in order to construct the optimal matches. The algorithm uses a single dynamic program to solve all of optimization problems simultaneously. Notable advantages of our method over existing matching procedures are its high-quality matches, versatility in handling different data distributions that may have irrelevant variables, and ability to handle missing data by matching on as many available covariates as possible

Flexible Collaborative Estimation of the Average Causal Effect of a Treatment using the Outcome-Highly-Adaptive Lasso

Many estimators of the average causal effect of an intervention require estimation of the propensity score, the outcome regression, or both. For these estimators, we must carefully con- sider how to estimate the relevant regressions. It is often beneficial to utilize flexible techniques such as semiparametric regression or machine learning. However, optimal estimation of the regression function does not necessarily lead to optimal estimation of the average causal effect. Therefore, it is important to consider criteria for evaluating regression estimators and selecting hyper-parameters. A recent proposal addressed these issues via the outcome-adaptive lasso, a penalized regression technique for estimating the propensity score. We build on this proposal and offer a method that is simultaneously more flexible and more efficient than the previous pro- posal. We propose the outcome-highly-adaptive LASSO, a semi-parametric regression estimator designed to down-weight regions of the confounder space that do not contribute variation to the outcome regression. We show that tuning this method using collaborative targeted learning leads to superior finite-sample performance relative to competing estimators.

MultiFIT: Multivariate Multiscale Framework for Independence Tests

We present a framework for testing independence between two random vectors that is scalable to massive data. Taking a ‘divide-and-conquer’ approach, we break down the nonparametric multivariate test of independence into simple univariate independence tests on a collection of 2\times 2 contingency tables, constructed by sequentially discretizing the original sample space at a cascade of scales from coarse to fine. This transforms a complex nonparametric testing problem—that traditionally requires quadratic computational complexity with respect to the sample size—into a multiple testing problem that can be addressed with a computational complexity that scales almost linearly with the sample size. We further consider the scenario when the dimensionality of the two random vectors also grows large, in which case the curse of dimensionality arises in the proposed framework through an explosion in the number of univariate tests to be completed. To overcome this difficulty, we propose a data-adaptive version of our method that completes a fraction of the univariate tests, judged to be more likely to contain evidence for dependency based on exploiting the spatial characteristics of the dependency structure in the data. We provide an inference recipe based on multiple testing adjustment that guarantees the inferential validity in terms of properly controlling the family-wise error rate. We demonstrate the tremendous computational advantage of the algorithm in comparison to existing approaches while achieving desirable statistical power through an extensive simulation study. In addition, we illustrate how our method can be used for learning the nature of the underlying dependency in addition to hypothesis testing. We demonstrate the use of our method through analyzing a data set from flow cytometry.

Kernel-based Outlier Detection using the Inverse Christoffel Function

Outlier detection methods have become increasingly relevant in recent years due to increased security concerns and because of its vast application to different fields. Recently, Pauwels and Lasserre (2016) noticed that the sublevel sets of the inverse Christoffel function accurately depict the shape of a cloud of data using a sum-of-squares polynomial and can be used to perform outlier detection. In this work, we propose a kernelized variant of the inverse Christoffel function that makes it computationally tractable for data sets with a large number of features. We compare our approach to current methods on 15 different data sets and achieve the best average area under the precision recall curve (AUPRC) score, the best average rank and the lowest root mean square deviation.

Evaluating and Characterizing Incremental Learning from Non-Stationary Data

Incremental learning from non-stationary data poses special challenges to the field of machine learning. Although new algorithms have been developed for this, assessment of results and comparison of behaviors are still open problems, mainly because evaluation metrics, adapted from more traditional tasks, can be ineffective in this context. Overall, there is a lack of common testing practices. This paper thus presents a testbed for incremental non-stationary learning algorithms, based on specially designed synthetic datasets. Also, test results are reported for some well-known algorithms to show that the proposed methodology is effective at characterizing their strengths and weaknesses. It is expected that this methodology will provide a common basis for evaluating future contributions in the field.

The Reduced PC-Algorithm: Improved Causal Structure Learning in Large Random Networks

We consider the task of estimating a high-dimensional directed acyclic graph, given observations from a linear structural equation model with arbitrary noise distribution. By exploiting properties of common random graphs, we develop a new algorithm that requires conditioning only on small sets of variables. The proposed algorithm, which is essentially a modified version of the PC-Algorithm, offers significant gains in both computational complexity and estimation accuracy. In particular, it results in more efficient and accurate estimation in large networks containing hub nodes, which are common in biological systems. We prove the consistency of the proposed algorithm, and show that it also requires a less stringent faithfulness assumption than the PC-Algorithm. Simulations in low and high-dimensional settings are used to illustrate these findings. An application to gene expression data suggests that the proposed algorithm can identify a greater number of clinically relevant genes than current methods.

The choice to define competing risk events as censoring events and implications for causal inference

In failure-time settings, a competing risk event is any event that makes it impossible for the event of interest to occur. Different analytical methods are available for estimating the effect of a treatment on a failure event of interest that is subject to competing events. The choice of method depends on whether or not competing events are defined as censoring events. Though such definition has key implications for the causal interpretation of a given estimate, explicit consideration of those implications has been rare in the statistical literature. As a result, confusion exists as to how to choose amongst available methods for analyzing data with competing events and how to interpret effect estimates. This confusion can be alleviated by understanding that the choice to define a competing event as a censoring event or not corresponds to a choice between different causal estimands. In this paper, we describe the assumptions required to identify those causal estimands and provide a mapping between such estimands and standard terminology from the statistical literature—in particular, the terms ‘subdistribution function’, ‘subdistribution hazard’ and ’cause-specific hazard’. We show that when the censoring process depends on measured time-varying risk factors, conventional statistical methods for competing events are not valid and alternative methods derived from Robins’s g-formula may recover the causal estimand of interest.

Sparse Principal Component based High-Dimensional Mediation Analysis

Causal mediation analysis aims to quantify the intermediate effect of a mediator on the causal pathway from treatment to outcome. With multiple mediators, which are potentially causally dependent, the possible decomposition of pathway effects grows exponentially with the number of mediators. Huang and Pan (2016) introduced a principal component analysis (PCA) based approach to address this challenge, in which the transformed mediators are conditionally independent given the orthogonality of the PCs. However, the transformed mediator PCs, which are linear combinations of original mediators, are difficult to interpret. In this study, we propose a sparse high-dimensional mediation analysis approach by adopting the sparse PCA method introduced by Zou and others (2006) to the mediation setting. We apply the approach to a task-based functional magnetic resonance imaging study, and show that our proposed method is able to detect biologically meaningful results related to the identified mediator.

Data Mining definition services in Cloud Computing with Linked Data

In recent years Cloud Computing service providers have been adding Data Mining (DM) services to their catalog. Several syntactic and semantic proposals have been presented to address the problem of the definition and description of services in Cloud Computing in a comprehensive way. Considering that each provider defines its own service logic for DM, we find that using semantic languages and following the linked data proposal it is possible to design a specification for the exchange of data mining services, achieving a high degree of interoperability. In this paper we propose a schema for the complete definition of DM Cloud Computing services, considering key aspects such as pricing, interfaces, experimentation work-flow, among others. Our proposal leverages the power of Linked Data for validating its usefulness with the definition of various DM services to define a complete Cloud Computing service.

A New High Performance and Scalable SVD algorithm on Distributed Memory Systems

This paper introduces a high performance implementation of \texttt{Zolo-SVD} algorithm on distributed memory systems, which is based on the polar decomposition (PD) algorithm via the Zolotarev’s function (\texttt{Zolo-PD}), originally proposed by Nakatsukasa and Freund [SIAM Review, 2016]. Our implementation highly relies on the routines of ScaLAPACK and therefore it is portable. Compared with the other PD algorithms such as the QR-based dynamically weighted Halley method (\texttt{QDWH-PD}), \texttt{Zolo-PD} is naturally parallelizable and has better scalability though performs more floating-point operations. When using many processes, \texttt{Zolo-PD} is usually 1.20 times faster than \texttt{QDWH-PD} algorithm, and \texttt{Zolo-SVD} can be about two times faster than the ScaLAPACK routine \texttt{\texttt{PDGESVD}}. These numerical experiments are performed on Tianhe-2 supercomputer, one of the fastest supercomputers in the world, and the tested matrices include some sparse matrices from particular applications and some randomly generated dense matrices with different dimensions. Our \texttt{QDWH-SVD} and \texttt{Zolo-SVD} implementations are freely available at https://…/Zolo-SVD.

Learning kernels that adapt to GPU

In recent years machine learning methods that nearly interpolate the data have achieved remarkable success. In many settings achieving near-zero training error leads to excellent test results. In this work we show how the mathematical and conceptual simplicity of interpolation can be harnessed to construct a framework for very efficient, scalable and accurate kernel machines. Our main innovation is in constructing kernel machines that output solutions mathematically equivalent to those obtained using standard kernels, yet capable of fully utilizing the available computing power of a parallel computational resource, such as GPU. Such utilization is key to strong performance since much of the computational resource capability is wasted by the standard iterative methods. The computational resource and data adaptivity of our learned kernels is based on theoretical convergence bounds. The resulting algorithm, which we call EigenPro 2.0, is accurate, principled and very fast. For example, using a single GPU, training on ImageNet with 1.3\times 10^6 data points and 1000 labels takes under an hour, while smaller datasets, such as MNIST, take seconds. Moreover, as the parameters are chosen analytically, based on the theory, little tuning beyond selecting the kernel and kernel parameter is needed, further facilitating the practical use of these methods.

BinGAN: Learning Compact Binary Descriptors\\with a Regularized GAN

In this paper, we propose a novel regularization method for Generative Adversarial Networks, which allows the model to learn discriminative yet compact binary representations of image patches (image descriptors). We employ the dimensionality reduction that takes place in the intermediate layers of the discriminator network and train binarized low-dimensional representation of the penultimate layer to mimic the distribution of the higher-dimensional preceding layers. To achieve this, we introduce two loss terms that aim at: (i) reducing the correlation between the dimensions of the binarized low-dimensional representation of the penultimate layer i. e. maximizing joint entropy) and (ii) propagating the relations between the dimensions in the high-dimensional space to the low-dimensional space. We evaluate the resulting binary image descriptors on two challenging applications, image matching and retrieval, and achieve state-of-the-art results.

Closing the Generalization Gap of Adaptive Gradient Methods in Training Deep Neural Networks

Adaptive gradient methods, which adopt historical gradient information to automatically adjust the learning rate, have been observed to generalize worse than stochastic gradient descent (SGD) with momentum in training deep neural networks. This leaves how to close the generalization gap of adaptive gradient methods an open problem. In this work, we show that adaptive gradient methods such as Adam, Amsgrad, are sometimes ‘over adapted’. We design a new algorithm, called Partially adaptive momentum estimation method (Padam), which unifies the Adam/Amsgrad with SGD to achieve the best from both worlds. Experiments on standard benchmarks show that Padam can maintain fast convergence rate as Adam/Amsgrad while generalizing as well as SGD in training deep neural networks. These results would suggest practitioners pick up adaptive gradient methods once again for faster training of deep neural networks.

Optimal Subsampling Algorithms for Big Data Generalized Linear Models

To fast approximate the maximum likelihood estimator with massive data, Wang et al. (JASA, 2017) proposed an Optimal Subsampling Method under the A-optimality Criterion (OSMAC) for in logistic regression. This paper extends the scope of the OSMAC framework to include generalized linear models with canonical link functions. The consistency and asymptotic normality of the estimator from a general subsampling algorithm are established, and optimal subsampling probabilities under the A- and L-optimality criteria are derived. Furthermore, using Frobenius norm matrix concentration inequality, finite sample properties of the subsample estimator based on optimal subsampling probabilities are derived. Since the optimal subsampling probabilities depend on the full data estimate, an adaptive two-step algorithm is developed. Asymptotic normality and optimality of the estimator from this adaptive algorithm are established. The proposed methods are illustrated and evaluated through numerical experiments on simulated and real datasets.

Comparison-Based Random Forests

Assume we are given a set of items from a general metric space, but we neither have access to the representation of the data nor to the distances between data points. Instead, suppose that we can actively choose a triplet of items (A,B,C) and ask an oracle whether item A is closer to item B or to item C. In this paper, we propose a novel random forest algorithm for regression and classification that relies only on such triplet comparisons. In the theory part of this paper, we establish sufficient conditions for the consistency of such a forest. In a set of comprehensive experiments, we then demonstrate that the proposed random forest is efficient both for classification and regression. In particular, it is even competitive with other methods that have direct access to the metric representation of the data.

Nonparametric Topic Modeling with Neural Inference

This work focuses on combining nonparametric topic models with Auto-Encoding Variational Bayes (AEVB). Specifically, we first propose iTM-VAE, where the topics are treated as trainable parameters and the document-specific topic proportions are obtained by a stick-breaking construction. The inference of iTM-VAE is modeled by neural networks such that it can be computed in a simple feed-forward manner. We also describe how to introduce a hyper-prior into iTM-VAE so as to model the uncertainty of the prior parameter. Actually, the hyper-prior technique is quite general and we show that it can be applied to other AEVB based models to alleviate the {\it collapse-to-prior} problem elegantly. Moreover, we also propose HiTM-VAE, where the document-specific topic distributions are generated in a hierarchical manner. HiTM-VAE is even more flexible and can generate topic distributions with better variability. Experimental results on 20News and Reuters RCV1-V2 datasets show that the proposed models outperform the state-of-the-art baselines significantly. The advantages of the hyper-prior technique and the hierarchical model construction are also confirmed by experiments.

Incremental Sparse Bayesian Ordinal Regression

Ordinal Regression (OR) aims to model the ordering information between different data categories, which is a crucial topic in multi-label learning. An important class of approaches to OR models the problem as a linear combination of basis functions that map features to a high dimensional non-linear space. However, most of the basis function-based algorithms are time consuming. We propose an incremental sparse Bayesian approach to OR tasks and introduce an algorithm to sequentially learn the relevant basis functions in the ordinal scenario. Our method, called Incremental Sparse Bayesian Ordinal Regression (ISBOR), automatically optimizes the hyper-parameters via the type-II maximum likelihood method. By exploiting fast marginal likelihood optimization, ISBOR can avoid big matrix inverses, which is the main bottleneck in applying basis function-based algorithms to OR tasks on large-scale datasets. We show that ISBOR can make accurate predictions with parsimonious basis functions while offering automatic estimates of the prediction uncertainty. Extensive experiments on synthetic and real word datasets demonstrate the efficiency and effectiveness of ISBOR compared to other basis function-based OR approaches.

HitNet: a neural network with capsules embedded in a Hit-or-Miss layer, extended with hybrid data augmentation and ghost capsules

Neural networks designed for the task of classification have become a commodity in recent years. Many works target the development of better networks, which results in a complexification of their architectures with more layers, multiple sub-networks, or even the combination of multiple classifiers. In this paper, we show how to redesign a simple network to reach excellent performances, which are better than the results reproduced with CapsNet on several datasets, by replacing a layer with a Hit-or-Miss layer. This layer contains activated vectors, called capsules, that we train to hit or miss a central capsule by tailoring a specific centripetal loss function. We also show how our network, named HitNet, is capable of synthesizing a representative sample of the images of a given class by including a reconstruction network. This possibility allows to develop a data augmentation step combining information from the data space and the feature space, resulting in a hybrid data augmentation process. In addition, we introduce the possibility for HitNet, to adopt an alternative to the true target when needed by using the new concept of ghost capsules, which is used here to detect potentially mislabeled images in the training data.

Self-Attentive Neural Collaborative Filtering

The dominant, state-of-the-art collaborative filtering (CF) methods today mainly comprises neural models. In these models, deep neural networks, e.g.., multi-layered perceptrons (MLP), are often used to model nonlinear relationships between user and item representations. As opposed to shallow models (e.g., factorization-based models), deep models generally provide a greater extent of expressiveness, albeit at the expense of impaired/restricted information flow. Consequently, the performance of most neural CF models plateaus at 3-4 layers, with performance stagnating or even degrading when increasing the model depth. As such, the question of how to train really deep networks in the context of CF remains unclear. To this end, this paper proposes a new technique that enables training neural CF models all the way up to 20 layers and beyond. Our proposed approach utilizes a new hierarchical self-attention mechanism that learns introspective intra-feature similarity across all the hidden layers of a standard MLP model. All in all, our proposed architecture, SA-NCF (Self-Attentive Neural Collaborative Filtering) is a densely connected self-matching model that can be trained up to 24 layers without plateau-ing, achieving wide performance margins against its competitors. On several popular benchmark datasets, our proposed architecture achieves up to an absolute improvement of 23%-58% and 1.3x to 2.8x fold improvement in terms of nDCG@10 and Hit Ratio (HR@10) scores over several strong neural CF baselines.

Post-Lasso Inference for High-Dimensional Regression

Among the most popular variable selection procedures in high-dimensional regression, Lasso provides a solution path to rank the variables and determines a cut-off position on the path to select variables and estimate coefficients. In this paper, we consider variable selection from a new perspective motivated by the frequently occurred phenomenon that relevant variables are not completely distinguishable from noise variables on the solution path. We propose to characterize the positions of the first noise variable and the last relevant variable on the path. We then develop a new variable selection procedure to control over-selection of the noise variables ranking after the last relevant variable, and, at the same time, retain a high proportion of relevant variables ranking before the first noise variable. Our procedure utilizes the recently developed covariance test statistic and Q statistic in post-selection inference. In numerical examples, our method compares favorably with other existing methods in selection accuracy and the ability to interpret its results.

DynMat, a network that can learn after learning

To survive in the dynamically-evolving world, we accumulate knowledge and improve our skills based on experience. In the process, gaining new knowledge does not disrupt our vigilance to external stimuli. In other words, our learning process is ‘accumulative’ and ‘online’ without interruption. However, despite the recent success, artificial neural networks (ANNs) must be trained offline, and they suffer catastrophic interference between old and new learning, indicating that ANNs’ conventional learning algorithms may not be suitable for building intelligent agents comparable to our brain. In this study, we propose a novel neural network architecture (DynMat) consisting of dual learning systems, inspired by the complementary learning system (CLS) theory suggesting that the brain relies on short- and long-term learning systems to learn continuously. Our experiments show that 1) DynMat can learn a new class without catastrophic interference and 2) it does not strictly require offline training.

Handling Cold-Start Collaborative Filtering with Reinforcement Learning

A major challenge in recommender systems is handling new users, whom are also called \textit{cold-start} users. In this paper, we propose a novel approach for learning an optimal series of questions with which to interview cold-start users for movie recommender systems. We propose learning interview questions using Deep Q Networks to create user profiles to make better recommendations to cold-start users. While our proposed system is trained using a movie recommender system, our Deep Q Network model should generalize across various types of recommender systems.

Supervised Fuzzy Partitioning

Centroid-based methods including k-means and fuzzy c-means (FCM) are known as effective and easy-to-implement approaches to clustering purposes in many areas of application. However, these algorithms cannot be directly applied to supervised tasks. We propose a generative model extending centroid-based clustering approaches to be applicable to classification and regression problems. Given an arbitrary loss function, our approach, termed supervised fuzzy partitioning (SFP), incorporates labels information into its objective function through a surrogate term penalizing the risk. We also fuzzify the partition and assign weights to features alongside entropy-based regularization terms, enabling the method to capture more complex data structure, to identify significant features, and to yield better performance facing high-dimensional data. An iterative algorithm based on block coordinate descent (BCD) scheme was formulated to efficiently find a local optimizer. The results show that the SFP performance in classification and supervised dimensionality reduction on synthetic and real-world datasets is competitive with state-of-the-art algorithms such as random forest and SVM. Our method has a major advantage over such methods in that it not only leads to a flexible model but also uses the loss function in training phase without compromising computational efficiency.

Detecting Dead Weights and Units in Neural Networks

Deep Neural Networks are highly over-parameterized and the size of the neural networks can be reduced significantly after training without any decrease in performance. One can clearly see this phenomenon in a wide range of architectures trained for various problems. Weight/channel pruning, distillation, quantization, matrix factorization are some of the main methods one can use to remove the redundancy to come up with smaller and faster models. This work starts with a short informative chapter, where we motivate the pruning idea and provide the necessary notation. In the second chapter, we compare various saliency scores in the context of parameter pruning. Using the insights obtained from this comparison and stating the problems it brings we motivate why pruning units instead of the individual parameters might be a better idea. We propose some set of definitions to quantify and analyze units that don’t learn and create any useful information. We propose an efficient way for detecting dead units and use it to select which units to prune. We get 5x model size reduction through unit-wise pruning on MNIST.

An Online Prediction Algorithm for Reinforcement Learning with Linear Function Approximation using Cross Entropy Method

In this paper, we provide two new stable online algorithms for the problem of prediction in reinforcement learning, \emph{i.e.}, estimating the value function of a model-free Markov reward process using the linear function approximation architecture and with memory and computation costs scaling quadratically in the size of the feature set. The algorithms employ the multi-timescale stochastic approximation variant of the very popular cross entropy (CE) optimization method which is a model based search method to find the global optimum of a real-valued function. A proof of convergence of the algorithms using the ODE method is provided. We supplement our theoretical results with experimental comparisons. The algorithms achieve good performance fairly consistently on many RL benchmark problems with regards to computational efficiency, accuracy and stability.

Polynomial Regression As an Alternative to Neural Nets

Despite the success of neural networks (NNs), there is still a concern among many over their ‘black box’ nature. Why do they work Here we present a simple analytic argument that NNs are in fact essentially polynomial regression models. This view will have various implications for NNs, e.g. providing an explanation for why convergence problems arise in NNs, and it gives rough guidance on avoiding overfitting. In addition, we use this phenomenon to predict and confirm a multicollinearity property of NNs not previously reported in the literature. Most importantly, given this loose correspondence, one may choose to routinely use polynomial models instead of NNs, thus avoiding some major problems of the latter, such as having to set many tuning parameters and dealing with convergence issues. We present a number of empirical results; in each case, the accuracy of the polynomial approach matches or exceeds that of NN approaches. A many-featured, open-source software package, polyreg, is available.

Implicit Policy for Reinforcement Learning

We introduce Implicit Policy, a general class of expressive policies that can flexibly represent complex action distributions in reinforcement learning, with efficient algorithms to compute entropy regularized policy gradients. We empirically show that, despite its simplicity in implementation, entropy regularization combined with a rich policy class can attain desirable properties displayed under maximum entropy reinforcement learning framework, such as robustness and multi-modality.

Exponents of primitive symmetric companion matrices
PAC-Bayes bounds for stable algorithms with instance-dependent priors
Fast and Accurate Multiclass Inference for MI-BCIs Using Large Multiscale Temporal and Spectral Features
Automated Bridge Component Recognition using Video Data
‘What’s ur type ‘ Contextualized Classification of User Types in Marijuana-related Communications using Compositional Multiview Embedding
Temporal coherence-based self-supervised learning for laparoscopic workflow analysis
Eigenvectors of non normal random matrices
Quantile Regression of Latent Longitudinal Trajectory Features
Deep Spatiotemporal Representation of the Face for Automatic Pain Intensity Estimation
Towards an efficient deep learning model for musical onset detection
The Algebraic Connectivity of a Graph and its Complement
Kid-Net: Convolution Networks for Kidney Vessels Segmentation from CT-Volumes
Online Absolute Ranking with Partial Information: A Bipartite Graph Matching Approach
Modularity Matters: Learning Invariant Relational Reasoning Tasks
Robust model selection between population growth and multiple merger coalescents
Sustainable Inventory with Robust Periodic-Affine Policies and Application to Medical Supply Chains
Unsupervised Word Segmentation from Speech with Attention
AccaSim: a Customizable Workload Management Simulator for Job Dispatching Research in HPC Systems
Zip Trees
Numerical Evaluation of Elliptic Functions, Elliptic Integrals and Modular Forms
Towards Manipulability of Interactive Lagrangian Systems
Assessing robustness of radiomic features by image perturbation
SMOGS: Social Network Metrics of Game Success
Asymmetric Hopfield neural network and twisted tetrahedron equation
Towards multi-instrument drum transcription
Where to Go Next: A Spatio-temporal LSTM model for Next POI Recommendation
Critical Ising model on random triangulations of the disk: enumeration and local limits
The Origin and the Resolution of Nonuniqueness in Linear Rational Expectations
Contraction Analysis of Geodesically Convex Optimization
Feedback pinning control of collective behaviors aroused by epidemic spread on complex networks
On a kinetic Elo rating model for players with dynamical strength
Adaptive transmission for radar arrays using Weiss-Weinstein bounds
Gradient Descent-based D-optimal Design for the Least-Squares Polynomial Approximation
Dense People Counting Using IR-UWB Radar with a Hybrid Feature Extraction Method
Cardinality Leap for Open-Ended Evolution: Theoretical Consideration and Demonstration by ‘Hash Chemistry’
On Enhancing Speech Emotion Recognition using Generative Adversarial Networks
Banach Wasserstein GAN
A Flow Formulation for Horizontal Coordinate Assignment with Prescribed Width
Generic Existence of Unique Lagrange Multipliers in AC Optimal Power Flow
On Multi-resident Activity Recognition in Ambient Smart-Homes
A generalized Turán problem in random graphs
Optimal Resource Allocation in Full-Duplex Ambient Backscatter Communication Networks for Wireless-Powered IoT
How complex is a random picture
Uncertainty in multitask learning: joint representations for probabilistic MR-only radiotherapy planning
Deep Recurrent Neural Network for Multi-target Filtering
Dynamic Programming for Finite Ensembles of Nanomagnetic Particles
Spectral Functions of One-Dimensional Systems with Correlated Disorder
Mining frequent items in unstructured P2P networks
Resetting Disturbance Observers with application in Compensation of bounded nonlinearities like Hysteresis in Piezo-Actuators
VEBO: A Vertex- and Edge-Balanced Ordering Heuristic to Load Balance Parallel Graph Processing
RenderNet: A deep convolutional network for differentiable rendering from 3D shapes
Distributed learning with compressed gradients
SubGram: Extending Skip-gram Word Representation with Substrings
Learning from Outside the Viability Kernel: Why we Should Build Robots that can Fall with Grace
Detour-saturated graphs of small girths
A cautionary tale on using imputation methods for inference in matched pairs design
A Simple Reservoir Model of Working Memory with Real Values
Partitioning Compute Units in CNN Acceleration for Statistical Memory Traffic Shaping
Modeling Musical Taste Evolution with Recurrent Neural Networks
Segmentation of Photovoltaic Module Cells in Electroluminescence Images
A Frequency Domain Bootstrap for General Stationary Processes
GRPF: Global Complex Roots and Poles Finding Algorithm Based on Phase Analysis
Stability of Conditional Sequential Monte Carlo
The Information Autoencoding Family: A Lagrangian Perspective on Latent Variable Generative Models
Semi-tied Units for Efficient Gating in LSTM and Highway Networks
Minimal time impulse control of the heat equation
An Ensemble of Transfer, Semi-supervised and Supervised Learning Methods for Pathological Heart Sound Classification
A unified strategy for implementing curiosity and empowerment driven reinforcement learning
Deforming Autoencoders: Unsupervised Disentangling of Shape and Appearance
Conditional Affordance Learning for Driving in Urban Environments
Optimal Infinite Horizon Decentralized Networked Controllers with Unreliable Communication
Detecting Zero-day Controller Hijacking Attacks on the Power-Grid with Enhanced Deep Learning
Moment-based Bayesian Poisson Mixtures for inferring unobserved units
Attack Detection and Isolation for Discrete-Time Nonlinear Systems
Study of BEM-Type Channel Estimation Techniques for 5G Multicarrier Systems
Co-training Embeddings of Knowledge Graphs and Entity Descriptions for Cross-lingual Entity Alignment
Variable Importance Assessments and Backward Variable Selection for High-Dimensional Data
The RBO Dataset of Articulated Objects and Interactions
Learning Policy Representations in Multiagent Systems
Experimental Design of a Prescribed Burn Instrumentation
Fast Convex Pruning of Deep Neural Networks
Utilizing Provenance in Reusable Research Objects
Hölder Error Bounds and Hölder Calmness with Applications to Convex Semi-Infinite Optimization
Conway-Coxeter friezes and mutation: a survey
Predicting Switching Graph Labelings with Cluster Specialists
Compressed Sensing with Deep Image Prior and Learned Regularization
Combinatorial manifolds are Hamiltonian
The component structure of dense random subgraphs of the hypercube
Elements of Finite Order in the Riordan Group
Subspace Embedding and Linear Regression with Orlicz Norm
On Sketching the $q$ to $p$ norms
ZICS: an application for calculating the stationary probability distribution of stochastic reaction networks
Property Testing for Differential Privacy
Efficient Beam Alignment for mmWave Single-Carrier Systems with Hybrid MIMO Transceivers
A Novel Hybrid Machine Learning Model for Auto-Classification of Retinal Diseases
Learning to Evaluate Image Captioning
Greedy and Local Ratio Algorithms in the MapReduce Model
Comparison of DCO-OFDM and M-PAM for LED-Based Communication Systems
High-speed Tracking with Multi-kernel Correlation Filters
Refined enumeration of vertices among all rooted ordered $d$-trees
Pulsing dynamics in randomly wired complex cellular automata
Feature Learning and Classification in Neuroimaging: Predicting Cognitive Impairment from Magnetic Resonance Imaging
Measuring Semantic Coherence of a Conversation
Gated Path Planning Networks
An Improved Text Sentiment Classification Model Using TF-IDF and Next Word Negation
High Speed Kernelized Correlation Filters without Boundary Effect
On APF Test for Poisson Process with Shift and Scale Parameters
Geometric mean extension for data sets with zeros
Poisson Source Localization on the Plane. Cusp Case
MedGAN: Medical Image Translation using GANs
Robust Trajectory and Transmit Power Design for Secure UAV Communications
Task-Relevant Object Discovery and Categorization for Playing First-person Shooter Games
A sharp symmetrized form of Talagrand’s transport-entropy inequality for the Gaussian measure
Multi-variable LSTM neural network for autoregressive exogenous model
On Cusp Location Estimation for Perturbed Dynamical Systems
Poisson Source Localization on the Plane. Smooth Case
Poisson Source Localization on the Plane. Change-Point Case
Warming trend in cold season of the Yangtze River Delta and its correlation with Siberian high
Method of Moments Estimators and Multu-step MLE for Poisson Processes
Nonparametric Empirical Bayes Simultaneous Estimation for Multiple Variances
Geodesic Convex Optimization: Differentiation on Manifolds, Geodesics, and Convexity
Multimodal Grounding for Language Processing
Age Dependent Hawkes Process
On dual stable Grothendieck polynomials and their sums
How Could Polyhedral Theory Harness Deep Learning
Exact information propagation through fully-connected feed forward neural networks
Effect of Climate and Geography on worldwide fine resolution economic activity
Incorporating Chinese Characters of Words for Lexical Sememe Prediction
Universal Nonlinear Disordered Wave Packet Subdiffusion: 12 Decades
Extending Recurrent Neural Aligner for Streaming End-to-End Speech Recognition in Mandarin
Improving Network Availability of Ultra-Reliable and Low-Latency Communications with Multi-Connectivity
On the Q-linear convergence of forward-backward splitting method and uniqueness of optimal solution to Lasso
The Structure and Evolution of an Offline Peer-to-Peer Financial Network
A Gradient Tree Boosting based Approach to Rumor Detecting on Sina Weibo
Approximate Submodular Functions and Performance Guarantees
Comparative survey of visual object classifiers
Laplacian Smoothing Gradient Descent
ON $(\triangle, 1)$-GRAPHS
Biased Embeddings from Wild Data: Measuring, Understanding and Removing
Finding Short Synchronizing Words for Prefix Codes
Deformable Generator Network: Unsupervised Disentanglement of Appearance and Geometry
A nonparametric spatial test to identify factors that shape a microbiome
Right for the Right Reason: Training Agnostic Networks
Detecting intrusions in control systems: a rule of thumb, its justification and illustrations
Medium Access Control in Wireless Network-on-Chip: A Context Analysis
Average-Case Lower Bounds and Satisfiability Algorithms for Small Threshold Circuits
A group-theoretical approach to conditionally free cumulants
NOMA-based Energy-Efficient Wireless Powered Communications
Sensitivity-driven adaptive construction of reduced-space surrogates
Latent Convolutional Models
Generalizations of the Durand-Kerner method
Integrating Deliberation and Voting in Participatory Drafting of Legislation
General tax structures for a Lévy insurance risk process under the Cramér condition
On the Discrepancy Normed Space of Event Sequences for Threshold-based Sampling
Stable Prediction across Unknown Environments
Cooperative colorings of trees and of bipartite graphs
On Strategyproof Conference Peer Review
Evaluation of sentence embeddings in downstream and linguistic probing tasks
Efficient Crowdsourcing via Proxy Voting
PATRICIA bridges
Real-time Prediction of Segmentation Quality
Information Aging through Queues: A Mutual Information Perspective
PeerReview4All: Fair and Accurate Reviewer Assignment in Peer Review
Near-optimal mean estimators with respect to general norms
Binary Classification in Unstructured Space With Hypergraph Case-Based Reasoning
Adaptive estimating function inference for non-stationary determinantal point processes
Nonsmooth Aggregative Games with Coupling Constraints and Infinitely Many Classes of Players
Multimodal Sentiment Analysis using Hierarchical Fusion with Context Modeling
Observer-Based Controller Design for Systems on Manifolds in Euclidean Space
Advice Complexity of Priority Algorithms
Joint Input-Label Embedding for Neural Text Classification
Sharp Analytical Capacity Upper Bounds for Sticky and Related Channels
Characterization of cycle obstruction sets for improper coloring planar graphs
Offline Extraction of Indic Regional Language from Natural Scene Image using Text Segmentation and Deep Convolutional Sequence
Meta-learning: searching in the model space
TrQuery: An Embedding-based Framework for Recommanding SPARQL Queries
Study of Semi-supervised Approaches to Improving English-Mandarin Code-Switching Speech Recognition
A Combinatorial Method for Computing Characteristic Polynomials of Starlike Hypergraphs
Part-Aware Fine-grained Object Categorization using Weakly Supervised Part Detection Network
Show, Attend and Translate: Unsupervised Image Translation with Self-Regularization and Attention
Wavelet regression: An approach for undertaking multi-time scale analyses of hydro-climate relationships
Large Scale Fine-Grained Categorization and Domain-Specific Transfer Learning
Scheduled Policy Optimization for Natural Language Communication with Intelligent Agents
EdgeChain: An Edge-IoT Framework and Prototype Based on Blockchain and Smart Contracts
The Neural Painter: Multi-Turn Image Generation
Fast Distance Sensitivity Oracle for Multiple Failures
Power-Temperature Stability and Safety Analysis for Multiprocessor Systems
Semi-supervised Inference for Explained Variance in High-dimensional Linear Regression and Its Applications
Component SPD Matrices: A lower-dimensional discriminative data descriptor for image set classification
Riemannian kernel based Nyström method for approximate infinite-dimensional covariance descriptors with application to image set classification
Learning Factorized Multimodal Representations
On the Complexity of Detecting Convexity over a Box
Semantic Video Segmentation: A Review on Recent Approaches
Sufficiency of Deterministic Policies for Atomless Discounted and Uniformly Absorbing MDPs with Multiple Criteria
A Micro-Scale Mobile-Enabled Implantable Medical Sensor
BaRC: Backward Reachability Curriculum for Robotic Reinforcement Learning
Object Level Visual Reasoning in Videos
Cramér-type Large deviation and non-uniform central limit theorems in high dimensions
Efficient Data Perturbation for Privacy Preserving and Accurate Data Stream Mining
Teaching computational reproducibility for neuroimaging
Morse Theory and an Impossibility Theorem for Graph Clustering
Straggler-Resilient and Communication-Efficient Distributed Iterative Linear Solver
Tight Bound of Incremental Cover Trees for Dynamic Diversification
Impact of Channel Models on the End-to-End Performance of mmWave Cellular Networks
On the Relationship between Data Efficiency and Error for Uncertainty Sampling
Fairness Under Composition
Generalized Dynamic Programming Principle and Sparse Mean-Field Control Problems
Stochastic WaveNet: A Generative Latent Variable Model for Sequential Data
Instrumental variables regression
Non-Negative Networks Against Adversarial Attacks
The Limits of Post-Selection Generalization
Unsupervised Training for 3D Morphable Model Regression
Arithmetic Circuits with Locally Low Algebraic Rank
Crime Event Embedding with Unsupervised Feature Selection
A Polynomial-Time Algorithm for 2-stable Instances of the k-terminal cut Problem
Minibatch Gibbs Sampling on Large Graphical Models
Stability Conditions for Cluster Synchronization in Networks of Kuramoto Oscillators
Uncertain Fate of Fair Sampling in Quantum Annealing
FPGA acceleration of Model Predictive Control for Iter Plasma current and shape control
On the algorithmic complexity of finding hamiltonian cycles in special classes of planar cubic graphs
Two Groups in a Curie-Weiss Model with Heterogeneous Coupling
Formulations for designing robust networks. An application to wind power collection
Data-Driven Decentralized Optimal Power Flow
Solving the Steiner Tree Problem in graphs with Variable Neighborhood Descent
VTracker: Impact of User Factors On Users’ Intention to Adopt Dietary Intake Monitoring System with Auto Workout Tracker