Unsupervised Domain Adaptation with Copula Models

We study the task of unsupervised domain adaptation, where no labeled data from the target domain is provided during training time. To deal with the potential discrepancy between the source and target distributions, both in features and labels, we exploit a copula-based regression framework. The benefits of this approach are two-fold: (a) it allows us to model a broader range of conditional predictive densities beyond the common exponential family, (b) we show how to leverage Sklar’s theorem, the essence of the copula formulation relating the joint density to the copula dependency functions, to find effective feature mappings that mitigate the domain mismatch. By transforming the data to a copula domain, we show on a number of benchmark datasets (including human emotion estimation), and using different regression models for prediction, that we can achieve a more robust and accurate estimation of target labels, compared to recently proposed feature transformation (adaptation) methods.


Fully Bayesian Estimation Under Informative Sampling

Bayesian estimation is increasingly popular for performing model based inference to support policymaking. These data are often collected from surveys under informative sampling designs where subject inclusion probabilities are designed to be correlated with the response variable of interest. Sampling weights constructed from marginal inclusion probabilities are typically used to form an exponentiated pseudo likelihood that adjusts the population likelihood for estimation on the sample. We propose an alternative adjustment based on a Bayes rule construction that simultaneously performs weight smoothing and estimates the population model parameters in a fully Bayesian construction. We formulate conditions on known marginal and pairwise inclusion probabilities that define a class of sampling designs where L_{1} consistency of the joint posterior is guaranteed. We compare performances between the two approaches on synthetic data.


Toward a System Building Agenda for Data Integration

In this paper we argue that the data management community should devote far more effort to building data integration (DI) systems, in order to truly advance the field. Toward this goal, we make three contributions. First, we draw on our recent industrial experience to discuss the limitations of current DI systems. Second, we propose an agenda to build a new kind of DI systems to address these limitations. These systems guide users through the DI workflow, step by step. They provide tools to address the ‘pain points’ of the steps, and tools are built on top of the Python data science and Big Data ecosystem (PyData). We discuss how to foster an ecosystem of such tools within PyData, then use it to build DI systems for collaborative/cloud/crowd/lay user settings. Finally, we discuss ongoing work at Wisconsin, which suggests that these DI systems are highly promising and building them raises many interesting research challenges.


Complex Correntropy Function: properties, and application to a channel equalization problem

The use of correntropy as a similarity measure has been increasing in different scenarios due to the well-known ability to extract high-order statistic information from data. Recently, a new similarity measure between complex random variables was defined and called complex correntropy. Based on a Gaussian kernel, it extends the benefits of correntropy to complex-valued data. However, its properties have not yet been formalized. This paper studies the properties of this new similarity measure and extends this definition to positive-definite kernels. Complex correntropy is applied to a channel equalization problem as good results are achieved when compared with other algorithms such as the complex least mean square (CLMS), complex recursive least squares (CRLS), and least absolute deviation (LAD).


Toward Scalable Machine Learning and Data Mining: the Bioinformatics Case

In an effort to overcome the data deluge in computational biology and bioinformatics and to facilitate bioinformatics research in the era of big data, we identify some of the most influential algorithms that have been widely used in the bioinformatics community. These top data mining and machine learning algorithms cover classification, clustering, regression, graphical model-based learning, and dimensionality reduction. The goal of this study is to guide the focus of scalable computing experts in the endeavor of applying new storage and scalable computation designs to bioinformatics algorithms that merit their attention most, following the engineering maxim of ‘optimize the common case’.


Matching Anonymized and Obfuscated Time Series to Users’ Profiles

Many popular applications use traces of user data to offer various services to their users, example applications include driver-assistance systems and smart home services. However, revealing user information to such applications puts users’ privacy at stake, as adversaries can infer sensitive private information about the users such as their behaviors, interests, and locations. Recent research shows that adversaries can compromise users’ privacy when they use such applications even when the traces of users’ information are protected by mechanisms like anonymization and obfuscation. In this work, we derive the theoretical bounds on the privacy of users of these applications when standard protection mechanisms are deployed. We build on our recent study in the area of location privacy, in which we introduced formal notions of location privacy for anonymization-based location privacy-protection mechanisms. More specifically, we derive the fundamental limits of user privacy when both anonymization and obfuscation-based protection mechanisms are applied to users’ time series of data. We investigate the impact of such mechanisms on the tradeoff between privacy protection and user utility. In particular, we study achievability results for the case where the time-series of users are governed by an i.i.d. process. The converse results are proved both for the i.i.d. case as well as the more general Markov Chain model. We demonstrate that as the number of users in the network grows, the obfuscation-anonymization plane can be divided into two regions: in the first region, all users have perfect privacy, and, in the second region, no user has privacy.


Enabling Quality Control for Entity Resolution: A Human and Machine Cooperative Framework

Even though many machine algorithms have been proposed for entity resolution, it remains very challenging to find a solution with quality guarantees. In this paper, we propose a novel HUman and Machine cOoperative (HUMO) framework for entity resolution (ER), which divides an ER workload between machine and human. HUMO enables a mechanism for quality control that can flexibly enforce both precision and recall levels. We introduce the optimization problem of HUMO, minimizing human cost given a quality requirement, and then present three optimization approaches: a conservative baseline one purely based on the monotonicity assumption of precision, a more aggressive one based on sampling and a hybrid one that can take advantage of the strengths of both previous approaches. Finally, we demonstrate by extensive experiments on real and synthetic datasets that HUMO can achieve high-quality results with reasonable return on investment (ROI) in terms of human cost, and it performs considerably better than the state-of-the-art alternative in quality control.


Bag-of-Vector Embeddings of Dependency Graphs for Semantic Induction

Vector-space models, from word embeddings to neural network parsers, have many advantages for NLP. But how to generalise from fixed-length word vectors to a vector space for arbitrary linguistic structures is still unclear. In this paper we propose bag-of-vector embeddings of arbitrary linguistic graphs. A bag-of-vector space is the minimal nonparametric extension of a vector space, allowing the representation to grow with the size of the graph, but not tying the representation to any specific tree or graph structure. We propose efficient training and inference algorithms based on tensor factorisation for embedding arbitrary graphs in a bag-of-vector space. We demonstrate the usefulness of this representation by training bag-of-vector embeddings of dependency graphs and evaluating them on unsupervised semantic induction for the Semantic Textual Similarity and Natural Language Inference tasks.


Testing for Feature Relevance: The HARVEST Algorithm

Feature selection with high-dimensional data and a very small proportion of relevant features poses a severe challenge to standard statistical methods. We have developed a new approach (HARVEST) that is straightforward to apply, albeit somewhat computer-intensive. This algorithm can be used to pre-screen a large number of features to identify those that are potentially useful. The basic idea is to evaluate each feature in the context of many random subsets of other features. HARVEST is predicated on the assumption that an irrelevant feature can add no real predictive value, regardless of which other features are included in the subset. Motivated by this idea, we have derived a simple statistical test for feature relevance. Empirical analyses and simulations produced so far indicate that the HARVEST algorithm is highly effective in predictive analytics, both in science and business.


The Deep Ritz method: A deep learning-based numerical algorithm for solving variational problems

We propose a deep learning based method, the Deep Ritz Method, for numerically solving variational problems, particularly the ones that arise from partial differential equations. The Deep Ritz method is naturally nonlinear, naturally adaptive and has the potential to work in rather high dimensions. The framework is quite simple and fits well with the stochastic gradient descent method used in deep learning. We illustrate the method on several problems including some eigenvalue problems.


Towards Understanding the Evolution of Vocabulary Terms in Knowledge Graphs

Vocabularies are used for modeling data in Knowledge Graphs (KG) like the Linked Open Data Cloud and Wikidata. During their lifetime, the vocabularies of the KGs are subject to changes. New terms are coined, while existing terms are modified or declared as deprecated. We first quantify the amount and frequency of changes in vocabularies. Subsequently, we investigate to which extend and when the changes are adopted in the evolution of the KGs. We conduct our experiments on three large-scale KGs for which time-stamped snapshots are available, namely the Billion Triples Challenge datasets, Dynamic Linked Data Observatory dataset, and Wikidata. Our results show that the change frequency of terms is rather low, but can have high impact when adopted on a large amount of distributed graph data on the web. Furthermore, not all coined terms are used and most of the deprecated terms are still used by data publishers. There are variations in the adoption time of terms coming from different vocabularies ranging from very fast (few days) to very slow (few years). Surprisingly, there are also adoptions we could observe even before the vocabulary changes are published. Understanding this adoption is important, since otherwise it may lead to wrong assumptions about the modeling status of data published on the web and may result in difficulties when querying the data from distributed sources.


DTATG: An Automatic Title Generator based on Dependency Trees

We study automatic title generation for a given block of text and present a method called DTATG to generate titles. DTATG first extracts a small number of central sentences that convey the main meanings of the text and are in a suitable structure for conversion into a title. DTATG then constructs a dependency tree for each of these sentences and removes certain branches using a Dependency Tree Compression Model we devise. We also devise a title test to determine if a sentence can be used as a title. If a trimmed sentence passes the title test, then it becomes a title candidate. DTATG selects the title candidate with the highest ranking score as the final title. Our experiments showed that DTATG can generate adequate titles. We also showed that DTATG-generated titles have higher F1 scores than those generated by the previous methods.


libact: Pool-based Active Learning in Python

libact is a Python package designed to make active learning easier for general users. The package not only implements several popular active learning strategies, but also features the active-learning-by-learning meta-algorithm that assists the users to automatically select the best strategy on the fly. Furthermore, the package provides a unified interface for implementing more strategies, models and application-specific labelers. The package is open-source on Github, and can be easily installed from Python Package Index repository.


Visual Reasoning with Natural Language

Natural language provides a widely accessible and expressive interface for robotic agents. To understand language in complex environments, agents must reason about the full range of language inputs and their correspondence to the world. Such reasoning over language and vision is an open problem that is receiving increasing attention. While existing data sets focus on visual diversity, they do not display the full range of natural language expressions, such as counting, set reasoning, and comparisons. We propose a simple task for natural language visual reasoning, where images are paired with descriptive statements. The task is to predict if a statement is true for the given scene. This abstract describes our existing synthetic images corpus and our current work on collecting real vision data.


Building a Structured Query Engine

Finding patterns in data and being able to retrieve information from those patterns is an important task in Information retrieval. Complex search requirements which are not fulfilled by simple string matching and require exploring certain patterns in data demand a better query engine that can support searching via structured queries. In this article, we built a structured query engine which supports searching data through structured queries on the lines of ElasticSearch. We will show how we achieved real time indexing and retrieving of data through a RESTful API and how complex queries can be created and processed using efficient data structures we created for storing the data in structured way. Finally, we will conclude with an example of movie recommendation system built on top of this query engine.


Deep Abstract Q-Networks

We examine the problem of learning and planning on high-dimensional domains with long horizons and sparse rewards. Recent approaches have shown great successes in many Atari 2600 domains. However, domains with long horizons and sparse rewards, such as Montezuma’s Revenge and Venture, remain challenging for existing methods. Methods using abstraction (Dietterich 2000; Sutton, Precup, and Singh 1999) have shown to be useful in tackling long-horizon problems. We combine recent techniques of deep reinforcement learning with existing model-based approaches using an expert-provided state abstraction. We construct toy domains that elucidate the problem of long horizons, sparse rewards and high-dimensional inputs, and show that our algorithm significantly outperforms previous methods on these domains. Our abstraction-based approach outperforms Deep Q-Networks (Mnih et al. 2015) on Montezuma’s Revenge and Venture, and exhibits backtracking behavior that is absent from previous methods.


Factor selection by permutation

Researchers often have data measuring features x_{ij} of samples, such as test scores of students. In factor analysis and PCA, these features are thought to be influenced by unobserved factors, such as skills. Can we determine how many factors affect the data? Many approaches have been developed for this factor selection problem. The popular Parallel Analysis method randomly permutes each feature of the data. It selects factors if their singular values are larger than those of the permuted data. It is used by leading applied statisticians, including T Hastie, M Stephens, J Storey, R Tibshirani and WH Wong. Despite empirical evidence for its accuracy, there is currently no theoretical justification. This prevents us from knowing when it will work in the future. In this paper, we show that parallel analysis consistently selects the significant factors in certain high-dimensional factor models. The intuition is that permutations keep the noise invariant, while ‘destroying’ the low-rank signal. This provides justification for permutation methods in PCA and factor models under some conditions. A key requirement is that the factors must load on several variables. Our work points to improvements of permutation methods.


Weighted-SVD: Matrix Factorization with Weights on the Latent Factors

The Matrix Factorization models, sometimes called the latent factor models, are a family of methods in the recommender system research area to (1) generate the latent factors for the users and the items and (2) predict users’ ratings on items based on their latent factors. However, current Matrix Factorization models presume that all the latent factors are equally weighted, which may not always be a reasonable assumption in practice. In this paper, we propose a new model, called Weighted-SVD, to integrate the linear regression model with the SVD model such that each latent factor accompanies with a corresponding weight parameter. This mechanism allows the latent factors have different weights to influence the final ratings. The complexity of the Weighted-SVD model is slightly larger than the SVD model but much smaller than the SVD++ model. We compared the Weighted-SVD model with several latent factor models on five public datasets based on the Root-Mean-Squared-Errors (RMSEs). The results show that the Weighted-SVD model outperforms the baseline methods in all the experimental datasets under almost all settings.


Attentive Convolution

In NLP, convolution neural networks (CNNs) have benefited less than recurrent neural networks (RNNs) from attention mechanisms. We hypothesize that this is because attention in CNNs has been mainly implemented as attentive pooling (i.e., it is applied to pooling) rather than as attentive convolution (i.e., it is integrated into convolution). Convolution is the differentiator of CNNs in that it can powerfully model the higher-level representation of a word by taking into account its local fixed-size context in input text t^x. In this work, we propose an attentive convolution network, AttentiveConvNet. It extends the context scope of the convolution operation, deriving higher-level features for a word not only from local context, but also from information extracted from nonlocal context by the attention mechanism commonly used in RNNs. This nonlocal context can come (i) from parts of the input text t^x that are distant or (ii) from a second input text, the context text t^y. In an evaluation on sentence relation classification (textual entailment and answer sentence selection) and text classification, experiments demonstrate that AttentiveConvNet has state-of-the-art performance and outperforms RNN/CNN variants with and without attention.


KV-match: An Efficient Subsequence Matching Approach for Large Scale Time Series

Time series data have exploded due to the popularity of new applications, like data center management and IoT. Time series data management system (TSDB), emerges to store and query the large volume of time series data. Subsequence matching is critical in many time series mining algorithms, and extensive approaches have been proposed. However, the shift of distributed storage system and the performance gap make these approaches not compatible with TSDB. To fill this gap, we propose a new index structure, KV-index, and the corresponding matching algorithm, KV-match. KV-index is a file-based structure, which can be easily implemented on local files, HDFS or HBase tables. KV-match algorithm probes the index efficiently with a few sequential scans. Moreover, two optimization techniques, window reduction and window reordering, are proposed to further accelerate the processing. To support the query of arbitrary lengths, we extend KV-match to KV-match_{DP}, which utilizes multiple varied length indexes to process the query simultaneously. A two-dimensional dynamic programming algorithm is proposed to find the optimal query segmentation. We implement our approach on both local files and HBase tables, and conduct extensive experiments on synthetic and real-world datasets. Results show that our index is of comparable size to the popular tree-style index while our query processing is order of magnitudes more efficient.


Learning Predictive Leading Indicators for Forecasting Time Series Systems with Unknown Clusters of Forecast Tasks

We present a new method for forecasting systems of multiple interrelated time series. The method learns the forecast models together with discovering leading indicators from within the system that serve as good predictors improving the forecast accuracy and a cluster structure of the predictive tasks around these. The method is based on the classical linear vector autoregressive model (VAR) and links the discovery of the leading indicators to inferring sparse graphs of Granger causality. We formulate a new constrained optimisation problem to promote the desired sparse structures across the models and the sharing of information amongst the learning tasks in a multi-task manner. We propose an algorithm for solving the problem and document on a battery of synthetic and real-data experiments the advantages of our new method over baseline VAR models as well as the state-of-the-art sparse VAR learning methods.


DeepER — Deep Entity Resolution

Entity Resolution (ER) is a fundamental problem with many applications. Machine learning (ML)-based and rule-based approaches have been widely studied for decades, with many efforts being geared towards which features/attributes to select, which similarity functions to employ, and which blocking function to use – complicating the deployment of an ER system as a turn-key system. In this paper, we present DeepER, a turn-key ER system powered by deep learning (DL) techniques. The central idea is that distributed representations and representation learning from DL can alleviate the above human efforts for tuning existing ER systems. DeepER makes several notable contributions: encoding a tuple as a distributed representation of attribute values, building classifiers using these representations and a semantic aware blocking based on LSH, and learning and tuning the distributed representations for ER. We evaluate our algorithms on multiple benchmark datasets and achieve competitive results while requiring minimal interaction with experts.


Building Chatbots from Forum Data: Model Selection Using Question Answering Metrics

We propose to use question answering (QA) data from Web forums to train chatbots from scratch, i.e., without dialog training data. First, we extract pairs of question and answer sentences from the typically much longer texts of questions and answers in a forum. We then use these shorter texts to train seq2seq models in a more efficient way. We further improve the parameter optimization using a new model selection strategy based on QA measures. Finally, we propose to use extrinsic evaluation with respect to a QA task as an automatic evaluation method for chatbots. The evaluation shows that the model achieves a MAP of 63.5% on the extrinsic task. Moreover, it can answer correctly 49.5% of the questions when they are similar to questions asked in the forum, and 47.3% of the questions when they are more conversational in style.


What Does Explainable AI Really Mean? A New Conceptualization of Perspectives

We characterize three notions of explainable AI that cut across research fields: opaque systems that offer no insight into its algo- rithmic mechanisms; interpretable systems where users can mathemat- ically analyze its algorithmic mechanisms; and comprehensible systems that emit symbols enabling user-driven explanations of how a conclusion is reached. The paper is motivated by a corpus analysis of NIPS, ACL, COGSCI, and ICCV/ECCV paper titles showing differences in how work on explainable AI is positioned in various fields. We close by introducing a fourth notion: truly explainable systems, where automated reasoning is central to output crafted explanations without requiring human post processing as final step of the generative process.


Change Acceleration and Detection

A generalization of the Bayesian sequential change detection problem is proposed, where the change is a latent event that should be not only detected, but also accelerated. It is assumed that the sequentially collected observations are responses to treatments selected in real time. The assigned treatments not only determine the distribution of responses before and after the change, but also influence when the change happens. The problem is to find a treatment assignment rule and a stopping rule to minimize the average total number of observations subject to a bound on the false-detection probability. An intuitive solution is proposed, which is easy to implement and achieves for a large class of change-point models the optimal performance up to a first-order asymptotic approximation. A simulation study suggests the almost exact optimality of the proposed scheme under a Markovian change-point model.


Interpretable Convolutional Neural Networks

This paper proposes a method to modify traditional convolutional neural networks (CNNs) into interpretable CNNs, in order to clarify knowledge representations in high conv-layers of CNNs. In an interpretable CNN, each filter in a high conv-layer represents a certain object part. We do not need any annotations of object parts or textures to supervise the learning process. Instead, the interpretable CNN automatically assigns each filter in a high conv-layer with an object part during the learning process. Our method can be applied to different types of CNNs with different structures. The clear knowledge representation in an interpretable CNN can help people understand the logics inside a CNN, i.e., based on which patterns the CNN makes the decision. Experiments showed that filters in an interpretable CNN were more semantically meaningful than those in traditional CNNs.


Event Identification as a Decision Process with Non-linear Representation of Text

We propose scale-free Identifier Network(sfIN), a novel model for event identification in documents. In general, sfIN first encodes a document into multi-scale memory stacks, then extracts special events via conducting multi-scale actions, which can be considered as a special type of sequence labelling. The design of large scale actions makes it more efficient processing a long document. The whole model is trained with both supervised learning and reinforcement learning.


A concatenating framework of shortcut convolutional neural networks

It is well accepted that convolutional neural networks play an important role in learning excellent features for image classification and recognition. However, in tradition they only allow adjacent layers connected, limiting integration of multi-scale information. To further improve their performance, we present a concatenating framework of shortcut convolutional neural networks. This framework can concatenate multi-scale features by shortcut connections to the fully-connected layer that is directly fed to the output layer. We do a large number of experiments to investigate performance of the shortcut convolutional neural networks on many benchmark visual datasets for different tasks. The datasets include AR, FERET, FaceScrub, CelebA for gender classification, CUReT for texture classification, MNIST for digit recognition, and CIFAR-10 for object recognition. Experimental results show that the shortcut convolutional neural networks can achieve better results than the traditional ones on these tasks, with more stability in different settings of pooling schemes, activation functions, optimizations, initializations, kernel numbers and kernel sizes.


Time Series Management Systems: A Survey

The collection of time series data increases as more monitoring and automation are being deployed. These deployments range in scale from an Internet of things (IoT) device located in a household to enormous distributed Cyber-Physical Systems (CPSs) producing large volumes of data at high velocity. To store and analyze these vast amounts of data, specialized Time Series Management Systems (TSMSs) have been developed to overcome the limitations of general purpose Database Management Systems (DBMSs) for times series management. In this paper, we present a thorough analysis and classification of TSMSs developed through academic or industrial research and documented through publications. Our classification is organized into categories based on the architectures observed during our analysis. In addition, we provide an overview of each system with a focus on the motivational use case that drove the development of the system, the functionality for storage and querying of time series a system implements, the components the system is composed of, and the capabilities of each system with regard to Stream Processing and Approximate Query Processing (AQP). Last, we provide a summary of research directions proposed by other researchers in the field and present our vision for a next generation TSMS.


An Updated Literature Review of Distance Correlation and its Applications to Time Series

The concept of distance covariance/correlation was introduced recently to characterize dependence among vectors of random variables. We review some statistical aspects of distance covariance/correlation function and we demonstrate its applicability to time series analysis. We will see that the auto-distance covariance/correlation function is able to identify nonlinear relationships and can be employed for testing the i.i.d.\ hypothesis. Comparisons with other measures of dependence are included.


Hyperfield Grassmannians
Hierarchical modeling of molecular energies using a deep neural network
Lower bounds on the lattice-free rank for packing and covering integer programs
Learning the Exact Topology of Undirected Consensus Networks
Gradient Flows in Filtering and Fisher-Rao Geometry
CARMA: Contention-aware Auction-based Resource Management in Architecture
A Gaussian mixture model representation of endmember variability in hyperspectral unmixing
Distance-based Depths for Directional Data
Extremal Threshold Graphs for Matchings and Independent Sets
Ergodicity versus non-ergodicity for Probabilistic Cellular Automata on rooted trees
Language-depedent I-Vectors for LRE15
On diregular digraphs with degree two and excess three
On spectral and numerical properties of random butterfly matrices
Inclusive Prime Number Races
User-friendly guarantees for the Langevin Monte Carlo with inaccurate gradient
HEPCloud, a New Paradigm for HEP Facilities: CMS Amazon Web Services Investigation
Reconstruction from Periodic Nonlinearities, With Applications to HDR Imaging
Graphs, Skeleta and Reconstruction of Polytopes
An Efficient Load Balancing Method for Tree Algorithms
3DOF Pedestrian Trajectory Prediction Learned from Long-Term Autonomous Mobile Robot Deployment Data
A Cheeger-type exponential bound for the number of triangulated manifolds
Dense RGB-D semantic mapping with Pixel-Voxel neural network
A Game-Theoretic Method for Multi-Period Demand Response: Revenue Maximization, Power Allocation, and Asymptotic Behavior
A Multi-Resolution Model for Non-Gaussian Random Fields on a Sphere with Application to Ionospheric Electrostatic Potentials
Enhanced Linear-array Photoacoustic Beamforming using Modified Coherence Factor
Accelerated Directional Search and Derivative-Free Methods with non-euclidian prox-structure
Speaker Role Contextual Modeling for Language Understanding and Dialogue Policy Learning
Dynamic Time-Aware Attention to Speaker Roles and Contexts for Spoken Language Understanding
Continuous-Time Relationship Prediction in Dynamic Heterogeneous Information Networks
PCANet-II: When PCANet Meets the Second Order Pooling
On heat kernel decay for the random conductance model
Group-labeled light dual multinets in the projective plane (with Appendix)
Multi-Scale Pipeline for the Search of String-Induced CMB Anisotropies
A Many-Objective Evolutionary Algorithm with Angle-Based Selection and Shift-Based Density Estimation
Full-Duplex Relay Selection in Cognitive Underlay Networks
Deep learning for source camera identification on mobile devices
Variational Grid Setting Network
Unsupervised Segmentation of Action Segments in Egocentric Videos using Gaze
Unsupervised Classification of Intrusive Igneous Rock Thin Section Images using Edge Detection and Colour Analysis
Power analysis for a linear regression model when regressors are matrix sampled
Where computer vision can aid physics: dynamic cloud motion forecasting from satellite images
New binary and ternary LCD codes
Improved Training for Self-Training
Spherical embeddings of symmetric association schemes in 3-dimensional Euclidean space
Parameterized Algorithms for Conflict-free Colorings of Graphs
Decontamination of Mutual Contamination Models
Clustering and Hitting Times of Threshold Exceedances and Applications
Robust Photometric Stereo Using Learned Image and Gradient Dictionaries
Robust Surface Reconstruction from Gradients via Adaptive Dictionary Regularization
Fast Fine-grained Image Classification via Weakly Supervised Discriminative Localization
A Coin-Tossing Conundrum
Homomorphisms are indeed a good basis for counting: Three fixed-template dichotomy theorems, for the price of one
Spectral distributions of periodic random matrix ensembles
DeepWheat: Estimating Phenotypic Traits From Images of Crops Using Deep Learning
The graph theory general position problem on some interconnection networks
Gaussian Three-Dimensional kernel SVM for Edge Detection Applications
Laplacian Simplices Associated to Digraphs
Harmonic analysis and inference for spherical distributions
Fine-grained Event Learning of Human-Object Interaction with LSTM-CRF
Bayesian estimation from few samples: community detection and related problems
Distributed and Managed: Research Challenges and Opportunities of the Next Generation Cyber-Physical Systems
DREMS-OS: An Operating System for Managed Distributed Real-time Embedded Systems
On the Complexity of Chore Division
What Words Do We Use to Lie?: Word Choice in Deceptive Messages
Heptavalent symmetric graphs with solvable stabilizers admitting vertex-transitive non-abelian simple groups
Image Dehazing using Bilinear Composition Loss Function
A Versatile Approach to Evaluating and Testing Automated Vehicles based on Kernel Methods
Efficient and Effective Single-Document Summarizations and A Word-Embedding Measurement of Quality
A Lottery Model for Center-type Problems With Outliers
A Moving-Horizon Hybrid Stochastic Game for Secure Control of Cyber-Physical Systems
Translating Videos to Commands for Robotic Manipulation with Deep Recurrent Neural Networks
Domination game on uniform hypergraphs
Velocity Field Generation for Density Control of Swarms using Heat Equation and Smoothing Kernels
Pyramidal RoR for Image Classification
Load Balancing in Hypergraphs
Personalized Fuzzy Text Search Using Interest Prediction and Word Vectorization
The Crowdfunding Game
FPT-algorithms for The Shortest Lattice Vector and Integer Linear Programming Problems
Mutual Information based Bayesian Analysis of Power System Reliability
The Width and Integer Optimization on Simplices With Bounded Minors of the Constraint Matrices
Parameter Sharing Deep Deterministic Policy Gradient for Cooperative Multi-agent Reinforcement Learning
Fully Automated Fact Checking Using External Sources
The Edwards-Wilkinson limit of the random heat equation in dimensions three and higher
Robust Tuning Datasets for Statistical Machine Translation
Large deviations for level sets of branching Brownian motion and Gaussian free fields
A New Property of Random Regular Bipartite Graphs
On the Limit Distributions Associated with Step-kind Boundary Problems
Activating the ‘Breakfast Club’: Modeling Influence Spread in Natural-World Social Networks
Mathematical foundations of matrix syntax
Multivalued matrices and forbidden configurations
On a generalization of Lie($k$): a CataLAnKe theorem
Homogenization of a Random Walk with Irreversible Rates On a Graph in $d$ Dimensions
Efficient Preconditioning for Noisy Separable NMFs by Successive Projection Based Low-Rank Approximations
Channel Hardening and Favorable Propagation in Cell-Free Massive MIMO with Stochastic Geometry
Wikipedia graph mining: dynamic structure of collective memory
Identifying Clickbait Posts on Social Media with an Ensemble of Linear Models
Ext and local cohomology modules of face rings of simplicial posets
Combinatorics of Toric Arrangements
Straggler Mitigation by Delayed Relaunch of Tasks
An Estimation-Theoretic View of Privacy
Learning event representation: As sparse as possible, but not sparser
Asymptotic Allocation Rules for a Class of Dynamic Multi-armed Bandit Problems
Creating a Social Brain for Cooperative Connected Autonomous Vehicles: Issues and Challenges
Large-Scale Quadratically Constrained Quadratic Program via Low-Discrepancy Sequences
Patrolling a Path Connecting a Set of Points with Unbalanced Frequencies of Visits
Oracle Importance Sampling for Stochastic Simulation Models
HUMOR: A Crowd-Annotated Spanish Corpus for Humor Analysis
Margin Sample Mining Loss: A Deep Learning Based Method for Person Re-identification
Breathing multichimera states in nonlocally coupled phase oscillators
Performance analysis of FSO using relays and spatial diversity under log-normal fading channel
DeepSafe: A Data-driven Approach for Checking Adversarial Robustness in Neural Networks
SE3-Pose-Nets: Structured Deep Dynamics Models for Visuomotor Planning and Control
The Dutch’s Real World Financial Institute: Introducing Quantum-Like Bayesian Networks as an Alternative Model to deal with Uncertainty
Log-majorizations for the (symplectic) eigenvalues of the Cartan barycenter
Online control of the false discovery rate with decaying memory
Bivariate Exponentiated Generalized Linear Exponential Distribution with Applications in Reliability Analysis
Hamiltonicity in random graphs is born resilient
Depth estimation using structured light flow — analysis of projected pattern flow on an object’s surface —
Temporal shape super-resolution by intra-frame motion encoding using high-fps structured light
Plethysm and fast matrix multiplication
Regular Sequences of Quasi-Nonexpansive Operators and Their Applications
The Strategy of Experts for Repeated Predictions
Sparse Doppler Sensing
Distributed Optimization for Coordinated Beamforming in Multi-Cell Multigroup Multicast Systems: Power Minimization and SINR Balancing
Design and Performance Analysis of Dual and Multi-hop Diffusive Molecular Communication Systems
Crude EEG parameter provides sleep medicine with well-defined continuous hypnograms
Diffusive Molecular Communication with Nanomachine Mobility
Random iterations of homeomorphisms on the circle
The branching-ruin number and the critical parameter of once-reinforced random walk on trees
Indirect Match Highlights Detection with Deep Convolutional Neural Networks
Raman spectroscopy of femtosecond multi-pulse irradiation of vitreous silica: experiment and simulation
Quantum Multicritical Phenomena in Disordered Weyl Semimetal
Remote Sensing Image Classification with Large Scale Gaussian Processes
On minimax nonparametric estimation of signal in Gaussian noise
sgmcmc: An R Package for Stochastic Gradient Markov Chain Monte Carlo
Orthogonal Vectors Indexing
Scalable Bayesian regression in high dimensions with multiple data sources
Lasso Regularization Paths for NARMAX Models via Coordinate Descent
Solving Two Conjectures regarding Codes for Location in Circulant Graphs
Constrained Differential Privacy for Count Data
Large deviations for the annealed Ising model on inhomogeneous random graphs: spins and degrees
Improving Spark Application Throughput Via Memory Aware Task Co-location: A Mixture of Experts Approach
Out-of-focus Blur: Image De-blurring
Adaptive Smoothing in fMRI Data Processing Neural Networks
Central limit theorem for the quenched path measures for the continuous directed polymer in $d\geq 3$ in weak disorder
Deep Convolutional Neural Networks for Interpretable Analysis of EEG Sleep Stage Scoring
Factorial characters of classical Lie groups and their combinatorial realisations
A Justification of Conditional Confidence Intervals
Analysis of Feedback Error in Automatic Repeat reQuest
Low Complexity Modem Structure for OFDM-based Orthogonal Time Frequency Space Modulation
The heptagon-wheel cocycle in the Kontsevich graph complex
Parameterized Approximation Schemes for Steiner Trees with Small Number of Steiner Vertices
Conditional Chromatic Filtering with Spatial Enhancement for Restoring Pansharpened Images
Profile extrema for visualizing and quantifying uncertainties on excursion regions. Application to coastal flooding
Global phase and magnitude synchronization of coupled oscillators with application to the control of grid-forming power inverters
Upper bounds for the function solution of the homogenuous 2D Bolzmann equation with hard potential
Volumes and Ehrhart polynomials of flow polytopes
Channel Estimation for TDD/FDD Massive MIMO Systems with Channel Covariance Computing
Synthesising Evolutionarily Stable Normative Systems
Revealing the Unseen: How to Expose Cloud Usage While Protecting User Privacy
A novel quantile-based decomposition of the indirect effect in mediation analysis with an application to infant mortality in the US population
DFT-Spread OFDM with Frequency Domain Reference Symbols
Learning hard quantum distributions with variational autoencoders
A Furstenberg type formula for the speed of distance stationary sequences
CHIPS: A Service for Collecting, Organizing, Processing, and Sharing Medical Image Data in the Cloud
A Hopf-algebraic approach to cumulants-moments relations and Wick polynomials
Effective Straggler Mitigation: Which Clones Should Attack and When?
Spinal cord gray matter segmentation using deep dilated convolutions
Neural Color Transfer between Images
On the smallest snarks with oddness 4 and connectivity 2
AUC Maximization with K-hyperplane
Resonance Graphs and Perfect Matchings of Graphs on Surfaces
Accelerating Scientific Data Exploration via Visual Query Systems
Stochastic Comparisons of Lifetimes of Two Series and Parallel Systems with Location-Scale Family Distributed Components having Archimedean Copulas
Sequential Deliberation for Social Choice
Exploring Graphs with Time Constraints by Unreliable Collections of Mobile Robots
Zeons, Permanents, the Johnson scheme, and Generalized Derangements
Identification of critical nodes in large-scale spatial networks
Accelerated Methods for $α$-Weakly-Quasi-Convex Problems
Local resilience of a random graph throughout its evolution
On the entropy power inequality for the Rényi entropy of order [0,1]
Colorful combinatorics and Macdonald polynomials
Compiling and Processing Historical and Contemporary Portuguese Corpora
The Capacity of Private Information Retrieval with Partially Known Private Side Information
Deep Learning for Unsupervised Insider Threat Detection in Structured Cybersecurity Data Streams
Entropy Inequalities for Sums in Prime Cyclic Groups
Detecting Adversarial Attacks on Neural Network Policies with Visual Foresight
Finding the optimal nets for self-folding Kirigami
On the smoothness of the partition function for multiple Schramm-Loewner evolutions
Testing for Global Network Structure Using Small Subgraph Statistics
Clustering Stream Data by Exploring the Evolution of Density Mountain
Rethinking Feature Discrimination and Polymerization for Large-scale Recognition
A note on perfect simulation for exponential random graph models
Local likelihood estimation of complex tail dependence structures in high dimensions, applied to U.S. precipitation extremes
Unsupervised Hypernym Detection by Distributional Inclusion Vector Embedding
Classification of Time-Series Images Using Deep Convolutional Neural Networks
Sentiment Perception of Readers and Writers in Emoji use
How is Distributed ADMM Affected by Network Topology?
Rényi Differential Privacy Mechanisms for Posterior Sampling
PIRVS: An Advanced Visual-Inertial SLAM System with Flexible Sensor Fusion and Hardware Co-Design
Detecting Epistatic Selection with Partially Observed Genotype Data Using Copula Graphical Models
Valuation of Employee Stock Options (ESOs) by means of Mean-Variance Hedging
Online and Distributed Robust Regressions under Adversarial Data Corruption
On Landauer principle and bound for infinite systems
End-to-end Learning for 3D Facial Animation from Raw Waveforms of Speech
Minimal Dependency Translation: a Framework for Computer-Assisted Translation for Under-Resourced Languages
Fine-Grained Head Pose Estimation Without Keypoints
Identifying Nominals with No Head Match Co-references Using Deep Learning
Natural boundary and zero distribution of random polynomials in smooth domains
An Extension of Clark-Haussman Formula and Applications
The ‘weak’ interdependence of infrastructure systems produces mixed percolation transitions in multilayer networks
Ordered Dags: HypercubeSort
VIDOSAT: High-dimensional Sparsifying Transform Learning for Online Video Denoising
Two-dimensional plasmons in the random impedance network model of disordered thin-film nanocomposites
Optimal Matroid Partitioning Problems
Maximum Matchings in Graphs for Allocating Kidney Paired Donation
Monte Carlo approximation certificates for k-means clustering
Bayesian Inference under Cluster Sampling with Probability Proportional to Size
GP-GAN: Gender Preserving GAN for Synthesizing Faces from Landmarks
Automated and Robust Quantification of Colocalization in Dual-Color Fluorescence Microscopy: A Nonparametric Statistical Approach
Asymptotic analysis of a multiclass queueing control problem under heavy-traffic with model uncertainty
Facial Key Points Detection using Deep Convolutional Neural Network – NaimishNet
Supervised Q-walk for Learning Vector Representation of Nodes in Networks
Random numerical semigroups and a simplicial complex of irreducible semigroups
Energy-Efficient Power and Bandwidth Allocation in an Integrated Sub-6 GHz — Millimeter Wave System
Joint Person Re-identification and Camera Network Topology Inference in Multiple Cameras
Annotation and Detection of Emotion in Text-based Dialogue Systems with CNN
On the trace of unimodal Lévy processes on Lipschitz domains
Equilibrium computation for zero sum games with submodular structure
Is Structure Necessary for Modeling Argument Expectations in Distributional Semantics?
Nearest Neighbor Imputation for Categorical Data by Weighting of Attributes
Training Feedforward Neural Networks with Standard Logistic Activations is Feasible
Dynamic range maximization in excitable networks
Learning Affinity via Spatial Propagation Networks
MMCR4NLP: Multilingual Multiway Corpora Repository for Natural Language Processing
Randomized Truncated SVD Levenberg-Marquardt Approach to Geothermal Natural State and History Matching
Estimating the decoherence time using non-commutative Functional Inequalities
Asymptotic Log-Harnack Inequality and Applications for Stochastic Systems of Infinite Memory
Degenerate SDEs with Singular Drift and Applications to Heisenberg Groups
Simulating Structure-from-Motion
Parameter estimation of platelets deposition: Approximate Bayesian computation with high performance computing
Improving approximate Bayesian computation via quasi Monte Carlo
De novo construction of q-ploid linkage maps using discrete graphical models
Pareto optimality in multilayer network growth
Generalised Mycielski graphs and the Borsuk-Ulam theorem
Resolution limits on visual speech recognition
Precise large deviations for random walk in random environment
Optimal DNN Primitive Selection with Partitioned Boolean Quadratic Programming
Some observations on computer lip-reading: moving from the dream to the reality
Learning the optimal scale for GWAS through hierarchical SNP aggregation
Long-time behaviour of generalised Zig-Zag process
Persistence probability of random weyl polynomial
Which phoneme-to-viseme maps best improve visual-only computer lip-reading?
Adaptive p-value weighting with power optimality
Towards an Inferential Lexicon of Event Selecting Predicates for French
Isotropic and Steerable Wavelets in N Dimensions. A multiresolution analysis framework for ITK
Differential Privacy for Sets in Euclidean Space
Detection of Inferior Myocardial Infarction using Shallow Convolutional Neural Networks
Speaker-independent machine lip-reading with speaker-dependent viseme classifiers
Finding Talk About the Past in the Discourse of Non-Historians
Finding phonemes: improving machine lip-reading
Computing Top-k Closeness Centrality in Fully-dynamic Graphs
Scaling up Group Closeness Maximization
Fractional equations via convergence of forms
Spectral gaps and discrete magnetic Laplacians
Decoding visemes: improving machine lipreading
A MAP-Based Layered Detection Algorithm and Outage Analysis over MIMO Channels
Secure Private Information Retrieval from Colluding Databases with Eavesdroppers
Secrecy Outage Analysis over Correlated Composite Nakagami-$m$/Gamma Fading Channels
Ramsey expansions of $Λ$-ultrametric spaces
A family of transformed copulas with singular component
Person Re-Identification with Vision and Language
The tail does not determine the size of the giant
Multi-Pair Two Way AF Full-Duplex Massive MIMO Relaying with ZFR/ZFT Processing
On a predator-prey system with random switching that never converges to its equilibrium
Asymptotic harvesting of populations in random environments
Weak convergence of weighted additive functionals of long-range dependent fields
Keep It Real: Tail Probabilities of Compound Heavy-Tailed Distributions
Perturbation theory approaches to Anderson and Many-Body Localization: some lecture notes
netgwas: An R Package for Network-Based Genome-Wide Association Studies
Relationships between cycles spaces, gain graphs, graph coverings, path homology, and graph curvature
A D2D-based Protocol for Ultra-Reliable Wireless Communications for Industrial Automation
Coverage and Rate Analysis for Co-Existing RF/VLC Downlink Cellular Networks
Weak convergence rates for stochastic evolution equations and applications to nonlinear stochastic wave, HJMM, stochastic Schroedinger and linearized stochastic Korteweg–de Vries equations
Indexing the Event Calculus with Kd-trees to Monitor Diabetes
Dilated Convolutions for Modeling Long-Distance Genomic Dependencies
Analysis of Large Scale Web Experiments Using Sequences of Estimators
Decoding visemes: improving machine lipreading (PhD thesis)
Visual speech recognition: aligning terminologies for better understanding
Multiple domination models for placement of electric vehicle charging stations in road networks
Isotropic covariance functions on graphs and their edges
Visual gesture variability between talkers in continuous visual speech