A Fairness-aware Hybrid Recommender System

Recommender systems are used in variety of domains affecting people’s lives. This has raised concerns about possible biases and discrimination that such systems might exacerbate. There are two primary kinds of biases inherent in recommender systems: observation bias and bias stemming from imbalanced data. Observation bias exists due to a feedback loop which causes the model to learn to only predict recommendations similar to previous ones. Imbalance in data occurs when systematic societal, historical, or other ambient bias is present in the data. In this paper, we address both biases by proposing a hybrid fairness-aware recommender system. Our model provides efficient and accurate recommendations by incorporating multiple user-user and item-item similarity measures, content, and demographic information, while addressing recommendation biases. We implement our model using a powerful and expressive probabilistic programming language called probabilistic soft logic. We experimentally evaluate our approach on a popular movie recommendation dataset, showing that our proposed model can provide more accurate and fairer recommendations, compared to a state-of-the art fair recommender system.


FeatureAnalytics: An approach to derive relevant attributes for analyzing Android Malware

Ever increasing number of Android malware, has always been a concern for cybersecurity professionals. Even though plenty of anti-malware solutions exist, a rational and pragmatic approach for the same is rare and has to be inspected further. In this paper, we propose a novel two-set feature selection approach based on Rough Set and Statistical Test named as RSST to extract relevant system calls. To address the problem of higher dimensional attribute set, we derived suboptimal system call space by applying the proposed feature selection method to maximize the separability between malware and benign samples. Comprehensive experiments conducted on a dataset consisting of 3500 samples with 30 RSST derived essential system calls resulted in an accuracy of 99.9%, Area Under Curve (AUC) of 1.0, with 1% False Positive Rate (FPR). However, other feature selectors (Information Gain, CFsSubsetEval, ChiSquare, FreqSel and Symmetric Uncertainty) used in the domain of malware analysis resulted in the accuracy of 95.5% with 8.5% FPR. Besides, empirical analysis of RSST derived system calls outperform other attributes such as permissions, opcodes, API, methods, call graphs, Droidbox attributes and network traces.


Functional Intrusive Load Monitor (FILM): A Model-based Platform for Non-Intrusive Load Monitoring System Development

Non-Intrusive Load Monitoring (NILM) is an important application to monitor household appliance activities and provide related information to house owner or/and utility company via a single sensor installed at the electrical entry of the house. It can be used for different purposes in residential and industrial sectors. Thus, an increasing number of new algorithms have been developed in recent years. In these algorithms, researchers either use existing public datasets or collect their own data which causes such problems as insufficiency of electrical parameters, missing of ground-truth data, absence of many appliances, and lack of appliance information. To solve these problems, this paper presents a model-based platform for NILM system development, namely Functional Intrusive Load Monitor (FILM). By using this platform, the state transitions and activities of all the involved appliances can be preset by researchers, and multiple electrical parameters such as harmonics and power factor can be monitored or calculated. This platform will help researchers save the time of collecting experimental data, utilize precise control of individual appliance activities, and develop load signatures of devices. This paper describes the steps, structure, and requirements of building this platform. Case study is presented to help understand this platform.


Permutation Invariant Gaussian Matrix Models

Permutation invariant Gaussian matrix models were recently developed for applications in computational linguistics. A 5-parameter family of models was solved. In this paper, we use a representation theoretic approach to solve the general 13-parameter Gaussian model, which can be viewed as a zero-dimensional quantum field theory. We express the two linear and eleven quadratic terms in the action in terms of representation theoretic parameters. These parameters are coefficients of simple quadratic expressions in terms of appropriate linear combinations of the matrix variables transforming in specific irreducible representations of the symmetric group S_D where D is the size of the matrices. They allow the identification of constraints which ensure a convergent Gaussian measure and well-defined expectation values for polynomial functions of the random matrix at all orders. A graph-theoretic interpretation is known to allow the enumeration of permutation invariants of matrices at linear, quadratic and higher orders. We express the expectation values of all the quadratic graph-basis invariants and a selection of cubic and quartic invariants in terms of the representation theoretic parameters of the model.


Internet of NanoThings: Concepts and Applications

This chapter focuses on Internet of Things from the nanoscale point of view. The chapter starts with section 1 which provides an introduction of nanothings and nanotechnologies. The nanoscale communication paradigms and the different approaches are discussed for nanodevices development. Nanodevice characteristics are discussed and the architecture of wireless nanodevices are outlined. Section 2 describes Internet of NanoThing(IoNT), its network architecture, and the challenges of nanoscale communication which is essential for enabling IoNT. Section 3 gives some practical applications of IoNT. The internet of Bio-NanoThing (IoBNT) and relevant biomedical applications are discussed. Other Applications such as military, industrial, and environmental applications are also outlined.


Target Transfer Q-Learning and Its Convergence Analysis

Q-learning is one of the most popular methods in Reinforcement Learning (RL). Transfer Learning aims to utilize the learned knowledge from source tasks to help new tasks to improve the sample complexity of the new tasks. Considering that data collection in RL is both more time and cost consuming and Q-learning converges slowly comparing to supervised learning, different kinds of transfer RL algorithms are designed. However, most of them are heuristic with no theoretical guarantee of the convergence rate. Therefore, it is important for us to clearly understand when and how will transfer learning help RL method and provide the theoretical guarantee for the improvement of the sample complexity. In this paper, we propose to transfer the Q-function learned in the source task to the target of the Q-learning in the new task when certain safe conditions are satisfied. We call this new transfer Q-learning method target transfer Q-Learning. The safe conditions are necessary to avoid the harm to the new tasks and thus ensure the convergence of the algorithm. We study the convergence rate of the target transfer Q-learning. We prove that if the two tasks are similar with respect to the MDPs, the optimal Q-functions in the source and new RL tasks are similar which means the error of the transferred target Q-function in new MDP is small. Also, the convergence rate analysis shows that the target transfer Q-Learning will converge faster than Q-learning if the error of the transferred target Q-function is smaller than the current Q-function in the new task. Based on our theoretical results, we design the safe condition as the Bellman error of the transferred target Q-function is less than the current Q-function. Our experiments are consistent with our theoretical founding and verified the effectiveness of our proposed target transfer Q-learning method.


Neural Approaches to Conversational AI

The present paper surveys neural approaches to conversational AI that have been developed in the last few years. We group conversational systems into three categories: (1) question answering agents, (2) task-oriented dialogue agents, and (3) chatbots. For each category, we present a review of state-of-the-art neural approaches, draw the connection between them and traditional approaches, and discuss the progress that has been made and challenges still being faced, using specific systems and models as case studies.


GAPLE: Generalizable Approaching Policy LEarning for Robotic Object Searching in Indoor Environment

We study the problem of learning a generalizable action policy for an intelligent agent to actively approach an object of interest in indoor environment solely from its visual inputs. While scene-driven or recognition-driven visual navigation has been widely studied, prior efforts suffer severely from the limited generalization capability. In this paper, we first argue the object searching task is environment dependent while the approaching ability is general. To learn a generalizable approaching policy, we present a novel solution dubbed as GAPLE which adopts two channels of visual features: depth and semantic segmentation, as the inputs to the policy learning module. The empirical studies conducted on the House3D dataset as well as on a physical platform in a real world scenario validate our hypothesis, and we further provide in-depth qualitative analysis.


Recurrent Flow-Guided Semantic Forecasting

Understanding the world around us and making decisions about the future is a critical component to human intelligence. As autonomous systems continue to develop, their ability to reason about the future will be the key to their success. Semantic anticipation is a relatively under-explored area for which autonomous vehicles could take advantage of (e.g., forecasting pedestrian trajectories). Motivated by the need for real-time prediction in autonomous systems, we propose to decompose the challenging semantic forecasting task into two subtasks: current frame segmentation and future optical flow prediction. Through this decomposition, we built an efficient, effective, low overhead model with three main components: flow prediction network, feature-flow aggregation LSTM, and end-to-end learnable warp layer. Our proposed method achieves state-of-the-art accuracy on short-term and moving objects semantic forecasting while simultaneously reducing model parameters by up to 95% and increasing efficiency by greater than 40x.


Interpretable Multi-Objective Reinforcement Learning through Policy Orchestration

Autonomous cyber-physical agents and systems play an increasingly large role in our lives. To ensure that agents behave in ways aligned with the values of the societies in which they operate, we must develop techniques that allow these agents to not only maximize their reward in an environment, but also to learn and follow the implicit constraints of society. These constraints and norms can come from any number of sources including regulations, business process guidelines, laws, ethical principles, social norms, and moral values. We detail a novel approach that uses inverse reinforcement learning to learn a set of unspecified constraints from demonstrations of the task, and reinforcement learning to learn to maximize the environment rewards. More precisely, we assume that an agent can observe traces of behavior of members of the society but has no access to the explicit set of constraints that give rise to the observed behavior. Inverse reinforcement learning is used to learn such constraints, that are then combined with a possibly orthogonal value function through the use of a contextual bandit-based orchestrator that picks a contextually-appropriate choice between the two policies (constraint-based and environment reward-based) when taking actions. The contextual bandit orchestrator allows the agent to mix policies in novel ways, taking the best actions from either a reward maximizing or constrained policy. In addition, the orchestrator is transparent on which policy is being employed at each time step. We test our algorithms using a Pac-Man domain and show that the agent is able to learn to act optimally, act within the demonstrated constraints, and mix these two functions in complex ways.


A Meta-Learning Approach for Custom Model Training

Transfer-learning and meta-learning are two effective methods to apply knowledge learned from large data sources to new tasks. In few-class, few-shot target task settings (i.e. when there are only a few classes and training examples available in the target task), meta-learning approaches that optimize for future task learning have outperformed the typical transfer approach of initializing model weights from a pre-trained starting point. But as we experimentally show, meta-learning algorithms that work well in the few-class setting do not generalize well in many-shot and many-class cases. In this paper, we propose a joint training approach that combines both transfer-learning and meta-learning. Benefiting from the advantages of each, our method obtains improved generalization performance on unseen target tasks in both few- and many-class and few- and many-shot scenarios.


CPDist: Deep Siamese Networks for Learning Distances Between Structured Preferences

Preference are central to decision making by both machines and humans. Representing, learning, and reasoning with preferences is an important area of study both within computer science and across the sciences. When working with preferences it is necessary to understand and compute the distance between sets of objects, e.g., the preferences of a user and a the descriptions of objects to be recommended. We present CPDist, a novel neural network to address the problem of learning to measure the distance between structured preference representations. We use the popular CP-net formalism to represent preferences and then leverage deep neural networks to learn a recently proposed metric function that is computationally hard to compute directly. CPDist is a novel metric learning approach based on the use of deep siamese networks which learn the Kendal Tau distance between partial orders that are induced by compact preference representations. We find that CPDist is able to learn the distance function with high accuracy and outperform existing approximation algorithms on both the regression and classification task using less computation time. Performance remains good even when CPDist is trained with only a small number of samples compared to the dimension of the solution space, indicating the network generalizes well.


Coupled Graphs and Tensor Factorization for Recommender Systems and Community Detection

Joint analysis of data from multiple information repositories facilitates uncovering the underlying structure in heterogeneous datasets. Single and coupled matrix-tensor factorization (CMTF) has been widely used in this context for imputation-based recommendation from ratings, social network, and other user-item data. When this side information is in the form of item-item correlation matrices or graphs, existing CMTF algorithms may fall short. Alleviating current limitations, we introduce a novel model coined coupled graph-tensor factorization (CGTF) that judiciously accounts for graph-related side information. The CGTF model has the potential to overcome practical challenges, such as missing slabs from the tensor and/or missing rows/columns from the correlation matrices. A novel alternating direction method of multipliers (ADMM) is also developed that recovers the nonnegative factors of CGTF. Our algorithm enjoys closed-form updates that result in reduced computational complexity and allow for convergence claims. A novel direction is further explored by employing the interpretable factors to detect graph communities having the tensor as side information. The resulting community detection approach is successful even when some links in the graphs are missing. Results with real data sets corroborate the merits of the proposed methods relative to state-of-the-art competing factorization techniques in providing recommendations and detecting communities.


Adversarial Link Prediction in Social Networks

Link prediction is one of the fundamental tools in social network analysis, used to identify relationships that are not otherwise observed. Commonly, link prediction is performed by means of a similarity metric, with the idea that a pair of similar nodes are likely to be connected. However, traditional link prediction based on similarity metrics assumes that available network data is accurate. We study the problem of adversarial link prediction, where an adversary aims to hide a target link by removing a limited subset of edges from the observed subgraph. We show that optimal attacks on local similarity metrics—that is, metrics which use only the information about the node pair and their network neighbors—can be found in linear time. In contrast, attacking Katz and ACT metrics which use global information about network topology is NP-Hard. We present an approximation algorithm for optimal attacks on Katz similarity, and a principled heuristic for ACT attacks. Extensive experiments demonstrate the efficacy of our methods.


Semi-Supervised Sequence Modeling with Cross-View Training

Unsupervised representation learning algorithms such as word2vec and ELMo improve the accuracy of many supervised NLP models, mainly because they can take advantage of large amounts of unlabeled text. However, the supervised models only learn from task-specific labeled data during the main training phase. We therefore propose Cross-View Training (CVT), a semi-supervised learning algorithm that improves the representations of a Bi-LSTM sentence encoder using a mix of labeled and unlabeled data. On labeled examples, standard supervised learning is used. On unlabeled examples, CVT teaches auxiliary prediction modules that see restricted views of the input (e.g., only part of a sentence) to match the predictions of the full model seeing the whole input. Since the auxiliary modules and the full model share intermediate representations, this in turn improves the full model. Moreover, we show that CVT is particularly effective when combined with multi-task learning. We evaluate CVT on five sequence tagging tasks, machine translation, and dependency parsing, achieving state-of-the-art results.


The Privacy Policy Landscape After the GDPR

Every new privacy regulation brings along the question of whether it results in improving the privacy for the users or whether it creates more barriers to understanding and exercising their rights. The EU General Data Protection Regulation (GDPR) is one of the most demanding and comprehensive privacy regulations of all time. Hence, a few months after it went into effect, it is natural to study its impact over the landscape of privacy policies online. In this work, we conduct the first longitudinal, in-depth, and at-scale assessment of privacy policies before and after the GDPR. We gauge the complete consumption cycle of these policies, from the first user impressions until the compliance assessment. We create a diverse corpus of 3,086 English-language privacy policies for which we fetch the pre-GPDR and the post-GDPR versions. Via a user study with 530 participants on Amazon Mturk, we discover that the visual presentation of privacy policies has slightly improved in limited data-sensitive categories in addition to the top European websites. We also find that the readability of privacy policies suffers under the GDPR, due to almost a 30% more sentences and words, despite the efforts to reduce the reliance on passive sentences. We further develop a new workflow for the automated assessment of requirements in privacy policies, building on automated natural language processing techniques. We find evidence for positive changes triggered by the GDPR, with the ambiguity level, averaged over 8 metrics, improving in over 20.5% of the policies. Finally, we show that privacy policies cover more data practices, particularly around data retention, user access rights, and specific audiences, and that an average of 15.2% of the policies improved across 8 compliance metrics. Our analysis, however, reveals a large gap that exists between the current status-quo and the ultimate goals of the GDPR.


Variational Collaborative Learning for User Probabilistic Representation

Collaborative filtering (CF) has been successfully employed by many modern recommender systems. Conventional CF-based methods use the user-item interaction data as the sole information source to recommend items to users. However, CF-based methods are known for suffering from cold start problems and data sparsity problems. Hybrid models that utilize auxiliary information on top of interaction data have increasingly gained attention. A few ‘collaborative learning’-based models, which tightly bridges two heterogeneous learners through mutual regularization, are recently proposed for the hybrid recommendation. However, the ‘collaboration’ in the existing methods are actually asynchronous due to the alternative optimization of the two learners. Leveraging the recent advances in variational autoencoder~(VAE), we here propose a model consisting of two streams of mutual linked VAEs, named variational collaborative model (VCM). Unlike the mutual regularization used in previous works where two learners are optimized asynchronously, VCM enables a synchronous collaborative learning mechanism. Besides, the two stream VAEs setup allows VCM to fully leverages the Bayesian probabilistic representations in collaborative learning. Extensive experiments on three real-life datasets have shown that VCM outperforms several state-of-art methods.


Combinatorial Designs for Deep Learning

Deep learning is a multi-layer neural network. It can be regarded as a chain of complete bipartite graphs. The nodes of the first partite is the input layer and the last is the output layer. The edges of a bipartite graph function as weights which are represented as a matrix. The values of i-th partite are computed by multiplication of the weight matrix and values of (i-1)-th partite. Using mass training and teacher data, the weight parameters are estimated little by little. Overfitting (or Overlearning) refers to a model that models the ‘training data’ too well. It then becomes difficult for the model to generalize to new data which were not in the training set. The most popular method to avoid overfitting is called dropout. Dropout deletes a random sample of activations (nodes) to zero during the training process. A random sample of nodes cause more irregular frequency of dropout edges. We propose a combinatorial design on dropout nodes from each partite which balances frequency of edges. We analyze and construct such designs in this paper.


Adaptive Shivers Sort: An Alternative Sorting Algorithm

We present a stable mergesort, called~\ASS, that exploits the existence of monotonic runs for sorting efficiently partially sorted data. We also prove that, although this algorithm is simple to implement, its computational cost, in number of comparisons performed, is optimal up to an additive linear term.


Differentiable Unbiased Online Learning to Rank

Online Learning to Rank (OLTR) methods optimize rankers based on user interactions. State-of-the-art OLTR methods are built specifically for linear models. Their approaches do not extend well to non-linear models such as neural networks. We introduce an entirely novel approach to OLTR that constructs a weighted differentiable pairwise loss after each interaction: Pairwise Differentiable Gradient Descent (PDGD). PDGD breaks away from the traditional approach that relies on interleaving or multileaving and extensive sampling of models to estimate gradients. Instead, its gradient is based on inferring preferences between document pairs from user clicks and can optimize any differentiable model. We prove that the gradient of PDGD is unbiased w.r.t. user document pair preferences. Our experiments on the largest publicly available Learning to Rank (LTR) datasets show considerable and significant improvements under all levels of interaction noise. PDGD outperforms existing OLTR methods both in terms of learning speed as well as final convergence. Furthermore, unlike previous OLTR methods, PDGD also allows for non-linear models to be optimized effectively. Our results show that using a neural network leads to even better performance at convergence than a linear model. In summary, PDGD is an efficient and unbiased OLTR approach that provides a better user experience than previously possible.


Implementation of Fuzzy C-Means and Possibilistic C-Means Clustering Algorithms, Cluster Tendency Analysis and Cluster Validation

In this paper, several two-dimensional clustering scenarios are given. In those scenarios, soft partitioning clustering algorithms (Fuzzy C-means (FCM) and Possibilistic c-means (PCM)) are applied. Afterward, VAT is used to investigate the clustering tendency visually, and then in order of checking cluster validation, three types of indices (e.g., PC, DI, and DBI) were used. After observing the clustering algorithms, it was evident that each of them has its limitations; however, PCM is more robust to noise than FCM as in case of FCM a noise point has to be considered as a member of any of the cluster.


Pachinko Prediction: A Bayesian method for event prediction from social media data

The combination of large open data sources with machine learning approaches presents a potentially powerful way to predict events such as protest or social unrest. However, accounting for uncertainty in such models, particularly when using diverse, unstructured datasets such as social media, is essential to guarantee the appropriate use of such methods. Here we develop a Bayesian method for predicting social unrest events in Australia using social media data. This method uses machine learning methods to classify individual postings to social media as being relevant, and an empirical Bayesian approach to calculate posterior event probabilities. We use the method to predict events in Australian cities over a period in 2017/18.


Shift-based Primitives for Efficient Convolutional Neural Networks

We propose a collection of three shift-based primitives for building efficient compact CNN-based networks. These three primitives (channel shift, address shift, shortcut shift) can reduce the inference time on GPU while maintains the prediction accuracy. These shift-based primitives only moves the pointer but avoids memory copy, thus very fast. For example, the channel shift operation is 12.7x faster compared to channel shuffle in ShuffleNet but achieves the same accuracy. The address shift and channel shift can be merged into the point-wise group convolution and invokes only a single kernel call, taking little time to perform spatial convolution and channel shift. Shortcut shift requires no time to realize residual connection through allocating space in advance. We blend these shift-based primitives with point-wise group convolution and built two inference-efficient CNN architectures named AddressNet and Enhanced AddressNet. Experiments on CIFAR100 and ImageNet datasets show that our models are faster and achieve comparable or better accuracy.


Towards Language Agnostic Universal Representations

When a bilingual student learns to solve word problems in math, we expect the student to be able to solve these problem in both languages the student is fluent in,even if the math lessons were only taught in one language. However, current representations in machine learning are language dependent. In this work, we present a method to decouple the language from the problem by learning language agnostic representations and therefore allowing training a model in one language and applying to a different one in a zero shot fashion. We learn these representations by taking inspiration from linguistics and formalizing Universal Grammar as an optimization process (Chomsky, 2014; Montague, 1970). We demonstrate the capabilities of these representations by showing that the models trained on a single language using language agnostic representations achieve very similar accuracies in other languages.


Interaction Detection with Bayesian Decision Tree Ensembles

Methods based on Bayesian decision tree ensembles have proven valuable in constructing high-quality predictions, and are particularly attractive in certain settings because they encourage low-order interaction effects. Despite adapting to the presence of low-order interactions for prediction purpose, we show that Bayesian decision tree ensembles are generally anti-conservative for the purpose of conducting interaction detection. We address this problem by introducing Dirichlet process forests (DP-Forests), which leverage the presence of low-order interactions by clustering the trees so that trees within the same cluster focus on detecting a specific interaction. We show on both simulated and benchmark data that DP-Forests perform well relative to existing interaction detection techniques for detecting low-order interactions, attaining very low false-positive and false-negative rates while maintaining the same performance for prediction using a comparable computational budget.


Harvesting Time-Series Data from Service-Based Systems Hosted in MANETs

We are concerned with reliably harvesting data collected from service-based systems hosted on a mobile ad hoc network (MANET). More specifically, we are concerned with time-bounded and time-sensitive time-series monitoring data describing the state of the network and system. The data are harvested in order to perform an analysis, usually one that requires a global view of the data taken from distributed sites. For example, network- and application-state data are typically analysed in order to make operational and maintenance decisions. MANETs are a challenging environment in which to harvest monitoring data, due to the inherently unstable and unpredictable connectivity between nodes, and the overhead of transferring data in a wireless medium. These limitations must be overcome to support time-series analysis of perishable and time-critical data. We present an epidemic, delay tolerant, and intelligent method to efficiently and effectively transfer time-series data between the mobile nodes of MANETs. The method establishes a network-wide synchronization overlay to transfer increments of the data over intermediate nodes in periodic cycles. The data are then accessible from local stores at the nodes. We implemented the method in Java~EE and present evaluation on a run-time dependence discovery method for Web Service applications hosted on MANETs, and comparison to other four methods demonstrating that our method performs significantly better in both data availability and network overhead.


DT-LET: Deep Transfer Learning by Exploring where to Transfer

Previous transfer learning methods based on deep network assume the knowledge should be transferred between the same hidden layers of the source domain and the target domains. This assumption doesn’t always hold true, especially when the data from the two domains are heterogeneous with different resolutions. In such case, the most suitable numbers of layers for the source domain data and the target domain data would differ. As a result, the high level knowledge from the source domain would be transferred to the wrong layer of target domain. Based on this observation, ‘where to transfer’ proposed in this paper should be a novel research frontier. We propose a new mathematic model named DT-LET to solve this heterogeneous transfer learning problem. In order to select the best matching of layers to transfer knowledge, we define specific loss function to estimate the corresponding relationship between high-level features of data in the source domain and the target domain. To verify this proposed cross-layer model, experiments for two cross-domain recognition/classification tasks are conducted, and the achieved superior results demonstrate the necessity of layer correspondence searching.


Softer-NMS: Rethinking Bounding Box Regression for Accurate Object Detection

Non-maximum suppression (NMS) is essential for state-of-the-art object detectors to localize object from a set of candidate locations. However, accurate candidate location sometimes is not associated with a high classification score, which leads to object localization failure during NMS. In this paper, we introduce a novel bounding box regression loss for learning bounding box transformation and localization variance together. The resulting localization variance exhibits a strong connection to localization accuracy, which is then utilized in our new non-maximum suppression method to improve localization accuracy for object detection. On MS-COCO, we boost the AP of VGG-16 faster R-CNN from 23.6% to 29.1% with a single model and nearly no additional computational overhead. More importantly, our method is able to improve the AP of ResNet-50 FPN fast R-CNN from 36.8% to 37.8%, which achieves state-of-the-art bounding box refinement result.


A Kernel Embedding-based Approach for Nonstationary Causal Model Inference

Although nonstationary data are more common in the real world, most existing causal discovery methods do not take nonstationarity into consideration. In this letter, we propose a kernel embedding-based approach, ENCI, for nonstationary causal model inference where data are collected from multiple domains with varying distributions. In ENCI, we transform the complicated relation of a cause-effect pair into a linear model of variables of which observations correspond to the kernel embeddings of the cause-and-effect distributions in different domains. In this way, we are able to estimate the causal direction by exploiting the causal asymmetry of the transformed linear model. Furthermore, we extend ENCI to causal graph discovery for multiple variables by transforming the relations among them into a linear nongaussian acyclic model. We show that by exploiting the nonstationarity of distributions, both cause-effect pairs and two kinds of causal graphs are identifiable under mild conditions. Experiments on synthetic and real-world data are conducted to justify the efficacy of ENCI over major existing methods.


Query Understanding via Entity Attribute Identification

Understanding searchers’ queries is an essential component of semantic search systems. In many cases, search queries involve specific attributes of an entity in a knowledge base (KB), which can be further used to find query answers. In this study, we aim to move forward the understanding of queries by identifying their related entity attributes from a knowledge base. To this end, we introduce the task of entity attribute identification and propose two methods to address it: (i) a model based on Markov Random Field, and (ii) a learning to rank model. We develop a human annotated test collection and show that our proposed methods can bring significant improvements over the baseline methods.


Identification and Visualization of the Underlying Independent Causes of the Diagnostic of Diabetic Retinopathy made by a Deep Learning Classifier

Interpretability is a key factor in the design of automatic classifiers for medical diagnosis. Deep learning models have been proven to be a very effective classification algorithm when trained in a supervised way with enough data. The main concern is the difficulty of inferring rationale interpretations from them. Different attempts have been done in last years in order to convert deep learning classifiers from high confidence statistical black box machines into self-explanatory models. In this paper we go forward into the generation of explanations by identifying the independent causes that use a deep learning model for classifying an image into a certain class. We use a combination of Independent Component Analysis with a Score Visualization technique. In this paper we study the medical problem of classifying an eye fundus image into 5 levels of Diabetic Retinopathy. We conclude that only 3 independent components are enough for the differentiation and correct classification between the 5 disease standard classes. We propose a method for visualizing them and detecting lesions from the generated visual maps.


Causal Inference and Mechanism Clustering of a Mixture of Additive Noise Models

The inference of the causal relationship between a pair of observed variables is a fundamental problem in science, and most existing approaches are based on one single causal model. In practice, however, observations are often collected from multiple sources with heterogeneous causal models due to certain uncontrollable factors, which renders causal analysis results obtained by a single model skeptical. In this paper, we generalize the Additive Noise Model (ANM) to a mixture model, which consists of a finite number of ANMs, and provide the condition of its causal identifiability. To conduct model estimation, we propose Gaussian Process Partially Observable Model (GPPOM), and incorporate independence enforcement into it to learn latent parameter associated with each observation. Causal inference and clustering according to the underlying generating mechanisms of the mixture model are addressed in this work. Experiments on synthetic and real data demonstrate the effectiveness of our proposed approach.


Language Identification with Deep Bottleneck Features
Monolingual sentence matching for text simplification
Close to Human Quality TTS with Transformer
Smart grid modeling and simulation – Comparing GridLAB-D and RAPSim via two Case studies
Millimeter-Wave Over-the-Air Signal-to-Interference-plus-Noise-Ratio Measurements Using a MIMO Testbed
Neural network approach to classifying alarming student responses to online assessment
Neural Educational Recommendation Engine (NERE)
Finite Sample Analysis of the GTD Policy Evaluation Algorithms in Markov Setting
Constrained Exploration and Recovery from Experience Shaping
Sarrus rules and dihedral groups
Understanding Compressive Adversarial Privacy
Adversarial Training in Affective Computing and Sentiment Analysis: Recent Advances and Perspectives
Using JSON-LD to Compose Different IoT and Cloud Services
Optical and RF Metrology for 5G
Lexical Bias In Essay Level Prediction
Networks and the Resilience and Fall of Empires: a Macro-Comparison of the Imperium Romanum and Imperial China
A note on cycles in graphs with specified radius and diameter
Parameter inference and model comparison using theoretical predictions from noisy simulations
Privacy in Index Coding: $k$-Limited-Access Schemes
Global Weighted Average Pooling Bridges Pixel-level Localization and Image-level Classification
Asymptotically Optimal Inventory Control for Assemble-to-Order Systems
Equitable List Vertex Colourability and Arboricity of Grids
Opacity, Obscurity, and the Geometry of Question-Asking
Stable Random Fields, Bowen-Margulis measures and Extremal Cocycle Growth
How do you correct run-on sentences it’s not as easy as it seems
Differential Dynamic Programming for Nonlinear Dynamic Games
onlineSPARC: a Programming Environment for Answer Set Programming
Temporal Interpolation as an Unsupervised Pretraining Task for Optical Flow Estimation
Short directed cycles in bipartite digraphs
Quantifying total uncertainty in physics-informed neural networks for solving forward and inverse stochastic problems
Mirror Descent and Constrained Online Optimization Problems
Estimating minimum effect with outlier selection
A Game-theoretic Framework for Security-aware Sensor Placement Problem in Networked Control Systems
The Voter Basis and the Admissibility of Tree Characters
Evolutionary Shelah-Spencer Graphs
Inverse Potential Problems for Divergence of Measures with Total Variation Regularization
Adversarial Recommendation: Attack of the Learned Fake Users
Unsupervised Image to Sequence Translation with Canvas-Drawer Networks
Augmenting Input Method Language Model with user Location Type Information
Regularity and multiplicity of toric rings of three-dimensional Ferrers diagrams
Unrestricted Adversarial Examples
Primitive and geometric-progression-free sets without large gaps
A convex program for bilinear inversion of sparse vectors
Comment on All-optical machine learning using diffractive deep neural networks
Uniform distributions on curves and optimal quantization
A Unified Framework for the Tractable Analysis of Multi-Antenna Wireless Networks
Gaussian fluctuations for linear eigenvalue statistics of products of independent iid random matrices
Toric degenerations of cluster varieties and cluster duality
Focus On What’s Important: Self-Attention Model for Human Pose Estimation
The Impact of Correlated Blocking on Millimeter-Wave Personal Networks
Security Constrained AC Transmission Network Expansion Planning
Galaxy morphology prediction using capsule networks
Learning to Localize and Align Fine-Grained Actions to Sparse Instructions
Secure and Energy-Efficient Transmissions in Cache-Enabled Heterogeneous Cellular Networks: Performance Analysis and Optimization
A Byte-sized Approach to Named Entity Recognition
Towards Secure Blockchain-enabled Internet of Vehicles: Optimizing Consensus Management Using Reputation and Contract Theory
Constructing Financial Sentimental Factors in Chinese Market Using Natural Language Processing
Understanding Fake Faces
From the Liouville to the Smoluchowski equation for a colloidal solute particle in a solvent
Geometric Multi-Model Fitting by Deep Reinforcement Learning
Relating Zipf’s law to textual information
RPNet: an End-to-End Network for Relative Camera Pose Estimation
Chaos and Order in the Bitcoin Market
Active image restoration
Entropy-Assisted Multi-Modal Emotion Recognition Framework Based on Physiological Signals
Tail probabilities for short-term returns on stocks
Medical Knowledge Embedding Based on Recursive Neural Network for Multi-Disease Diagnosis
On the performance of the Euler-Maruyama scheme for SDEs with discontinuous drift coefficient
Further Results on Circuit Codes
A 2-Approximation Algorithm for Feedback Vertex Set in Tournaments
Trusted Multi-Party Computation and Verifiable Simulations: A Scalable Blockchain Approach
Cascade Attention Network for Person Search: Both Image and Text-Image Similarity Selection
Automated Classification of Sleep Stages and EEG Artifacts in Mice with Deep Learning
Artistic Instance-Aware Image Filtering by Convolutional Neural Networks
A default prior for regression coefficients
Optimizing a Generalized Gini Index in Stable Marriage Problems: NP-Hardness, Approximation and a Polynomial Time Special Case
Chebyshev approximation and the global geometry of sloppy models
Sharp transition of the invertibility of the adjacency matrices of sparse random graphs
Simulation and Testing Results for a Sub-Bottom Imaging Sonar
Some notes on the signed bad number in bipartite graphs
Levelness of toric rings arising from order and chain polytopes
On Line Graphs and 2-variegated graphs
Spectrum and Energy Efficient Multiple Access for Detection in Wireless Sensor Networks
Recent Advances on Intersection Graphs of Hypergraphs: A Survey
Steady-state Analysis of a Neural-cognition Based Human-social Behavior Model
Bilateral tail estimate for distribution of self normalizes sums of independent centered random variables under natural norming
Symplectic Matroids, Circuits, and Signed Graphs
Parametric Synthesis of Text on Stylized Backgrounds using PGGANs
Ranking of Social Media Alerts with Workload Bounds in Emergency Operation Centers
On 3-Inflatable Permutations
SelfKin: Self Adjusted Deep Model For Kinship Verification
SqueezeSegV2: Improved Model Structure and Unsupervised Domain Adaptation for Road-Object Segmentation from a LiDAR Point Cloud
On the relation of separability, bandwidth and embedding
An improved algorithm to compute the $ω$-primality
P-value: A Bless or A Curse for Evidence-Based Studies?
Legal Assignments and fast EADAM with consent via classical theory of stable matchings
On Eulerian orientations of even-degree hypercubes
A Train Status Assistant for Indian Railways
EXTRA: Explaining Team Recommendation in Networks
Fundamental Limits of Invisible Flow Fingerprinting
Adversarial Defense via Data Dependent Activation Function and Total Variation Minimization
Semiparametric Mixed-Scale Models Using Shared Bayesian Forests
Evolution of Threats in the Global Risk Network
Permissioned Blockchain Technologies for Academic Publishing
Provably Correct Automatic Subdifferentiation for Qualified Programs
Generalized Low-Rank Optimization for Topological Cooperation in Ultra-Dense Networks
Security Diffusion Games
On the Maximum of Dependent Gaussian Random Variables: A Sharp Bound for the Lower Tail
A Learning Framework for Robust Bin Picking by Customized Grippers
A Learning Framework for High Precision Industrial Assembly
An explicit solution for a multimarginal mass transportation problem
Self Attention Grid for Person Re-Identification
Bounds on tail probabilities for quadratic forms in dependent sub-gaussian random variables
Periodic Splines and Gaussian Processes for the Resolution of Linear Inverse Problems
Learning for Video Super-Resolution through HR Optical Flow Estimation
EC-GSM-IoT Network Synchronization with Support for Large Frequency Offsets
The use of Virtual Reality in Enhancing Interdisciplinary Research and Education