Context-aware Path Ranking (C-PR) google
Knowledge base (KB) completion aims to infer missing facts from existing ones in a KB. Among various approaches, path ranking (PR) algorithms have received increasing attention in recent years. PR algorithms enumerate paths between entity pairs in a KB and use those paths as features to train a model for missing fact prediction. Due to their good performances and high model interpretability, several methods have been proposed. However, most existing methods suffer from scalability (high RAM consumption) and feature explosion (trains on an exponentially large number of features) problems. This paper proposes a Context-aware Path Ranking (C-PR) algorithm to solve these problems by introducing a selective path exploration strategy. C-PR learns global semantics of entities in the KB using word embedding and leverages the knowledge of entity semantics to enumerate contextually relevant paths using bidirectional random walk. Experimental results on three large KBs show that the path features (fewer in number) discovered by C-PR not only improve predictive performance but also are more interpretable than existing baselines. …

Fairness-Aware Algorithm for Concurrent Service Selection (FASS) google
The increasing momentum of service-oriented architecture has led to the emergence of divergent delivered services, where service selection is meritedly required to obtain the target service fulfilling the requirements from both users and service providers. Despite many existing works have extensively handled the issue of service selection, it remains an open question in the case where requests from multiple users are performed simultaneously by a certain set of shared candidate services. Meanwhile, there exist some constraints enforced on the context of service selection, e.g. service placement location and contracts between users and service providers. In this paper, we focus on the QoS-aware service selection with constraints from a fairness aspect, with the objective of achieving max-min fairness across multiple service requests sharing candidate service sets. To be more specific, we study the problem of fairly selecting services from shared candidate sets while service providers are self-motivated to offer better services with higher QoS values. We formulate this problem as a lexicographical maximization problem, which is far from trivial to deal with practically due to its inherently multi-objective and discrete nature. A fairness-aware algorithm for concurrent service selection (FASS) is proposed, whose basic idea is to iteratively solve the single-objective subproblems by transforming them into linear programming problems. Experimental results based on real-world datasets also validate the effectiveness and practicality of our proposed approach. …

Multi-View Multiple Clustering (MVMC) google
Multiple clustering aims at exploring alternative clusterings to organize the data into meaningful groups from different perspectives. Existing multiple clustering algorithms are designed for single-view data. We assume that the individuality and commonality of multi-view data can be leveraged to generate high-quality and diverse clusterings. To this end, we propose a novel multi-view multiple clustering (MVMC) algorithm. MVMC first adapts multi-view self-representation learning to explore the individuality encoding matrices and the shared commonality matrix of multi-view data. It additionally reduces the redundancy (i.e., enhancing the individuality) among the matrices using the Hilbert-Schmidt Independence Criterion (HSIC), and collects shared information by forcing the shared matrix to be smooth across all views. It then uses matrix factorization on the individual matrices, along with the shared matrix, to generate diverse clusterings of high-quality. We further extend multiple co-clustering on multi-view data and propose a solution called multi-view multiple co-clustering (MVMCC). Our empirical study shows that MVMC (MVMCC) can exploit multi-view data to generate multiple high-quality and diverse clusterings (co-clusterings), with superior performance to the state-of-the-art methods. …

Second-Level Global Sensitivity Analysis (GSA2) google
Global sensitivity analysis (GSA) of numerical simulators aims at studying the global impact of the input uncertainties on the output. To perform the GSA, statistical tools based on inputs/output dependence measures are commonly used. We focus here on dependence measures based on reproducing kernel Hilbert spaces: the Hilbert-Schmidt Independence Criterion denoted HSIC. Sometimes, the probability distributions modeling the uncertainty of inputs may be themselves uncertain and it is important to quantify the global impact of this uncertainty on GSA results. We call it here the second-level global sensitivity analysis (GSA2). However, GSA2, when performed with a double Monte Carlo loop, requires a large number of model evaluations which is intractable with CPU time expensive simulators. To cope with this limitation, we propose a new statistical methodology based on a single Monte Carlo loop with a limited calculation budget. Firstly, we build a unique sample of inputs from a well chosen probability distribution and the associated code outputs are computed. From this inputs/output sample, we perform GSA for various assumed probability distributions of inputs by using weighted HSIC measures estimators. Statistical properties of these weighted esti-mators are demonstrated. Finally, we define 2 nd-level HSIC-based measures between the probability distributions of inputs and GSA results, which constitute GSA2 indices. The efficiency of our GSA2 methodology is illustrated on an analytical example, thereby comparing several technical options. Finally, an application to a test case simulating a severe accidental scenario on nuclear reactor is provided. …