tax2vec google
The use of background knowledge remains largely unexploited in many text classification tasks. In this work, we explore word taxonomies as means for constructing new semantic features, which may improve the performance and robustness of the learned classifiers. We propose tax2vec, a parallel algorithm for constructing taxonomy based features, and demonstrate its use on six short-text classification problems, including gender, age and personality type prediction, drug effectiveness and side effect prediction, and news topic prediction. The experimental results indicate that the interpretable features constructed using tax2vec can notably improve the performance of classifiers; the constructed features, in combination with fast, linear classifiers tested against strong baselines, such as hierarchical attention neural networks, achieved comparable or better classification results on short documents. Further, tax2vec can also serve for extraction of corpus-specific keywords. Finally, we investigated the semantic space of potential features where we observe a similarity with the well known Zipf’s law. …

Katz Centrality google
In graph theory, the Katz centrality of a node is a measure of centrality in a network. It was introduced by Leo Katz in 1953 and is used to measure the relative degree of influence of an actor (or node) within a social network. Unlike typical centrality measures which consider only the shortest path (the geodesic) between a pair of actors, Katz centrality measures influence by taking into account the total number of walks between a pair of actors. It is similar to Google’s PageRank and to the eigenvector centrality.
· Katz centrality can be used to compute centrality in directed networks such as citation networks and the World Wide Web.
· Katz centrality is more suitable in the analysis of directed acyclic graphs where traditionally used measures like eigenvector centrality are rendered useless.
· Katz centrality can also be used in estimating the relative status or influence of actors in a social network.
· In neuroscience, it is found that Katz centrality correlates with the relative firing rate of neurons in a neural network. …


Parallel Matrix Condensation google
Calculating the log-determinant of a matrix is useful for statistical computations used in machine learning, such as generative learning which uses the log-determinant of the covariance matrix to calculate the log-likelihood of model mixtures. The log-determinant calculation becomes challenging as the number of variables becomes large. Therefore, finding a practical speedup for this computation can be useful. In this study, we present a parallel matrix condensation algorithm for calculating the log-determinant of a large matrix. We demonstrate that in a distributed environment, Parallel Matrix Condensation has several advantages over the well-known Parallel Gaussian Elimination. The advantages include high data distribution efficiency and less data communication operations. We test our Parallel Matrix Condensation against self-implemented Parallel Gaussian Elimination as well as ScaLAPACK (Scalable Linear Algebra Package) on 1000 x1000 to 8000×8000 for 1,2,4,8,16,32,64 and 128 processors. The results show that Matrix Condensation yields the best speed-up among all other tested algorithms. The code is available on https://…/MatrixCondensation

Entropic Spectral Learning google
We present a novel algorithm for learning the spectral density of large scale networks using stochastic trace estimation and the method of maximum entropy. The complexity of the algorithm is linear in the number of non-zero elements of the matrix, offering a computational advantage over other algorithms. We apply our algorithm to the problem of community detection in large networks. We show state-of-the-art performance on both synthetic and real datasets. …