Given a symmetric nonnegative matrix , symmetric nonnegative matrix factorization (symNMF) is the problem of finding a nonnegative matrix , usually with much fewer columns than , such that . SymNMF can be used for data analysis and in particular for various clustering tasks. In this paper, we propose simple and very efficient coordinate descent schemes to solve this problem, and that can handle large and sparse input matrices. The effectiveness of our methods is illustrated on synthetic and real-world data sets, and we show that they perform favorably compared to recent state-of-the-art methods.
Deep learning has demonstrated the power of detailed modeling of complex high-order (multivariate) interactions in data. For some learning tasks there is power in learning models that are not only Deep but also Broad. By Broad, we mean models that incorporate evidence from large numbers of features. This is of especial value in applications where many different features and combinations of features all carry small amounts of information about the class. The most accurate models will integrate all that information. In this paper, we propose an algorithm for Deep Broad Learning called DBL. The proposed algorithm has a tunable parameter , that specifies the depth of the model. It provides straightforward paths towards out-of-core learning for large data. We demonstrate that DBL learns models from large quantities of data with accuracy that is highly competitive with the state-of-the-art.
In a distributed network environment, the diffusion-least mean squares (LMS) algorithm gives faster convergence than the original LMS algorithm. It has also been observed that, the diffusion-LMS generally outperforms other distributed LMS algorithms like spatial LMS and incremental LMS. However, both the original LMS and diffusion-LMS are not applicable in non-linear environments where data may not be linearly separable. A variant of LMS called kernel-LMS (KLMS) has been proposed in the literature for such non-linearities. In this paper, we propose kernelised version of diffusion-LMS for non-linear distributed environments. Simulations show that the proposed approach has superior convergence as compared to algorithms of the same genre. We also introduce a technique to predict the transient and steady-state behaviour of the proposed algorithm. The techniques proposed in this work (or algorithms of same genre) can be easily extended to distributed parameter estimation applications like cooperative spectrum sensing and massive multiple input multiple output (MIMO) receiver design which are potential components for 5G communication systems.
Applications that learn from opinionated documents, like tweets or product reviews, face two challenges. First, the opinionated documents constitute an evolving stream, where both the author’s attitude and the vocabulary itself may change. Second, labels of documents are scarce and labels of words are unreliable, because the sentiment of a word depends on the (unknown) context in the author’s mind. Most of the research on mining over opinionated streams focuses on the first aspect of the problem, whereas for the second a continuous supply of labels from the stream is assumed. Such an assumption though is utopian as the stream is infinite and the labeling cost is prohibitive. To this end, we investigate the potential of active stream learning algorithms that ask for labels on demand. Our proposed ACOSTREAM 1 approach works with limited labels: it uses an initial seed of labeled documents, occasionally requests additional labels for documents from the human expert and incrementally adapts to the underlying stream while exploiting the available labeled documents. In its core, ACOSTREAM consists of a MNB classifier coupled with ‘sampling’ strategies for requesting class labels for new unlabeled documents. In the experiments, we evaluate the classifier performance over time by varying: (a) the class distribution of the opinionated stream, while assuming that the set of the words in the vocabulary is fixed but their polarities may change with the class distribution; and (b) the number of unknown words arriving at each moment, while the class polarity may also change. Our results show that active learning on a stream of opinionated documents, delivers good performance while requiring a small selection of labels
A l1-norm penalized orthogonal forward regression (l1-POFR) algorithm is proposed based on the concept of leaveone- out mean square error (LOOMSE). Firstly, a new l1-norm penalized cost function is defined in the constructed orthogonal space, and each orthogonal basis is associated with an individually tunable regularization parameter. Secondly, due to orthogonal computation, the LOOMSE can be analytically computed without actually splitting the data set, and moreover a closed form of the optimal regularization parameter in terms of minimal LOOMSE is derived. Thirdly, a lower bound for regularization parameters is proposed, which can be used for robust LOOMSE estimation by adaptively detecting and removing regressors to an inactive set so that the computational cost of the algorithm is significantly reduced. Illustrative examples are included to demonstrate the effectiveness of this new l1-POFR approach.
Text analysis includes lexical analysis of the text and has been widely studied and used in diverse applications. In the last decade, researchers have proposed many efficient solutions to analyze / classify large text dataset, however, analysis / classification of short text is still a challenge because 1) the data is very sparse 2) It contains noise words and 3) It is difficult to understand the syntactical structure of the text. Short Messaging Service (SMS) is a text messaging service for mobile/smart phone and this service is frequently used by all mobile users. Because of the popularity of SMS service, marketing companies nowadays are also using this service for direct marketing also known as SMS marketing.In this paper, we have proposed Ontology based SMS Controller which analyze the text message and classify it using ontology as legitimate or spam. The proposed system has been tested on different scenarios and experimental results shows that the proposed solution is effective both in terms of efficiency and time.
Two approaches for graph based semi-supervised learning are proposed. The first approach is based on iteration of an affine map. A key element of the affine map iteration is sparsematrix-vector multiplication, which has several very efficient parallel implementations. The second approach belongs to the class of Markov Chain Monte Carlo (MCMC) algorithms. It is based onsampling of nodes by performing a random walk on the graph. The latter approach is distributedby its nature and can be easily implemented on several processors or over the network. Both theoretical and practical evaluations are provided. It is found that the nodes are classified into their class with very small error. The sampling algorithm’s ability to track new incoming nodes and to classify them is also demonstrated.
In this paper, we propose another version of help-training approach by employing a Probabilistic Neural Network (PNN) that improves the performance of the main discriminative classifier in the semi-supervised strategy. We introduce the PNN-training algorithm and use it for training the support vector machine (SVM) with a few numbers of labeled data and a large number of unlabeled data. We try to find the best labels for unlabeled data and then use SVM to enhance the classification rate. We test our method on two famous benchmarks and show the efficiency of our method in comparison with pervious methods.
We propose a quantization based approach for fast approximate Maximum Inner Product Search (MIPS). Each database vector is quantized in multiple subspaces via a set of codebooks, learned directly by minimizing the inner product quantization error. Then, the inner product of a query to a database vector is approximated as the sum of inner products with the subspace quantizers. Different from recently proposed LSH approaches to MIPS, the database vectors and queries do not need to be augmented in a higher dimensional feature space. We also provide a theoretical analysis of the proposed approach, consisting of the concentration results under mild assumptions. Furthermore, if a small sample of example queries is given at the training time, we propose a modified codebook learning procedure which further improves the accuracy. Experimental results on a variety of datasets including those arising from deep neural networks show that the proposed approach significantly outperforms the existing state-of-the-art.
We consider the problem of efficient financial surveillance aimed at ‘on-the-go’ detection of structural breaks (anomalies) in ‘live’-monitored financial time series. With the problem approached statistically, viz. as that of multi-cyclic sequential (quickest) change-point detection, we propose a semi-parametric multi-cyclic change-point detection procedure to promptly spot anomalies as they occur in the time series under surveillance. The proposed procedure is a derivative of the likelihood ratio-based Shiryaev-Roberts (SR) procedure; the latter is a quasi-Bayesian surveillance method known to deliver the fastest (in the multi-cyclic sense) speed of detection, whatever be the false alarm frequency. We offer a case study where we first carry out, step by step, statistical analysis of a set of real-world financial data, and then set up and devise (a) the proposed SR-based anomaly-detection procedure and (b) the celebrated Cumulative Sum (CUSUM) chart to detect structural breaks in the data. While both procedures performed well, the proposed SR-derivative, conforming to the intuition, seemed slightly better.
Advances in information technology and its widespread growth in several areas of business, engineering, medical and scientific studies are resulting in information/data explosion. Knowledge discovery and decision making from such rapidly growing voluminous data is a challenging task in terms of data organization and processing, which is an emerging trend known as Big Data Computing; a new paradigm which combines large scale compute, new data intensive techniques and mathematical models to build data analytics. Big Data computing demands a huge storage and computing for data curation and processing that could be delivered from on-premise or clouds infrastructures. This paper discusses the evolution of Big Data computing, differences between traditional data warehousing and Big Data, taxonomy of Big Data computing and underpinning technologies, integrated platform of Big Data and Clouds known as Big Data Clouds, layered architecture and components of Big Data Cloud and finally discusses open technical challenges and future directions.