Compositional Coding google
Text classification is a challenging problem which aims to identify the category of text. Recently, Capsule Networks (CapsNets) are proposed for image classification, it has been shown that CapsNets have several advantages over Convolutional Neural Networks (CNNs), while, their validity in the domain of text has less been explored. An effective method named deep compositional code learning has been proposed lately. This method can save many parameters about word embeddings without any significant sacrifices in performance. In this paper, we introduce the Compositional Coding (CC) mechanism between capsules, and we propose a new routing algorithm, which is based on k-means clustering theory. Experiments conducted on eight challenging text classification datasets show the proposed method achieves similar accuracy compared to the state-of-the-art approach with significantly fewer parameters. …

Semantic Brand Score (SBS) google
The Semantic Brand Score (SBS) is a new measure of brand importance calculated on text data, combining methods of social network and semantic analysis. This metric is flexible as it can be used in different contexts and across products, markets and languages. It is applicable not only to brands, but also to multiple sets of words. The SBS, described together with its three dimensions of brand prevalence, diversity and connectivity, represents a contribution to the research on brand equity and on word co-occurrence networks. It can be used to support decision-making processes within companies; for example, it can be applied to forecast a company’s stock price or to assess brand importance with respect to competitors. On the one side, the SBS relates to familiar constructs of brand equity, on the other, it offers new perspectives for effective strategic management of brands in the era of big data. …

Copula Statistic (CoS) google
A new index based on empirical copulas, termed the Copula Statistic (CoS), is introduced for assessing the strength of multivariate dependence and for testing statistical independence. New properties of the copulas are proved. They allow us to define the CoS in terms of a relative distance function between the empirical copula, the Fr\’echet-Hoeffding bounds and the independence copula. Monte Carlo simulations reveal that for large sample sizes, the CoS is approximately normal. This property is utilised to develop a CoS-based statistical test of independence against various noisy functional dependencies. It is shown that this test exhibits higher statistical power than the Total Information Coefficient (TICe), the Distance Correlation (dCor), the Randomized Dependence Coefficient (RDC), and the Copula Correlation (Ccor) for monotonic and circular functional dependencies. Furthermore, the R2-equitability of the CoS is investigated for estimating the strength of a collection of functional dependencies with additive Gaussian noise. Finally, the CoS is applied to a real stock market data set from which we infer that a bivariate analysis is insufficient to unveil multivariate dependencies and to two gene expression data sets of the Yeast and of the E. Coli, which allow us to demonstrate the good performance of the CoS. …

AdaBoost+SVM google
The AdaBoost algorithm has the superiority of resisting overfitting. Understanding the mysteries of this phenomena is a very fascinating fundamental theoretical problem. Many studies are devoted to explaining it from statistical view and margin theory. In this paper, we illustrate it from feature learning viewpoint, and propose the AdaBoost+SVM algorithm, which can explain the resistant to overfitting of AdaBoost directly and easily to understand. Firstly, we adopt the AdaBoost algorithm to learn the base classifiers. Then, instead of directly weighted combination the base classifiers, we regard them as features and input them to SVM classifier. With this, the new coefficient and bias can be obtained, which can be used to construct the final classifier. We explain the rationality of this and illustrate the theorem that when the dimension of these features increases, the performance of SVM would not be worse, which can explain the resistant to overfitting of AdaBoost. …