Correlated Anomaly Detection (CAD)
Correlated anomaly detection (CAD) from streaming data is a type of group anomaly detection and an essential task in useful real-time data mining applications like botnet detection, financial event detection, industrial process monitor, etc. …
TaxoGen
Taxonomy construction is not only a fundamental task for semantic analysis of text corpora, but also an important step for applications such as information filtering, recommendation, and Web search. Existing pattern-based methods extract hypernym-hyponym term pairs and then organize these pairs into a taxonomy. However, by considering each term as an independent concept node, they overlook the topical proximity and the semantic correlations among terms. In this paper, we propose a method for constructing topic taxonomies, wherein every node represents a conceptual topic and is defined as a cluster of semantically coherent concept terms. Our method, TaxoGen, uses term embeddings and hierarchical clustering to construct a topic taxonomy in a recursive fashion. To ensure the quality of the recursive process, it consists of: (1) an adaptive spherical clustering module for allocating terms to proper levels when splitting a coarse topic into fine-grained ones; (2) a local embedding module for learning term embeddings that maintain strong discriminative power at different levels of the taxonomy. Our experiments on two real datasets demonstrate the effectiveness of TaxoGen compared with baseline methods. …
Hierarchical Navigable Small World (HNSW,Hierarchical NSW)
We present a new algorithm for the approximate nearest neighbor search based on navigable small world graphs with controllable hierarchy (Hierarchical NSW) admitting simple insertion, deletion and K-nearest neighbor queries. The Hierarchical NSW is a fully graph-based approach without a need for additional search structures (such as kd-trees or Cartesian concatenation) typically used at coarse search stage of the most proximity graph techniques. The algorithm incrementally builds a layered structure consisting from hierarchical set of proximity graphs (layers) for nested subsets of the stored elements. The maximum layer in which an element is present is selected randomly with exponentially decaying probability distribution. This allows producing graphs similar to the previously studied Navigable Small World (NSW) structures while additionally having the links separated by their characteristic distance scales. Starting search from the upper layer instead of random seeds together with utilizing the scale separation boosts the performance compared to the NSW and allows a logarithmic complexity scaling. Additional employment of a simple heuristic for selecting proximity graph neighbors increases performance at high recall and in case of highly clustered data. Performance evaluation on a large number of datasets has demonstrated that the proposed general metric space method is able to strongly outperform many previous state-of-art vector-only approaches such as FLANN, FALCONN and Annoy. Similarity of the algorithm to a well-known 1D skip list structure allows straightforward efficient and balanced distributed implementation.
A Comparative Study on Hierarchical Navigable Small World Graphs …
Imitation Network
In this paper, we propose imitation networks, a simple but effective method for training neural networks with a limited amount of training data. Our approach inherits the idea of knowledge distillation that transfers knowledge from a deep or wide reference model to a shallow or narrow target model. The proposed method employs this idea to mimic predictions of reference estimators that are much more robust against overfitting than the network we want to train. Different from almost all the previous work for knowledge distillation that requires a large amount of labeled training data, the proposed method requires only a small amount of training data. Instead, we introduce pseudo training examples that are optimized as a part of model parameters. Experimental results for several benchmark datasets demonstrate that the proposed method outperformed all the other baselines, such as naive training of the target model and standard knowledge distillation. …
If you did not already know
20 Monday Mar 2023
Posted What is ...
in