Deep Differential Recurrent Neural Network (DDRNN)
Due to the special gating schemes of Long Short-Term Memory (LSTM), LSTMs have shown greater potential to process complex sequential information than the traditional Recurrent Neural Network (RNN). The conventional LSTM, however, fails to take into consideration the impact of salient spatio-temporal dynamics present in the sequential input data. This problem was first addressed by the differential Recurrent Neural Network (dRNN), which uses a differential gating scheme known as Derivative of States (DoS). DoS uses higher orders of internal state derivatives to analyze the change in information gain caused by the salient motions between the successive frames. The weighted combination of several orders of DoS is then used to modulate the gates in dRNN. While each individual order of DoS is good at modeling a certain level of salient spatio-temporal sequences, the sum of all the orders of DoS could distort the detected motion patterns. To address this problem, we propose to control the LSTM gates via individual orders of DoS and stack multiple levels of LSTM cells in an increasing order of state derivatives. The proposed model progressively builds up the ability of the LSTM gates to detect salient dynamical patterns in deeper stacked layers modeling higher orders of DoS, and thus the proposed LSTM model is termed deep differential Recurrent Neural Network (d2RNN). The effectiveness of the proposed model is demonstrated on two publicly available human activity datasets: NUS-HGA and Violent-Flows. The proposed model outperforms both LSTM and non-LSTM based state-of-the-art algorithms. …
Relational Concept Analysis (RCA)
The processing of complex data is admittedly among the major concerns of knowledge discovery from data (kdd). Indeed, a major part of the data worth analyzing is stored in relational databases and, since recently, on the Web of Data. This clearly underscores the need for Entity-Relationship and rdf compliant data mining (dm) tools. We are studying an approach to the underlying multi-relational data mining (mrdm) problem, which relies on formal concept analysis (fca) as a framework for clustering and classification. Our relational concept analysis (rca) extends fca to the processing of multi-relational datasets, i.e., with multiple sorts of individuals, each provided with its own set of attributes, and relationships among those. Given such a dataset, rca constructs a set of concept lattices, one per object sort, through an iterative analysis process that is bound towards a fixed-point. In doing that, it abstracts the links between objects into attributes akin to role restrictions from description logics (dls). We address here key aspects of the iterative calculation such as evolution in data description along the iterations and process termination. We describe implementations of rca and list applications to problems from software and knowledge engineering.
On-demand Relational Concept Analysis …
Majority-CRF
We explore active learning (AL) utterance selection for improving the accuracy of new underrepresented domains in a natural language understanding (NLU) system. Moreover, we propose an AL algorithm called Majority-CRF that uses an ensemble of classification and sequence labeling models to guide utterance selection for annotation. Experiments with three domains show that Majority-CRF achieves 6.6%-9% relative error rate reduction compared to random sampling with the same annotation budget, and statistically significant improvements compared to other AL approaches. Additionally, case studies with human-in-the-loop AL on six new domains show 4.6%-9% improvement on an existing NLU system. …
Large Scale Incremental Learning
Modern machine learning suffers from catastrophic forgetting when learning new classes incrementally. The performance dramatically degrades due to the missing data of old classes. Incremental learning methods have been proposed to retain the knowledge acquired from the old classes, by using knowledge distilling and keeping a few exemplars from the old classes. However, these methods struggle to scale up to a large number of classes. We believe this is because of the combination of two factors: (a) the data imbalance between the old and new classes, and (b) the increasing number of visually similar classes. Distinguishing between an increasing number of visually similar classes is particularly challenging, when the training data is unbalanced. We propose a simple and effective method to address this data imbalance issue. We found that the last fully connected layer has a strong bias towards the new classes, and this bias can be corrected by a linear model. With two bias parameters, our method performs remarkably well on two large datasets: ImageNet (1000 classes) and MS-Celeb-1M (10000 classes), outperforming the state-of-the-art algorithms by 11.1% and 13.2% respectively. …
If you did not already know
09 Saturday May 2020
Posted What is ...
in