EnsembleNet google
Ensembling is a universally useful approach to boost the performance of machine learning models. However, individual models in an ensemble are typically trained independently in separate stages, without information access about the overall ensemble. In this paper, model ensembles are treated as first-class citizens, and their performance is optimized end-to-end with parameter sharing and a novel loss structure that improves generalization. On large-scale datasets including ImageNet, Youtube-8M, and Kinetics, we demonstrate a procedure that starts from a strongly performing single deep neural network, and constructs an EnsembleNet that has both a smaller size and better performance. Moreover, an EnsembleNet can be trained in one stage just like a single model without manual intervention. …

COMpact Image Captioning (COMIC) google
Recent works in image captioning have shown very promising raw performance. However, we realize that most of these encoder-decoder style networks with attention do not scale naturally to large vocabulary size, making them difficult to be deployed on embedded system with limited hardware resources. This is because the size of word and output embedding matrices grow proportionally with the size of vocabulary, adversely affecting the compactness of these networks. To address this limitation, this paper introduces a brand new idea in the domain of image captioning. That is, we tackle the problem of compactness of image captioning models which is hitherto unexplored. We showed that, our proposed model, named COMIC for COMpact Image Captioning, achieves comparable results in five common evaluation metrics with state-of-the-art approaches on both MS-COCO and InstaPIC-1.1M datasets despite having an embedding vocabulary size that is 39x – 99x smaller. …

Self-Adaptive Neuro-Fuzzy Inference System (SANFIS) google
This paper presents a self-adaptive neuro-fuzzy inference system (SANFIS) that is capable of self-adapting and self-organizing its internal structure to acquire a parsimonious rule-base for interpreting the embedded knowledge of a system from the given training data set. A connectionist topology of fuzzy basis functions with their universal approximation capability is served as a fundamental SANFIS architecture that provides an elasticity to be extended to all existing fuzzy models whose consequent could be fuzzy term sets, fuzzy singletons, or functions of linear combination of input variables. Without a priori knowledge of the distribution of the training data set, a novel mapping-constrained agglomerative clustering algorithm is devised to reveal the true cluster configuration in a single pass for an initial SANFIS construction, estimating the location and variance of each cluster. Subsequently, a fast recursive linear/nonlinear least-squares algorithm is performed to further accelerate the learning convergence and improve the system performance. Good generalization capability, fast learning convergence and compact comprehensible knowledge representation summarize the strength of SANFIS. Computer simulations for the Iris, Wisconsin breast cancer, and wine classifications show that SANFIS achieves significant improvements in terms of learning convergence, higher accuracy in recognition, and a parsimonious architecture. …

Statistical Matching Problem google
The statistical matching problem is a data integration problem with structured missing data. The general form involves the analysis of multiple datasets that only have a strict subset of variables jointly observed across all datasets. The simplest version involves two datasets, labelled A and B, with three variables of interest $X, Y$ and $Z$. Variables $X$ and $Y$ are observed in dataset A and variables $X$ and $Z$ are observed in dataset $B$. Statistical inference is complicated by the absence of joint $(Y, Z)$ observations. Parametric modelling can be challenging due to identifiability issues and the difficulty of parameter estimation. …