Convolutional Block Attention Module (CBAM)
We propose Convolutional Block Attention Module (CBAM), a simple yet effective attention module for feed-forward convolutional neural networks. Given an intermediate feature map, our module sequentially infers attention maps along two separate dimensions, channel and spatial, then the attention maps are multiplied to the input feature map for adaptive feature refinement. Because CBAM is a lightweight and general module, it can be integrated into any CNN architectures seamlessly with negligible overheads and is end-to-end trainable along with base CNNs. We validate our CBAM through extensive experiments on ImageNet-1K, MS~COCO detection, and VOC~2007 detection datasets. Our experiments show consistent improvements in classification and detection performances with various models, demonstrating the wide applicability of CBAM. The code and models will be publicly available. …
Full-Jacobian
Non-linear functions such as neural networks can be locally approximated by affine planes. Recent works make use of input-Jacobians, which describe the normal to these planes. In this paper, we introduce full-Jacobians, which includes this normal along with an additional intercept term called the bias-Jacobians, that together completely describe local planes. For ReLU neural networks, bias-Jacobians correspond to sums of gradients of outputs w.r.t. intermediate layer activations. We first use these full-Jacobians for distillation by aligning gradients of their intermediate representations. Next, we regularize bias-Jacobians alone to improve generalization. Finally, we show that full-Jacobian maps can be viewed as saliency maps. Experimental results show improved distillation on small data-sets, improved generalization for neural network training, and sharper saliency maps. …
Ordinal Forests (OF)
Ordinal forests (OF) are a method for ordinal regression with high-dimensional and low-dimensional data that is able to predict the values of the ordinal target variable for new observations and at the same time estimate the relative widths of the classes of the ordinal target variable. Using a (permutation-based) variable importance measure it is moreover possible to rank the importances of the covariates. …
Student’s t-Generative Adversarial Network
Generative Adversarial Networks (GANs) have a great performance in image generation, but they need a large scale of data to train the entire framework, and often result in nonsensical results. We propose a new method referring to conditional GAN, which equipments the latent noise with mixture of Student’s t-distribution with attention mechanism in addition to class information. Student’s t-distribution has long tails that can provide more diversity to the latent noise. Meanwhile, the discriminator in our model implements two tasks simultaneously, judging whether the images come from the true data distribution, and identifying the class of each generated images. The parameters of the mixture model can be learned along with those of GANs. Moreover, we mathematically prove that any multivariate Student’s t-distribution can be obtained by a linear transformation of a normal multivariate Student’s t-distribution. Experiments comparing the proposed method with typical GAN, DeliGAN and DCGAN indicate that, our method has a great performance on generating diverse and legible objects with limited data. …
If you did not already know
23 Sunday Jan 2022
Posted What is ...
in