DeepShift google
Deep learning models, especially DCNN have obtained high accuracies in several computer vision applications. However, for deployment in mobile environments, the high computation and power budget proves to be a major bottleneck. Convolution layers and fully connected layers, because of their intense use of multiplications, are the dominant contributer to this computation budget. This paper, proposes to tackle this problem by introducing two new operations: convolutional shifts and fully-connected shifts, that replace multiplications all together and use bitwise shift and bitwise negation instead. This family of neural network architectures (that use convolutional shifts and fully-connected shifts) are referred to as DeepShift models. With such DeepShift models that can be implemented with no multiplications, the authors have obtained accuracies of up to 93.6% on CIFAR10 dataset, and Top-1/Top-5 accuracies of 70.9%/90.13% on Imagenet dataset. Extensive testing is made on various well-known CNN architectures after converting all their convolution layers and fully connected layers to their bitwise shift counterparts, and we show that in some architectures, the Top-1 accuracy drops by less than 4% and the Top-5 accuracy drops by less than 1.5%. The experiments have been conducted on PyTorch framework and the code for training and running is submitted along with the paper and will be made available online. …

L1-Norm Batch Normalization (L1BN) google
Batch Normalization (BN) has been proven to be quite effective at accelerating and improving the training of deep neural networks (DNNs). However, BN brings additional computation, consumes more memory and generally slows down the training process by a large margin, which aggravates the training effort. Furthermore, the nonlinear square and root operations in BN also impede the low bit-width quantization techniques, which draws much attention in deep learning hardware community. In this work, we propose an L1-norm BN (L1BN) with only linear operations in both the forward and the backward propagations during training. L1BN is shown to be approximately equivalent to the original L2-norm BN (L2BN) by multiplying a scaling factor. Experiments on various convolutional neural networks (CNNs) and generative adversarial networks (GANs) reveal that L1BN maintains almost the same accuracies and convergence rates compared to L2BN but with higher computational efficiency. On FPGA platform, the proposed signum and absolute operations in L1BN can achieve 1.5$\times$ speedup and save 50\% power consumption, compared with the original costly square and root operations, respectively. This hardware-friendly normalization method not only surpasses L2BN in speed, but also simplify the hardware design of ASIC accelerators with higher energy efficiency. Last but not the least, L1BN promises a fully quantized training of DNNs, which is crucial to future adaptive terminal devices. …

SuperSCS google
We present SuperSCS: a fast and accurate method for solving large-scale convex conic problems. SuperSCS combines the SuperMann algorithmic framework with the Douglas-Rachford splitting which is applied on the homogeneous self-dual embedding of conic optimization problems: a model for conic optimization problems which simultaneously encodes the optimality conditions and infeasibility/unboundedness certificates for the original problem. SuperMann allows the use of fast quasi-Newtonian directions such as a modified restarted Broyden-type direction and Anderson’s acceleration. …

Time-Variant System google
A time-variant system is a system that is not time invariant (TIV). Roughly speaking, its output characteristics depend explicitly upon time. In other words, a system in which certain quantities governing the system’s behavior change with time, so that the system will respond differently to the same input at different times. …