Modular, Optimal Learning Testing Environment (MOLTE) google
We address the relative paucity of empirical testing of learning algorithms (of any type) by introducing a new public-domain, Modular, Optimal Learning Testing Environment (MOLTE) for Bayesian ranking and selection problem, stochastic bandits or sequential experimental design problems. The Matlab-based simulator allows the comparison of a number of learning policies (represented as a series of .m modules) in the context of a wide range of problems (each represented in its own .m module) which makes it easy to add new algorithms and new test problems. State-of-the-art policies and various problem classes are provided in the package. The choice of problems and policies is guided through a spreadsheet-based interface. Different graphical metrics are included. MOLTE is designed to be compatible with parallel computing to scale up from local desktop to clusters and clouds. We offer MOLTE as an easy-to-use tool for the research community that will make it possible to perform much more comprehensive testing, spanning a broader selection of algorithms and test problems. We demonstrate the capabilities of MOLTE through a series of comparisons of policies on a starter library of test problems. We also address the problem of tuning and constructing priors that have been largely overlooked in optimal learning literature. We envision MOLTE as a modest spur to provide researchers an easy environment to study interesting questions involved in optimal learning. …

Vanishing Gradient Problem google
In machine learning, the vanishing gradient problem is a difficulty found in training artificial neural networks with gradient-based learning methods and backpropagation. In such methods, each of the neural network’s weights receives an update proportional to the partial derivative of the error function with respect to the current weight in each iteration of training. The problem is that in some cases, the gradient will be vanishingly small, effectively preventing the weight from changing its value. In the worst case, this may completely stop the neural network from further training. As one example of the problem cause, traditional activation functions such as the hyperbolic tangent function have gradients in the range (0, 1), and backpropagation computes gradients by the chain rule. This has the effect of multiplying n of these small numbers to compute gradients of the ‘front’ layers in an n-layer network, meaning that the gradient (error signal) decreases exponentially with n while the front layers train very slowly. Back-propagation allowed researchers to train supervised deep artificial neural networks from scratch, initially with little success. Hochreiter’s diploma thesis of 1991[1][2] formally identified the reason for this failure in the ‘vanishing gradient problem’, which not only affects many-layered feedforward networks,[3] but also recurrent networks.[4] The latter are trained by unfolding them into very deep feedforward networks, where a new layer is created for each time step of an input sequence processed by the network. When activation functions are used whose derivatives can take on larger values, one risks encountering the related exploding gradient problem. …

Beholder-GAN google
Beauty is in the eye of the beholder. This maxim, emphasizing the subjectivity of the perception of beauty, has enjoyed a wide consensus since ancient times. In the digitalera, data-driven methods have been shown to be able to predict human-assigned beauty scores for facial images. In this work, we augment this ability and train a generative model that generates faces conditioned on a requested beauty score. In addition, we show how this trained generator can be used to beautify an input face image. By doing so, we achieve an unsupervised beautification model, in the sense that it relies on no ground truth target images. …

Multiple Graph Optimized Convolutional Network (M-GOCN) google
Graph Convolutional Networks (GCNs) have been widely studied for graph data representation and learning tasks. Existing GCNs generally use a fixed single graph which may lead to weak suboptimal for data representation/learning and are also hard to deal with multiple graphs. To address these issues, we propose a novel Graph Optimized Convolutional Network (GOCN) for graph data representation and learning. Our GOCN is motivated based on our re-interpretation of graph convolution from a regularization/optimization framework. The core idea of GOCN is to formulate graph optimization and graph convolutional representation into a unified framework and thus conducts both of them cooperatively to boost their respective performance in GCN learning scheme. Moreover, based on the proposed unified graph optimization-convolution framework, we propose a novel Multiple Graph Optimized Convolutional Network (M-GOCN) to naturally address the data with multiple graphs. Experimental results demonstrate the effectiveness and benefit of the proposed GOCN and M-GOCN. …