**Genetic Programming for Reinforcement Learning (GPRL)**

The search for interpretable reinforcement learning policies is of high academic and industrial interest. Especially for industrial systems, domain experts are more likely to deploy autonomously learned controllers if they are understandable and convenient to evaluate. Basic algebraic equations are supposed to meet these requirements, as long as they are restricted to an adequate complexity. Here we introduce the genetic programming for reinforcement learning (GPRL) approach based on model-based batch reinforcement learning and genetic programming, which autonomously learns policy equations from pre-existing default state-action trajectory samples. GPRL is compared to a straight-forward method which utilizes genetic programming for symbolic regression, yielding policies imitating an existing well-performing, but non-interpretable policy. Experiments on three reinforcement learning benchmarks, i.e., mountain car, cart-pole balancing, and industrial benchmark, demonstrate the superiority of our GPRL approach compared to the symbolic regression method. GPRL is capable of producing well-performing interpretable reinforcement learning policies from pre-existing default trajectory data. … **Adaptive Robust Control**

In this paper we propose a new methodology for solving an uncertain stochastic Markovian control problem in discrete time. We call the proposed methodology the adaptive robust control. We demonstrate that the uncertain control problem under consideration can be solved in terms of associated adaptive robust Bellman equation. The success of our approach is to the great extend owed to the recursive methodology for construction of relevant confidence regions. We illustrate our methodology by considering an optimal portfolio allocation problem, and we compare results obtained using the adaptive robust control method with some other existing methods. … **Active Function Cross-Entropy Clustering (afCEC)**

Active function cross-entropy clustering partitions the n-dimensional data into the clusters by finding the parameters of the mixed generalized multivariate normal distribution, that optimally approximates the scattering of the data in the n-dimensional space, whose density function is of the form: p_1*N(mi_1,^sigma_1,sigma_1,f_1)+…+p_k*N(mi_k,^sigma_k,sigma_k,f_k). The above-mentioned generalization is performed by introducing so called ‘f-adapted Gaussian densities’ (i.e. the ordinary Gaussian densities adapted by the ‘active function’). Additionally, the active function cross-entropy clustering performs the automatic reduction of the unnecessary clusters. For more information please refer to P. Spurek, J. Tabor, K.Byrski, ‘Active function Cross-Entropy Clustering’ (2017) <doi:10.1016/j.eswa.2016.12.011>. …

# If you did not already know

**16**
*Friday*
Mar 2018

Posted What is ...

in
Advertisements