The aim of this survey is an attempt to review the kind of machine learning and stochastic techniques and the ways existing work currently uses machine learning and stochastic methods for the challenging problem of visual tracking. It is not intended to study the whole tracking literature of the last decades as this seems impossible by the incredible vast number of published papers. This first draft version of the article focuses very targeted on recent literature that suggests Siamese networks for the learning of tracking. This approach promise a step forward in terms of robustness, accuracy and computational efficiency. For example, the representative tracker SINT performs currently best on the popular OTB-2013 benchmark with AuC/IoU/prec. 65.5/62.5/84.8 % for the one-pass experiment (OPE). The CVPR’17 work CVNet by the Oxford group shows the approach’s large potential of HW/SW co-design with network memory needs around 600 kB and frame-rates of 75 fps and beyond. Before a detailed description of this approach is given, the article recaps the definition of tracking, the current state-of-the-art view on designing algorithms and the state-of-the-art of trackers by summarising insights from existing literature. In future, the article will be extended by the review of two alternative approaches, the one using very general recurrent networks such as the Long Shortterm Memory (LSTM) networks and the other most obvious approach of applying sole convolutional networks (CNN), the earliest approach since the idea of deep learning tracking appeared at NIPS’13. Siamese Learning Visual Tracking: A Survey