**Beginning deep learning with 500 lines of Julia**

There are a number of deep learning packages out there. However most sacrifice readability for efficiency. This has two disadvantages: (1) It is difficult for a beginner student to understand what the code is doing, which is a shame because sometimes the code can be a lot simpler than the underlying math. (2) Every other day new ideas come out for optimization, regularization, etc. If the package used already has the trick implemented, great. But if not, it is difficult for a researcher to test the new idea using impenetrable code with a steep learning curve. So I started writing KUnet.jl which currently implements backprop with basic units like relu, standard loss functions like softmax, dropout for generalization, L1-L2 regularization, and optimization using SGD, momentum, ADAGRAD, Nesterov’s accelerated gradient etc. in less than 500 lines of Julia code. Its speed is competitive with the fastest GPU packages. For installation and usage information, please refer to the GitHub repo. The remainder of this post will present (a slightly cleaned up version of) the code as a beginner’s neural network tutorial (modeled after Honnibal’s excellent parsing example).

**Matrix Cheatsheet**

The Matrix Cheatsheet by Sebastian Raschka is licensed under a Creative Commons Attribution 4.0 International License.

**Markov Chains – A visual explanation**

Markov chains, named after Andrey Markov, are mathematical systems that hop from one “state” (a situation or set of values) to another. For example, if you made a Markov chain model of a baby’s behavior, you might include “playing,” “eating”, “sleeping,” and “crying” as states, which together with other behaviors could form a ‘state space’: a list of all possible states. In addition, on top of the state space, a Markov chain tells you the probabilitiy of hopping, or “transitioning,” from one state to any other state—e.g., the chance that a baby currently playing will fall asleep in the next five minutes without crying first.

**AdaBoost Sparse Input Support**

AdaBoost is a meta classifier, it operates by repeatedly training many base classifiers that are not very accurate and pooling their results together to make a more accurate classifier. This is a common ensemble method known as boosting. AdaBoost in addition looks for examples that most base classifiers are having trouble getting right and it increases the focus on these examples in hopes of improving overall prediction accuracy. We can demonstrate AdaBoost honing in on hard samples by running a demonstration where we train AdaBoost to recognize the integer value from an image of a handwritten digit. By running AdaBoost we will now be able to see which examples it had the most trouble on by examining the sample weights. Images with high sample weight are harder to get right for the classifier.

**Should I use premium Diesel? Setup**

Since I drive quite a lot, I have some interest in getting the most km out every Euro spent on fuel. One thing to change is the fuel. The oil companies have a premium fuel, which is supposed to be better for both engine and fuel consumption. On the other hand, it is easy to find counter claims which say it is not beneficial for fuel consumption. In that case the extra costs would be a waste. In this post I am creating a setup to check the claims.