“A linear composition of a bunch of linear functions is still just a linear function, so most neural networks use non-linear activation functions.” Ivan Vasilev