**Convolutional Neural Network**

Neural networks have been around for a number of decades now and have seen their ups and downs. In particular Convolutional Neural Network (CNN) has proven to be extremely powerful for image recognition problems. Before discussing CNN let’s revisit the concepts of neural networks again and then see how CCN help. Neural Networks (NN) are collections of neurons that are connected in an acyclic graph. The outputs of a neuron can become input to other neurons. Cycles are prohibited as that can lead to infinite loop in the forward pass of a network. Neural Network models have distinct layers of neurons. For regular NN, the most common layer type is the fully connected layer in which neurons between two adjacent layers are fully pairwise connected, but neurons within a single layer share no connections.

**Clustering Customers for Machine Learning With Hadoop and Mahout**

We wanted to create and test a solution that allowed us to group together similar customers using different sets of dimensions depending on the information we wanted to provide or obtain. We thought about introducing clustering technology and algorithms to group our customers. This would be a very rough implementation that would allow us to prove certain techniques and solutions for this type of problems — it certainly would NOT cover all the nuances that machine learning algorithms and analysis carry with them. Many liberties were taken to get to a proof of concept. The code presented here is not 100% the same code used in the spike, but it forms a very accurate approximation This post covers the implementation of the solution.

**Introduction to Apache Spark**

Apache Spark is a fast and general-purpose cluster computing system. The latest version can be downloaded from http://…/downloads.html. In this post, we will try to perform some basic data manipulations using spark and python.

**Stand-alone Code for Numerical Computing**

For this week’s resource post, see the page Stand-alone code for numerical computing. It points to small, self-contained bits of code for special functions (log gamma, erf, etc.) and for random number generation (normal, Poisson, gamma, etc.). The code is available in Python, C++, and C# versions. It could easily be translated into other languages since it hardly uses any language-specific features. I wrote these functions for projects where you don’t have a numerical library available or would like to minimize dependencies. If you have access to a numerical library, such as SciPy in Python, then by all means use it (although SciPy is missing some of the random number generators provided here). In C++ and especially C#, it’s harder to find some of this functionality.

**A Comprehensive guide to Parametric Survival Analysis**

Survival analysis is one of the less understood and highly applied algorithm by business analysts. That is a dangerous combination! Not many analysts understand the science and application of survival analysis, but because of its natural use cases in multiple scenarios, it is difficult to avoid!