Are These Losses from The Same Distribution?
In Advanced Measurement Approaches (AMA) for Operational Risk models, the bank needs to segment operational losses into homogeneous segments known as ‘Unit of Measures (UoM)’, which are often defined by the combination of lines of business (LOB) and Basel II event types. However, how do we support whether the losses in one UoM are statistically different from the ones in another UoM? The answer is to test if the losses from various UoMs are distributionally different or equivalent.

Getting Mongo-ed in NoSQL manager, R & Python
MongoDB is the most popular NoSQL database out there. It is used by several big companies like ebay, Criagslist, FourSquare etc. Most of the popular data analysis tools like R and Python offer incredible packages to integrate themselves with MongoDB. These packages enable people to use Mongo & its powerful features from the windows of their choice and comfort. Some of the popular packages integrating MongoDB to the analytics tool are RMongo, PyMongo, Mongolite, JSON Studio and jSonarR.

Kaggle Ensembling Guide
Model ensembling is a very powerful technique to increase accuracy on a variety of ML tasks. In this article I will share my ensembling approaches for Kaggle Competitions.

The Overlap Coefficient
I’m currently working with a client who is researching best covariance matrices for financial time series. Specifically, they are looking at what best describes 1 month out of sample distributions. They are not concerned with the means, just the variance.

What’s Wrong With Deep Learning