My 1st Kaggle ConvNet: Getting to 3rd Percentile in 3 months

The Diabetic Retinopathy challenge on Kaggle has just finished. The goal of the competition was to predict the presence and severity of the disease Diabetic Retinopathy from photographs of eyes. I finished in 20th place using a Convolutional Neural Network (ConvNet). In this post I’ll explain my learning process and progress as I implemented my first ConvNet over the last 3 months. Throughout, I’ll link to the implementations in my code, which is available on github for anyone who wishes to replicate my score.

Deep learning

In the last chapter we learned that deep neural networks are often much harder to train than shallow neural networks. That’s unfortunate, since we have good reason to believe that if we could train deep nets they’d be much more powerful than shallow nets. But while the news from the last chapter is discouraging, we won’t let it stop us. In this chapter, we’ll develop techniques which can be used to train deep networks, and apply them in practice. We’ll also look at the broader picture, briefly reviewing recent progress on using deep nets for image recognition, speech recognition, and other applications. And we’ll take a brief, speculative look at what the future may hold for neural nets, and for artificial intelligence. The chapter is a long one. To help you navigate, let’s take a tour. The sections are only loosely coupled, so provided you have some basic familiarity with neural nets, you can jump to whatever most interests you.

A Visual Introduction to Machine Learning

In machine learning, computers apply statistical learning techniques to automatically identify patterns in data. These techniques can be used to make highly accurate predictions.

The Brain vs Deep Learning Part I: Computational Complexity — Or Why the Singularity Is Nowhere Near

In this blog post I will delve into the brain and explain its basic information processing machinery and compare it to deep learning. I do this by moving step-by-step along with the brains electrochemical and biological information processing pipeline and relating it directly to the architecture of convolutional nets. Thereby we will see that a neuron and a convolutional net are very similar information processing machines. While performing this comparison, I will also discuss the computational complexity of these processes and thus derive an estimate for the brains overall computational power. I will use these estimates, along with knowledge from high performance computing, to show that it is unlikely that there will be a technological singularity in this century.

Modelling Occurence of Events, with some Exposure

This afternoon, an interesting point was raised, and I wanted to get back on it (since I did publish a post on that same topic a long time ago). How can we adapt a logistic regression when all the observations do not have the same exposure.

Visualising Claims Frequency

A few years ago, I did publish a post to visualize and empirical claims frequency in a portfolio. I wanted to update the code.

CrowdFlower Winner’s Interview: 1st place, Chenglong Chen

The Crowdflower Search Results Relevance competition asked Kagglers to evaluate the accuracy of e-commerce search engines on a scale of 1-4 using a dataset of queries & results. Chenglong Chen finished ahead of 1,423 other data scientists to take first place. He shares his approach with us from his home in Guangzhou, Guangdong, China.

Statistical Models of Judgment and Choice: Deciding What Matters Guided by Attention and Intention

Preference begins with attention, a form of intention-guided perception. You enter the store thirsty on a hot summer day, and all you can see is the beverage cooler at the far end of the aisle with your focus drawn toward the cold beverages that you immediately recognize and desire. Focal attention is such a common experience that we seldom appreciate the important role that it plays in almost every activity. For instance, how are you able to read this post? Automatically and without awareness, you see words and phrases by blurring everything else in your perceptual field.