Your Guide to Master Hypothesis Testing in Statistics

In today’s article, I will explain hypothesis testing and reading statistical significance to differentiate signal from the noise in data – exactly what my new manager wanted me to do!

Artificial Neural Networks for Beginners

Deep Learning is a very hot topic these days especially in computer vision applications and you probably see it in the news and get curious. Now the question is, how do you get started with it? Today’s guest blogger, Toshi Takeuchi, gives us a quick tutorial on artificial neural networks as a starting point for your study of deep learning.

An efficient, flexible distributed framework for deep learning

•To Mix and Maximize: Mix all flavors of programming models to maximize flexiblity and efficiency.
•Lightweight and scalable: Minimum build dependency, scales to multi-GPU and ready toward distributed.
•Auto parallelization: Write numpy-style ndarray GPU programs, which will be automatically parallelized.
•Language agnostic: With support for python, c++, more to come.
•Cloud friendly: Directly load/save from S3, HDFS, AZure
•Easy extensibility: Extending no requirement on GPU programming.

Spark vs. Hadoop: Not Enemies, but Sidekicks

While you might have been hearing about Apache Spark and all of the things it can do, you might be wondering whatever happened to Hadoop? After all, MapR is still one of the biggest Hadoop distribution providers. While you might think that Apache Spark might be replacing Hadoop, that’s anything but the case. Spark represents the next step for Hadoop.

Working With SEM Keywords in R

The following post was republished from two previous posts that were on an older blog of mine that is no longer available. These are from several years ago, and related to two critical questions that I encountered. One, how can I automatically generate hundreds of thousands of keywords for a search engine marketing campaign. Two, how can I develop an effective system for examining keywords based on different characteristics.

5 Text Classification Case Studies Using SciKit Learn

1. News Classification for Startup Intelligence
2. News Classification for Investing
3. Web page Classification
4. Email Spam Classification
5. Matching user profiles to music listener profiles

100 Most Popular Machine Learning Talks at VideoLectures.Net

100 Most Popular Machine Learning Video Talks

Recurrent neural networks, Time series data and IoT – Part One

In this series of exploratory blog posts, we explore the relationship between recurrent neural networks (RNNs) and IoT data. The article is written by Ajit Jaokar, Dr Paul Katsande and Dr Vinay Mehendiratta as part of the Data Science for Internet of Things – practitioners course RNNs are already used for Time series analysis. Because IoT problems can often be modelled as a Time series, RNNs could apply to IoT data. In this multi-part blog, we first discuss Time series applications and then discuss how RNNs could apply to Time series applications. Finally, we discuss applicability to IoT.

What Makes A Good Data Visualization?

This graphic visualises the four elements I think are necessary for a successful “good” visualization. i.e. one that works. These elements form the backbone of my process and also what I teach in my dataviz workshops.