6 Easy Steps to Learn Naive Bayes Algorithm (with code in Python)

You are working on a classification problem and you have generated your set of hypothesis, created features and discussed the importance of variables. Within an hour, stakeholders want to see the first cut of the model. What will you do? You have hunderds of thousands of data points and quite a few variables in your training data set. In such situation, if I were at your place, I would have used ‘Naive Bayes‘, which can be extremely fast relative to other classification algorithms. It works on Bayes theorem of probability to predict the class of unknown data set. In this article, I’ll explain the basics of this algorithm, so that next time when you come across large data sets, you can bring this algorithm to action. In addition, if you are a newbie in Python, you should be overwhelmed by the presence of available codes in this article.

Rolling in the Deep (Learning)

Deep Learning has been getting a lot of press lately, and is one of the hottest the buzz terms in Tech these days. Just check out one of the few recent headlines from Forbes, MIT Tech Review and you will surely see these words pop up at least once. But what is this strange concept everybody is talking about ? Is it just a fleeting craze, that everybody will forget in a few years (or maybe months) ? What is all the hype about ? We will try to answer these questions, and a few more, in the following post.

Convolutional Wasserstein Distances

This paper introduces a new class of algorithms for optimization problems involving optimal transportation over geometric domains. Our main contribution is to show that optimal transportation can be made tractable over large domains used in graphics, such as images and triangle meshes, improving performance by orders of magnitude compared to previous work. To this end, we approximate optimal transportation distances using entropic regularization. The result- ing objective contains a geodesic distance-based kernel that can be approximated with the heat kernel. This approach leads to simple iterative numerical schemes with linear convergence, in which each iteration only requires Gaussian convolution or the solution of a sparse, pre-factored linear system. We demonstrate the versatility and efficiency of our method on tasks including reflectance interpolation, color transfer, and geometry processing.

Big Data Monetization Lessons from Zillow

• Develop a Data Exchange with Customers on the Platform of your Customer’s Preference
• Fuse Data from Many Sources and Formats
• Bring Scale to Data with Indices
• Solve Customer Needs with Data Products
• Achieve Market Scale Quickly
• Treat and View Your Data as an Asset
• Focus Data Products on Sellers: Reminder that Sellers (and Their Agents) are More Willing to Pay Than Buyers