Fitting models with discrete parameters in Stan

This book, “Bayesian Cognitive Modeling: A Practical Course,” by Michael Lee and E. J. Wagenmakers, has a bunch of examples of Stan models with discrete parameters—mixture models of various sorts—with Stan code written by Martin Smira! It’s a good complement to the Finite Mixtures chapter in the Stan manual.

Liberty Mutual Property Inspection, Winner’s Interview: Qingchen Wang

The hugely popular Liberty Mutual Group: Property Inspection Prediction competition wrapped up on August 28, 2015 with Qingchen Wang at the top of a crowded leaderboard. A total of 2,362 players on 2,236 teams competed to predict how many hazards a property inspector would count during a home inspection.

How TDA and Machine Learning Enhance Each Other

People new to topological data analysis (TDA) often ask me some form of the question, “What’s the difference between Machine Learning and TDA?” It’s a hard question to answer, in part because it depends on what you mean by Machine Learning (ML).

A Few Days of Python: Using R in Python

Using R Functions in Python

Call R functions from any application with the AzureML package

If you’ve developed a useful function in R (say, a function to make a forecast or prediction from a statistical model), you may want to call that function from an application other than R. For example, you might want to display the forecast (calculated in R) as part of a desktop, web-based or mobile application. One solution is to install R alongside the application and call it directly, but that can be difficult — or impossible, in the case of mobile apps. (You also need to be careful to comply with R’s open-source GPL2 license.)

A Data Cleaning Example

The objective is to separate these key-value pairs and store the values in corresponding key columns. The hadleyverse packages make this task a fairly simple one, especially tidyr, stringr and magrittr.

A Segmentation Of The World According To Migration Flows ft. Leaflet

In this post I analyze two datasets from Enigma:
• Migration flows: Every 10 years, since 1960, the World Bank estimates migrations worldwide (267.960 rows)
• World population: Values and percentages of populations for each nation examined beginning in year 1960, by the World Bank’s Health, Nutrition and Population project (4.168.185 rows)

Using Linear Regression to Predict Energy Output of a Power Plant

In this article, I will show you how to fit a linear regression to predict the energy output at a Combined Cycle Power Plant(CCPP). The dataset is obtained from the UCI Machine Learning Repository. The dataset contains five columns, namely, Ambient Temperature (AT), Ambient Pressure (AP), Relative Humidity (RH), Exhaust Vacuum (EV), and net hourly electrical energy output (PE) of the plant. The first four are the attributes, and are used to predict the output, PE.

What Types of Questions Can Data Science Answer?

Machine learning (ML) is the motor that drives data science. Each ML method (also called an algorithm) takes in data, turns it over, and spits out an answer. ML algorithms do the part of data science that is the trickiest to explain and the most fun to work with. That’s where the mathematical magic happens. ML algorithms can be grouped into families based on the type of question they answer. These can help guide your thinking as you are formulating your razor sharp question.