A Tutorial on Latin Hypercube Design of Experiments

The growing power of computers enabled techniques created for design and analysis of simulations to be applied to a large spectrum of problems and to reach high level of acceptance among practitioners. Generally, when simulations are time consuming, a surrogate model replaces the computer code in further studies (e.g., optimization, sensitivity analysis, etc.). The first step for a successful surrogate modeling and statistical analysis is the planning of the input configuration that is used to exercise the simulation code. Among the strategies devised for computer experiments, Latin hypercube designs have become particularly popular. This paper provides a tutorial on Latin hypercube design of experiments, highlighting potential reasons of its widespread use. The discussion starts with the early developments in optimization of the point selection and goes all the way to the pitfalls of the indiscriminate use of Latin hypercube designs. Final thoughts are given on opportunities for future research.

Multinomial Logit as an Iterated Logit Regression

For the second section of the course at ENSAE, yesterday, we’ve seen how to run a multinomial logistic regression model. It is simply an extension of the binomial logistic regression. But actually, it is also possible to consider iterative binomial regressions. Consider here a response variable YYY with a multinomial distribution (3 factors to have something more general than the binomial), taking values {A,B,C}{A,B,C}, with respective probabilities p=(pA,pB,pC)\mathbf{p}=(p_A,p_B,p_C)p=(p ?A ??,p ?B ??,p ?C ??). Here is a code to generate some multinomial variables

Choosing an Open Source Machine Learning Library: TensorFlow, Theano, Torch, scikit-learn, Caffe

From healthcare and security to marketing personalization, despite being at the early stages of development, machine learning has been changing the way we use technology to solve business challenges and everyday tasks. This potential has prompted companies to start looking at machine learning as a relevant opportunity rather than a distant, unattainable virtue. We’ve already discussed machine learning as a service tools for your ML projects. But now let’s look at free and open source software that allows everyone to board the machine learning train without spending time and resources on infrastructure support.

Formal ways to compare forecasting models: Rolling windows

When working with time-series forecasting we often have to choose between a few potential models and the best way is to test each model in pseudo-out-of-sample estimations. In other words, we simulate a forecasting situation where we drop some data from the estimation sample to see how each model perform. Naturally, if you do only one (or just a few) forecasting test you results will have no robustness and in the next forecast the results may change drastically. Another possibility is to estimate the model in, let’s say, half of the sample, and use the estimated model to forecast the other half. This is better than a single forecast but it does not account for possible changes in the structure of the data over the time because you have only one estimation of the model. The most accurate way to compare models is using rolling windows. Suppose you have, for example, 200 observations of a time-series. First you estimate the model with the first 100 observations to forecast the observation 101. Then you include the observation 101 in the estimation sample and estimate the model again to forecast the observation 102. The process is repeated until you have a forecast for all 100 out-of-sample observations. This procedure is also called expanding window. If you drop the first observation in each iteration to keep the window size always the same then you have a fixed rolling window estimation. In the end you will have 100 forecasts for each model and you can calculate RMSE, MAE and formal tests such as Diebold & Mariano.

How Happy is Your Country??—?Happy Planet Index Visualized

The Happy Planet Index (HPI) is an index of human well-being and environmental impact that was introduced by NEF, a UK-based economic think tank promoting social, economic and environmental justice. It ranks 140 countries according to “what matters most?—?sustainable wellbeing for all”.

TensorFlow: What Parameters to Optimize?

Learning TensorFlow Core API, which is the lowest level API in TensorFlow, is a very good step for starting learning TensorFlow because it let you understand the kernel of the library. Here is a very simple example of TensorFlow Core API in which we create and train a linear regression model.