More on causes of death in Netherlands over the years
Last week I had a post ‘Deaths in the Netherlands by cause and age’. During creation of that post I made one plot which I had not shown. It shows something odd. There is a vertical striping. Hence mortality varies by year across age.

Little Debate: Data Priorities for all Industries
The figure titled ‘Data Pipeline’ is from an article by Jeffrey T. Leek & Roger D. Peng titled, ‘Statistics: P values are just the tip of the iceberg. These are both well known scientists in the field of statistics and data science, and for them, there is no need to debate the importance of data integrity; it is a fundamental concept. Current terminology uses the term ‘tidy data’, a phrase coined by Hadley Wickham from an article by the same name. Whatever you call it, as scientist, they understand the consequences of bad data. Business decisions today are frequently driven by results from data analysis, and, as such, this requires today’s executives to also understand these same consequencese. Bad data leads to bad decisions.

Review: Machine Learning with R Cookbook
Machine Learning with R Cookbook’ by Chiu Yu-Wei is nothing more or less than it purports to be: a collection of 110 recipes for applying Data Analysis and Machine Learning techniques in R. I was asked by the publishers to review this book and found it to be an interesting and informative read. It will not help you understand how Machine Learning works (that’s not the goal!) but it will help you quickly learn how to apply Machine Learning techniques to you own problems.