**Lessons Learned Reproducing a Deep Reinforcement Learning Paper**

There are a lot of neat things going on in deep reinforcement learning. One of the coolest things from last year was OpenAI and DeepMind’s work on training an agent using feedback from a human rather than a classical reward signal. There’s a great blog post about it at Learning from Human Preferences, and the original paper is at Deep Reinforcement Learning from Human Preferences.

**Comet.ml – Machine Learning Experiment Management**

This article presents comet.ml – a platform that allows tracking machine learning experiments with an emphasis on collaboration and knowledge sharing.

**Where Analytics, Data Science, Machine Learning Were Applied: Trends and Analysis**

CRM/Consumer Analytics, Finance, and Banking are still the leading applications, but Health Care and Fraud Detection are gaining. Anti-spam, Manufacturing, and Social are the fastest growing sectors in 2017, while Oil / Gas / Energy and Social Networks analysis have declined.

**P-Values, Sample Size and Data Mining**

Recently, a paper was presented at our university that showed a significant effect for a variable of interest but had a relatively small number of observations. One colleague suggested that we should consider the significance of the results with care since the number of observations was fairly small. This ignited some discussion. Given that the significance test computed exact p-values, why should the significance of the results be less convincing than if we had a larger sample? Independent of the sample size the probability to find a result significant at a 5% level if there is actually no effect should be 5%. The discussion turned to the question whether small sample sizes could sometimes be problematic because they may magnify possible biases if some data mining takes place. Indeed the dangers of data mining and ‘p-hacking’ are a regular theme in statistics literature and on statistics blogs. To get a better gut feeling of the relationship between sample size and possible biases from data mining, I just have run different Monte-Carlo simulations in R that are shown below.

**How to use dplyr’s mutate in R without a vectorized function**

If you’re reading this, you’ve either encountered this problem before, or you just got to this article out of curiousity (in which case you probably don’t know what problem I’m talking about). A few days ago I was given code by a client for a function that, given a path to a patient’s file, generates a useful ID for the patient.

**4 things business leaders should know as they explore AI and deep learning**

1. There’s an AI skills gap

2. Companies are addressing the AI skills gap through training

3. Initial deep learning projects often focus on safe upgrades

4. TensorFlow is the most popular deep learning tool

2. Companies are addressing the AI skills gap through training

3. Initial deep learning projects often focus on safe upgrades

4. TensorFlow is the most popular deep learning tool

**4 Types of Machine Intelligence You Should Know**

1. Cognitive Computing,

2. AI,

3. Machine Learning, and

4. Deep Learning

are often used to describe the same thing, when they actually differ. We explain what the differences are so you can better understand how the pieces fit together.

2. AI,

3. Machine Learning, and

4. Deep Learning

are often used to describe the same thing, when they actually differ. We explain what the differences are so you can better understand how the pieces fit together.

**Logistic Regression in R Tutorial**

Logistic regression is yet another technique borrowed by machine learning from the field of statistics. It’s a powerful statistical way of modeling a binomial outcome with one or more explanatory variables. It measures the relationship between the categorical dependent variable and one or more independent variables by estimating probabilities using a logistic function, which is the cumulative logistic distribution.

Advertisements