The Paradox of Replication, and the vindication of the P-value (but she can go deeper)

Critic 1: It’s much too easy to get small P-values. Critic 2: We find it very difficult to get small P-values; only 36 of 100 psychology experiments were found to yield small P-values in the recent Open Science collaboration.

Data Visualization Tools For Data scientists & analysts

Infographic

Logistic Regression in R – Part One

Logistic regression is used to analyze the relationship between a dichotomous dependent variable and one or more categorical or continuous independent variables. It specifies the likelihood of the response variable as a function of various predictors. The model expressed as log(odds) = \beta_0 + \beta_1*x_1 + … + \beta_n*x_n , where \beta refers to the parameters and x_i represents the independent variables. The log(odds), or log of the odds ratio, is defined as ln[\frac{p}{1-p}]. It expresses the natural logarithm of the ratio between the probability that an event will occur, p(Y=1), to the probability that an event will not occur, p(Y=0).

Correction For Spatial And Temporal Auto-Correlation In Panel Data: Using R To Estimate Spatial HAC Errors Per Conley

Economists and political scientists often employ panel data that track units (e.g., firms or villages) over time. When estimating regression models using such data, we often need to be concerned about two forms of auto-correlation: serial (within units over time) and spatial (across nearby units). As Cameron and Miller (2013) note in their excellent guide to cluster-robust inference, failure to account for such dependence can lead to incorrect conclusions: “[f]ailure to control for within-cluster error correlation can lead to very misleadingly small standard errors…” (p. 4).

R: Why it’s the next programming language you should learn