“Once the often laborious task of data munging is complete, the next step in the data science process is to become intimately familiar with the data set by performing what’s called Exploratory Data Analysis (EDA). The way to gain this level of familiarity is to utilize the features of the statistical environment you’re using (R, Matlab, SAS, Python, etc.) that support this effort – numeric summaries, aggregations, distributions, densities, reviewing all the levels of factor variables, applying general statistical methods, exploratory plots, and expository plots.” Daniel Gutierrez ( November 5, 2014 )

Advertisements