Awesome Data Viz – Curated List of Data Viz Frameworks, Libraries, Software

A curated list of awesome data visualizations frameworks, libraries and software. Inspired by awesome-python.

Data Manipulation with dplyr

dplyr is a package for data manipulation, written and maintained by Hadley Wickham. It provides some great, easy-to-use functions that are very handy when performing exploratory data analysis and manipulation. Here, I will provide a basic overview of some of the most useful functions contained in the package. For this article, I will be using the airquality dataset from the datasets package. The airquality dataset contains information about air quality measurements in New York from May 1973 – September 1973.

5 New R Packages for Data Scientists

1. AzureML
2. distcomp
3. rotationForest
4. rpca
5. SwarmSVM

Generating Poetry with PoetRNN

A few months ago I read Andrej Karpathy’s blog post about using RNNs to generate text. I was amazed by the quality of the results (it basically wrote compilable LaTeX code, which as a mathematician blew my mind). I saw that a lot of people had been doing some really cool stuff with these networks, and so I decided I wanted to try it out myself. Being a machine learning/python novice I decided I would learn way more if I basically started from scratch. Thus PoetRNN was born.

Deploying a car price model using R and AzureML

Recently Microsoft released the AzureML R package, it allows R users to publish their R models (or any R function) as a web service on the Microsoft Azure Machine Learning platform. Of course, I wanted to test the new package, so I performed the following steps.

Generalized Linear Models (GLMs) in R, Part 4: Options, Link Functions, and Interpretation

Last year I wrote several articles (GLM in R 1, GLM in R 2, GLM in R 3) that provided an introduction to Generalized Linear Models (GLMs) in R. As a reminder, Generalized Linear Models are an extension of linear regression models that allow the dependent variable to be non-normal.

List of Machine Learning Certifications and Best Data Science Bootcamps

Every one has a different style of learning. Hence, there are multiple ways to become a data scientist. You can learn from tutorials, blogs, books, hackathons, videos and what not! I personally like self paced learning aided by help from a community – it works best for me. What works best for you? If the answer to above question was class room / instructor led certifications, you should check out machine learning certifications and data science bootcamps. They offer a great way to learn and prepare you for the role and expectations from a data scientist.