Recommendations – SlideShare Presentations on Data Science

Every one has their own learning sytle! If you need close hand holding and guidance – an easy going MOOC is probably the best place to start. However, if you are a quick learner and don’t need some one to explain a lot of context, some one who prefers to glance through concepts, apply them a bit and then again refer back to these concepts – presentations can be really handy!

First year books

I had to read a lot of books in graduate school. Some were life-changing, and others were forgettable. If I could bring a reading list back in time for my ‘first year’ graduate self, it would include the following …

How do you know if your model is going to work? Part 2: In-training set measures

When fitting and selecting models in a data science project, how do you know that your final model is good? And how sure are you that it’s better than the models that you rejected? In this Part 2 of our four part mini-series “How do you know if your model is going to work?” we develop in-training set measures.

A Simple Interactive Map Of US Prisons With Leaflet

Some time ago, I discovered Enigma, an amazing open platform that unifies billions of records from thousands of government sources to make the world of public data universally accessible and useful. This is the first experiment I have done using data from Enigma.

comic phylogenetic tree with ggtree and comicR

ggtree applies the concepts of grammar of graphic in phylogenetic tree presentation and make it easy to add multiple layers of text and even figures above a tree. Here, I cartoonize a phylogenetic tree generated by ggtree with comicR, which is a funny package to generate comic (xkcd-like) graph in R. Have fun with ggtree and comicR.

How to Search the Internet by Chemical Structure

iScienceSearch is a search engine for chemists & biologists that lets you search many structure databases from PubChem to Wikipedia in one go. You can search, by structure, substructure, similarity, text and synonyms.

Building Packages in R – Part 0: Setting Up R

One of the highly touted features of R is that it allows you, me, and everyone to create packages. Packages are collections of functions that are made to enable end-users to analyze their data more quickly and efficiently. But the package framework is not just suited to allow experts to distribute their statistical approaches and techniques to a wide audience via CRAN, but it also allows us to collect functions we define ourselves in a single place. They are a great way of making your analysis more efficient if you often find yourself doing the same steps again and again. Over the coming weeks I will try to give a brief and simple introduction into the basics of building R packages for your own home use. Mostly, it will contain stuff that I wished someone had told me, when I started out combining my own functions into packages. This week we’ll start out with a part on the prerequisites – which is why I labeled it Part 0 – of your R-setup.