5 interesting subtle insights from TED videos data analysis in R

This post aims to bring out some not-so-obvious subtle insights from analyzing TED Videos posted on TED.com. For those who do not know what is TED, Here’s the summary from Wikipedia: TED (Technology, Entertainment, Design) is a media organization which posts talks online for free distribution, under the slogan “ideas worth spreading”.

Introducing NIMA: Neural Image Assessment

Quantification of image quality and aesthetics has been a long-standing problem in image processing and computer vision. While technical quality assessment deals with measuring pixel-level degradations such as noise, blur, compression artifacts, etc., aesthetic assessment captures semantic level characteristics associated with emotions and beauty in images. Recently, deep convolutional neural networks (CNNs) trained with human-labelled data have been used to address the subjective nature of image quality for specific classes of images, such as landscapes. However, these approaches can be limited in their scope, as they typically categorize images to two classes of low and high quality. Our proposed method predicts the distribution of ratings. This leads to a more accurate quality prediction with higher correlation to the ground truth ratings, and is applicable to general images.

New Poll: When will Artificial General Intelligence (AGI) be achieved?

Artificial General Intelligence (AGI), is defined as the machine intelligence that could successfully perform any intellectual task that a human can do. AGI can potentially bring enormous benefits – cure diseases, provide ample leisure time, eliminate car fatalities from safe self-driving cars, etc. AGI has poses also enormous risks – eliminating most (or all) jobs, dramatically increasing inequality, and perhaps an existential threat to humanity, as Elon Musk and Stephen Hawking warn. If AGI is achieved, then Singularity, when the computer intelligence will increase exponentially, could follow soon after.

The state of AI adoption

Artificial intelligence (AI) has attracted a lot of media coverage recently, and companies are rushing to figure out how AI technologies will impact them. Much of the coverage is devoted to research breakthroughs or new product offerings. But how are companies integrating AI into their underlying businesses? In this post, we share slides and notes from a talk we gave this past September at the AI Conference in San Francisco, offering an overview of the state of adoption and some suggestions to companies interested in implementing AI technologies.

How to Perform Hierarchical Clustering using R

Clustering is a technique to club similar data points into one group and separate out dissimilar observations into different groups or clusters. In Hierarchical Clustering, clusters are created such that they have a predetermined ordering i.e. a hierarchy. For example, consider the concept hierarchy of a library. A library has many sections, each section would have many books, and the books would be grouped according to their subject, let’s say. This forms a hierarchy. In Hierarchical Clustering, this hierarchy of clusters can either be created from top to bottom, or vice-versa. Hence, it’s two types namely – Divisive and Agglomerative. Let’s discuss it in detail.

Teaching the tidyverse to beginners

This year, I taught three classes; a 12-hour class to colleagues that work with me, a 15-hour class to master’s students and 3 hours again to some of my colleagues. Each time, I decided to focus on the tidyverse(almost) entirely, and must say that I am not disappointed with the results! The 12 hour class was divided in two 6 hours days. It was a bit intense, especially the last 3 hours that took place Friday afternoon. The crowd was composed of some economists that had experience with STATA, some others that were mostly using Excel and finally some colleagues from the IT department that sometimes need to dig into some data themselves. Apart from 2 people, all the other never had any experience with R. We went from 0 to being able to do the plot below after the end of the first day (so 6 hours in). Keep in mind that practically none of them even had opened RStudio before. I show the code so you can see the progress made in just a few hours: