Mapping 24 Emotions Conveyed by Brief Human Vocalization

Emotional vocalizations are central to human social life. Recent studies have documented that people recognize at least 13 emotions in brief vocalizations. This capacity emerges early in development, is preserved in some form across cultures, and informs how people respond emotionally to music. What is poorly understood is how emotion recognition from vocalization is structured within what we call a semantic space, the study of which addresses questions critical to the field: How many distinct kinds of emotions can be expressed? Do expressions convey emotion categories or affective appraisals (e.g., valence, arousal)? Is the recognition of emotion expressions discrete or continuous? Guided by a new theoretical approach to emotion taxonomies, we apply large-scale data collection and analysis techniques to judgments of 2,032 emotional vocal bursts produced in laboratory settings (Study 1) and 48 found in the real world (Study 2) by U.S. English speakers (N = 1,105). We find that vocal bursts convey at least 24 distinct kinds of emotion. Emotion categories (sympathy, awe), more so than affective appraisals (including valence and arousal), organize emotion recognition. In contrast to discrete emotion theories, the emotion categories conveyed by vocal bursts are bridged by smooth gradients with continuously varying meaning. We visualize the complex, high-dimensional space of emotion conveyed by brief human vocalization within an online interactive map.

5 New Generative Adversarial Network (GAN) Architectures For Image Synthesis

AI image synthesis has made impressive progress since Generative Adversarial Networks (GANs) were introduced in 2014. GANs were originally only capable of generating small, blurry, black-and-white pictures, but now we can generate high-resolution, realistic and colorful pictures that you can hardly distinguish from real photographs. Here we have summarized for you 5 recently introduced GAN architectures that are used for image synthesis.

Essential NLP Tools, Code, and Tips

In a previous article, we introduced the influential impact of natural language processing (NLP) in different industries and explained the way this discipline is reshaping several fields, yet facing huge challenges on its way. The main drawbacks we face these days with NLP relate to the fact that language is very tricky. The process of understanding and manipulating language is extremely complex, and for this reason, it is common to use different techniques to handle different challenges before binding everything together. Programming languages like Python or R are highly used to perform these techniques, but before diving into code lines (that will be the topic of a different article), it’s important to understand the concepts beneath them.

Tackling Bias in Machine Learning

Machine learning and AI applications are used across industries, from recommendation engines to self-driving cars and more. When machine learning is used in automated decision-making, it can create issues with transparency, accountability, and equity. For example, last year it came to light that the AI tool Amazon built to automate their hiring process had to be shut down because it was discriminating against women.

The FAIR Guiding Principles for scientific data management and stewardship

There is an urgent need to improve the infrastructure supporting the reuse of scholarly data. A diverse set of stakeholders – representing academia, industry, funding agencies, and scholarly publishers – have come together to design and jointly endorse a concise and measureable set of principles that we refer to as the FAIR Data Principles. The intent is that these may act as a guideline for those wishing to enhance the reusability of their data holdings. Distinct from peer initiatives that focus on the human scholar, the FAIR Principles put specific emphasis on enhancing the ability of machines to automatically find and use the data, in addition to supporting its reuse by individuals. This Comment is the first formal publication of the FAIR Principles, and includes the rationale behind them, and some exemplar implementations in the community.
FAIR = Findability, Accessibility, Interoperability and Reusability

Is Your Data FAIR? An Open Data Checklist for Success

Many believe that data should be recognized as the ‘oil of the 21st century’, the world’s most valuable resource. This opinion is spreading, with 64% of more than 1,000 researchers who contributed to the State of Open Data Report in 2018 making their data available, compared to just 57% in 2016. There are huge social and financial benefits that businesses and economies can realize if they can successfully leverage Open Data. Despite this, there are still some hurdles for data professionals to leap. A great way to start is to consider whether your data meets the criteria for what’s known as the FAIR principles. These are Findability, Accessibility, Interoperability and Reusability.

Overcoming distrust on the path to productive analytics

We outline the importance of overcoming distrust in data and analytics, with tips on how to align all stakeholders, being a data optimist, streamlining the process, and more.

Artificial Neural Networks Optimization using Genetic Algorithm with Python

In a previous tutorial titled ‘Artificial Neural Network Implementation using NumPy and Classification of the Fruits360 Image Dataset’ available in my LinkedIn profile at this link https://…implementation-using-numpy-fruits360-gad, an artificial neural network (ANN) is created for classifying 4 classes of the Fruits360 image dataset. The source code used in this tutorial is available in my GitHub page here: https://…/NumPyANN A quick summary of this tutorial is extracting the feature vector (360 bins hue channel histogram) and reducing it to just 102 element by using a filter-based technique using the standard deviation. Later, the ANN is built from scratch using NumPy.

The Credibility Crisis in Data Science

There’s a tendency to focus on minutia, like what neural network architecture are we using? Are we using R or are we using Python? What method are you using? These kind of things, which are not as important to decision makers. One thing that I’ve heard just in working here at Civis is like you say from a CEO of a very large company that every one would know, if I mentioned who they were. I mean it’s basically saying if I have all these data scientists, I have hundreds of data scientists and I have no idea what the fuck they do all day. It’s like a part of a profession, part of a class of jobs. That’s not what you wanna be. You don’t want the decision makers and the people who are supposed to be benefiting from your insights, not able to discern what it is you do not understanding what your output is.

A gentle introduction to SHAP values in R

This novel approach allows us to dig a little bit more in the complexity of the predictive model results, while it allows us to explore the relationships between variables for predicted case.

Network Analysis of Emotions

In this month’s post, I set out to create a visual network of emotions. Emotion Dynamics tells us that different emotions are highly interconnected, such that one emotion morphs into another and so on. I’ll be using a large dataset from an original study published in PLOS ONE by Trampe, Quoidbach, and Taquet (2015). Thanks to Google Dataset Search, I was able to locate this data. The data is collected from 11,000 participants who completed daily questionnaires on the emotions they felt at a given moment. The original paper is fascinating and I highly encourage checking it out – not to mention that the author’s analysis is the inspiration for this post. The raw data can be freely accessed from the author’s OSF page (link in online article) – props to them for publishing the data! What is a network? In a sentence, a network is a complex set of interrelations between variables. Some terminology: nodes are the variables (in this case, emotions), and edges are the relationships between the variables. Networks can be directed, which means that variables are linked in a sequence (e.g, from emotion A to emotion B), or undirected, which just shows the relationships. Trampe et al. (2015) created an undirected network in their paper, but the data also allows for a directed network – and this is what I’m going to make for this post.

Speeding Up and Perfecting Your Work Using Parallel Computing

A detailed guide of Python multiprocessing vs. PySpark mapPartition. In science, behind every achievement is grinding, rigorous work. And success is unlikely to happen with one attempt. As a data scientist, you probably deal with huge amount of data and computations, perform repeated tests and experiments on your day-to-day work. Though you don’t want to turn your rewarding, stimulating job into tedious one by waiting the time-consuming operation repeating again and again, observation after observation.

Creating a discord sentiment analysis bot using VADER

Sentiment analysis refers to the use of natural language processing, text analysis, computational linguistics, and many more to identify and quantify the sentiment of some kind of text or audio. VADER (Valence Aware Dictionary and sEntiment Reasoner) is a lexicon and rule-based sentiment analysis tool that is specifically attuned to sentiments expressed in social media. VADER makes it easy for us to create a sentiment analysis application. In this article, we will create a discord bot that can analyze the sentiment of the written messages.

The Complete TensorFlow Tutorial for Newbies

TensorFlow is a robust framework for machine learning and deep learning. It makes it easier to build models and deploy them for production. It is the most popular framework among developers. This comes with no surprise, as the framework is also available for web-based machine learning (TensorFlow.js) and for on-device inference (TensorFlow Lite). Furthermore, with the recent announcement of TensorFlow 2.0, the framework will soon be easier to use, as the syntax will be simplified with fewer APIs, and it will support the Julia programming language. It is a great time to get started with TensorFlow, and mastering it is an important asset data scientists. This tutorial will get you up and running with TensorFlow. Note that TensorFlow 2.0 is not stable, so we will focus on the previous stable version. We will first install the framework in the easiest way possible, then we will write a few functions to learn the syntax and to use a few APIs. Finally, we will write a model that will recognize hand signs. Let’s get started!

Introducing Mercury-ML: an open-source ‘messenger of the machine learning gods’

In the ancient Roman mythology, the god Mercury was known as ‘the messenger of the gods’. Wearing winged shoes and a winged hat, he zipped between Mount Olympus and the kingdoms of men and saw to it that the will of the gods was known. He wasn’t the strongest, the wisest, the most revered, or the most feared of the gods, but he was fleet of foot and cunning and could be relied upon to steer events to their desired outcomes. Without him Perseus could not have defeated Medusa; Odysseus would have fallen to Circe’s spells; and Hercules could not have dragged Cerberus from Hades, thereby completing the final of his 12 mythical labours… With this post I would like to introduce a new initiative called Mercury-ML, and open-source ‘messenger of the machine learning gods’.

Interactive spreadsheets in Jupyter

ipywidgets plays an essential part in the Jupyter ecosystem; it brings interactivity between user and data. Widgets are eventful Python objects that often have a visual representation in the Jupyter Notebook or JupyterLab: a button, a slider, a text input, a checkbox… More than a library of interactive widgets, ipywidgets is a powerful framework upon which it is straightforward to create new custom widgets. Developers can quickly start their own widgets library with best practices of code structure and packaging using the widget-cookiecutter project. You can find examples of really nice widgets libraries in the blog-post: Video streaming in the Jupyter Notebook. A spreadsheet is an interactive tool for data analysis in a tabular form. It consists of cells and cell ranges. It supports value dependent cell formatting/styling and one can apply mathematical functions on cells and perform chained computations. It is the perfect user interface for statistical and financial operations. The Jupyter Notebook was lacking a spreadsheet library, that’s when ipysheet comes into play.