Bringing AI to the enterprise is easier said than done, and it is rife with challenges. Many businesses find themselves paralyzed by inaction, unsure of where to begin. Download the toolkit to find Your Path to Enterprise AI. Get tips from experts as well as in-depth how-tos on the required elements, all in easy-to-digest language that will ensure you’re ready to get started.
Natural language understanding(NLU) is one of the disciplines that has seen leading the deep learning revolution of the last few years. From basic chatbots to general-purpose digital assistants, conversational interfaces have become of the most prevalent manifestations of artificial intelligence(AI) influencing our daily lives. Despite the remarkable progress, NLU applications seem to be mostly constrained to task-specific models in which the language representation is very tailored to a specific task. Recently, researchers from Microsoft published a new paper and implementation of a new technique that can learn language representations across different NLU tasks. The specialization of NLU models has its roots on something call language embeddings. Conceptually, a language embedding is a process of mapping symbolic natural language text (for example, words, phrases and sentences) to semantic vector representations. Currently, most NLU models rely on domain-specific language embeddings that can’t be applicable to other NLU tasks. In order to create more general-purpose conversational applications, we need language embedding models that can be reused across different NLU tasks.
SVMs (Support Vector Machines) are a way to classify data by finding the optimal plane or hyperplane that separates the data. In 2D, the separation is a plane; In higher dimensions, it’s a hyperplane. For simplicity, the following picture shows how SVM works for a two-dimensional set.
Whether you’re a data scientist building an implementation case to present to executives or a non-data scientist leader trying to figure this out there’s a need for a much broader framework of strategic thinking around how to capture the value of AI/ML.
In this blog post, I discuss the design and implementation of kubeCDN, a tool designed to simplify geo-replication of Kubernetes clusters in order to deploy services with high availability on a global scale.
The central topic of this thesis are composition models of distributional semantics and their application for representing the semantics of German and English nominal compounds.Composition models are mathematical transformations that, given a compound like Apfelbaum ‘apple tree’, can be applied to the vector representations of Apfel ‘apple’ and Baum ‘tree’ to obtain a vector representation for the compound Apfelbaum ‘apple tree’. The new composed representation is deemed appropriate if it is similar to the representation of Apfelbaum that can be directly learned from large corpora using distributional methods.
Deep reinforcement learning (RL) techniques can be used to learn policies for complex tasks from visual inputs, and have been applied with great success to classic Atari 2600 games. Recent work in this field has shown that it is possible to get super-human performance in many of them, even in challenging exploration regimes such as that exhibited by Montezuma’s Revenge. However, one of the limitations of many state-of-the-art approaches is that they require a very large number of interactions with the game environment, often much larger than what people would need to learn to play well. One plausible hypothesis explaining why people learn these tasks so much more efficiently is that they are able to predict the effect of their own actions, and thus implicitly learn a model of which action sequences will lead to desirable outcomes. This general idea – building a so-called model of the game and using it to learn a good policy for selecting actions – is the main premise of model-based reinforcement learning (MBRL).
Object Detection in Aerial Images is a challenging and interesting problem. With the cost of drones decreasing, there is a surge in amount of aerial data being generated. It will be very useful to have models that can extract valuable information from aerial data. Retina Net is the most famous single stage detector and in this blog, I want to test it out on an aerial images of pedestrians and bikers from the Stanford Drone Data set. See a sample image below. This is a challenging problem since most objects are only a few pixels wide, some objects are occluded and objects in shade are even harder to detect. I have read several blogs of object detection on aerial images or cars/planes but there are only a few links for pedestrian detection aerially which is especially challenging.
Explainable AI is an essential component of a ‘Human AI’, i.e., an AI that expands human experience, instead of replacing it. It will be impossible to gain the trust of people in AI tools that make crucial decisions in an opaque way without explaining the rationale followed, especially in areas where we do not want to completely delegate decisions to machines. On the contrary, the last decade has witnessed the rise of a black box society. Black box AI systems for automated decision making, often based on machine learning over big data, map a user’s features into a class predicting the behavioural traits of individuals, such as credit risk, health status, etc., without exposing the reasons why. This is problematic not only for lack of transparency, but also for possible biases inherited by the algorithms from human prejudices and collection artifacts hidden in the training data, which may lead to unfair or wrong decisions .
A fundamental aspect of the reproducible research framework is that (statistical) analysis can be reproduced; that is, given a set of instructions (or a script file) the exact results can be achieved by another analyst with the same raw data. This idea may seem intuitive, but in practice it can be difficult to achieve in an analytical environment that is always evolving and changing.
This tutorial provides a gentle introduction to the particle Metropolis-Hastings (PMH) algorithm for parameter inference in nonlinear state-space models together with a software implementation in the statistical programming language R. We employ a step-by-step approach to develop an implementation of the PMH algorithm (and the particle filter within) together with the reader. This final implementation is also available as the package pmhtutorial from the Comprehensive R Archive Network (CRAN) repository. Throughout the tutorial, we provide some intuition as to how the algorithm operates and discuss some solutions to problems that might occur in practice. To illustrate the use of PMH, we consider parameter inference in a linear Gaussian state-space model with synthetic data and a nonlinear stochastic volatility model with real-world data.
Co-clustering (also known as biclustering), is an important extension of cluster analysis since it allows to simultaneously group objects and features in a matrix, resulting in row and column clusters that are both more accurate and easier to interpret. This paper presents the theory underlying several effective diagonal and non-diagonal co-clustering algorithms, and describes CoClust, a package which provides implementations for these algorithms. The quality of the results produced by the implemented algorithms is demonstrated through extensive tests performed on datasets of various size and balance. CoClust has been designed to complete and easily interface with popular Python machine learning libraries such as scikit-learn.
We provide a package called plssem that fits partial least squares structural equation models, which is often considered an alternative to the commonly known covariance-based structural equation modeling. plssem is developed in line with the algorithm provided by Wold (1975) and Lohmöller (1989). To demonstrate its features, we present an empirical application on the relationship between perception of self-attractiveness and two specific types of motivations for working out using a real-life data set. In the paper we also show that, in line with other software performing structural equation modeling, plssem can be used for putting in relation single-item observed variables too and not only for latent variable modeling.
In paired comparison models, the inclusion of covariates is a tool to account for the heterogeneity of preferences and to investigate which characteristics determine the preferences. Although methods for the selection of variables have been proposed no coherent framework that combines all possible types of covariates is available. There are three different types of covariates that can occur in paired comparisons, the covariates can either vary over the subjects, the objects or both the subjects and the objects of the paired comparisons. This paper gives an overview over all possible types of covariates in paired comparisons and introduces a general framework to include covariate effects into BradleyTerry models. For each type of covariate, appropriate penalty terms that allow for sparser models and, therefore, easier interpretation are proposed. The whole framework is implemented in the R package BTLLasso. The main functionality and the visualization tools of the package are introduced and illustrated by real data sets.
Designing neural networks for various tasks like image classification and natural language understanding often requires significant architecture engineering and expertise. Enter Neural Architecture Search (NAS), a task to automate the manual process of designing neural networks. NAS owes its growing research interest to the increasing prominence of deep learning models of late.