Dawid-Skene Algorithm (DSA) google
More and more online communities classify contributions based on collaborative ratings of these contributions. A popular method for such a rating-based classification is the Dawid-Skene algorithm (DSA). However, despite its popularity, DSA has two major shortcomings:
(1) It is vulnerable to raters with a low competence, i.e., a low probability of rating correctly.
(2) It is defenseless against collusion attacks.
In a collusion attack, raters coordinate to rate the same data objects with the same value to artificially increase their remuneration.
Error Rate Analysis of Labeling by Crowdsourcing
Fast Dawid-Skene

Successor Representation (SR) google
Here we propose using the successor representation (SR) to accelerate learning in a constructive knowledge system based on general value functions (GVFs). In real-world settings like robotics for unstructured and dynamic environments, it is infeasible to model all meaningful aspects of a system and its environment by hand due to both complexity and size. Instead, robots must be capable of learning and adapting to changes in their environment and task, incrementally constructing models from their own experience. GVFs, taken from the field of reinforcement learning (RL), are a way of modeling the world as predictive questions. One approach to such models proposes a massive network of interconnected and interdependent GVFs, which are incrementally added over time. It is reasonable to expect that new, incrementally added predictions can be learned more swiftly if the learning process leverages knowledge gained from past experience. The SR provides such a means of separating the dynamics of the world from the prediction targets and thus capturing regularities that can be reused across multiple GVFs. As a primary contribution of this work, we show that using SR-based predictions can improve sample efficiency and learning speed in a continual learning setting where new predictions are incrementally added and learned over time. We analyze our approach in a grid-world and then demonstrate its potential on data from a physical robot arm. …

Gradient Deflection google
Over-parameterized neural networks generalize well in practice without any explicit regularization. Although it has not been proven yet, empirical evidence suggests that implicit regularization plays a crucial role in deep learning and prevents the network from overfitting. In this work, we introduce the gradient gap deviation and the gradient deflection as statistical measures corresponding to the network curvature and the Hessian matrix to analyze variations of network derivatives with respect to input parameters, and investigate how implicit regularization works in ReLU neural networks from both theoretical and empirical perspectives. Our result reveals that the network output between each pair of input samples is properly controlled by random initialization and stochastic gradient descent to keep interpolating between samples almost straight, which results in low complexity of over-parameterized neural networks. …

Visual Data Management System (VDMS) google
We introduce the Visual Data Management System (VDMS), a data management solution that enables efficient access of big-visual-data to support visual analytics. This is achieved by searching for relevant visual data via metadata stored as a graph, as well as enabling faster access to visual data through new machine-friendly storage formats. VDMS differs from existing large scale photo serving, video streaming, and textual big-data data management systems due to its primary focus on supporting machine learning and data analytics pipelines that use visual data, and in its treatment of visual data such as images, videos, and feature vectors as first class entities. We describe how to use VDMS via its user friendly interface as well as how it enables rich and efficient vision analytics through a machine learning pipeline for processing medical images. We show the improved performance of 2x in complex queries over a comparable set-up. …