SemTK google
The relatively recent adoption of Knowledge Graphs as an enabling technology in multiple high-profile artificial intelligence and cognitive applications has led to growing interest in the Semantic Web technology stack. Many semantics-related tools, however, are focused on serving experts with a deep understanding of semantic technologies. For example, triplification of relational data is available but there is no open source tool that allows a user unfamiliar with OWL/RDF to import data into a semantic triple store in an intuitive manner. Further, many tools require users to have a working understanding of SPARQL to query data. Casual users interested in benefiting from the power of Knowledge Graphs have few tools available for exploring, querying, and managing semantic data. We present SemTK, the Semantics Toolkit, a user-friendly suite of tools that allow both expert and non-expert semantics users convenient ingestion of relational data, simplified query generation, and more. The exploration of ontologies and instance data is performed through SPARQLgraph, an intuitive web-based user interface in SemTK understandable and navigable by a lay user. The open source version of SemTK is available at http://semtk.research.ge.com.

ARCHANGEL google
We present ARCHANGEL; a novel distributed ledger based system for assuring the long-term integrity of digital video archives. First, we describe a novel deep network architecture for computing compact temporal content hashes (TCHs) from audio-visual streams with durations of minutes or hours. Our TCHs are sensitive to accidental or malicious content modification (tampering) but invariant to the codec used to encode the video. This is necessary due to the curatorial requirement for archives to format shift video over time to ensure future accessibility. Second, we describe how the TCHs (and the models used to derive them) are secured via a proof-of-authority blockchain distributed across multiple independent archives. We report on the efficacy of ARCHANGEL within the context of a trial deployment in which the national government archives of the United Kingdom, Estonia and Norway participated. …

Kernel Mean-p Power Error (KMPE) google
Correntropy is a second order statistical measure in kernel space, which has been successfully applied in robust learning and signal processing. In this paper, we define a nonsecond order statistical measure in kernel space, called the kernel mean-p power error (KMPE), including the correntropic loss (CLoss) as a special case. Some basic properties of KMPE are presented. In particular, we apply the KMPE to extreme learning machine (ELM) and principal component analysis (PCA), and develop two robust learning algorithms, namely ELM-KMPE and PCA-KMPE. Experimental results on synthetic and benchmark data show that the developed algorithms can achieve consistently better performance when compared with some existing methods. …

Adaptive Function-on-Scalar Regression with a Smoothing Elastic Net (AFSSEN) google
This paper presents a new methodology, called AFSSEN, to simultaneously select significant predictors and produce smooth estimates in a high-dimensional function-on-scalar linear model with a sub-Gaussian errors. Outcomes are assumed to lie in a general real separable Hilbert space, H, while parameters lie in a subspace known as a Cameron Martin space, K, which are closely related to Reproducing Kernel Hilbert Spaces, so that parameter estimates inherit particular properties, such as smoothness or periodicity, without enforcing such properties on the data. We propose a regularization method in the style of an adaptive Elastic Net penalty that involves mixing two types of functional norms, providing a fine tune control of both the smoothing and variable selection in the estimated model. Asymptotic theory is provided in the form of a functional oracle property, and the paper concludes with a simulation study demonstrating the advantage of using AFSSEN over existing methods in terms of prediction error and variable selection. …