Data Markets to support AI for All: Pricing, Valuation and Governance

We discuss a data market technique based on intrinsic (relevance and uniqueness) as well as extrinsic value (influenced by supply and demand) of data. For intrinsic value, we explain how to perform valuation of data in absolute terms (i.e just by itself), or relatively (i.e in comparison to multiple datasets) or in conditional terms (i.e valuating new data given currently existing data).

FAT-DeepFFM: Field Attentive Deep Field-aware Factorization Machine

Click through rate (CTR) estimation is a fundamental task in personalized advertising and recommender systems. Recent years have witnessed the success of both the deep learning based model and attention mechanism in various tasks in computer vision (CV) and natural language processing (NLP). How to combine the attention mechanism with deep CTR model is a promising direction because it may ensemble the advantages of both sides. Although some CTR model such as Attentional Factorization Machine (AFM) has been proposed to model the weight of second order interaction features, we posit the evaluation of feature importance before explicit feature interaction procedure is also important for CTR prediction tasks because the model can learn to selectively highlight the informative features and suppress less useful ones if the task has many input features. In this paper, we propose a new neural CTR model named Field Attentive Deep Field-aware Factorization Machine (FAT-DeepFFM) by combining the Deep Field-aware Factorization Machine (DeepFFM) with Compose-Excitation network (CENet) field attention mechanism which is proposed by us as an enhanced version of Squeeze-Excitation network (SENet) to highlight the feature importance. We conduct extensive experiments on two real-world datasets and the experiment results show that FAT-DeepFFM achieves the best performance and obtains different improvements over the state-of-the-art methods. We also compare two kinds of attention mechanisms (attention before explicit feature interaction vs. attention after explicit feature interaction) and demonstrate that the former one outperforms the latter one significantly.

The Statistical Finite Element Method

The finite element method (FEM) is one of the great triumphs of modern day applied mathematics, numerical analysis and algorithm development. Engineering and the sciences benefit from the ability to simulate complex systems with FEM. At the same time the ability to obtain data by measurements from these complex systems, often through sensor networks, poses the question of how one systematically incorporates data into the FEM, consistently updating the finite element solution in the face of mathematical model misspecification with physical reality. This paper presents a statistical construction of FEM which goes beyond forward uncertainty propagation or solving inverse problems, and for the first time provides the means for the coherent synthesis of data and FEM.

IPC: A Benchmark Data Set for Learning with Graph-Structured Data

Benchmark data sets are an indispensable ingredient of the evaluation of graph-based machine learning methods. We release a new data set, compiled from International Planning Competitions (IPC), for benchmarking graph classification, regression, and related tasks. Apart from the graph construction (based on AI planning problems) that is interesting in its own right, the data set possesses distinctly different characteristics from popularly used benchmarks. The data set, named IPC, consists of two self-contained versions, grounded and lifted, both including graphs of large and skewedly distributed sizes, posing substantial challenges for the computation of graph models such as graph kernels and graph neural networks. The graphs in this data set are directed and the lifted version is acyclic, offering the opportunity of benchmarking specialized models for directed (acyclic) structures. Moreover, the graph generator and the labeling are computer programmed; thus, the data set may be extended easily if a larger scale is desired. The data set is accessible from \url{https://…/IPC-graph-data}.

End-to-End Entity Resolution for Big Data: A Survey

One of the most important tasks for improving data quality and the reliability of data analytics results is Entity Resolution (ER). ER aims to identify different descriptions that refer to the same real-world entity, and remains a challenging problem. While previous works have studied specific aspects of ER (and mostly in traditional settings), in this survey, we provide for the first time an end-to-end view of modern ER workflows, and of the novel aspects of entity indexing and matching methods in order to cope with more than one of the Big Data characteristics simultaneously. We present the basic concepts, processing steps and execution strategies that have been proposed by different communities, i.e., database, semantic Web and machine learning, in order to cope with the loose structuredness, extreme diversity, high speed and large scale of entity descriptions used by real-world applications. Finally, we provide a synthetic discussion of the existing approaches, and conclude with a detailed presentation of open research directions.

Transfer Entropy in Continuous Time

Transfer entropy (TE) was introduced by Schreiber in 2000 as a measurement of the predictive capacity of one stochastic process with respect to another. Originally stated for discrete time processes, we expand the theory of TE to stochastic processes indexed over a compact interval taking values in a Polish state space. We provide a definition for continuous time TE using the Radon-Nikodym Theorem, random measures, and projective limits of probability spaces. Furthermore, we provide necessary and sufficient conditions to obtain this definition as a limit of discrete time TE, as well as illustrate its application via an example involving Poisson point processes. As a derivative of continuous time TE, we also define the transfer entropy rate between two processes and show that (under mild assumptions) their stationarity implies a constant rate. We also investigate TE between homogeneous Markov jump processes and discuss some open problems and possible future directions.

Joint Learning of Neural Networks via Iterative Reweighted Least Squares

In this paper, we introduce the problem of jointly learning feed-forward neural networks across a set of relevant but diverse datasets. Compared to learning a separate network from each dataset in isolation, joint learning enables us to extract correlated information across multiple datasets to significantly improve the quality of learned networks. We formulate this problem as joint learning of multiple copies of the same network architecture and enforce the network weights to be shared across these networks. Instead of hand-encoding the shared network layers, we solve an optimization problem to automatically determine how layers should be shared between each pair of datasets. Experimental results show that our approach outperforms baselines without joint learning and those using pretraining-and-fine-tuning. We show the effectiveness of our approach on three tasks: image classification, learning auto-encoders, and image generation.

Data Processing Protocol for Regression of Geothermal Times Series with Uneven Intervals

Regression of data generated in simulations or experiments has important implications in sensitivity studies, uncertainty analysis, and prediction accuracy. Depending on the nature of the physical model, data points may not be evenly distributed. It is not often practical to choose all points for regression of a model because it doesn’t always guarantee a better fit. Fitness of the model is highly dependent on the number of data points and the distribution of the data along the curve. In this study, the effect of the number of points selected for regression is investigated and various schemes aimed to process regression data points are explored. Time series data i.e., output varying with time, is our prime interest mainly the temperature profile from enhanced geothermal system. The objective of the research is to find a better scheme for choosing a fraction of data points from the entire set to find a better fitness of the model without losing any features or trends in the data. A workflow is provided to summarize the entire protocol of data preprocessing, regression of mathematical model using training data, model testing, and error analysis. Six different schemes are developed to process data by setting criteria such as equal spacing along axes (X and Y), equal distance between two consecutive points on the curve, constraint in the angle of curvature, etc. As an example for the application of the proposed schemes, 1 to 20% of the data generated from the temperature change of a typical geothermal system is chosen from a total of 9939 points. It is shown that the number of data points, to a degree, has negligible effect on the fitted model depending on the scheme. The proposed data processing schemes are ranked in terms of R2 and NRMSE values.

On Conditioning GANs to Hierarchical Ontologies

The recent success of Generative Adversarial Networks (GAN) is a result of their ability to generate high quality images from a latent vector space. An important application is the generation of images from a text description, where the text description is encoded and further used in the conditioning of the generated image. Thus the generative network has to additionally learn a mapping from the text latent vector space to a highly complex and multi-modal image data distribution, which makes the training of such models challenging. To handle the complexities of fashion image and meta data, we propose Ontology Generative Adversarial Networks (O-GANs) for fashion image synthesis that is conditioned on an hierarchical fashion ontology in order to improve the image generation fidelity. We show that the incorporation of the ontology leads to better image quality as measured by Fr\'{e}chet Inception Distance and Inception Score. Additionally, we show that the O-GAN achieves better conditioning results evaluated by implicit similarity between the text and the generated image.

Recent Advances in Diversified Recommendation

With the rapid development of recommender systems, accuracy is no longer the only golden criterion for evaluating whether the recommendation results are satisfying or not. In recent years, diversity has gained tremendous attention in recommender systems research, which has been recognized to be an important factor for improving user satisfaction. On the one hand, diversified recommendation helps increase the chance of answering ephemeral user needs. On the other hand, diversifying recommendation results can help the business improve product visibility and explore potential user interests. In this paper, we are going to review the recent advances in diversified recommendation. Specifically, we first review the various definitions of diversity and generate a taxonomy to shed light on how diversity have been modeled or measured in recommender systems. After that, we summarize the major optimization approaches to diversified recommendation from a taxonomic view. Last but not the least, we project into the future and point out trending research directions on this topic.

An Information Theoretic Interpretation to Deep Neural Networks

It is commonly believed that the hidden layers of deep neural networks (DNNs) attempt to extract informative features for learning tasks. In this paper, we formalize this intuition by showing that the features extracted by DNN coincide with the result of an optimization problem, which we call the `universal feature selection’ problem, in a local analysis regime. We interpret the weights training in DNN as the projection of feature functions between feature spaces, specified by the network structure. Our formulation has direct operational meaning in terms of the performance for inference tasks, and gives interpretations to the internal computation results of DNNs. Results of numerical experiments are provided to support the analysis.

MAIA: A Microservices-based Architecture for Industrial Data Analytics

In recent decades, it has become a significant tendency for industrial manufacturers to adopt decentralization as a new manufacturing paradigm. This enables more efficient operations and facilitates the shift from mass to customized production. At the same time, advances in data analytics give more insights into the production lines, thus improving its overall productivity. The primary objective of this paper is to apply a decentralized architecture to address new challenges in industrial analytics. The main contributions of this work are therefore two-fold: (1) an assessment of the microservices’ feasibility in industrial environments, and (2) a microservices-based architecture for industrial data analytics. Also, a prototype has been developed, analyzed, and evaluated, to provide further practical insights. Initial evaluation results of this prototype underpin the adoption of microservices in industrial analytics with less than 20ms end-to-end processing latency for predicting movement paths for 100 autonomous robots on a commodity hardware server. However, it also identifies several drawbacks of the approach, which is, among others, the complexity in structure, leading to higher resource consumption.

Machine Learning based English Sentiment Analysis

Sentiment analysis or opinion mining aims to determine attitudes, judgments and opinions of customers for a product or a service. This is a great system to help manufacturers or servicers know the satisfaction level of customers about their products or services. From that, they can have appropriate adjustments. We use a popular machine learning method, being Support Vector Machine, combine with the library in Waikato Environment for Knowledge Analysis (WEKA) to build Java web program which analyzes the sentiment of English comments belongs one in four types of woman products. That are dresses, handbags, shoes and rings. We have developed and test our system with a training set having 300 comments and a test set having 400 comments. The experimental results of the system about precision, recall and F measures for positive comments are 89.3%, 95.0% and 92,.1%; for negative comments are 97.1%, 78.5% and 86.8%; and for neutral comments are 76.7%, 86.2% and 81.2%.