Computing the Unique Information

Given a set of predictor variables and a response variable, how much information do the predictors have about the response, and how is this information distributed between unique, complementary, and shared components? Recent work has proposed to quantify the unique component of the decomposition as the minimum value of the conditional mutual information over a constrained set of information channels. We present an efficient iterative divergence minimization algorithm to solve this optimization problem with convergence guarantees, and we evaluate its performance against other techniques.

Big Data Systems Meet Machine Learning Challenges: Towards Big Data Science as a Service

Recently, we have been witnessing huge advancements in the scale of data we routinely generate and collect in pretty much everything we do, as well as our ability to exploit modern technologies to process, analyze and understand this data. The intersection of these trends is what is called, nowadays, as Big Data Science. Cloud computing represents a practical and cost-effective solution for supporting Big Data storage, processing and for sophisticated analytics applications. We analyze in details the building blocks of the software stack for supporting big data science as a commodity service for data scientists. We provide various insights about the latest ongoing developments and open challenges in this domain.

Hamilton-Jacobi Reachability: A Brief Overview and Recent Advances

Hamilton-Jacobi (HJ) reachability analysis is an important formal verification method for guaranteeing performance and safety properties of dynamical systems; it has been applied to many small-scale systems in the past decade. Its advantages include compatibility with general nonlinear system dynamics, formal treatment of bounded disturbances, and the availability of well-developed numerical tools. The main challenge is addressing its exponential computational complexity with respect to the number of state variables. In this tutorial, we present an overview of basic HJ reachability theory and provide instructions for using the most recent numerical tools, including an efficient GPU-parallelized implementation of a Level Set Toolbox for computing reachable sets. In addition, we review some of the current work in high-dimensional HJ reachability to show how the dimensionality challenge can be alleviated via various general theoretical and application-specific insights.

MRNet-Product2Vec: A Multi-task Recurrent Neural Network for Product Embeddings

E-commerce websites such as Amazon, Alibaba, Flipkart, and Walmart sell billions of products. Machine learning (ML) algorithms involving products are often used to improve the customer experience and increase revenue, e.g., product similarity, recommendation, and price estimation. The products are required to be represented as features before training an ML algorithm. In this paper, we propose an approach called MRNet-Product2Vec for creating generic embeddings of products within an e-commerce ecosystem. We learn a dense and low-dimensional embedding where a diverse set of signals related to a product are explicitly injected into its representation. We train a Discriminative Multi-task Bidirectional Recurrent Neural Network (RNN), where the input is a product title fed through a Bidirectional RNN and at the output, product labels corresponding to fifteen different tasks are predicted. The task set includes several intrinsic characteristics about a product such as price, weight, size, color, popularity, and material. We evaluate the proposed embedding quantitatively and qualitatively. We demonstrate that they are almost as good as sparse and extremely high-dimensional TF-IDF representation in spite of having less than 3% of the TF-IDF dimension. We also use a multimodal autoencoder for comparing products from different language-regions and show preliminary yet promising qualitative results.

AutoCon: Regression Testing for Detecting Cache Contention Anomalies Using Autoencoder

Cache contention is an important type of performance anomaly in this multi-core and many-core era. It can cause a significant slowdown in parallel programs. However, it is hard to detect and often, not visible in the source code. As software changes over time, modifications in code can introduce cache contention anomalies. One way to detect such anomalies, is to use performance regression testing. Prior approaches for cache contention detection are either not suitable for performance regression testing or requires knowledge about specific type of contention behavior. To remedy these shortcomings, we propose AutoCon. It works by finding the modified functions and collecting hardware performance counter profiles for them. It uses an unsupervised learning technique, called Autoencoder, to learn the contention behavior implied by the profiles (collected from the older version of code). Then, it checks the profiles collected from the newer version of code to determine whether the contention pattern (implied by the profiles) is anomalous. If so, AutoCon reports a cache contention anomaly. Finally, it performs root cause analysis to provide detailed debugging information. AutoCon is the first learning based cache contention detector that does not require any positive example of contention anomalies. We evaluated AutoCon with 13 real world cache contention anomalies as well as 7 open source programs. AutoCon detected all types of cache contention anomalies with only 3.7% profiling overhead (on average). Moreover, compared to a state-of-the-art cache contention detector, AutoCon detected more anomalies.

Heteroscedastic BART Using Multiplicative Regression Trees

Bayesian additive regression trees (BART) has become increasingly popular as a flexible and scalable non-parametric model useful in many modern applied statistics regression problems. It brings many advantages to the practitioner dealing with large and complex non-linear response surfaces, such as a matrix-free formulation and the lack of a requirement to specify a regression basis a priori. However, while flexible in fitting the mean, the basic BART model relies on the standard i.i.d. normal model for the errors. This assumption is unrealistic in many applications. Moreover, in many applied problems understanding the relationship between the variance and predictors can be just as important as that of the mean model. We develop a novel heteroscedastic BART model to alleviate these concerns. Our approach is entirely non-parametric and does not rely on an a priori basis for the variance model. In BART, the conditional mean is modeled as a sum of trees, each of which determines a contribution to the overall mean. In this paper, we model the conditional variance with a product of trees, each of which determines a contribution to the overall variance. We implement the approach and demonstrate it on a simple low-dimensional simulated dataset, a higher-dimensional dataset of used car prices, a fisheries dataset and data from an alcohol consumption study.

FogStore: Toward a Distributed Data Store for Fog Computing

Stateful applications and virtualized network functions (VNFs) can benefit from state externalization to increase their reliability, scalability, and inter-operability. To keep and share the externalized state, distributed data stores (DDSs) are a powerful tool allowing for the management of classical trade-offs in consistency, availability and partitioning tolerance. With the advent of Fog and Edge Computing, stateful applications and VNFs are pushed from the data centers toward the network edge. This poses new challenges on DDSs that are tailored to a deployment in Cloud data centers. In this paper, we propose two novel design goals for DDSs that are tailored to Fog Computing: (1) Fog-aware replica placement, and (2) context-sensitive differential consistency. To realize those design goals on top of existing DDSs, we propose the FogStore system. FogStore manages the needed adaptations in replica placement and consistency management transparently, so that existing DDSs can be plugged into the system. To show the benefits of FogStore, we perform a set of evaluations using the Yahoo Cloud Serving Benchmark.

Smart Mirror: Intelligent Makeup Recommendation and Synthesis

The female facial image beautification usually requires professional editing softwares, which are relatively difficult for common users. In this demo, we introduce a practical system for automatic and personalized facial makeup recommendation and synthesis. First, a model describing the relations among facial features, facial attributes and makeup attributes is learned as the makeup recommendation model for suggesting the most suitable makeup attributes. Then the recommended makeup attributes are seamlessly synthesized onto the input facial image.

EB-GLS: An Improved Guided Local Search Based on the Big Valley Structure

Local search is a basic building block in memetic algorithms. Guided Local Search (GLS) can improve the efficiency of local search. By changing the guide function, GLS guides a local search to escape from locally optimal solutions and find better solutions. The key component of GLS is its penalizing mechanism which determines which feature is selected to penalize when the search is trapped in a locally optimal solution. The original GLS penalizing mechanism only makes use of the cost and the current penalty value of each feature. It is well known that many combinatorial optimization problems have a big valley structure, i.e., the better a solution is, the more the chance it is closer to a globally optimal solution. This paper proposes to use big valley structure assumption to improve the GLS penalizing mechanism. An improved GLS algorithm called Elite Biased GLS (EB-GLS) is proposed. EB-GLS records and maintains an elite solution as an estimate of the globally optimal solutions, and reduces the chance of penalizing the features in this solution. We have systematically tested the proposed algorithm on the symmetric traveling salesman problem. Experimental results show that EB-GLS is significantly better than GLS.

Abandon Statistical Significance

In science publishing and many areas of research, the status quo is a lexicographic decision rule in which any result is first required to have a p-value that surpasses the 0.05 threshold and only then is consideration–often scant–given to such factors as prior and related evidence, plausibility of mechanism, study design and data quality, real world costs and benefits, novelty of finding, and other factors that vary by research domain. There have been recent proposals to change the p-value threshold, but instead we recommend abandoning the null hypothesis significance testing paradigm entirely, leaving p-values as just one of many pieces of information with no privileged role in scientific publication and decision making. We argue that this radical approach is both practical and sensible.

The Long Term Fréchet distribution: Estimation, Properties and its Application

In this paper a new long-term survival distribution is proposed. The so called long term Fr\’echet distribution allows us to fit data where a part of the population is not susceptible to the event of interest. This model may be used, for example, in clinical studies where a portion of the population can be cured during a treatment. It is shown an account of mathematical properties of the new distribution such as its moments and survival properties. As well is presented the maximum likelihood estimators (MLEs) for the parameters. A numerical simulation is carried out in order to verify the performance of the MLEs. Finally, an important application related to the leukemia free-survival times for transplant patients are discussed to illustrates our proposed distribution

A Comparative Study of Association Rule Mining Algorithms on Grid and Cloud Platform

Association rule mining is a time consuming process due to involving both data intensive and computation intensive nature. In order to mine large volume of data and to enhance the scalability and performance of existing sequential association rule mining algorithms, parallel and distributed algorithms are developed. These traditional parallel and distributed algorithms are based on homogeneous platform and are not lucrative for heterogeneous platform such as grid and cloud. This requires design of new algorithms which address the issues of good data set partition and distribution, load balancing strategy, optimization of communication and synchronization technique among processors in such heterogeneous system. Grid and cloud are the emerging platform for distributed data processing and various association rule mining algorithms have been proposed on such platforms. This survey article integrates the brief architectural aspect of distributed system, various recent approaches of grid based and cloud based association rule mining algorithms with comparative perception. We differentiate between approaches of association rule mining algorithms developed on these architectures on the basis of data locality, programming paradigm, fault tolerance, communication cost, partition and distribution of data sets. Although it is not complete in order to cover all algorithms, yet it can be very useful for the new researchers working in the direction of distributed association rule mining algorithms.

A Comprehensive Survey of Graph Embedding: Problems, Techniques and Applications

Graph is an important data representation which appears in a wide diversity of real-world scenarios. Effective graph analytics provides users a deeper understanding of what is behind the data, and thus can benefit a lot of useful applications such as node classification, node recommendation, link prediction, etc. However, most graph analytics methods suffer the high computation and space cost. Graph embedding is an effective yet efficient way to solve the graph analytics problem. It converts the graph data into a low dimensional space in which the graph structural information and graph properties are maximally preserved. In this survey, we conduct a comprehensive review of the literature in graph embedding. We first introduce the formal definition of graph embedding as well as the related concepts. After that, we propose two taxonomies of graph embedding which correspond to what challenges exist in different graph embedding problem settings and how the existing work address these challenges in their solutions. Finally, we summarize the applications that graph embedding enables and suggest four promising future research directions in terms of computation efficiency, problem settings, techniques and application scenarios.

Predicting Runtime Distributions using Deep Neural Networks

Many state-of-the-art algorithms for solving hard combinatorial problems include elements of stochasticity that lead to high variations in runtime, even for a fixed problem instance, across runs with different pseudo-random number seeds. Knowledge about the runtime distributions (RTDs) of algorithms on given problem instances can be exploited in various meta-algorithmic procedures, such as algorithm selection, portfolios, and randomized restarts. Previous work has shown that machine learning can be used to individually predict mean, median and variance of RTDs. To establish a new state-of-the-art in predicting RTDs, we demonstrate that the parameters of an RTD should be learned jointly and that neural networks can do this well by directly optimizing the likelihood of an RTD given runtime observations. In an empirical study involving four algorithms for SAT solving and AI planning, we show that our neural networks predict the true RTDs of unseen instances better than previous methods. As an exemplary application of RTD predictions, we show that our RTD models also yield good predictions of running these algorithms in parallel.

EraseReLU: A Simple Way to Ease the Training of Deep Convolution Neural Networks

For most state-of-the-art architectures, Rectified Linear Unit (ReLU) becomes a standard component accompanied by each layer. Although ReLU can ease the network training to an extent, the character of blocking negative values may suppress the propagation of useful information and leads to the difficulty of optimizing very deep Convolutional Neural Networks (CNNs). Moreover, stacking of layers with nonlinear activations is hard to approximate the intrinsic linear transformations between feature representations. In this paper, we investigate the effect of erasing ReLUs of certain layers and apply it to various representative architectures. We name our approach as ‘EraseReLU’. It can ease the optimization and improve the generalization performance for very deep CNN models. In experiments, this method successfully improves the performance of various representative architectures, and we report the improved results on SVHN, CIFAR-10/100, and ImageNet-1k. By using EraseReLU, we achieve state-of-the-art single-model performance on CIFAR-100 with 83.47% accuracy. Codes will be released soon.

SwGridNet: A Deep Convolutional Neural Network based on Grid Topology for Image Classification

Deep convolutional neural networks (CNNs) achieve remarkable performances in image classification tasks. Recent studies, however, report that generalization abilities are more important than depth of neural networks in order to improve accuracy rate for image classification tasks. In this paper, I propose a new neural network called SwGridNet. SwGridNets contains many convolutional processing units which connect with each other as a grid network where there are many processing paths between input and output. SwGridNets have a high generalization ability because the multi-path architecture has the same effect of ensemble learning. In this paper, I describe the details of the network architecture of SwGridNets. I also show experimental results which indicate that the performances of SwGridNets are close to state-of-the-art deep CNNs.

Annotation based automatic action processing

With a strong motivational background in search engine optimization the amount of structured data on the web is growing rapidly. The main search engine providers are promising great increase in visibility through annotation of the web page’s content with the vocabulary of and thus providing it as structured data. But besides the usage by search engines the data can be used in various other ways, for example for automatic processing of annotated web services or actions. In this work we present an approach to consume and process annotated data on the web and give an idea how a best practice can look like.

A Petri Nets Model for Blockchain Analysis

A Blockchain is a global shared infrastructure where cryptocurrency transactions among addresses are recorded, validated and made publicly available in a peer- to-peer network. To date the best known and important cryptocurrency is the bitcoin. In this paper we focus on this cryptocurrency and in particular on the modeling of the Bitcoin Blockchain by using the Petri Nets formalism. The proposed model allows us to quickly collect information about identities owning Bitcoin addresses and to recover measures and statistics on the Bitcoin network. By exploiting algebraic formalism, we reconstructed an Entities network associated to Blockchain transactions gathering together Bitcoin addresses into the single entity holding permits to manage Bitcoins held by those addresses. The model allows also to identify a set of behaviours typical of Bitcoin owners, like that of using an address only once, and to reconstruct chains for this behaviour together with the rate of firing. Our model is highly flexible and can easily be adapted to include different features of the Bitcoin crypto-currency system.

Intrinsic Metrics: Nearest Neighbor and Edge Squared Distances

Some researchers have proposed using non-Euclidean metrics for clustering data points. Generally, the metric should recognize that two points in the same cluster are close, even if their Euclidean distance is far. Multiple proposals have been suggested, including the Edge-Squared Metric (a specific example of a graph geodesic) and the Nearest Neighbor Metric. In this paper, we prove that the edge-squared and nearest-neighbor metrics are in fact equivalent. Previous best work showed that the edge-squared metric was a 3-approximation of the Nearest Neighbor metric. This paper represents one of the first proofs of equating a continuous metric with a discrete metric, using non-trivial discrete methods. Our proof uses the Kirszbraun theorem (also known as the Lipschitz Extension Theorem and Brehm’s Extension Theorem), a notable theorem in functional analysis and computational geometry. The results of our paper, combined with the results of Hwang, Damelin, and Hero, tell us that the Nearest Neighbor distance on i.i.d samples of a density is a reasonable constant approximation of a natural density-based distance function.

A coordinate-free theory of virtual holonomic constraints
The 6-girth-thickness of the complete graph
Learning Domain-Specific Word Embeddings from Sparse Cybersecurity Texts
Stability of Spatial Smoothness and Cluster-Size Threshold Estimates in FMRI using AFNI
Complexity of Scheduling Charging in the Smart Grid
WERd: Using Social Text Spelling Variants for Evaluating Dialectal Speech Recognition
The Covering Path Problem on a Grid
Sparse-to-Dense: Depth Prediction from Sparse Depth Samples and a Single Image
Decision making and uncertainty quantification for individualized treatments
A Multimodal, Full-Surround Vehicular Testbed for Naturalistic Studies and Benchmarking: Design, Calibration and Deployment
Hopf monoids and generalized permutahedra
An Empirical Dynamic Programming Algorithm for Continuous MDPs
Maximum oriented forcing number for complete graphs
Robust Optimization of Unconstrained Binary Quadratic Problems
Critical random forests
Achieving Parsimony in Bayesian VARs with the Horseshoe Prior
Defining a Lingua Franca to Open the Black Box of a Naïve Bayes Recommender
Topics in loop measures and the loop-erased walk
On an early paper of Maryam Mirzakhani
Attention-based Mixture Density Recurrent Networks for History-based Recommendation
On self-dual negacirculant codes of index two and four
On self-dual four circulant codes
Virtual Blood Vessels in Complex Background using Stereo X-ray Images
Recent Advances on Estimating Population Size with Link-Tracing Sampling
A preconditioning approach for improved estimation of sparse polynomial chaos expansions
EmuFog: Extensible and Scalable Emulation of Large-Scale Fog Computing Infrastructures
Novel Evaluation Metrics for Seam Carving based Image Retargeting
A dynamical systems model of unorganised segregation
Yaglom limits can depend on the starting state
Hierarchical Detail Enhancing Mesh-Based Shape Generation with 3D Generative Adversarial Network
Happy Travelers Take Big Pictures: A Psychological Study with Machine Learning and Big Data
On the Martingale Problem and Feller and Strong Feller Properties for Weakly Coupled Lévy Type Operators
Almost Difference Sets in Nonabelian Groups
UAV-Enabled Wireless Power Transfer: Trajectory Design and Energy Optimization
Learning to Generate Time-Lapse Videos Using Multi-Stage Dynamic Generative Adversarial Networks
Inverse Reinforcement Learning with Conditional Choice Probabilities
Demography-based Facial Retouching Detection using Subclass Supervised Sparse Autoencoder
High-Resolution Shape Completion Using Deep Neural Networks for Global Structure and Local Geometry Inference
Stochastic Input Models in Online Computing
mts: a light framework for parallelizing tree search codes
Efficient Nearest-Neighbor Search for Dynamical Systems with Nonholonomic Constraints
Generalized Bayesian Updating and the Loss-Likelihood Bootstrap
Subdiffusivity of Brownian motion among a Poissonian field of moving traps
On global universality for zeros of random polynomials
Kidnapping Model: An Extension of Selten’s Game
Total stability of kernel methods
BreathRNNet: Breathing Based Authentication on Resource-Constrained IoT Devices using RNNs
Zhang $L^2$-Regularity for the solutions of Backward Doubly Stochastic Differential Equations under globally Lipschitz continuous assumptions
Navigating Between Packings of Graphic Sequences
Tolerances, robustness and parametrization of matrix properties related to optimization problems
On the genealogy and coalescence times of Bienaymé-Galton-Watson branching processes
On a coalescence process and its branching genealogy
Quantified Derandomization of Linear Threshold Circuits
Hierarchical Kriging for multi-fidelity aero-servo-elastic simulators – Application to extreme loads on wind turbines
Approximate Bayesian Inference in Linear State Space Models for Intermittent Demand Forecasting at Scale
Code Attention: Translating Code to Comments by Exploiting Domain Features
OptLayer – Practical Constrained Optimization for Deep Reinforcement Learning in the Real World
Infinite variance $H$-sssi processes as limits of particle systems
Potentials and Implications of Dedicated Highway Lanes for Autonomous Vehicles
Estimating the maximum possible earthquake magnitude using extreme value methodology: the Groningen case
Random flights connecting Porous Medium and Euler-Poisson-Darboux equations
Semantic Segmentation from Limited Training Data
STAR: Spatio-Temporal Altimeter Waveform Retracking using Sparse Representation and Conditional Random Fields
Towards Decentralised Resilient Community Cloud Infrastructures
Real-time 3D Shape Instantiation from Single Fluoroscopy Projection for Fenestrated Stent Graft Deployment
Experimenting with the p4est library for AMR simulations of two-phase flows
Barker’s algorithm for Bayesian inference with intractable likelihoods
Polynomial Cases for the Vertex Coloring Problem
Testing covariate significance in spatial point process first-order intensity
Can We Boost the Power of the Viola-Jones Face Detector Using Pre-processing? An Empirical Study
Multigroup Multicast Precoding in Massive MIMO
Decentralized Robust Control of Coupled Multi-Agent Systems under Local Signal Temporal Logic Tasks
Estimate Exchange over Network is Good for Distributed Hard Thresholding Pursuit
The martin boundary of a free product of abelian groups
Single-pixel imaging with Morlet wavelet correlated random patterns
Subjective Simulation as a Notion of Morphism for Composing Concurrent Resources
Bernstein – von Mises theorems for statistical inverse problems II: Compound Poisson processes
Improving Language Modelling with Noise-contrastive estimation
On the Existence and Structure of Mixed Nash Equilibria for In-Band Full-Duplex Wireless Networks
Probabilistic Synchronous Parallel
Computation Error Analysis of Block Floating Point Arithmetic Oriented Convolution Neural Network Accelerator Design
Sentence Correction Based on Large-scale Language Modelling
On predictive density estimation with additional information
The GENIUS Approach to Robust Mendelian Randomization Inference
OpenCL Actors – Adding Data Parallelism to Actor-based Programming with CAF
Mining User Queries with Information Extraction Methods and Linked Data
Humanoid Robots as Agents of Human Consciousness Expansion
Tropical Land Use Land Cover Mapping in Pará (Brazil) using Discriminative Markov Random Fields and Multi-temporal TerraSAR-X Data
On overfitting and asymptotic bias in batch reinforcement learning with partial observability
A multivariate zero-inflated logistic model for microbiome relative abundance data
Compact localized states and flat bands from local symmetry partitioning
The structure of information: from probability to homology
Quantum Memristors in Quantum Photonics
Neural Machine Translation
Attention-based Wav2Text with Feature Transfer Learning
Roaring Bitmaps: Implementation of an Optimized Software Library
Planar Graph Perfect Matching is in NC
Dual polar graphs, a nil-DAHA of rank one, and non-symmetric dual q-Krawtchouk polynomials
Challenging Neural Dialogue Models with Natural Data: Memory Networks Fail on Incremental Phenomena
Bayesian Optimization for Parameter Tuning of the XOR Neural Network
Generalized Quantum Reinforcement Learning with Quantum Technologies
Pemantle’s min-plus binary tree
Universal points in the asymptotic spectrum of tensors
Using Simulation and Domain Adaptation to Improve Efficiency of Deep Robotic Grasping
Bootstrapping incremental dialogue systems from minimal data: the generalisation power of dialogue grammars
VLSI Designs for Joint Channel Estimation and Data Detection in Large SIMO Wireless Systems
Mitigating the Impact of Speech Recognition Errors on Chatbot using Sequence-to-sequence Model
Planar Perfect Matching is in NC
Critical behavior of the 2D Ising model modulated by the Octonacci sequence
FiLM: Visual Reasoning with a General Conditioning Layer