• A coordinate-free theory of virtual holonomic constraints
• The 6-girth-thickness of the complete graph
• Learning Domain-Specific Word Embeddings from Sparse Cybersecurity Texts
• Stability of Spatial Smoothness and Cluster-Size Threshold Estimates in FMRI using AFNI
• Complexity of Scheduling Charging in the Smart Grid
• WERd: Using Social Text Spelling Variants for Evaluating Dialectal Speech Recognition
• The Covering Path Problem on a Grid
• Sparse-to-Dense: Depth Prediction from Sparse Depth Samples and a Single Image
• Decision making and uncertainty quantification for individualized treatments
• A Multimodal, Full-Surround Vehicular Testbed for Naturalistic Studies and Benchmarking: Design, Calibration and Deployment
• Hopf monoids and generalized permutahedra
• An Empirical Dynamic Programming Algorithm for Continuous MDPs
• Maximum oriented forcing number for complete graphs
• Robust Optimization of Unconstrained Binary Quadratic Problems
• Critical random forests
• Achieving Parsimony in Bayesian VARs with the Horseshoe Prior
• Defining a Lingua Franca to Open the Black Box of a Naïve Bayes Recommender
• Topics in loop measures and the loop-erased walk
• On an early paper of Maryam Mirzakhani
• Attention-based Mixture Density Recurrent Networks for History-based Recommendation
• On self-dual negacirculant codes of index two and four
• On self-dual four circulant codes
• Virtual Blood Vessels in Complex Background using Stereo X-ray Images
• Recent Advances on Estimating Population Size with Link-Tracing Sampling
• A preconditioning approach for improved estimation of sparse polynomial chaos expansions
• EmuFog: Extensible and Scalable Emulation of Large-Scale Fog Computing Infrastructures
• Novel Evaluation Metrics for Seam Carving based Image Retargeting
• A dynamical systems model of unorganised segregation
• Yaglom limits can depend on the starting state
• Hierarchical Detail Enhancing Mesh-Based Shape Generation with 3D Generative Adversarial Network
• Happy Travelers Take Big Pictures: A Psychological Study with Machine Learning and Big Data
• On the Martingale Problem and Feller and Strong Feller Properties for Weakly Coupled Lévy Type Operators
• Almost Difference Sets in Nonabelian Groups
• UAV-Enabled Wireless Power Transfer: Trajectory Design and Energy Optimization
• Learning to Generate Time-Lapse Videos Using Multi-Stage Dynamic Generative Adversarial Networks
• Inverse Reinforcement Learning with Conditional Choice Probabilities
• Demography-based Facial Retouching Detection using Subclass Supervised Sparse Autoencoder
• High-Resolution Shape Completion Using Deep Neural Networks for Global Structure and Local Geometry Inference
• Stochastic Input Models in Online Computing
• mts: a light framework for parallelizing tree search codes
• Efficient Nearest-Neighbor Search for Dynamical Systems with Nonholonomic Constraints
• Generalized Bayesian Updating and the Loss-Likelihood Bootstrap
• Subdiffusivity of Brownian motion among a Poissonian field of moving traps
• On global universality for zeros of random polynomials
• Kidnapping Model: An Extension of Selten’s Game
• Total stability of kernel methods
• BreathRNNet: Breathing Based Authentication on Resource-Constrained IoT Devices using RNNs
• Zhang $L^2$-Regularity for the solutions of Backward Doubly Stochastic Differential Equations under globally Lipschitz continuous assumptions
• Navigating Between Packings of Graphic Sequences
• Tolerances, robustness and parametrization of matrix properties related to optimization problems
• On the genealogy and coalescence times of Bienaymé-Galton-Watson branching processes
• On a coalescence process and its branching genealogy
• Quantified Derandomization of Linear Threshold Circuits
• Hierarchical Kriging for multi-fidelity aero-servo-elastic simulators – Application to extreme loads on wind turbines
• Approximate Bayesian Inference in Linear State Space Models for Intermittent Demand Forecasting at Scale
• Code Attention: Translating Code to Comments by Exploiting Domain Features
• OptLayer – Practical Constrained Optimization for Deep Reinforcement Learning in the Real World
• Infinite variance $H$-sssi processes as limits of particle systems
• Potentials and Implications of Dedicated Highway Lanes for Autonomous Vehicles
• Estimating the maximum possible earthquake magnitude using extreme value methodology: the Groningen case
• Random flights connecting Porous Medium and Euler-Poisson-Darboux equations
• Semantic Segmentation from Limited Training Data
• STAR: Spatio-Temporal Altimeter Waveform Retracking using Sparse Representation and Conditional Random Fields
• Towards Decentralised Resilient Community Cloud Infrastructures
• Real-time 3D Shape Instantiation from Single Fluoroscopy Projection for Fenestrated Stent Graft Deployment
• Experimenting with the p4est library for AMR simulations of two-phase flows
• Barker’s algorithm for Bayesian inference with intractable likelihoods
• Polynomial Cases for the Vertex Coloring Problem
• Testing covariate significance in spatial point process first-order intensity
• Can We Boost the Power of the Viola-Jones Face Detector Using Pre-processing? An Empirical Study
• Multigroup Multicast Precoding in Massive MIMO
• Decentralized Robust Control of Coupled Multi-Agent Systems under Local Signal Temporal Logic Tasks
• Estimate Exchange over Network is Good for Distributed Hard Thresholding Pursuit
• The martin boundary of a free product of abelian groups
• Single-pixel imaging with Morlet wavelet correlated random patterns
• Subjective Simulation as a Notion of Morphism for Composing Concurrent Resources
• Bernstein – von Mises theorems for statistical inverse problems II: Compound Poisson processes
• Improving Language Modelling with Noise-contrastive estimation
• On the Existence and Structure of Mixed Nash Equilibria for In-Band Full-Duplex Wireless Networks
• Probabilistic Synchronous Parallel
• Computation Error Analysis of Block Floating Point Arithmetic Oriented Convolution Neural Network Accelerator Design
• Sentence Correction Based on Large-scale Language Modelling
• On predictive density estimation with additional information
• The GENIUS Approach to Robust Mendelian Randomization Inference
• OpenCL Actors – Adding Data Parallelism to Actor-based Programming with CAF
• Mining User Queries with Information Extraction Methods and Linked Data
• Humanoid Robots as Agents of Human Consciousness Expansion
• Tropical Land Use Land Cover Mapping in Pará (Brazil) using Discriminative Markov Random Fields and Multi-temporal TerraSAR-X Data
• On overfitting and asymptotic bias in batch reinforcement learning with partial observability
• A multivariate zero-inflated logistic model for microbiome relative abundance data
• Compact localized states and flat bands from local symmetry partitioning
• The structure of information: from probability to homology
• Quantum Memristors in Quantum Photonics
• Neural Machine Translation
• Attention-based Wav2Text with Feature Transfer Learning
• Roaring Bitmaps: Implementation of an Optimized Software Library
• Planar Graph Perfect Matching is in NC
• Dual polar graphs, a nil-DAHA of rank one, and non-symmetric dual q-Krawtchouk polynomials
• Challenging Neural Dialogue Models with Natural Data: Memory Networks Fail on Incremental Phenomena
• Bayesian Optimization for Parameter Tuning of the XOR Neural Network
• Generalized Quantum Reinforcement Learning with Quantum Technologies
• Pemantle’s min-plus binary tree
• Universal points in the asymptotic spectrum of tensors
• Using Simulation and Domain Adaptation to Improve Efficiency of Deep Robotic Grasping
• Bootstrapping incremental dialogue systems from minimal data: the generalisation power of dialogue grammars
• VLSI Designs for Joint Channel Estimation and Data Detection in Large SIMO Wireless Systems
• Mitigating the Impact of Speech Recognition Errors on Chatbot using Sequence-to-sequence Model
• Planar Perfect Matching is in NC
• Critical behavior of the 2D Ising model modulated by the Octonacci sequence
• FiLM: Visual Reasoning with a General Conditioning Layer
Given a set of predictor variables and a response variable, how much information do the predictors have about the response, and how is this information distributed between unique, complementary, and shared components? Recent work has proposed to quantify the unique component of the decomposition as the minimum value of the conditional mutual information over a constrained set of information channels. We present an efficient iterative divergence minimization algorithm to solve this optimization problem with convergence guarantees, and we evaluate its performance against other techniques.
Recently, we have been witnessing huge advancements in the scale of data we routinely generate and collect in pretty much everything we do, as well as our ability to exploit modern technologies to process, analyze and understand this data. The intersection of these trends is what is called, nowadays, as Big Data Science. Cloud computing represents a practical and cost-effective solution for supporting Big Data storage, processing and for sophisticated analytics applications. We analyze in details the building blocks of the software stack for supporting big data science as a commodity service for data scientists. We provide various insights about the latest ongoing developments and open challenges in this domain.
Hamilton-Jacobi (HJ) reachability analysis is an important formal verification method for guaranteeing performance and safety properties of dynamical systems; it has been applied to many small-scale systems in the past decade. Its advantages include compatibility with general nonlinear system dynamics, formal treatment of bounded disturbances, and the availability of well-developed numerical tools. The main challenge is addressing its exponential computational complexity with respect to the number of state variables. In this tutorial, we present an overview of basic HJ reachability theory and provide instructions for using the most recent numerical tools, including an efficient GPU-parallelized implementation of a Level Set Toolbox for computing reachable sets. In addition, we review some of the current work in high-dimensional HJ reachability to show how the dimensionality challenge can be alleviated via various general theoretical and application-specific insights.
E-commerce websites such as Amazon, Alibaba, Flipkart, and Walmart sell billions of products. Machine learning (ML) algorithms involving products are often used to improve the customer experience and increase revenue, e.g., product similarity, recommendation, and price estimation. The products are required to be represented as features before training an ML algorithm. In this paper, we propose an approach called MRNet-Product2Vec for creating generic embeddings of products within an e-commerce ecosystem. We learn a dense and low-dimensional embedding where a diverse set of signals related to a product are explicitly injected into its representation. We train a Discriminative Multi-task Bidirectional Recurrent Neural Network (RNN), where the input is a product title fed through a Bidirectional RNN and at the output, product labels corresponding to fifteen different tasks are predicted. The task set includes several intrinsic characteristics about a product such as price, weight, size, color, popularity, and material. We evaluate the proposed embedding quantitatively and qualitatively. We demonstrate that they are almost as good as sparse and extremely high-dimensional TF-IDF representation in spite of having less than 3% of the TF-IDF dimension. We also use a multimodal autoencoder for comparing products from different language-regions and show preliminary yet promising qualitative results.
Cache contention is an important type of performance anomaly in this multi-core and many-core era. It can cause a significant slowdown in parallel programs. However, it is hard to detect and often, not visible in the source code. As software changes over time, modifications in code can introduce cache contention anomalies. One way to detect such anomalies, is to use performance regression testing. Prior approaches for cache contention detection are either not suitable for performance regression testing or requires knowledge about specific type of contention behavior. To remedy these shortcomings, we propose AutoCon. It works by finding the modified functions and collecting hardware performance counter profiles for them. It uses an unsupervised learning technique, called Autoencoder, to learn the contention behavior implied by the profiles (collected from the older version of code). Then, it checks the profiles collected from the newer version of code to determine whether the contention pattern (implied by the profiles) is anomalous. If so, AutoCon reports a cache contention anomaly. Finally, it performs root cause analysis to provide detailed debugging information. AutoCon is the first learning based cache contention detector that does not require any positive example of contention anomalies. We evaluated AutoCon with 13 real world cache contention anomalies as well as 7 open source programs. AutoCon detected all types of cache contention anomalies with only 3.7% profiling overhead (on average). Moreover, compared to a state-of-the-art cache contention detector, AutoCon detected more anomalies.
Bayesian additive regression trees (BART) has become increasingly popular as a flexible and scalable non-parametric model useful in many modern applied statistics regression problems. It brings many advantages to the practitioner dealing with large and complex non-linear response surfaces, such as a matrix-free formulation and the lack of a requirement to specify a regression basis a priori. However, while flexible in fitting the mean, the basic BART model relies on the standard i.i.d. normal model for the errors. This assumption is unrealistic in many applications. Moreover, in many applied problems understanding the relationship between the variance and predictors can be just as important as that of the mean model. We develop a novel heteroscedastic BART model to alleviate these concerns. Our approach is entirely non-parametric and does not rely on an a priori basis for the variance model. In BART, the conditional mean is modeled as a sum of trees, each of which determines a contribution to the overall mean. In this paper, we model the conditional variance with a product of trees, each of which determines a contribution to the overall variance. We implement the approach and demonstrate it on a simple low-dimensional simulated dataset, a higher-dimensional dataset of used car prices, a fisheries dataset and data from an alcohol consumption study.
Stateful applications and virtualized network functions (VNFs) can benefit from state externalization to increase their reliability, scalability, and inter-operability. To keep and share the externalized state, distributed data stores (DDSs) are a powerful tool allowing for the management of classical trade-offs in consistency, availability and partitioning tolerance. With the advent of Fog and Edge Computing, stateful applications and VNFs are pushed from the data centers toward the network edge. This poses new challenges on DDSs that are tailored to a deployment in Cloud data centers. In this paper, we propose two novel design goals for DDSs that are tailored to Fog Computing: (1) Fog-aware replica placement, and (2) context-sensitive differential consistency. To realize those design goals on top of existing DDSs, we propose the FogStore system. FogStore manages the needed adaptations in replica placement and consistency management transparently, so that existing DDSs can be plugged into the system. To show the benefits of FogStore, we perform a set of evaluations using the Yahoo Cloud Serving Benchmark.
The female facial image beautification usually requires professional editing softwares, which are relatively difficult for common users. In this demo, we introduce a practical system for automatic and personalized facial makeup recommendation and synthesis. First, a model describing the relations among facial features, facial attributes and makeup attributes is learned as the makeup recommendation model for suggesting the most suitable makeup attributes. Then the recommended makeup attributes are seamlessly synthesized onto the input facial image.
Local search is a basic building block in memetic algorithms. Guided Local Search (GLS) can improve the efficiency of local search. By changing the guide function, GLS guides a local search to escape from locally optimal solutions and find better solutions. The key component of GLS is its penalizing mechanism which determines which feature is selected to penalize when the search is trapped in a locally optimal solution. The original GLS penalizing mechanism only makes use of the cost and the current penalty value of each feature. It is well known that many combinatorial optimization problems have a big valley structure, i.e., the better a solution is, the more the chance it is closer to a globally optimal solution. This paper proposes to use big valley structure assumption to improve the GLS penalizing mechanism. An improved GLS algorithm called Elite Biased GLS (EB-GLS) is proposed. EB-GLS records and maintains an elite solution as an estimate of the globally optimal solutions, and reduces the chance of penalizing the features in this solution. We have systematically tested the proposed algorithm on the symmetric traveling salesman problem. Experimental results show that EB-GLS is significantly better than GLS.
In science publishing and many areas of research, the status quo is a lexicographic decision rule in which any result is first required to have a p-value that surpasses the 0.05 threshold and only then is consideration–often scant–given to such factors as prior and related evidence, plausibility of mechanism, study design and data quality, real world costs and benefits, novelty of finding, and other factors that vary by research domain. There have been recent proposals to change the p-value threshold, but instead we recommend abandoning the null hypothesis significance testing paradigm entirely, leaving p-values as just one of many pieces of information with no privileged role in scientific publication and decision making. We argue that this radical approach is both practical and sensible.
In this paper a new long-term survival distribution is proposed. The so called long term Fr\’echet distribution allows us to fit data where a part of the population is not susceptible to the event of interest. This model may be used, for example, in clinical studies where a portion of the population can be cured during a treatment. It is shown an account of mathematical properties of the new distribution such as its moments and survival properties. As well is presented the maximum likelihood estimators (MLEs) for the parameters. A numerical simulation is carried out in order to verify the performance of the MLEs. Finally, an important application related to the leukemia free-survival times for transplant patients are discussed to illustrates our proposed distribution
Association rule mining is a time consuming process due to involving both data intensive and computation intensive nature. In order to mine large volume of data and to enhance the scalability and performance of existing sequential association rule mining algorithms, parallel and distributed algorithms are developed. These traditional parallel and distributed algorithms are based on homogeneous platform and are not lucrative for heterogeneous platform such as grid and cloud. This requires design of new algorithms which address the issues of good data set partition and distribution, load balancing strategy, optimization of communication and synchronization technique among processors in such heterogeneous system. Grid and cloud are the emerging platform for distributed data processing and various association rule mining algorithms have been proposed on such platforms. This survey article integrates the brief architectural aspect of distributed system, various recent approaches of grid based and cloud based association rule mining algorithms with comparative perception. We differentiate between approaches of association rule mining algorithms developed on these architectures on the basis of data locality, programming paradigm, fault tolerance, communication cost, partition and distribution of data sets. Although it is not complete in order to cover all algorithms, yet it can be very useful for the new researchers working in the direction of distributed association rule mining algorithms.
Graph is an important data representation which appears in a wide diversity of real-world scenarios. Effective graph analytics provides users a deeper understanding of what is behind the data, and thus can benefit a lot of useful applications such as node classification, node recommendation, link prediction, etc. However, most graph analytics methods suffer the high computation and space cost. Graph embedding is an effective yet efficient way to solve the graph analytics problem. It converts the graph data into a low dimensional space in which the graph structural information and graph properties are maximally preserved. In this survey, we conduct a comprehensive review of the literature in graph embedding. We first introduce the formal definition of graph embedding as well as the related concepts. After that, we propose two taxonomies of graph embedding which correspond to what challenges exist in different graph embedding problem settings and how the existing work address these challenges in their solutions. Finally, we summarize the applications that graph embedding enables and suggest four promising future research directions in terms of computation efficiency, problem settings, techniques and application scenarios.
Many state-of-the-art algorithms for solving hard combinatorial problems include elements of stochasticity that lead to high variations in runtime, even for a fixed problem instance, across runs with different pseudo-random number seeds. Knowledge about the runtime distributions (RTDs) of algorithms on given problem instances can be exploited in various meta-algorithmic procedures, such as algorithm selection, portfolios, and randomized restarts. Previous work has shown that machine learning can be used to individually predict mean, median and variance of RTDs. To establish a new state-of-the-art in predicting RTDs, we demonstrate that the parameters of an RTD should be learned jointly and that neural networks can do this well by directly optimizing the likelihood of an RTD given runtime observations. In an empirical study involving four algorithms for SAT solving and AI planning, we show that our neural networks predict the true RTDs of unseen instances better than previous methods. As an exemplary application of RTD predictions, we show that our RTD models also yield good predictions of running these algorithms in parallel.
For most state-of-the-art architectures, Rectified Linear Unit (ReLU) becomes a standard component accompanied by each layer. Although ReLU can ease the network training to an extent, the character of blocking negative values may suppress the propagation of useful information and leads to the difficulty of optimizing very deep Convolutional Neural Networks (CNNs). Moreover, stacking of layers with nonlinear activations is hard to approximate the intrinsic linear transformations between feature representations. In this paper, we investigate the effect of erasing ReLUs of certain layers and apply it to various representative architectures. We name our approach as ‘EraseReLU’. It can ease the optimization and improve the generalization performance for very deep CNN models. In experiments, this method successfully improves the performance of various representative architectures, and we report the improved results on SVHN, CIFAR-10/100, and ImageNet-1k. By using EraseReLU, we achieve state-of-the-art single-model performance on CIFAR-100 with 83.47% accuracy. Codes will be released soon.
Deep convolutional neural networks (CNNs) achieve remarkable performances in image classification tasks. Recent studies, however, report that generalization abilities are more important than depth of neural networks in order to improve accuracy rate for image classification tasks. In this paper, I propose a new neural network called SwGridNet. SwGridNets contains many convolutional processing units which connect with each other as a grid network where there are many processing paths between input and output. SwGridNets have a high generalization ability because the multi-path architecture has the same effect of ensemble learning. In this paper, I describe the details of the network architecture of SwGridNets. I also show experimental results which indicate that the performances of SwGridNets are close to state-of-the-art deep CNNs.
With a strong motivational background in search engine optimization the amount of structured data on the web is growing rapidly. The main search engine providers are promising great increase in visibility through annotation of the web page’s content with the vocabulary of schema.org and thus providing it as structured data. But besides the usage by search engines the data can be used in various other ways, for example for automatic processing of annotated web services or actions. In this work we present an approach to consume and process schema.org annotated data on the web and give an idea how a best practice can look like.
A Blockchain is a global shared infrastructure where cryptocurrency transactions among addresses are recorded, validated and made publicly available in a peer- to-peer network. To date the best known and important cryptocurrency is the bitcoin. In this paper we focus on this cryptocurrency and in particular on the modeling of the Bitcoin Blockchain by using the Petri Nets formalism. The proposed model allows us to quickly collect information about identities owning Bitcoin addresses and to recover measures and statistics on the Bitcoin network. By exploiting algebraic formalism, we reconstructed an Entities network associated to Blockchain transactions gathering together Bitcoin addresses into the single entity holding permits to manage Bitcoins held by those addresses. The model allows also to identify a set of behaviours typical of Bitcoin owners, like that of using an address only once, and to reconstruct chains for this behaviour together with the rate of firing. Our model is highly flexible and can easily be adapted to include different features of the Bitcoin crypto-currency system.
Some researchers have proposed using non-Euclidean metrics for clustering data points. Generally, the metric should recognize that two points in the same cluster are close, even if their Euclidean distance is far. Multiple proposals have been suggested, including the Edge-Squared Metric (a specific example of a graph geodesic) and the Nearest Neighbor Metric. In this paper, we prove that the edge-squared and nearest-neighbor metrics are in fact equivalent. Previous best work showed that the edge-squared metric was a 3-approximation of the Nearest Neighbor metric. This paper represents one of the first proofs of equating a continuous metric with a discrete metric, using non-trivial discrete methods. Our proof uses the Kirszbraun theorem (also known as the Lipschitz Extension Theorem and Brehm’s Extension Theorem), a notable theorem in functional analysis and computational geometry. The results of our paper, combined with the results of Hwang, Damelin, and Hero, tell us that the Nearest Neighbor distance on i.i.d samples of a density is a reasonable constant approximation of a natural density-based distance function.