A Distance Measuring Algorithm for Location Analysis

Approximating distance is one of the key challenge in a facility location problem. Several algorithms have been proposed, however, none of them focused on estimating distance between two concave regions. In this work, we present an algorithm to estimate the distance between two irregular regions of a facility location problem. The proposed algorithm can identify the distance between concave shape regions. We also discuss some relevant properties of the proposed algorithm. A distance-sensitive capacity location model is introduced to test the algorithm. Moreover, sSeveral special geometric cases are discussed to show the advantages and insights of the algorithm.

Network reconstruction with local partial correlation: comparative evaluation

Over the past decade, various methods have been proposed for the reconstruction of networks modeled as Gaussian Graphical Models. In this work we analyzed three different approaches: the Graphical Lasso (GLasso), Graphical Ridge (GGMridge) and Local Partial Correlation (LPC). For the evaluation of the methods, we used high dimensional data generated from simulated random graphs (Erd\’os-R\’enyi, Barab\’asi-Albert, Watts-Strogatz). The performance was assessed through the Receiver Operating Characteristic (ROC) curve. In addition the methods were used for reconstruction of co-expression network, for differentially expressed genes in human cervical cancer data. LPC method outperformed the GLasso in most of the simulation cases, even though GGMridge produced better ROC curves then both other methods. LPC obtained similar outcomes as GGMridge in real data studies.

ATOMO: Communication-efficient Learning via Atomic Sparsification

Distributed model training suffers from communication overheads due to frequent gradient updates transmitted between compute nodes. To mitigate these overheads, several studies propose the use of sparsified stochastic gradients. We argue that these are facets of a general sparsification method that can operate on any possible atomic decomposition. Notable examples include element-wise, singular value, and Fourier decompositions. We present ATOMO, a general framework for atomic sparsification of stochastic gradients. Given a gradient, an atomic decomposition, and a sparsity budget, ATOMO gives a random unbiased sparsification of the atoms minimizing variance. We show that methods such as QSGD and TernGrad are special cases of ATOMO and show that sparsifiying gradients in their singular value decomposition (SVD), rather than the coordinate-wise one, can lead to significantly faster distributed training.

A framework for posterior consistency in model selection

We develop a theoretical framework for the frequentist assessment of Bayesian model selection, specifically its ability to select the (Kullback-Leibler) optimal model and to portray the corresponding uncertainty. The contribution is not proving consistency for a specific prior, but giving a general strategy for such proofs. Its basis applies to any model, prior, sample size, parameter dimensionality and (although only briefly exploited here) under model misspecification. As an immediate consequence the framework also characterizes a strong form of convergence for L_0 penalties and associated pseudo-posterior probabilities of potential interest for uncertainty quantification. The main advantage of the framework is that, instead of studying complex high-dimensional stochastic sums, it suffices to bound certain Bayes factor tails and use standard tools to determine the convergence of deterministic series. As a second contribution we deploy the framework to canonical linear regression. These findings give a high-level description of when one can achieve consistency and at what rate for a wide class of priors as a function of the data-generating truth, sample size and dimensionality. They also indicate when it is possible to use less sparse priors to improve inherent sparsity vs. power trade-offs that are not adequately captured by studying asymptotically optimal rates. Our empirical illustrations align with these findings, underlining the importance of considering the problem at hand’s characteristics to judge the quality of model selection procedures, rather than relying purely on asymptotics.

Aggregating Predictions on Multiple Non-disclosed Datasets using Conformal Prediction

Conformal Prediction is a machine learning methodology that produces valid prediction regions under mild conditions. In this paper, we explore the application of making predictions over multiple data sources of different sizes without disclosing data between the sources. We propose that each data source applies a transductive conformal predictor independently using the local data, and that the individual predictions are then aggregated to form a combined prediction region. We demonstrate the method on several data sets, and show that the proposed method produces conservatively valid predictions and reduces the variance in the aggregated predictions. We also study the effect that the number of data sources and size of each source has on aggregated predictions, as compared with equally sized sources and pooled data.

Gear Training: A new way to implement high-performance model-parallel training

The training of Deep Neural Networks usually needs tremendous computing resources. Therefore many deep models are trained in large cluster instead of single machine or GPU. Though major researchs at present try to run whole model on all machines by using asynchronous asynchronous stochastic gradient descent (ASGD), we present a new approach to train deep model parallely — split the model and then seperately train different parts of it in different speed.

A Survey on Trust Modeling from a Bayesian Perspective

This paper is concerned with trust modeling for networked computing systems. Of particular interest to this paper is the observation that trust is a subjective notion that is invisible, implicit and uncertain in nature, therefore it may be suitable for being expressed by subjective probabilities and then modeled on the basis of Bayesian principle. In spite of a few attempts to model trust in the Bayesian paradigm, the field lacks a global comprehensive overview of Bayesian methods and their theoretical connections to other alternatives. This paper presents a study to fill in this gap. It provides a comprehensive review and analysis of the literature, showing that a large deal of existing work, whether or not proposed based on Bayesian principle, can cast into a general Bayesian paradigm termed subjective Bayesian trust (SBT) theory here. The SBT framework can thus act as a general theoretical infrastructure for comparing or analyzing theoretical ties among existing trust models, and for developing novel models. The aim of this study is twofold. One is to gain insights about Bayesian philosophy in modeling trust. The other is to drive current research step ahead in seeking a high-level, abstract way of modeling and evaluating trust.

On Predictability of Time Series

The method to estimate the predictability of human mobility was proposed in [C. Song \emph{et al.}, Science {\bf 327}, 1018 (2010)], which is extensively followed in exploring the predictability of disparate time series. However, the ambiguous description in the original paper leads to some misunderstandings, including the inconsistent logarithm bases in the entropy estimator and the entropy-predictability-conversion equation, as well as the details in the calculation of the Lempel-Ziv estimator, which further results in remarkably overestimated predictability. This paper demonstrates the degree of overestimation by four different types of theoretically generated time series and an empirical data set, and shows the intrinsic deviation of the Lempel-Ziv estimator for highly random time series. This work provides a clear picture on this issue and thus helps researchers in correctly estimating the predictability of time series.

Data augmentation instead of explicit regularization

Modern deep artificial neural networks have achieved impressive results through models with very large capacity—compared to the number of training examples—that control overfitting with the help of different forms of regularization. Regularization can be implicit, as is the case of stochastic gradient descent and parameter sharing in convolutional layers, or explicit. Most common explicit regularization techniques, such as weight decay and dropout, reduce the effective capacity of the model and typically require the use of deeper and wider architectures to compensate for the reduced capacity. Although these techniques have been proven successful in terms of improved generalization, they seem to waste capacity. In contrast, data augmentation techniques do not reduce the effective capacity and improve generalization by increasing the number of training examples. In this paper we systematically analyze the effect of data augmentation on some popular architectures and conclude that data augmentation alone—without any other explicit regularization techniques—can achieve the same performance or higher as regularized models, especially when training with fewer examples, and exhibits much higher adaptability to changes in the architecture.

Bayesian Model-Agnostic Meta-Learning

Learning to infer Bayesian posterior from a few-shot dataset is an important step towards robust meta-learning due to the model uncertainty inherent in the problem. In this paper, we propose a novel Bayesian model-agnostic meta-learning method. The proposed method combines scalable gradient-based meta-learning with nonparametric variational inference in a principled probabilistic framework. During fast adaptation, the method is capable of learning complex uncertainty structure beyond a point estimate or a simple Gaussian approximation. In addition, a robust Bayesian meta-update mechanism with a new meta-loss prevents overfitting during meta-update. Remaining an efficient gradient-based meta-learner, the method is also model-agnostic and simple to implement. Experiment results show the accuracy and robustness of the proposed method in various tasks: sinusoidal regression, image classification, active learning, and reinforcement learning.

State Space Representations of Deep Neural Networks

This paper deals with neural networks as dynamical systems governed by differential or difference equations. It shows that the introduction of skip connections into network architectures, such as residual networks and dense networks, turns a system of static equations into a system of dynamical equations with varying levels of smoothness on the layer-wise transformations. Closed form solutions for the state space representations of general dense networks, as well as k^{th} order smooth networks, are found in general settings. Furthermore, it is shown that imposing k^{th} order smoothness on a network architecture with d-many nodes per layer increases the state space dimension by a multiple of k, and so the effective embedding dimension of the data manifold is k \cdot d-many dimensions. It follows that network architectures of these types reduce the number of parameters needed to maintain the same embedding dimension by a factor of k^2 when compared to an equivalent first-order, residual network, significantly motivating the development of network architectures of these types. Numerical simulations were run to validate parts of the developed theory.

Context Tree for Adaptive Session-based Recommendation

There has been growing interests in recent years from both practical and research perspectives for session-based recommendation tasks as long-term user profiles do not often exist in many real-life recommendation applications. In this case, recommendations for user’s immediate next actions need to be generated based on patterns in anonymous short sessions. An often overlooked aspect is that new items with limited observations arrive continuously in many domains (e.g. news and discussion forums). Therefore, recommendations need to be adaptive to such frequent changes. In this paper, we benchmark a new nonparametric method called context tree (CT) against various state-of-the-art methods on extensive datasets for session-based recommendation task. Apart from the standard static evaluation protocol adopted by previous literatures, we include an adaptive configuration to mimic the situation when new items with limited observations arrives continuously. Our results show that CT outperforms two best-performing approaches (recurrent neural network; heuristic-based nearest neighbor) in majority of the tested configurations and datasets. We analyze reasons for this and demonstrate that it is because of the better adaptation to changes in the domain, as well as the remarkable capability to learn static sequential patterns. Moreover, our running time analysis illustrates the efficiency of using CT as other nonparametric methods.

Smallify: Learning Network Size while Training

As neural networks become widely deployed in different applications and on different hardware, it has become increasingly important to optimize inference time and model size along with model accuracy. Most current techniques optimize model size, model accuracy and inference time in different stages, resulting in suboptimal results and computational inefficiency. In this work, we propose a new technique called Smallify that optimizes all three of these metrics at the same time. Specifically we present a new method to simultaneously optimize network size and model performance by neuron-level pruning during training. Neuron-level pruning not only produces much smaller networks but also produces dense weight matrices that are amenable to efficient inference. By applying our technique to convolutional as well as fully connected models, we show that Smallify can reduce network size by 35X with a 6X improvement in inference time with similar accuracy as models found by traditional training techniques.

Deconvolution-Based Global Decoding for Neural Machine Translation

A great proportion of sequence-to-sequence (Seq2Seq) models for Neural Machine Translation (NMT) adopt Recurrent Neural Network (RNN) to generate translation word by word following a sequential order. As the studies of linguistics have proved that language is not linear word sequence but sequence of complex structure, translation at each step should be conditioned on the whole target-side context. To tackle the problem, we propose a new NMT model that decodes the sequence with the guidance of its structural prediction of the context of the target sequence. Our model generates translation based on the structural prediction of the target-side context so that the translation can be freed from the bind of sequential order. Experimental results demonstrate that our model is more competitive compared with the state-of-the-art methods, and the analysis reflects that our model is also robust to translating sentences of different lengths and it also reduces repetition with the instruction from the target-side context for decoding.

LexNLP: Natural language processing and information extraction for legal and regulatory texts

LexNLP is an open source Python package focused on natural language processing and machine learning for legal and regulatory text. The package includes functionality to (i) segment documents, (ii) identify key text such as titles and section headings, (iii) extract over eighteen types of structured information like distances and dates, (iv) extract named entities such as companies and geopolitical entities, (v) transform text into features for model training, and (vi) build unsupervised and supervised models such as word embedding or tagging models. LexNLP includes pre-trained models based on thousands of unit tests drawn from real documents available from the SEC EDGAR database as well as various judicial and regulatory proceedings. LexNLP is designed for use in both academic research and industrial applications, and is distributed at https://…/lexpredict-lexnlp.

Effect of walking-distance on a queuing system of totally asymmetric simple exclusion process equipped with functions of site assignments
Fubini-Tonelli type theorem for non product measures in a product space
Stochastic HJB Equations and Regular Singular Points
Valid Post-selection Inference in Assumption-lean Linear Regression
Iteration Complexity of Randomized Primal-Dual Methods for Convex-Concave Saddle Point Problems
Shuffle-compatible permutation statistics II: the exterior peak set
Robust test statistics for the two-way MANOVA based on the minimum covariance determinant estimator
Exact, complete expressions for the thermodynamic costs of circuits
The Optimal DoF Region for the Two-User Non-Coherent SIMO Multiple-Access Channel
Branching random walks with uncountably many extinction probability vectors
WikiRef: Wikilinks as a route to recommending appropriate references for scientific Wikipedia pages
Tensor-based Hardness of the Shortest Vector Problem to within Almost Polynomial Factors
On the Hardness of Satisfiability with Bounded Occurrences in the Polynomial-Time Hierarchy
Scalable Overload Control for Large-scale Microservice Architecture
Semantically Selective Augmentation for Deep Compact Person Re-Identification
Fairness-Aware Scheduling in Multi-Numerology Based 5G New Radio
The Research of the Real-time Detection and Recognition of Targets in Streetscape Videos
Exploiting Mobility in Cache-Assisted D2D Networks: Performance Analysis and Optimization
A Co-Matching Model for Multi-choice Reading Comprehension
Adaptive Mechanism Design: Learning to Promote Cooperation
Joint Learning of Motion Estimation and Segmentation for Cardiac MR Image Sequences
Density and Distribution Evaluation for Convolution of Independent Gamma Variables
Digital compensation of the side-band-rejection ratio in a fully analog 2SB sub-millimeter receiver
CT-Realistic Lung Nodule Simulation from 3D Conditional Generative Adversarial Networks for Robust Lung Segmentation
High Dimensional Data Enrichment: Interpretable, Fast, and Data-Efficient
Parameter estimation for stochastic partial differential equations of second order
The Influence of One Strategic Agent on The Matching Market
Randomized reference models for temporal networks
Quasi-potential Calculation and Minimum Action Method for Limit Cycle
Adaptive Denoising of Signals with Shift-Invariant Structure
BSDEs driven by cylindrical martingales with application to approximate hedging in bond markets
Control and Readout Software in Superconducting Quantum Computing
Baselines and a datasheet for the Cerema AWP dataset
Ubiquity in graphs I: Topological ubiquity of trees
PubMed Labs: An experimental platform for improving biomedical literature search
Modeling Time-dependent CO$_2$ Intensities in Multi-modal Energy Systems with Storage
The topological trees with extremal Matula numbers
Learning to Estimate Indoor Lighting from 3D Objects
Ergodicity of Invariant Capacity
Reconciling Multiple Genes Trees via Segmental Duplications and Losses
Revising and Extending the Linear Response Theory for Statistical Mechanical Systems: Evaluating Observables as Predictors and Predictands
Efficient global optimization of constrained mixed variable problems
Fast Decoder for Overloaded Uniquely Decodable Synchronous CDMA
Prosody Modifications for Question-Answering in Voice-Only Settings
Statistics on functional data and covariance operators in linear inverse problems
Global rational stabilization of a class of nonlinear time-delay systems
A Fast and Easy Regression Technique for k-NN Classification Without Using Negative Pairs
Supervised Machine Learning for Analysing Spectra of Exoplanetary Atmospheres
Mixing times for the simple exclusion process in ballistic random environment
Scalable Approximation Algorithm for Graph Summarization
When and where do feed-forward neural networks learn localist representations
Analysis of Average Consensus Algorithm for Asymmetric Regular Networks
Coloring Delaunay-Edges and their Generalizations
Versatile Mobile Communications Simulation: The Vienna 5G Link Level Simulator
On Mixtures of Gamma Distributions, Distributions with Hyperbolically Monotone Densities and Generalized Gamma Convolutions (GGC)
Convergence Rates for Projective Splitting
Decentralize and Randomize: Faster Algorithm for Wasserstein Barycenters
FESTUNG: A MATLAB /GNU Octave toolbox for the discontinuous Galerkin method. Part IV: Generic problem framework and model-coupling interface
Reachability for Branching Concurrent Stochastic Games
Retinal Optic Disc Segmentation using Conditional Generative Adversarial Network
Multi-task learning of daily work and study round-trips from survey data
Dual Pattern Learning Networks by Empirical Dual Prediction Risk Minimization
A Cost-based Storage Format Selector for Materialization in Big Data Frameworks
On solid density of Cayley digraphs on finite Abelian groups
Multi-Task Deep Networks for Depth-Based 6D Object Pose and Joint Registration in Crowd Scenarios
Fractal and multifractal properties of electrographic recordings of human brain activity
Time-inhomogeneous polynomial processes
Fast Approximate Natural Gradient Descent in a Kronecker-factored Eigenbasis
A $4$-choosable graph that is not $(8:2)$-choosable
On the Rate of Convergence to a Gamma Distribution on Wiener Space
Optimizing sequential decisions in the drift-diffusion model
Distance-Free Modeling of Multi-Predicate Interactions in End-to-End Japanese Predicate-Argument Structure Analysis
Interdependent Values without Single-Crossing
Massively Parallel Video Networks
Air-Ground Integrated Vehicular Network Slicing with Content Pushing and Caching
Effect of incommensurate potential on nodal-link semimetals
Quantum structure of glasses and the boson peak: a theory of vibrations
Deep Learning for Classification Tasks on Geospatial Vector Polygons
Object detection and tracking benchmark in industry based on improved correlation filter
Confidence ellipsoids for regression coefficients by observations from a mixture
Synthetic Perfusion Maps: Imaging Perfusion Deficits in DSC-MRI with Deep Learning
A Multimodal Classifier Generative Adversarial Network for Carry and Place Tasks from Ambiguous Language Instructions
HetNetAligner: Design and Implementation of an algorithm for heterogeneous network alignment on Apache Spark
On closeness of two discrete weighted sums
Large deviations of regression parameter estimator in continuous-time models with sub-Gaussian noise
Analytic continuation via ‘domain-knowledge free’ machine learning
Properties of Poisson processes directed by compound Poisson-Gamma subordinators
Interactive Visual Grounding of Referring Expressions for Human-Robot Interaction
Mixed-Effect Time-Varying Network Model and Application in Brain Connectivity Analysis
SVA Based Beamforming
Exponential bounds for the tail probability of the supremum of an inhomogeneous random walk
Know What You Don’t Know: Unanswerable Questions for SQuAD
Addition of Code Mixed Features to Enhance the Sentiment Prediction of Song Lyrics
An Efficient, Generalized Bellman Update For Cooperative Inverse Reinforcement Learning
Adaptive MCMC via Combining Local Samplers
Minmax-Regret $k$-Sink Location on a Dynamic Tree Network with Uniform Capacities
Compression of phase-only holograms with JPEG standard and deep learning
Distributed Kalman Filter for A Class of Nonlinear Uncertain Systems: An Extended State Method
Enhancing PHY Security of MISO NOMA SWIPT Systems With a Practical Non-Linear EH Model
Greybox fuzzing as a contextual bandits problem
Central limit theorems for non-symmetric random walks on nilpotent covering graphs: Part I
Chaining Mutual Information and Tightening Generalization Bounds
Polynomials from combinatorial $K$-theory
On the adversarial robustness of robust estimators
Hyperviscosity-Based Stabilization for Radial Basis Function-Finite Difference (RBF-FD) Discretizations of Advection-Diffusion Equations
Generative Adversarial Network Architectures For Image Synthesis Using Capsule Networks
Context-Aware Policy Reuse
The Effect of Network Width on the Performance of Large-batch Training
Distributed Evaluations: Ending Neural Point Metrics
On oracle-type local recovery guarantees in compressed sensing
Noise-based control of social dynamics
Counting subgroups of fixed order in finite abelian groups
DOOBNet: Deep Object Occlusion Boundary Detection from an Image
Joint Beamforming and Power Allocation in Downlink NOMA Multiuser MIMO Networks
Scalable Self-Adaptive Synchronous Triggering System in Superconducting Quantum Computing
Smoothed analysis of the low-rank approach for smooth semidefinite programs
On critical dynamics and thermodynamic efficiency of urban transformations
Part-of-Speech Tagging on an Endangered Language: a Parallel Griko-Italian Resource
The CCP Selector: Scalable Algorithms for Sparse Ridge Regression from Chance-Constrained Programming
Exponential drift condition and ergodicity for generalized reflected Brownian motion
Robust Object Tracking with Crow Search Optimized Multi-cue Particle Filter
Machine-learning Skyrmions
A Structured Variational Autoencoder for Contextual Morphological Inflection
Forecast evaluation with imperfect observations and imperfect models
Are All Languages Equally Hard to Language-Model
Unsupervised Disambiguation of Syncretism in Inflected Lexicons
Asymptotics for 2D critical and near-critical first-passage percolation
The distribution of sandpile groups of random regular graphs
Lost in translation: On the impact of data coding on penalized regression with interactions
Cross-Dataset Adaptation for Visual Question Answering
Learning Answer Embeddings for Visual Question Answering
VLSI Design Of Advanced Digital Filters
Stochastic seismic waveform inversion using generative adversarial networks as a geological prior
Static Quantized Radix-2 FFT/IFFT Processor for Constraints Analysis
Smart GSM Based Home Automation System
All-in-one: Multi-task Learning for Rumour Verification
Deep Reinforcement Learning for Chinese Zero pronoun Resolution
On the third-order Jacobsthal and third-order Jacobsthal-Lucas sequences and their matrix representations
Deterministic Min-Cost Matching with Delays
An asymmetric container lemma and the structure of graphs with no induced $4$-cycle
Poisson percolation on the oriented square lattice
Towards Completely Characterizing the Complexity of Boolean Nets Synthesis
Convolutional number-theoretic method to optimise integer matrix multiplication
Unsupervised Video-to-Video Translation
Leaves on the line and in the plane
Segmentation of Arterial Walls in Intravascular Ultrasound Cross-Sectional Images Using Extremal Region Selection
Collaboration Diversity and Scientific Impact
Complex network representation through multi-dimensional node projection
A Sufficient and Necessary Condition of PS-ergodicity of Periodic Measures and Generated Ergodic Upper Expectations
Granular Optimal Load-Side Control of Power Systems with Electric Spring Aggregators