Probabilities of causation of climate changes

Multiple changes in Earth’s climate system have been observed over the past decades. Determining how likely each of these changes are to have been caused by human influence, is important for decision making on mitigation and adaptation policy. Here we describe an approach for deriving the probability that anthropogenic forcings have caused a given observed change. The proposed approach is anchored into causal counterfactual theory (Pearl 2009) which has been introduced recently, and was in fact partly used already, in the context of extreme weather event attribution (EA). We argue that these concepts are also relevant, and can be straightforwardly extended to, the context of detection and attribution of long term trends associated to climate change (D&A). For this purpose, and in agreement with the principle of ‘fingerprinting’ applied in the conventional D&A framework, a trajectory of change is converted into an event occurrence defined by maximizing the causal evidence associated to the forcing under scrutiny. Other key assumptions used in the conventional D&A framework, in particular those related to numerical models error, can also be adapted conveniently to this approach. Our proposal thus allows to bridge the conventional framework with the standard causal theory, in an attempt to improve the quantification of causal probabilities. An illustration suggests that our approach is prone to yield a significantly higher estimate of the probability that anthropogenic forcings have caused the observed temperature change, thus supporting more assertive causal claims.

Causal inference taking into account unobserved confounding

Causal inference with observational data can be performed under an assumption of no unobserved confounders (unconfoundedness assumption). There is, however, seldom clear subject-matter or empirical evidence for such an assumption. We therefore develop uncertainty intervals for average causal effects based on outcome regression estimators and doubly robust estimators, which provide inference taking into account both sampling variability and uncertainty due to unobserved confounders. In contrast with sampling variation, uncertainty due unobserved confounding does not decrease with increasing sample size. The intervals introduced are obtained by deriving the bias of the estimators due to unobserved confounders. We are thus also able to contrast the size of the bias due to violation of the unconfoundedness assumption, with bias due to misspecification of the models used to explain potential outcomes. This is illustrated through numerical experiments where bias due to moderate unobserved confounding dominates misspecification bias for typical situations in terms of sample size and modeling assumptions. We also study the empirical coverage of the uncertainty intervals introduced and apply the results to a study of the effect of regular food intake on health. An R-package implementing the inference proposed is available.

Graph Centrality Measures for Boosting Popularity-Based Entity Linking

Many Entity Linking systems use collective graph-based methods to disambiguate the entity mentions within a document. Most of them have focused on graph construction and initial weighting of the candidate entities, less attention has been devoted to compare the graph ranking algorithms. In this work, we focus on the graph-based ranking algorithms, therefore we propose to apply five centrality measures: Degree, HITS, PageRank, Betweenness and Closeness. A disambiguation graph of candidate entities is constructed for each document using the popularity method, then centrality measures are applied to choose the most relevant candidate to boost the results of entity popularity method. We investigate the effectiveness of each centrality measure on the performance across different domains and datasets. Our experiments show that a simple and fast centrality measure such as Degree centrality can outperform other more time-consuming measures.

Text Generation Based on Generative Adversarial Nets with Latent Variable

In this paper, we propose a model using generative adversarial net (GAN) to generate realistic text. Instead of using standard GAN, we combine variational autoencoder (VAE) with generative adversarial net. The use of high-level latent random variables is helpful to learn the data distribution and solve the problem that generative adversarial net always emits the similar data. We propose the VGAN model where the generative model is composed of recurrent neural network and VAE. The discriminative model is a convolutional neural network. We train the model via policy gradient. We apply the proposed model to the task of text generation and compare it to other recent neural network based models, such as recurrent neural network language model and SeqGAN. We evaluate the performance of the model by calculating negative log-likelihood and the BLEU score. We conduct experiments on three benchmark datasets, and results show that our model outperforms other previous models.

Rank of Experts: Detection Network Ensemble

The recent advances of convolutional detectors show impressive performance improvement for large scale object detection. However, in general, the detection performance usually decreases as the object classes to be detected increases, and it is a practically challenging problem to train a dominant model for all classes due to the limitations of detection models and datasets. In most cases, therefore, there are distinct performance differences of the modern convolutional detectors for each object class detection. In this paper, in order to build an ensemble detector for large scale object detection, we present a conceptually simple but very effective class-wise ensemble detection which is named as Rank of Experts. We first decompose an intractable problem of finding the best detections for all object classes into small subproblems of finding the best ones for each object class. We then solve the detection problem by ranking detectors in order of the average precision rate for each class, and then aggregate the responses of the top ranked detectors (i.e. experts) for class-wise ensemble detection. The main benefit of our method is easy to implement and does not require any joint training of experts for ensemble. Based on the proposed Rank of Experts, we won the 2nd place in the ILSVRC 2017 object detection competition.

Optimal Algorithms for Distributed Optimization

In this paper, we study the optimal convergence rate for distributed convex optimization problems in networks. We model the communication restrictions imposed by the network as a set of affine constraints and provide optimal complexity bounds for four different setups, namely: the function F(\xb) \triangleq \sum_{i=1}^{m}f_i(\xb) is strongly convex and smooth, either strongly convex or smooth or just convex. Our results show that Nesterov’s accelerated gradient descent on the dual problem can be executed in a distributed manner and obtains the same optimal rates as in the centralized version of the problem (up to constant or logarithmic factors) with an additional cost related to the spectral gap of the interaction matrix. Finally, we discuss some extensions to the proposed setup such as proximal friendly functions, time-varying graphs, improvement of the condition numbers.

Sobol Tensor Trains for Global Sensitivity Analysis

Sobol indices are a widespread quantitative measure for variance-based global sensitivity analysis, but computing and utilizing them remains challenging for high-dimensional systems. We propose the tensor train decomposition (TT) as a unified framework for surrogate modeling and global sensitivity analysis via Sobol indices. We first overview several strategies to build a TT surrogate of the unknown true model using either an adaptive sampling strategy or a predefined set of samples. We then introduce and derive the Sobol tensor train, which compactly represents the Sobol indices for all possible joint variable interactions which are infeasible to compute and store explicitly. Our formulation allows efficient aggregation and subselection operations: we are able to obtain related indices (closed, total, and superset indices) at negligible cost. Furthermore, we exploit an existing global optimization procedure within the TT framework for variable selection and model analysis tasks. We demonstrate our algorithms with two analytical engineering models and a parallel computing simulation data set.

Unsupervised Generative Adversarial Cross-modal Hashing

Cross-modal hashing aims to map heterogeneous multimedia data into a common Hamming space, which can realize fast and flexible retrieval across different modalities. Unsupervised cross-modal hashing is more flexible and applicable than supervised methods, since no intensive labeling work is involved. However, existing unsupervised methods learn hashing functions by preserving inter and intra correlations, while ignoring the underlying manifold structure across different modalities, which is extremely helpful to capture meaningful nearest neighbors of different modalities for cross-modal retrieval. To address the above problem, in this paper we propose an Unsupervised Generative Adversarial Cross-modal Hashing approach (UGACH), which makes full use of GAN’s ability for unsupervised representation learning to exploit the underlying manifold structure of cross-modal data. The main contributions can be summarized as follows: (1) We propose a generative adversarial network to model cross-modal hashing in an unsupervised fashion. In the proposed UGACH, given a data of one modality, the generative model tries to fit the distribution over the manifold structure, and select informative data of another modality to challenge the discriminative model. The discriminative model learns to distinguish the generated data and the true positive data sampled from correlation graph to achieve better retrieval accuracy. These two models are trained in an adversarial way to improve each other and promote hashing function learning. (2) We propose a correlation graph based approach to capture the underlying manifold structure across different modalities, so that data of different modalities but within the same manifold can have smaller Hamming distance and promote retrieval accuracy. Extensive experiments compared with 6 state-of-the-art methods verify the effectiveness of our proposed approach.

Hierarchical Bayesian image analysis: from low-level modeling to robust supervised learning

Within a supervised classification framework, labeled data are used to learn classifier parameters. Prior to that, it is generally required to perform dimensionality reduction via feature extraction. These preprocessing steps have motivated numerous research works aiming at recovering latent variables in an unsupervised context. This paper proposes a unified framework to perform classification and low-level modeling jointly. The main objective is to use the estimated latent variables as features for classification and to incorporate simultaneously supervised information to help latent variable extraction. The proposed hierarchical Bayesian model is divided into three stages: a first low-level modeling stage to estimate latent variables, a second stage clustering these features into statistically homogeneous groups and a last classification stage exploiting the (possibly badly) labeled data. Performance of the model is assessed in the specific context of hyperspectral image interpretation, unifying two standard analysis techniques, namely unmixing and classification.

Precision Learning: Towards Use of Known Operators in Neural Networks

In this paper, we consider the use of prior knowledge within neural networks. In particular, we investigate the effect of a known transform within the mapping from input data space to the output domain. We demonstrate that use of known transforms is able to change maximal error bounds and that these are additive for the entire sequence of transforms. In order to explore the effect further, we consider the problem of X-ray material decomposition as an example to incorporate additional prior knowledge. We demonstrate that inclusion of a non-linear function known from the physical properties of the system is able to reduce prediction errors therewith improving prediction quality from SSIM values of 0.54 to 0.88. This approach is applicable to a wide set of applications in physics and signal processing that provide prior knowledge on such transforms. Also maximal error estimation and network understanding could be facilitated within the context of precision learning.

Time Limits in Reinforcement Learning

In reinforcement learning, it is common to let an agent interact with its environment for a fixed amount of time before resetting the environment and repeating the process in a series of episodes. The task that the agent has to learn can either be to maximize its performance over (i) that fixed period, or (ii) an indefinite period where time limits are only used during training to diversify experience. In this paper, we investigate theoretically how time limits could effectively be handled in each of the two cases. In the first one, we argue that the terminations due to time limits are in fact part of the environment, and propose to include a notion of the remaining time as part of the agent’s input. In the second case, the time limits are not part of the environment and are only used to facilitate learning. We argue that such terminations should not be treated as environmental ones and propose a method, specific to value-based algorithms, that incorporates this insight by continuing to bootstrap at the end of each partial episode. To illustrate the significance of our proposals, we perform several experiments on a range of environments from simple few-state transition graphs to complex control tasks, including novel and standard benchmark domains. Our results show that the proposed methods improve the performance and stability of existing reinforcement learning algorithms.

Probabilistic Adaptive Computation Time

We present a probabilistic model with discrete latent variables that control the computation time in deep learning models such as ResNets and LSTMs. A prior on the latent variables expresses the preference for faster computation. The amount of computation for an input is determined via amortized maximum a posteriori (MAP) inference. MAP inference is performed using a novel stochastic variational optimization method. The recently proposed Adaptive Computation Time mechanism can be seen as an ad-hoc relaxation of this model. We demonstrate training using the general-purpose Concrete relaxation of discrete variables. Evaluation on ResNet shows that our method matches the speed-accuracy trade-off of Adaptive Computation Time, while allowing for evaluation with a simple deterministic procedure that has a lower memory footprint.

Deep Learning Scaling is Predictable, Empirically

Deep learning (DL) creates impactful advances following a virtuous recipe: model architecture search, creating large training data sets, and scaling computation. It is widely believed that growing training sets and models should improve accuracy and result in better products. As DL application domains grow, we would like a deeper understanding of the relationships between training set size, computational scale, and model accuracy improvements to advance the state-of-the-art. This paper presents a large scale empirical characterization of generalization error and model size growth as training sets grow. We introduce a methodology for this measurement and test four machine learning domains: machine translation, language modeling, image processing, and speech recognition. Our empirical results show power-law generalization error scaling across a breadth of factors, resulting in power-law exponents—the ‘steepness’ of the learning curve—yet to be explained by theoretical work. Further, model improvements only shift the error but do not appear to affect the power-law exponent. We also show that model size scales sublinearly with data size. These scaling relationships have significant implications on deep learning research, practice, and systems. They can assist model debugging, setting accuracy targets, and decisions about data set growth. They can also guide computing system design and underscore the importance of continued computational scaling.

Feature discovery and visualization of robot mission data using convolutional autoencoders and Bayesian nonparametric topic models
Resource Sharing of a Computing Access Point for Multi-user Mobile Cloud Offloading with Delay Constraints
Paris-Lille-3D: a large and high-quality ground truth urban point cloud dataset for automatic segmentation and classification
Thermodynamic and kinetic fragility of Freon113: the most fragile plastic crystal
Glassy anomalies in the low-temperature thermal properties of a minimally disordered crystalline solid
Balancing Out Regression Error: Efficient Treatment Effect Estimation without Smooth Propensities
Regularization of non-normal matrices by Gaussian noise – the banded Toeplitz and twisted Toeplitz cases
Fluctuation theory for level-dependent Lévy risk processes
Adaptive fast gradient method in stochastic optimization tasks
A Short-term Intervention for Long-term Fairness in the Labor Market
Inference of Dynamic Regimes in the Microbiome
On the importance of normative data in speech-based assessment
When and how much the altruism impacts your privileged information? Proposing a new paradigm in game theory: The boxers game
Multi-Channel CNN-based Object Detection for Enhanced Situation Awareness
Machine Learning and Manycore Systems Design: A Serendipitous Symbiosis
Mining Precision Interfaces From Query Logs
Super SloMo: High Quality Estimation of Multiple Intermediate Frames for Video Interpolation
Budget-Aware Activity Detection with A Recurrent Policy Network
The distribution of shortest cycle lengths in random networks
Virtualized Control over Fog: Interplay Between Reliability and Latency
Finite GUE distribution with cut-off at a shock
Parameter Estimation for Subsurface flow using Ensemble Data Assimilation
A Note on 3-free Permutations
Graph Distillation for Action Detection with Privileged Information
Semantic Photometric Bundle Adjustment on Natural Sequences
Blind Gain and Phase Calibration via Sparse Spectral Methods
Towards Personalized Modeling of the Female Hormonal Cycle: Experiments with Mechanistic Models and Gaussian Processes
Significance of an excess in a counting experiment: assessing the impact of systematic uncertainties and the case with Gaussian background
Label Efficient Learning of Transferable Representations across Domains and Tasks
Spanning closed walks with bounded maximum degrees of graphs on surfaces
An interpretable latent variable model for attribute applicability in the Amazon catalogue
Experimental learning of quantum states
Hint of a Universal Law for the Financial Gains of Competitive Sport Teams. The case of Tour de France cycle race
Benford’s law first significant digit and distribution distances for testing the reliability of financial reports in developing countries
Video retrieval based on deep convolutional neural network
Maximal arcs and extended cyclic codes
Solving the kernel perfect problem by (simple) forbidden subdigraphs for digraphs in some families of generalized tournaments and generalized bipartite tournaments
Computing upper bounds for optimal density of $(t,r)$ broadcasts on the infinite grid
Optimization Methods for Inverse Problems
Susceptibility Propagation by Using Diagonal Consistency
Fundamental Limits on Data Acquisition: Trade-offs between Sample Complexity and Query Difficulty
Distance-based Camera Network Topology Inference for Person Re-identification
On Edge-Colored Saturation Problems
Generative Adversarial Networks for Electronic Health Records: A Framework for Exploring and Evaluating Methods for Predicting Drug-Induced Laboratory Test Trajectories
Capacity-Achievability of Polar Codes under Successive Cancellation List Decoding
Audio Cover Song Identification using Convolutional Neural Network
Lattice Model for Production of Gas
Speaker identification from the sound of the human breath
Rapid point-of-care Hemoglobin measurement through low-cost optics and Convolutional Neural Network based validation
Learning Depth from Monocular Videos using Direct Methods
New Techniques for Inferring L-Systems Using Genetic Algorithm
Personalized Gaussian Processes for Future Prediction of Alzheimer’s Disease Progression
Emulating satellite drag from large simulation experiments
Inertial-aided Rolling Shutter Relative Pose Estimation
Tight Hamilton cycles in cherry quasirandom $3$-uniform hypergraphs
Modeling the Multiple Sclerosis Brain Disease Using Agents: What Works and What Doesn’t?
Delineation of Skin Strata in Reflectance Confocal Microscopy Images using Recurrent Convolutional Networks with Toeplitz Attention
Improving Smiling Detection with Race and Gender Diversity
3D Facial Action Units Recognition for Emotional Expression
Kernel entropy estimation for linear processes
Graph Homomorphism Reconfiguration and Frozen $H$-Colourings
A 3D Coarse-to-Fine Framework for Automatic Pancreas Segmentation
InverseNet: Solving Inverse Problems with Splitting Networks
Closed-loop field development optimization with multipoint geostatistics and statistical assessment
Many body localization-delocalization transition in quantum Sherrington-Kirkpatrick model
Tensors, Learning, and ‘Kolmogorov Extension’ for Finite-alphabet Random Vectors
Distributed Stratified Locality Sensitive Hashing for Critical Event Prediction in the Cloud
The full chracterazation of the graphs with a L-eigenvalue of multiplicity $n-3$
An upper bound on the size of avoidance couplings on $K_n$
Real-time Semantic Image Segmentation via Spatial Sparsity
Closed Formulas of the Arithmetic Mean Component Competitive Ratio for the 3-Objective and 4-Objective Time Series Search Problems
On a conjecture of Karasev
A double competitive strategy based learning automata algorithm
On Quadratic Eigenvalue Complementarity Problem via DC Programming Approaches
A new exponential upper bound for the Erdős-Ginzburg-Ziv constant
Efficient determination of optimised multi-arm multi-stage experimental designs with control of generalised error-rates
A quantitative inverse theorem for the $U^4$ norm over finite fields
Spatial Modulation Aided Layered Division Multiplexing: A Spectral Efficiency Perspective
Deep Learning for Metagenomic Data: using 2D Embeddings and Convolutional Neural Networks
A bilinear version of Bogolyubov’s theorem
Learning Deep Representations for Word Spotting Under Weak Supervision
Cosmological Simulations in Exascale Era
Utilizing Domain Knowledge in End-to-End Audio Processing
Fast-SSC-Flip Decoding of Polar Codes
Topology of posets with special partial matchings
Deformable Shape Completion with Graph Convolutional Autoencoders
GANosaic: Mosaic Creation with Generative Texture Manifolds
Together or Alone: The Price of Privacy in Joint Learning
A Short Solution to the Many-Player Silent Duel with Arbitrary Consolation Prize
The Wright–Fisher model for class–dependent fitness landscapes
Neural Signatures for Licence Plate Re-identification
Locally-Iterative Distributed (Delta + 1)-Coloring below Szegedy-Vishwanathan Barrier, and Applications to Self-Stabilization and to Restricted-Bandwidth Models
Faithful Model Inversion Substantially Improves Auto-encoding Variational Inference
Prior and Likelihood Choices for Bayesian Matrix Factorisation on Small Datasets
Towards Time-Limited $\mathcal H_2$-Optimal Model Order Reduction
Energy- and Spectral- Efficiency Tradeoff for D2D-Multicasts in Underlay Cellular Networks
Deep Learning with Permutation-invariant Operator for Multi-instance Histopathology Classification
Folded Recurrent Neural Networks for Future Video Prediction
Semi-Adversarial Networks: Convolutional Autoencoders for Imparting Privacy to Face Images
On a skewed and multifractal uni-dimensional random field, as a probabilistic representation of Kolmogorov’s views on turbulence
Stochastic homogenisation of high-contrast media
Testing weak optimality of a given solution in interval linear programming revisited: NP-hardness proof, algorithm and some polynomial cases
Thresholding gradient methods in Hilbert spaces: support identification and linear convergence
Reachability Analysis of Large Linear Systems with Uncertain Inputs in the Krylov Subspace
Deep Neural Network Detects Quantum Phase Transition
Polar Coding for the Large Hadron Collider: Challenges in Code Concatenation
Don’t Just Assume; Look and Answer: Overcoming Priors for Visual Question Answering
Robust Computation of 2D EIT Absolute Images with D-bar Methods
General Erased-Word Processes: Product-Type Filtrations, Ergodic Laws and Martin Boundaries
Explicit formulas for heat kernels on diamond fractals
Footprint and minimum distance functions
The Tightness of the Kesten-Stigum Reconstruction Bound of Symmetric Model with Multiple Mutations
Reconstruction for the Asymmetric Ising Model on Regular Trees
Near critical preferential attachment networks have small giant components
On the martingale decompositions of Gundy, Meyer, and Yoeurp in infinite dimensions
On the few products, many sums problem
DAOS for Extreme-scale Systems in Scientific Applications
The reparameterization trick for acquisition functions
Unsupervised Classification of PolSAR Data Using a Scattering Similarity Measure Derived from a Geodesic Distance
Novel Exploration Techniques (NETs) for Malaria Policy Interventions
Event-Triggered Communication and Control of Network Systems for Multi-Agent Consensus
Integrable Trotterization: Local Conservation Laws and Boundary Driving
Single-Shot Object Detection with Enriched Semantics
On the treewidth of triangulated 3-manifolds
Unsupervised Learning for Color Constancy
Deep Neural Network Architectures for Modulation Classification