Causal Decomposition in the Mutual Causation System

Inference of causality in time series has been principally based on the prediction paradigm. Nonetheless, the predictive causality approach may overlook the simultaneous and reciprocal nature of causal interactions observed in real world phenomena. Here, we present a causal decomposition approach that is not based on prediction, but based on the instantaneous phase dependency between the intrinsic components of a decomposed time series. The method involves two assumptions: (1) any cause effect relationship can be quantified with instantaneous phase dependency between the source and target decomposed as intrinsic components at specific time scale, and (2) the phase dynamics in the target originating from the source are separable from the target itself. Using empirical mode decomposition, we show that the causal interaction is encoded in instantaneous phase dependency at a specific time scale, and this phase dependency is diminished when the causal-related intrinsic component is removed from the effect. Furthermore, we demonstrate the generic applicability of our method to both stochastic and deterministic systems, and show the consistency of the causal decomposition method compared to existing methods, and finally uncover the key mode of causal interactions in both the modelled and actual predator prey system. We anticipate that this novel approach will assist with revealing causal interactions in complex networks not accounted for by current methods.

Independent component analysis for multivariate functional data

We extend two methods of independent component analysis, fourth order blind identification and joint approximate diagonalization of eigen-matrices, to vector-valued functional data. Multivariate functional data occur naturally and frequently in modern applications, and extending independent component analysis to this setting allows us to distill important information from this type of data, going a step further than the functional principal component analysis. To allow the inversion of the covariance operator we make the assumption that the dependency between the component functions lies in a finite-dimensional subspace. In this subspace we define fourth cross-cumulant operators and use them to construct the two novel, Fisher consistent methods for solving the independent component problem for vector-valued functions. Both simulations and an application on a hand gesture data set show the usefulness and advantages of the proposed methods over functional principal component analysis.

Cognitive Database: A Step towards Endowing Relational Databases with Artificial Intelligence Capabilities

We propose Cognitive Databases, an approach for transparently enabling Artificial Intelligence (AI) capabilities in relational databases. A novel aspect of our design is to first view the structured data source as meaningful unstructured text, and then use the text to build an unsupervised neural network model using a Natural Language Processing (NLP) technique called word embedding. This model captures the hidden inter-/intra-column relationships between database tokens of different types. For each database token, the model includes a vector that encodes contextual semantic relationships. We seamlessly integrate the word embedding model into existing SQL query infrastructure and use it to enable a new class of SQL-based analytics queries called cognitive intelligence (CI) queries. CI queries use the model vectors to enable complex queries such as semantic matching, inductive reasoning queries such as analogies, predictive queries using entities not present in a database, and, more generally, using knowledge from external sources. We demonstrate unique capabilities of Cognitive Databases using an Apache Spark based prototype to execute inductive reasoning CI queries over a multi-modal database containing text and images. We believe our first-of-a-kind system exemplifies using AI functionality to endow relational databases with capabilities that were previously very hard to realize in practice.

Differentially Private Federated Learning: A Client Level Perspective

Federated learning is a recent advance in privacy protection. In this context, a trusted curator aggregates parameters optimized in decentralized fashion by multiple clients. The resulting model is then distributed back to all clients, ultimately converging to a joint representative model without explicitly having to share the data. However, the protocol is vulnerable to differential attacks, which could originate from any party contributing during federated optimization. In such an attack, a client’s contribution during training and information about their data set is revealed through analyzing the distributed model. We tackle this problem and propose an algorithm for client sided differential privacy preserving federated optimization. The aim is to hide clients’ contributions during training, balancing the trade-off between privacy loss and model performance. Empirical studies suggest that given a sufficiently large number of participating clients, our proposed procedure can maintain client-level differential privacy at only a minor cost in model performance.

Attentive Memory Networks: Efficient Machine Reading for Conversational Search

Recent advances in conversational systems have changed the search paradigm. Traditionally, a user poses a query to a search engine that returns an answer based on its index, possibly leveraging external knowledge bases and conditioning the response on earlier interactions in the search session. In a natural conversation, there is an additional source of information to take into account: utterances produced earlier in a conversation can also be referred to and a conversational IR system has to keep track of information conveyed by the user during the conversation, even if it is implicit. We argue that the process of building a representation of the conversation can be framed as a machine reading task, where an automated system is presented with a number of statements about which it should answer questions. The questions should be answered solely by referring to the statements provided, without consulting external knowledge. The time is right for the information retrieval community to embrace this task, both as a stand-alone task and integrated in a broader conversational search setting. In this paper, we focus on machine reading as a stand-alone task and present the Attentive Memory Network (AMN), an end-to-end trainable machine reading algorithm. Its key contribution is in efficiency, achieved by having an hierarchical input encoder, iterating over the input only once. Speed is an important requirement in the setting of conversational search, as gaps between conversational turns have a detrimental effect on naturalness. On 20 datasets commonly used for evaluating machine reading algorithms we show that the AMN achieves performance comparable to the state-of-the-art models, while using considerably fewer computations.

Revisiting the Master-Slave Architecture in Multi-Agent Deep Reinforcement Learning

Many tasks in artificial intelligence require the collaboration of multiple agents. We exam deep reinforcement learning for multi-agent domains. Recent research efforts often take the form of two seemingly conflicting perspectives, the decentralized perspective, where each agent is supposed to have its own controller; and the centralized perspective, where one assumes there is a larger model controlling all agents. In this regard, we revisit the idea of the master-slave architecture by incorporating both perspectives within one framework. Such a hierarchical structure naturally leverages advantages from one another. The idea of combining both perspectives is intuitive and can be well motivated from many real world systems, however, out of a variety of possible realizations, we highlights three key ingredients, i.e. composed action representation, learnable communication and independent reasoning. With network designs to facilitate these explicitly, our proposal consistently outperforms latest competing methods both in synthetic experiments and when applied to challenging StarCraft micromanagement tasks.

DeepFuse: A Deep Unsupervised Approach for Exposure Fusion with Extreme Exposure Image Pairs

We present a novel deep learning architecture for fusing static multi-exposure images. Current multi-exposure fusion (MEF) approaches use hand-crafted features to fuse input sequence. However, the weak hand-crafted representations are not robust to varying input conditions. Moreover, they perform poorly for extreme exposure image pairs. Thus, it is highly desirable to have a method that is robust to varying input conditions and capable of handling extreme exposure without artifacts. Deep representations have known to be robust to input conditions and have shown phenomenal performance in a supervised setting. However, the stumbling block in using deep learning for MEF was the lack of sufficient training data and an oracle to provide the ground-truth for supervision. To address the above issues, we have gathered a large dataset of multi-exposure image stacks for training and to circumvent the need for ground truth images, we propose an unsupervised deep learning framework for MEF utilizing a no-reference quality metric as loss function. The proposed approach uses a novel CNN architecture trained to learn the fusion operation without reference ground truth image. The model fuses a set of common low level features extracted from each image to generate artifact-free perceptually pleasing results. We perform extensive quantitative and qualitative evaluation and show that the proposed technique outperforms existing state-of-the-art approaches for a variety of natural images.

Bayesian model checking: A comparison of tests

Two procedures for checking Bayesian models are compared using a simple test problem based on the local Hubble expansion. Over four orders of magnitude, p-values derived from a global goodness-of-fit criterion for posterior probability density functions (Lucy 2017) agree closely with posterior predictive p-values. The former can therefore serve as an effective proxy for the difficult-to-calculate posterior predictive p-values.

Dataflow Matrix Machines and V-values: a Bridge between Programs and Neural Nets

Dataflow matrix machines generalize neural nets by replacing streams of numbers with streams of vectors (or other kinds of linear streams admitting a notion of linear combination of several streams) and adding a few more changes on top of that, namely arbitrary input and output arities for activation functions, countable-sized networks with finite dynamically changeable active part capable of unbounded growth, and a very expressive self-referential mechanism. While recurrent neural networks are Turing-complete, they form an esoteric programming platform, not conductive for practical general-purpose programming. Dataflow matrix machines are more suitable as a general-purpose programming platform, although it remains to be seen whether this platform can be made fully competitive with more traditional programming platforms currently in use. At the same time, dataflow matrix machines retain the key property of recurrent neural networks: programs are expressed via matrices of real numbers, and continuous changes to those matrices produce arbitrarily small variations in the programs associated with those matrices. Spaces of vector-like elements are of particular importance in this context. In particular, we focus on the vector space V of finite linear combinations of strings, which can be also understood as the vector space of finite prefix trees with numerical leaves, the vector space of ‘mixed rank tensors’, or the vector space of recurrent maps. This space, and a family of spaces of vector-like elements derived from it, are sufficiently expressive to cover all cases of interest we are currently aware of, and allow a compact and streamlined version of dataflow matrix machines based on a single space of vector-like elements and variadic neurons. We call elements of these spaces V-values. Their role in our context is somewhat similar to the role of S-expressions in Lisp.

Learning a Wavelet-like Auto-Encoder to Accelerate Deep Neural Networks

Accelerating deep neural networks (DNNs) has been attracting increasing attention as it can benefit a wide range of applications, e.g., enabling mobile systems with limited computing resources to own powerful visual recognition ability. A practical strategy to this goal usually relies on a two-stage process: operating on the trained DNNs (e.g., approximating the convolutional filters with tensor decomposition) and fine-tuning the amended network, leading to difficulty in balancing the trade-off between acceleration and maintaining recognition performance. In this work, aiming at a general and comprehensive way for neural network acceleration, we develop a Wavelet-like Auto-Encoder (WAE) that decomposes the original input image into two low-resolution channels (sub-images) and incorporate the WAE into the classification neural networks for joint training. The two decomposed channels, in particular, are encoded to carry the low-frequency information (e.g., image profiles) and high-frequency (e.g., image details or noises), respectively, and enable reconstructing the original input image through the decoding process. Then, we feed the low-frequency channel into a standard classification network such as VGG or ResNet and employ a very lightweight network to fuse with the high-frequency channel to obtain the classification result. Compared to existing DNN acceleration solutions, our framework has the following advantages: i) it is tolerant to any existing convolutional neural networks for classification without amending their structures; ii) the WAE provides an interpretable way to preserve the main components of the input image for classification.

Use of Deep Learning in Modern Recommendation System: A Summary of Recent Works

With the exponential increase in the amount of digital information over the internet, online shops, online music, video and image libraries, search engines and recommendation system have become the most convenient ways to find relevant information within a short time. In the recent times, deep learning’s advances have gained significant attention in the field of speech recognition, image processing and natural language processing. Meanwhile, several recent studies have shown the utility of deep learning in the area of recommendation systems and information retrieval as well. In this short review, we cover the recent advances made in the field of recommendation using various variants of deep learning technology. We organize the review in three parts: Collaborative system, Content based system and Hybrid system. The review also discusses the contribution of deep learning integrated recommendation systems into several application domains. The review concludes by discussion of the impact of deep learning in recommendation system in various domain and whether deep learning has shown any significant improvement over the conventional systems for recommendation. Finally, we also provide future directions of research which are possible based on the current state of use of deep learning in recommendation systems.

Riemann-Theta Boltzmann Machine

A general Boltzmann machine with continuous visible and discrete integer valued hidden states is introduced. Under mild assumptions about the connection matrices, the probability density function of the visible units can be solved for analytically, yielding a novel parametric density function involving a ratio of Riemann-Theta functions. The conditional expectation of a hidden state for given visible states can also be calculated analytically, yielding a derivative of the logarithmic Riemann-Theta function. The conditional expectation can be used as activation function in a feedforward neural network, thereby increasing the modelling capacity of the network. Both the Boltzmann machine and the derived feedforward neural network can be successfully trained via standard gradient- and non-gradient-based optimization techniques.

SuperPoint: Self-Supervised Interest Point Detection and Description

This paper presents a self-supervised framework for training interest point detectors and descriptors suitable for a large number of multiple-view geometry problems in computer vision. As opposed to patch-based neural networks, our fully-convolutional model operates on full-sized images and jointly computes pixel-level interest point locations and associated descriptors in one forward pass. We introduce Homographic Adaptation, a multi-scale, multi-homography approach for boosting interest point detection accuracy and performing cross-domain adaptation (e.g., synthetic-to-real). Our model, when trained on the MS-COCO generic image dataset using Homographic Adaptation, is able to repeatedly detect a much richer set of interest points than the initial pre-adapted deep model and any other traditional corner detector. The final system gives rise to strong interest point repeatability on the HPatches dataset and outperforms traditional descriptors such as ORB and SIFT on point matching accuracy and on the task of homography estimation.

Learning with Imprinted Weights
Machine Learning for Vehicular Networks
Direct Positioning with Channel Database Assistance
Distributed Massive MIMO Channel Estimation and Channel Database Assistance
The null hypothesis of common jumps in case of irregular and asynchronous observations
Linear Block Coding for Efficient Beam Discovery in Millimeter Wave Communication Networks
Scale-invariant temporal history (SITH): optimal slicing of the past in an uncertain world
Are Extreme Value Estimation Methods Useful for Network Data?
Real-time deep hair matting on mobile devices
Approximate Profile Maximum Likelihood
A note on Linnik’s Theorem on quadratic non-residues
Algebraic lattice codes for linear fading channels
On Wasserstein Reinforcement Learning and the Fokker-Planck equation
Almost perfect transport of an entangled two-qubit state through a spin chain
Y-net: 3D intracranial artery segmentation using a convolutional autoencoder
Deep Regression Forests for Age Estimation
Calibrating Noise to Variance in Adaptive Data Analysis
Optimal P-value Weighting with Independent Information
Discovery of Shifting Patterns in Sequence Classification
Accelerating the computation of FLAPW methods on heterogeneous architectures
Mixed-effects models using the normal and the Laplace distributions: A $\mathbf{2 \times 2}$ convolution scheme for applied research
Towards Practical File Packetizations in Wireless Device-to-Device Caching Networks
Codes Correcting Two Deletions
Numerical Comparison of Leja and Clenshaw-Curtis Dimension-Adaptive Collocation for Stochastic Parametric Electromagnetic Field Problems
Fusing Multifaceted Transaction Data for User Modeling and Demographic Prediction
Equivalences and counterexamples between several definitions of the uniform large deviations principle
Hyperparameters Optimization in Deep Convolutional Neural Network / Bayesian Approach with Gaussian Process Prior
Tensor networks demonstrate the robustness of localization and symmetry protected topological phases
Metadynamics for Training Neural Network Model Chemistries: a Competitive Assessment
Linear Time Clustering for High Dimensional Mixtures of Gaussian Clouds
Further limitations of the known approaches for matrix multiplication
Some Large Sample Results for the Method of Regularized Estimators
A Learning from Demonstration Approach fusing Torque Controllers
Multi-shot Pedestrian Re-identification via Sequential Decision Making
FoldingNet: Interpretable Unsupervised Learning on 3D Point Clouds
Positive definite (p.d.) functions vs p.d. distributions
Model-based curve registration via stochastic approximation EM algorithm
Spectral pairs and positive definite tempered distributions
A Framework to Utilize DERs’ VAR Resources to Support the Grid in an Integrated T-D System
Adaptive Mantel Test for Penalized Inference, with Applications to Imaging Genetics
Learning Sight from Sound: Ambient Sound Provides Supervision for Visual Learning
Uniform Diagonalization Theorem for Complexity Classes of Promise Problems including Randomized and Quantum Classes
LVreID: Person Re-Identification with Long Sequence Videos
Hierarchical and Interpretable Skill Acquisition in Multi-task Reinforcement Learning
Block-diagonal Hessian-free Optimization for Training Neural Networks
A distributed-memory hierarchical solver for general sparse linear systems
Random Walk by Majority Rule and Lévy walk
Analysis of supervised and semi-supervised GrowCut applied to segmentation of masses in mammography images
Detection and classification of masses in mammographic images in a multi-kernel approach
A Flexible Approach to Automated RNN Architecture Generation
Mining Events with Declassified Diplomatic Documents
Lost in Time: Temporal Analytics for Long-Term Video Surveillance
Model-Based Clustering of Time-Evolving Networks through Temporal Exponential-Family Random Graph Models
On the Diversity of Realistic Image Synthesis
Zero-dimensional Donaldson-Thomas invariants of Calabi-Yau 4-folds
Packing Fraction of a Two-dimensional Eden Model with Random-Sized Particles
On P-unique hypergraphs
Uniform rates of the Glivenko-Cantelli convergence and their use in approximating Bayesian inferences
Transformation Models in High-Dimensions
Intelligent Power Control for Spectrum Sharing: A Deep Reinforcement Learning Approach
Independent sets, cliques, and colorings in graphons
EstimatedWold Representation and Spectral Density-Driven Bootstrap for Time Series
Adversarial Structured Prediction for Multivariate Measures
Extreme Value Analysis Without the Largest Values: What Can Be Done?
Monte-Carlo methods for the pricing of American options: a semilinear BSDE point of view
Particles Systems for Mean Reflected BSDEs
Variational image regularization with Euler’s elastica using a discrete gradient scheme
The fourth smallest Hamming weight in the code of the projective plane over $\mathbb{Z}/p \mathbb{Z}$
Light Field Segmentation From Super-pixel Graph Representation
Sharp concentration of the equitable chromatic number of dense random graphs
Optimization of stochastic lossy transport networks and applications to power grids
Scheduling Algorithms for Minimizing Age of Information in Wireless Broadcast Networks with Random Arrivals
Finding Competitive Network Architectures Within a Day Using UCT
Rainbow Cycles in Flip Graphs
ADINE: An Adaptive Momentum Method for Stochastic Gradient Descent
Text Indexing and Searching in Sublinear Time
Incremental Adversarial Domain Adaptation
Laplace approximation and the natural gradient for Gaussian process regression with the heteroscedastic Student-t model
Boolean Tensor Decomposition for Conjunctive Queries with Negation
In silico generation of novel, drug-like chemical matter using the LSTM neural network
Self-Supervised Damage-Avoiding Manipulation Strategy Optimization via Mental Simulation
Fast kNN mode seeking clustering applied to active learning
Symmetries and synchronization in multilayer random networks
Selfishness need not be bad
Recurrent Attentional Reinforcement Learning for Multi-label Image Recognition
Transverse-spin correlations of the random transverse-field Ising model
Accurate 3D Reconstruction of Dynamic Scenes from Monocular Image Sequences with Severe Occlusions
Differentially Private Distributed Learning for Language Modeling Tasks
The graph tessellation cover number: extremal bounds, efficient algorithms and hardness
Weighted Lattice Paths Enumeration by Gaussian Polynomials
Attribute CNNs for Word Spotting in Handwritten Documents
Partial Labeled Gastric Tumor Segmentation via patch-based Reiterative Learning
Mean–field limit of a particle approximation of the one-dimensional parabolic–parabolic Keller-Segel model without smoothing
A Distributed Frank-Wolfe Framework for Learning Low-Rank Matrices with the Trace Norm
On Counting Perfect Matchings in General Graphs
Efficiently Decodable Threshold Non-Adaptive Group Testing
Ethical Questions in NLP Research: The (Mis)-Use of Forensic Linguistics
Estimating historic movement of a climatological variable from a pair of misaligned data sets
Statistical Inference for the Population Landscape via Moment Adjusted Stochastic Gradients
Symbol-Level Selective Full-Duplex Relaying with Power and Location Optimization
Convex and weakly convex domination in prism graphs
Adaptive model predictive control for constrained, linear time varying systems
Pole placement for overdetermined 2D systems
Anchored Network Users: Stochastic Evolutionary Dynamics of Cognitive Radio Network Selection
On the time to absorption in $Λ$-coalescents
An Ensemble Model with Ranking for Social Dialogue
Sharp heat kernel estimates for spectral fractional Laplacian perturbed by gradient
Mixing Time of Vertex-Weighted Exponential Random Graphs
Experimental Phase Estimation Enhanced By Machine Learning
Learning to Act Properly: Predicting and Explaining Affordances from Images
On a generalization of the Dvoretzky-Wald-Wolfowitz theorem with an application to a robust optimization problem
Optimal Discrete Spatial Compression for Beamspace Massive MIMO Signals
Story of the Developments in Statistical Physics of Fracture, Breakdown \& Earthquake: A Personal Account
Supermarket Model on Graphs
A Plünnecke-Ruzsa inequality in compact abelian groups
Temporal logic control of general Markov decision processes by approximate policy refinement
Improving Generalization Performance by Switching from Adam to SGD
Deep Learning with Lung Segmentation and Bone Shadow Exclusion Techniques for Chest X-Ray Analysis of Lung Cancer
Image Segmentation to Distinguish Between Overlapping Human Chromosomes
An Evolutionary Game Theory Model for Devaluing Rhinos
Sim2Real View Invariant Visual Servoing by Recurrent Control