Causality Refined Diagnostic Prediction

Applying machine learning in the health care domain has shown promising results in recent years. Interpretable outputs from learning algorithms are desirable for decision making by health care personnel. In this work, we explore the possibility of utilizing causal relationships to refine diagnostic prediction. We focus on the task of diagnostic prediction using discomfort drawings, and explore two ways to employ causal identification to improve the diagnostic results. Firstly, we use causal identification to infer the causal relationships among diagnostic labels which, by itself, provides interpretable results to aid the decision making and training of health care personnel. Secondly, we suggest a post-processing approach where the inferred causal relationships are used to refine the prediction accuracy of a multi-view probabilistic model. Experimental results show firstly that causal identification is capable of detecting the causal relationships among diagnostic labels correctly, and secondly that there is potential for improving pain diagnostics prediction accuracy using the causal relationships.

The Block Point Process Model for Continuous-Time Event-Based Dynamic Networks

Many application settings involve the analysis of timestamped relations or events between a set of entities, e.g. messages between users of an on-line social network. Static and discrete-time network models are typically used as analysis tools in these settings; however, they discard a significant amount of information by aggregating events over time to form network snapshots. In this paper, we introduce a block point process model (BPPM) for dynamic networks evolving in continuous time in the form of events at irregular time intervals. The BPPM is inspired by the well-known stochastic block model (SBM) for static networks and is a simpler version of the recently-proposed Hawkes infinite relational model (IRM). We show that networks generated by the BPPM follow an SBM in the limit of a growing number of nodes and leverage this property to develop an efficient inference procedure for the BPPM. We fit the BPPM to several real network data sets, including a Facebook network with over 3, 500 nodes and 130, 000 events, several orders of magnitude larger than the Hawkes IRM and other existing point process network models.

FearNet: Brain-Inspired Model for Incremental Learning

Incremental class learning involves sequentially learning classes in bursts of examples from the same class. This violates the assumptions that underlie methods for training standard deep neural networks, and will cause them to suffer from catastrophic forgetting. Arguably, the best method for incremental class learning is iCaRL, but it requires storing training examples for each class, making it challenging to scale. Here, we propose FearNet for incremental class learning. FearNet is a generative model that does not store previous examples, making it memory efficient. FearNet uses a brain-inspired dual-memory system in which new memories are consolidated from a network for recent memories inspired by the mammalian hippocampal complex to a network for long-term storage inspired by medial prefrontal cortex. Memory consolidation is inspired by mechanisms that occur during sleep. FearNet also uses a module inspired by the basolateral amygdala for determining which memory system to use for recall. FearNet achieves state-of-the-art performance at incremental class learning on image (CIFAR-100, CUB-200) and audio classification (AudioSet) benchmarks.

A reinforcement learning algorithm for building collaboration in multi-agent systems

This paper presents a proof-of concept study for demonstrating the viability of building collaboration among multiple agents through standard Q learning algorithm embedded in particle swarm optimisation. Collaboration is formulated to be achieved among the agents via some sort competition, where the agents are expected to balance their action in such a way that none of them drifts away of the team and none intervene any fellow neighbours territory. Particles are devised with Q learning algorithm for self training to learn how to act as members of a swarm and how to produce collaborative/collective behaviours. The produced results are supportive to the algorithmic structures suggesting that a substantive collaboration can be build via proposed learning algorithm.

Intent-Aware Contextual Recommendation System

Recommender systems take inputs from user history, use an internal ranking algorithm to generate results and possibly optimize this ranking based on feedback. However, often the recommender system is unaware of the actual intent of the user and simply provides recommendations dynamically without properly understanding the thought process of the user. An intelligent recommender system is not only useful for the user but also for businesses which want to learn the tendencies of their users. Finding out tendencies or intents of a user is a difficult problem to solve. Keeping this in mind, we sought out to create an intelligent system which will keep track of the user’s activity on a web-application as well as determine the intent of the user in each session. We devised a way to encode the user’s activity through the sessions. Then, we have represented the information seen by the user in a high dimensional format which is reduced to lower dimensions using tensor factorization techniques. The aspect of intent awareness (or scoring) is dealt with at this stage. Finally, combining the user activity data with the contextual information gives the recommendation score. The final recommendations are then ranked using filtering and collaborative recommendation techniques to show the top-k recommendations to the user. A provision for feedback is also envisioned in the current system which informs the model to update the various weights in the recommender system. Our overall model aims to combine both frequency-based and context-based recommendation systems and quantify the intent of a user to provide better recommendations. We ran experiments on real-world timestamped user activity data, in the setting of recommending reports to the users of a business analytics tool and the results are better than the baselines. We also tuned certain aspects of our model to arrive at optimized results.

Contextual Outlier Interpretation

Outlier detection plays an essential role in many data-driven applications to identify isolated instances that are different from the majority. While many statistical learning and data mining techniques have been used for developing more effective outlier detection algorithms, the interpretation of detected outliers does not receive much attention. Interpretation is becoming increasingly important to help people trust and evaluate the developed models through providing intrinsic reasons why the certain outliers are chosen. It is difficult, if not impossible, to simply apply feature selection for explaining outliers due to the distinct characteristics of various detection models, complicated structures of data in certain applications, and imbalanced distribution of outliers and normal instances. In addition, the role of contrastive contexts where outliers locate, as well as the relation between outliers and contexts, are usually overlooked in interpretation. To tackle the issues above, in this paper, we propose a novel Contextual Outlier INterpretation (COIN) method to explain the abnormality of existing outliers spotted by detectors. The interpretability for an outlier is achieved from three aspects: outlierness score, attributes that contribute to the abnormality, and contextual description of its neighborhoods. Experimental results on various types of datasets demonstrate the flexibility and effectiveness of the proposed framework compared with existing interpretation approaches.

TensorFlow Distributions

The TensorFlow Distributions library implements a vision of probability theory adapted to the modern deep-learning paradigm of end-to-end differentiable computation. Building on two basic abstractions, it offers flexible building blocks for probabilistic computation. Distributions provide fast, numerically stable methods for generating samples and computing statistics, e.g., log density. Bijectors provide composable volume-tracking transformations with automatic caching. Together these enable modular construction of high dimensional distributions and transformations not possible with previous libraries (e.g., pixelCNNs, autoregressive flows, and reversible residual networks). They are the workhorse behind deep probabilistic programming systems like Edward and empower fast black-box inference in probabilistic models built on deep-network components. TensorFlow Distributions has proven an important part of the TensorFlow toolkit within Google and in the broader deep learning community.

Active Betweenness Cardinality: Algorithms and Applications

Centrality rankings such as degree, closeness, betweenness, Katz, PageRank, etc. are commonly used to identify critical nodes in a graph. These methods are based on two assumptions that restrict their wider applicability. First, they assume the exact topology of the network is available. Secondly, they do not take into account the activity over the network and only rely on its topology. However, in many applications, the network is autonomous, vast, and distributed, and it is hard to collect the exact topology. At the same time, the underlying pairwise activity between node pairs is not uniform and node criticality strongly depends on the activity on the underlying network. In this paper, we propose active betweenness cardinality, as a new measure, where the node criticalities are based on not the static structure, but the activity of the network. We show how this metric can be computed efficiently by using only local information for a given node and how we can find the most critical nodes starting from only a few nodes. We also show how this metric can be used to monitor a network and identify failed nodes.We present experimental results to show effectiveness by demonstrating how the failed nodes can be identified by measuring active betweenness cardinality of a few nodes in the system.

Valid Inference Corrected for Outlier Removal

Ordinary least square (OLS) estimation of a linear regression model is well-known to be highly sensitive to outliers. It is common practice to first identify and remove outliers by looking at the data then to fit OLS and form confidence intervals and p-values on the remaining data as if this were the original data collected. We show in this paper that this ‘detect-and-forget’ approach can lead to invalid inference, and we propose a framework that properly accounts for outlier detection and removal to provide valid confidence intervals and hypothesis tests. Our inferential procedures apply to any outlier removal procedure that can be characterized by a set of quadratic constraints on the response vector, and we show that several of the most commonly used outlier detection procedures are of this form. Our methodology is built upon recent advances in selective inference (Taylor & Tibshirani 2015), which are focused on inference corrected for variable selection. We conduct simulations to corroborate the theoretical results, and we apply our method to two classic data sets considered in the outlier detection literature to illustrate how our inferential results can differ from the traditional detect-and-forget strategy. A companion R package, outference, implements these new procedures with an interface that matches the functions commonly used for inference with lm in R.

BLADE: Filter Learning for General Purpose Image Processing

The Rapid and Accurate Image Super Resolution (RAISR) method of Romano, Isidoro, and Milanfar is a computationally efficient image upscaling method using a trained set of filters. We describe a generalization of RAISR, which we name Best Linear Adaptive Enhancement (BLADE). This approach is a trainable edge-adaptive filtering framework that is general, simple, computationally efficient, and useful for a wide range of image processing problems. We show applications to denoising, compression artifact removal, demosaicing, and approximation of anisotropic diffusion equations.

Representation Learning for Scale-free Networks

Network embedding aims to learn the low-dimensional representations of vertexes in a network, while structure and inherent properties of the network is preserved. Existing network embedding works primarily focus on preserving the microscopic structure, such as the first- and second-order proximity of vertexes, while the macroscopic scale-free property is largely ignored. Scale-free property depicts the fact that vertex degrees follow a heavy-tailed distribution (i.e., only a few vertexes have high degrees) and is a critical property of real-world networks, such as social networks. In this paper, we study the problem of learning representations for scale-free networks. We first theoretically analyze the difficulty of embedding and reconstructing a scale-free network in the Euclidean space, by converting our problem to the sphere packing problem. Then, we propose the ‘degree penalty’ principle for designing scale-free property preserving network embedding algorithm: punishing the proximity between high-degree vertexes. We introduce two implementations of our principle by utilizing the spectral techniques and a skip-gram model respectively. Extensive experiments on six datasets show that our algorithms are able to not only reconstruct heavy-tailed distributed degree distribution, but also outperform state-of-the-art embedding models in various network mining tasks, such as vertex classification and link prediction.

Transfer Learning with Binary Neural Networks

Previous work has shown that it is possible to train deep neural networks with low precision weights and activations. In the extreme case it is even possible to constrain the network to binary values. The costly floating point multiplications are then reduced to fast logical operations. High end smart phones such as Google’s Pixel 2 and Apple’s iPhone X are already equipped with specialised hardware for image processing and it is very likely that other future consumer hardware will also have dedicated accelerators for deep neural networks. Binary neural networks are attractive in this case because the logical operations are very fast and efficient when implemented in hardware. We propose a transfer learning based architecture where we first train a binary network on Imagenet and then retrain part of the network for different tasks while keeping most of the network fixed. The fixed binary part could be implemented in a hardware accelerator while the last layers of the network are evaluated in software. We show that a single binary neural network trained on the Imagenet dataset can indeed be used as a feature extractor for other datasets.

Introduction to Tensor Decompositions and their Applications in Machine Learning

Tensors are multidimensional arrays of numerical values and therefore generalize matrices to multiple dimensions. While tensors first emerged in the psychometrics community in the 20^{\text{th}} century, they have since then spread to numerous other disciplines, including machine learning. Tensors and their decompositions are especially beneficial in unsupervised learning settings, but are gaining popularity in other sub-disciplines like temporal and multi-relational data analysis, too. The scope of this paper is to give a broad overview of tensors, their decompositions, and how they are used in machine learning. As part of this, we are going to introduce basic tensor concepts, discuss why tensors can be considered more rigid than matrices with respect to the uniqueness of their decomposition, explain the most important factorization algorithms and their properties, provide concrete examples of tensor decomposition applications in machine learning, conduct a case study on tensor-based estimation of mixture models, talk about the current state of research, and provide references to available software libraries.

Latent Factor Interpretations for Collaborative Filtering

Many machine learning systems utilize latent factors as internal representations for making predictions. However, since these latent factors are largely uninterpreted, predictions made using them are opaque. Collaborative filtering via matrix factorization is a prime example of such an algorithm that uses uninterpreted latent features, and yet has seen widespread adoption for many recommendation tasks. We present Latent Factor Interpretation (LFI), a method for interpreting models by leveraging interpretations of latent factors in terms of human-understandable features. The interpretation of latent factors can then replace the uninterpreted latent factors, resulting in a new model that expresses predictions in terms of interpretable features. This new model can then be interpreted using recently developed model explanation techniques. In this paper, we develop LFI for collaborative filtering based recommender systems, which are particularly challenging from an interpretation perspective. We illustrate the use of LFI interpretations on the MovieLens dataset demonstrating that latent factors can be predicted with enough accuracy for accurately replicating the predictions of the true model. Further, we demonstrate the accuracy of interpretations by applying the methodology to a collaborative recommender system using DB tropes and IMDB data and synthetic user preferences.

AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks
BP-homology and an implication for symmetric polynomials
Highlighting objects of interest in an image by integrating saliency and depth
On products of Gaussian random variables
Learning from Longitudinal Face Demonstration – Where Tractable Deep Modeling Meets Inverse Reinforcement Learning
A Recursive Bayesian Approach To Describe Retinal Vasculature Geometry
Parametrised second-order complexity theory with applications to the study of interval computation
Deep Lesion Graphs in the Wild: Relationship Learning and Organization of Significant Radiology Image Findings in a Diverse Large-scale Lesion Database
Power Control and Scheduling In Low SNR Region In The Uplink of Two Cell Networks
From Cages to Trapping Sets and Codewords: A Technique to Derive Tight Upper Bounds on the Minimum Size of Trapping Sets and Minimum Distance of LDPC Codes
Hardness Results on Finding Leafless Elementary Trapping Sets and Elementary Absorbing Sets of LDPC Codes
Physics Informed Deep Learning (Part I): Data-driven Solutions of Nonlinear Partial Differential Equations
Physics Informed Deep Learning (Part II): Data-driven Discovery of Nonlinear Partial Differential Equations
Deep learning analysis of breast MRIs for prediction of occult invasive disease in ductal carcinoma in situ
Martingale transform and Square function: some end-point weak weighted estimates
Estimation and Optimization of Composite Outcomes
Utilitarians Without Utilities: Maximizing Social Welfare for Graph Problems using only Ordinal Preferences – Full Version
Morsifications and mutations
Multilevel Bayesian Parameter Estimation in the Presence of Model Inadequacy and Data Uncertainty
An Overflow Free Fixed-point Eigenvalue Decomposition Algorithm: Case Study of Dimensionality Reduction in Hyperspectral Images
Merlin-Arthur with efficient quantum Merlin and quantum supremacy for the second level of the Fourier hierarchy
A recurrent neural network for classification of unevenly sampled variable stars
Optimal Dynamic Sensor Subset Selection for Tracking a Time-Varying Stochastic Process
Cycle double covers and non-separating cycles
Bulk diffusion in a kinetically constrained lattice gas
Nested distance for stagewise-independent processes
Symbolic vs. Bounded Synthesis for Petri Games
Limit theorems for functionals of two independent Gaussian processes
A Review on Cooperative Diversity Techniques Bypassing Channel Estimation
PSIque: Next Sequence Prediction of Satellite Images using a Convolutional Sequence-to-Sequence Network
Fractional approaches for the distribution of innovation sequence of INAR(1) processes
Intrinsic Analysis of the Sample Fréchet Mean and Sample Mean of Complex Wishart Matrices
Split-Decomposition Trees with Prime Nodes: Enumeration and Random Generation of Cactus Graphs
Limit theorems with rate of convergence under sublinear expectations
Online Knapsack Problem under Expected Capacity Constraint
Augmented Outcome-weighted Learning for Optimal Treatment Regimes
Non-orthogonal Multiple Access Assisted Multi-Region Geocast
Deep-Person: Learning Discriminative Deep Features for Person Re-Identification
Inner Product and Set Disjointness: Beyond Logarithmically Many Parties
An Adaptive Fuzzy-Based System to Simulate, Quantify and Compensate Color Blindness
Predicting readmission risk from doctors’ notes
Cost-Effective Seed Selection in Online Social Networks
Laplacian Controllability of Threshold Graphs
Image2Mesh: A Learning Framework for Single Image 3D Reconstruction
The game of plates and olives
$\mathbb{F}_{q}[G]$-modules and $G$-invariant codes
Invasion Percolation on Galton-Watson Trees
Private federated learning on vertically partitioned data via entity resolution and additively homomorphic encryption
Arbitrary Facial Attribute Editing: Only Change What You Want
Multiplexing induced explosive synchronization in Kuramoto oscillators with inertia
On the Approximability and Hardness of the Minimum Connected Dominating Set with Routing Cost Constraint
Do Convolutional Neural Networks act as Compositional Nearest Neighbors?
Road Extraction by Deep Residual U-Net
IoT based Platform as a Service for Provisioning of Concurrent Applications
Interpretable Facial Relational Network Using Relational Importance
Grand unified theory for Tsetlin libraries
Orthogonal and symplectic Harish-Chandra integrals and matrix product ensembles
Local-Access Generators for Basic Random Graph Models
Small Drone Field Experiment: Data Collection & Processing
Backscatter Multiplicative Multiple-Access Systems: Fundamental Limits and Practical Design
Detailed proof of Nazarov’s inequality
FSRNet: End-to-End Learning Face Super-Resolution with Facial Priors
Speaker-Sensitive Dual Memory Networks for Multi-Turn Slot Tagging
End-to-End Optimization of Task-Oriented Dialogue Model with Deep Reinforcement Learning
Recursive Harmonic Numbers and Binomial Coefficients
Predicting the Popularity of Online Videos via Deep Neural Networks
Disorder Effects in Topological States — Brief Review of the Recent Developments
Deep Eyes: Binocular Depth-from-Focus on Focal Stack Pairs
A probabilistic approach to Dirac concentration in nonlocal models of adaptation with several resources
Tighter Lifting-Free Convex Relaxations for Quadratic Matching Problems
Photo-to-Caricature Translation on Faces in the Wild
Pipeline Generative Adversarial Networks for Facial Images Generation with Multiple Attributes
RoboJam: A Musical Mixture Density Network for Collaborative Touchscreen Interaction
Convolutional Neural Networks for Breast Cancer Screening: Transfer Learning with Exponential Decay
Stochastic Approximation on Riemannian manifolds
Learning nonlinear state-space models using smooth particle-filter-based likelihood approximations
Parameter-free $\ell_p$-Box Decoding of LDPC Codes
Leveraging Conversation Structure on Social Media to Identify Potentially Influential Users
Online Product Quantization
Energy Efficient Resource Allocation in Machine-to-Machine Communications with Multiple Access and Energy Harvesting for IoT
Partial Consensus and Conservative Fusion of Gaussian Mixtures for Distributed PHD Fusion
Topology optimization of multiple anisotropic materials, with application to self-assembling diblock copolymers
Automating Vehicles by Deep Reinforcement Learning using Task Separation with Hill Climbing
Bayesian Measurement Error Correction in Structured Additive Distributional Regression with an Application to the Analysis of Sensor Data on Soil-Plant Variability
Efficient exploration with Double Uncertain Value Networks
Reinforcement Learning To Adapt Speech Enhancement to Instantaneous Input Signal Quality
Blind estimation of white Gaussian noise variance in highly textured images
Saliency Weighted Convolutional Features for Instance Search
DeepSkeleton: Skeleton Map for 3D Human Pose Regression
Downlink Precoding with Mixed Statistical and Imperfect Instantaneous CSI for Massive MIMO Systems
A maximizing characteristic for critical configurations of chip-firing games on digraphs
A new fMRI data analysis method using cross validation: Negative BOLD responses may be the deactivations of interneurons
The Alon-Tarsi number of planar graphs
Objective Bayesian inference with proper scoring rules
On a Greedy Algorithm to Construct Universal Cycles for Permutations
Bayesian Simultaneous Estimation for Means in k Sample Problems
Compression for Smooth Shape Analysis
Curriculum Q-Learning for Visual Vocabulary Acquisition
Behavior of Wireless Body-to-Body Networks Routing Strategies for Public Protection and Disaster Relief
Data Dissemination Strategies for Emerging Wireless Body-to-Body Networks based Internet of Humans
Semi-Supervised Few-Shot Learning with Prototypical Networks
Sparse Photometric 3D Face Reconstruction Guided by Morphable Models
PointFusion: Deep Sensor Fusion for 3D Bounding Box Estimation
Occlusion-aware Hand Pose Estimation Using Hierarchical Mixture Density Network
Faster ICA under orthogonal constraint
On sets of points with few odd secants
High-performance Implementation of Matrix-free High-order Discontinuous Galerkin Methods
A review of asymptotic theory of estimating functions
Distributed Sweep Coverage Algorithm of Multi-agent Systems Using Workload Memory
On S-packing edge-colorings of cubic graphs
Deep Reinforcement Learning for De-Novo Drug Design
Gaussian Processes for Demand Unconstraining
Learning Spatio-temporal Features with Partial Expression Sequences for on-the-Fly Prediction
Joint Blind Motion Deblurring and Depth Estimation of Light Field
Generalizing Virtual Values to Multidimensional Auctions: a Non-Myersonian Approach
Dynamical systems associated to the $β$-core in the repeated prisoner’s dilemma
Deep Image Prior
Particle Optimization in Stochastic Gradient MCMC
Strong and safe Nash equilibrium in some repeated 3-player games
Two types of criticality in the brain
Learning Interesting Categorical Attributes for Refined Data Exploration
NPC: Neighbors Progressive Competition Algorithm for Classification of Imbalanced Data Sets
Forest-based methods and ensemble model output statistics for rainfall ensemble forecasting
Dimension Reduction for Robust Covariate Shift Correction
A Generative Model of 3D Object Layouts in Apartments
Extended Poisson INAR(1) processes with equidispersion, underdispersion and overdispersion
Intelligent Traffic Light Control Using Distributed Multi-agent Q Learning
A Robust Time-Domain Beam Alignment Scheme for Multi-User Wideband mmWave Systems
Connectivity jamming game for physical layer attack in peer to peer networks
Now Playing: Continuous low-power music recognition
Saccade Sequence Prediction: Beyond Static Saliency Maps
Colour Constancy: Biologically-inspired Contrast Variant Pooling Mechanism
The first order convergence law fails for random perfect graphs
Bayesian analysis of finite population sampling in multivariate co-exchangeable structures with separable covariance matric
Discrete Morse-Bott theory for CW complexes
Balanced Spanning Caterpillars
An Uncertainty Principle for Estimates of Floquet Multipliers
A Centralized Reputation Management Scheme for Isolating Malicious Controller(s) in Distributed Software-Defined Networks
A Family of Iterative Gauss-Newton Shooting Methods for Nonlinear Optimal Control
BPS/CFT correspondence IV: sigma models and defects in gauge theory
HoME: a Household Multimodal Environment
PDE-Based Optimization for Stochastic Mapping and Coverage Strategies using Robotic Ensembles
A Novel Data-Driven Framework for Risk Characterization and Prediction from Electronic Medical Records: A Case Study of Renal Failure
A Benchmarking Environment for Reinforcement Learning Based Task Oriented Dialogue Management
Embedding Words as Distributions with a Bayesian Skip-gram Model
Formation of large-scale random structure by competitive erosion
On the Parameterized Complexity of Approximating Dominating Set