Whats new on arXiv

• Doctor AI: Predicting Clinical Events via Recurrent Neural Networks

Large amount of Electronic Health Record (EHR) data have been collected over millions of patients over multiple years. The rich longitudinal EHR data documented the collective experiences of physicians including diagnosis, medication prescription and procedures. We argue it is possible now to leverage the EHR data to model how physicians behave, and we call our model Doctor AI. Towards this direction of modeling clinical bahavior of physicians, we develop a successful application of Recurrent Neural Networks (RNN) to jointly forecast the future disease diagnosis and medication prescription along with their timing. Unlike a traditional classification model where a single target is of interest, our model can assess entire history of patients and make continuous and multilabel prediction based on patients’ historical data. We evaluate the performance of the proposed method on a large real-world EHR data over 250K patients over 8 years. We observe Doctor AI achieves up to 79% recall@30, significantly higher than several baselines.

• Metric Learning with Adaptive Density Discrimination

Distance metric learning (DML) approaches learn a transformation to a representation space where distance is in correspondence with a predefined notion of similarity. While such models offer a number of compelling benefits, it has been difficult for these to compete with modern classification algorithms in performance and even in feature extraction. In this work, we propose a novel approach explicitly designed to address a number of subtle yet important issues which have stymied earlier DML algorithms. It maintains an explicit model of the distributions of the different classes in representation space. It then employs this knowledge to adaptively assess similarity, and achieve local discrimination by penalizing class distribution overlap. We demonstrate the effectiveness of this idea on several tasks. Our approach achieves state-of-the-art classification results on a number of fine-grained visual recognition datasets, surpassing the standard softmax classifier and outperforming triplet loss by a relative margin of 30-40%. In terms of computational performance, it alleviates training inefficiencies in the traditional triplet loss, reaching the same error in 5-30 times fewer iterations. Beyond classification, we further validate the saliency of the learnt representations via their attribute concentration and hierarchy recovery properties, achieving 10-25% relative gains on the softmax classifier and 25-50% on triplet loss in these tasks.

• Probabilistic K-Means using Method of Moments

K-means is one of the most widely used algorithms for clustering in Data Mining applications, which attempts to minimize the sum of square of Euclidean distance of the points in the clusters from the respective means of the clusters. The simplicity and scalability of K-means makes it very appealing. However, K-means suffers from local minima problem, and comes with no guarantee to converge to the optimal cost. K-means++ tries to address the problem by seeding the means using a distance based sampling scheme. However, seeding the means in K-means++ needs O(K) passes through the entire dataset, which could be very costly in large amount of dataset. Here we propose a method of seeding initial means based on higher order moments of the data, which takes O(1) passes through the entire dataset to extract the initial set of means. Our method yields competitive performance with respect to all the existing K-means algorithms, whilst avoiding the expensive mean selection steps of K-means++ and other heuristics. We demonstrate the performance of our algorithm in comparison with the existing algorithms on various benchmark datasets.

• A note on probability metrics in a categorical setting

Probability metrics constitute an important tool in probability theory and statistics \cite{DKS91}, \cite{R91}, \cite{Z83} as they are specific metrics on spaces of random variables which, by satisfying an extra condition, concord well with the randomness structure. But probability metrics suffer from the same instability under constructions as metrics. In \cite{L15}, as well as in former and related work which can be found in the references of \cite{L15}, a comprehensive setting was developed to deal with this. It is the purpose of this note to point out that these ideas can also be applied to probability metrics thus embedding them in a natural categorical framework, showing that certain constructions performed in the setting of probability theory are in fact categorical in nature. This allows us to deduce various separate results in the literature from a unified approach.

• A Random Forest Guided Tour

The random forest algorithm, proposed by L. Breiman in 2001, has been extremely successful as a general-purpose classification and regression method. The approach, which combines several randomized decision trees and aggregates their predictions by averaging, has shown excellent performance in settings where the number of variables is much larger than the number of observations. Moreover, it is versatile enough to be applied to large-scale problems, is easily adapted to various ad-hoc learning tasks, and returns measures of variable importance. The present article reviews the most recent theoretical and methodological developments for random forests. Emphasis is placed on the mathematical forces driving the algorithm, with special attention given to the selection of parameters, the resampling mechanism, and variable importance measures. This review is intended to provide non-experts easy access to the main ideas.

• Prioritized Experience Replay

• Staleness-aware Async-SGD for Distributed Deep Learning

• Least squares estimation for the subcritical Heston model based on continuous time observations

• ACDC: A Structured Efficient Linear Layer

• Weighted multiple ergodic averages and correlation sequences

• Unitary-Group Invariant Kernels and Features from Transformed Unlabeled Data

• From generalized Tamari intervals to non-separable planar maps (extended abstract)

• On the Global Linear Convergence of Frank-Wolfe Optimization Variants

• Combining Neural Networks and Log-linear Models to Improve Relation Extraction

• Bayesian quantile regression analysis for continuous data with a discrete component at zero

• Automatic Region-wise Spatially Varying Coefficient Regression Model: an Application to National Cardiovascular Disease Mortality and Air Pollution Association Study

• Mean-Field interacton of Brownian occupation measures. II: A rigorous construction of the Pekar process

• Behavior Query Discovery in System-Generated Temporal Graphs

• Comparison of viscosity solutions of fully nonlinear degenerate parabolic Path-dependent PDEs

• Censoring Representations with an Adversary

• Infinite excursions of router walks on regular trees

• Randomization can be as helpful as a glimpse of the future in online computation

• Generation of scenarios from calibrated ensemble forecasts with a dynamic ensemble copula coupling approach

• On Mäkelä’s Conjectures: deciding if a morphic word avoids long abelian-powers

• Fast Saddle-Point Algorithm for Generalized Dantzig Selector and FDR Control with the Ordered $\ell_1$-Norm

• Matrix-Ball Construction of affine Robinson-Schensted correspondence

• Anomalous Contagion and Renormalization in Dynamical Networks with Nodal Mobility

• Trees with small b-chromatic index

• The Hopf Algebra of graph invariants

• On an adaptive preconditioned Crank-Nicolson algorithm for infinite dimensional Bayesian inferences

• Using Machine Learning to Predict the Outcome of English County twenty over Cricket Matches

• Alternative Markov Properties for Acyclic Directed Mixed Graphs

• The relationship between internet user type and user performance when carrying out simple vs. complex search tasks

• A Framework for Evaluating the Retrieval Effectiveness of Search Engines

• The retrieval effectiveness of search engines on navigational queries

• The Influence of Commercial Intent of Search Results on Their Perceived Relevance

• Ranking library materials

• What Users See – Structures in Search Engine Results Pages

• The Retrieval Effectiveness of Web Search Engines: Considering Results Descriptions

• Problems with the use of Web search engines to find results in foreign languages

• Metric learning for graph-based label propgation

• The historical Moran model

• Nonparametric estimation for irregularly sampled Lévy processes

• Cache-Conscious Run-time Decomposition of Data Parallel Computations

• Uniqueness of the extreme cases in theorems of Drisko and Erdős-Ginzburg-Ziv

• Toward Transparent Heterogeneous Systems

• Continued Classification of 3D Lattice Walks in the Positive Octant

• Image Question Answering using Convolutional Neural Network with Dynamic Parameter Prediction

• Solution Repair/Recovery in Uncertain Optimization Environment

• Penalized complexity priors for degrees of freedom in Bayesian P-splines

• Infinite-dimensional calculus under weak spatial regularity of the processes

• Sparse learning of maximum likelihood model for optimization of complex loss function

• The classical-quantum divergence of complexity in the Ising spin chain

• On the minimum of a conditioned Brownian bridge

• Dummy variables and their interactions in regression analysis: examples from research on body mass index

• Generation and motion of interfaces in one-dimensional stochastic Allen-Cahn equation

• Online learning in repeated auctions

• Using Abduction in Markov Logic Networks for Root Cause Analysis

• Complex-Valued Gaussian Processes for Regression: A Widely Non-Linear Approach

• Efficient Output Kernel Learning for Multiple Tasks

• Hyperspectral Unmixing in Presence of Endmember Variability, Nonlinearity or Mismodelling Effects

• One to rule them all: a general method for fast computation on semirings isomorphic to $(\times, \max)$ on $\mathbb{R}_+$

• A Distribution Adaptive Framework for Prediction Interval Estimation Using Nominal Variables

• Preimages under the Stack-Sorting Algorithm

• Wishart Mechanism for Differentially Private Principal Components Analysis

• Expressiveness of Rectifier Networks

• Discovering Underlying Plans Based on Distributed Representations of Actions

• Bayesian hypothesis testing for one bit compressed sensing with sensing matrix perturbation

• Learning Discriminative Representations for Semantic Cross Media Retrieval

• Why are deep nets reversible: A simple theory, with implications for training

• Tree-Guided MCMC Inference for Normalized Random Measure Mixture Models

• The Invisible Hand of Dynamic Market Pricing

• Adversarial Autoencoders

• A New Smooth Approximation to the Zero One Loss with a Probabilistic Interpretation

• Net2Net: Accelerating Learning via Knowledge Transfer

• Discrete one-dimensional oriented percolation of intervals

• Competitive Multi-scale Convolution

• Local entropy as a measure for sampling solutions in Constraint Satisfaction Problems

• Two laws of large numbers for sublinear expectations

• Marginalized Two Part Models for Generalized Gamma Family of Distributions

• MOEA/D-GM: Using probabilistic graphical models in MOEA/D for solving combinatorial optimization problems

• Predicting distributions with Linearizing Belief Networks

• Learning Structured Inference Neural Networks with Label Relations

• A Bayesian Semiparametric Framework for Understanding and Predicting Customer Base Dynamics

• A Block Regression Model for Short-Term Mobile Traffic Forecasting

• Co-modularity and Co-community Detection in Large Networks

• Identifying the Absorption Bump with Deep Learning

• Rescue of endemic states in interconnected networks with adaptive coupling

• blavaan: Bayesian structural equation models via parameter expansion

• Semiparametric Estimation of CES Demand System with Observed and Unobserved Product Characteristics

AnalytiXon

~ Broaden your Horizon

Whats new on arXiv

Like this:

Leave a ReplyCancel reply

Share this:

Like this:

Leave a ReplyCancel reply

Discover more from AnalytiXon