Incorporating Feedback into Tree-based Anomaly Detection

Anomaly detectors are often used to produce a ranked list of statistical anomalies, which are examined by human analysts in order to extract the actual anomalies of interest. Unfortunately, in realworld applications, this process can be exceedingly difficult for the analyst since a large fraction of high-ranking anomalies are false positives and not interesting from the application perspective. In this paper, we aim to make the analyst’s job easier by allowing for analyst feedback during the investigation process. Ideally, the feedback influences the ranking of the anomaly detector in a way that reduces the number of false positives that must be examined before discovering the anomalies of interest. In particular, we introduce a novel technique for incorporating simple binary feedback into tree-based anomaly detectors. We focus on the Isolation Forest algorithm as a representative tree-based anomaly detector, and show that we can significantly improve its performance by incorporating feedback, when compared with the baseline algorithm that does not incorporate feedback. Our technique is simple and scales well as the size of the data increases, which makes it suitable for interactive discovery of anomalies in large datasets.

A Stochastic Analysis of a Network with Two Levels of Service

In this paper a stochastic model of a call center with a two-level architecture is analyzed. A first-level pool of operators answers calls, identifies, and handles non-urgent calls. A call classified as urgent has to be transferred to specialized operators at the second level. When the operators of the second level are all busy, the operator of first level handling the urgent call is blocked until an operator at the second level is available. Under a scaling assumption, the evolution of the number of urgent calls blocked at level~1 is investigated. It is shown that if the ratio of the number of operators at level 2 and~1 is greater than some threshold, then, essentially, the system operates without congestion, with probability close to 1, no urgent call is blocked after some finite time. Otherwise, we prove that a positive fraction of the operators of the first level are blocked due to the congestion of the second level. Stochastic calculus with Poisson processes, coupling arguments and formulations in terms of Skorokhod problems are the main mathematical tools to establish these convergence results.

Measure differential equations

A new type of differential equations for probability measures on Euclidean spaces, called Measure Differential Equations (briefly MDEs), is introduced. MDEs correspond to Probability Vector Fields, which map measures on an Euclidean space to measures on its tangent bundle. Solutions are intended in weak sense and existence, uniqueness and continuous dependence results are proved under suitable conditions. The latter are expressed in terms of the Wasserstein metric on the base and fiber of the tangent bundle. MDEs represent a natural measure-theoretic generalization of Ordinary Differential Equations via a monoid morphism mapping sums of vector fields to fiber convolution of the corresponding Probability Vector Fields. Various examples, including finite-speed diffusion and concentration, are shown, together with relationships to Partial Differential Equations. Finally, MDEs are also natural mean-field limits of multi-particle systems, with convergence results extending the classical Dubroshin approach.

Designing Strassen’s algorithm
Machine Learning Topological Invariants with Neural Networks
Fast(er) Exact Decoding and Global Training for Transition-Based Dependency Parsing via a Minimal Feature Set
Secure Communications for the Two-user Broadcast Channel with Random Traffic
Seasonal Effects on Honey Bee Population Dynamics: a Nonautonomous System of Difference Equations
LangPro: Natural Language Theorem Prover
Proposal for a fully decentralized blockchain and proof-of-work algorithm for solving NP-complete problems
End-to-end Training for Whole Image Breast Cancer Diagnosis using An All Convolutional Design
Sample path properties of permanental processes
Stability of patterns in the Abelian sandpile
A Scalable and Statistically Robust Beam Alignment Technique for mm-Wave Systems
Effects of Arrival Type and Degree of Saturation on Queue Length Estimation at Signalized Intersections
Transmission clusters in the HIV-1 epidemic among men who have sex with men in Montreal, Quebec, Canada
On the Upper Limit of Separability
Learning Fine-Grained Knowledge about Contingent Relations between Everyday Events
Inference of Fine-Grained Event Causality from Blogs and Films
An algorithm to simulate alternating Turing machine in signal machine
A Probabilistic proof of the breakdown of Besov regularity in $L$-shaped domains
The Hammersley-Welsh bound for self-avoiding walk revisited
Isotonic regression in general dimensions
Characterizing Migration Dynamics with Convolution-Based Movement Models
A Compressive Sensing Approach to Community Detection with Applications
Graphical Lasso and Thresholding: Equivalence and Closed-form Solutions
Dynamic Bayesian Influenza Forecasting in the United States with Hierarchical Discrepancy
Learning Invariant Riemannian Geometric Representations Using Deep Nets
On Scheduling a Photolithography Area Containing Cluster Tools
Automatically Generating Commit Messages from Diffs using Neural Machine Translation
Fast scrambling in holographic Einstein-Podolsky-Rosen pair
Integer sorting on multicores: some (experiments and) observations
Inferring Narrative Causality between Event Pairs in Films
Unsupervised Induction of Contingent Event Pairs from Film Scenes
Finite Sample Inference for Targeted Learning
Estimation in Semiparametric Quantile Factor Models
Leveraging Deep Neural Network Activation Entropy to cope with Unseen Data in Speech Recognition
Upper and Lower Bounds on the Capacity of Amplitude-Constrained MIMO Channels
Dynamic Asset Price Jumps and the Performance of High Frequency Tests and Measures
Action Classification and Highlighting in Videos
Decompositions of amplituhedra
Non-invasive acquisition of fetal ECG from the maternal thorax: a feasibility study and a call for open data sets
Apéry sets of shifted numerical monoids
Learning a Generative Adversarial Network for High Resolution Artwork Synthesis
Confidence Intervals that Utilize Uncertain Prior Information about Exogeneity in Panel Data
Video Summarization with Attention-Based Encoder-Decoder Networks
Differentiable cellular automata
Thresholds Optimization for One-Bit Feedback Multi-User Scheduling
Translations in the exponential Orlicz space with Gaussian weight
Determinantal Multivariate Polynomials
Characterization of Determinantal Bivariate Polynomials
Universal simplicial complexes inspired by toric topology
Social Network Analysis Using Coordination Games
Anagram-free colourings of graph subdivisions
Exponentially many nowhere-zero $Z_3$-, $Z_4$-, and $Z_6$-flows
Fast Landmark Localization with 3D Component Reconstruction and CNN for Cross-Pose Recognition
One-loop topological expansion for spin glasses in the large connectivity limit
ICDAR2017 Competition on Reading Chinese Text in the Wild (RCTW-17)
SINR-Threshold Scheduling with Binary Power Control for D2D Networks
The exact asymptotics of the large deviation probabilities in the multivariate boundary crossing problem
On the Distribution and Model Selection Properties of the Lasso Estimator in Low and High Dimensions
Identifying Products in Online Cybercrime Marketplaces: A Dataset for Fine-grained Domain Adaptation
Einstein relation and linear response in one-dimensional Mott variable-range hopping
Moments of the multivariate Gaussian complex random variable
Optimal Distributed Control of Multi-agent Systems in Contested Environments via Reinforcement Learning
ALCN: Meta-Learning for Contrast Normalization Applied to Robust 3D Pose Estimation
Some Liouville-type results for eigenfunctions of elliptic operators
Automatic Semantic Style Transfer using Deep Convolutional Neural Networks and Soft Masks
Neural Class-Specific Regression for face verification
Abnormal Event Detection in Videos using Generative Adversarial Nets
Key features of Turing systems are determined purely by network topology
Sensitivity and Robustness of Quantum Spin-1/2 Rings to Parameter Uncertainty
Speeding up non-Markovian First Passage Percolation with a few extra edges
Simple Compact Monotone Tree Drawings
A comment on Stein’s unbiased risk estimate for reduced rank estimators
Quality Enhancement by Weighted Rank Aggregation of Crowd Opinion
Enhancing and comparing methods for the detection of fishing activity from Vessel Monitoring System data
Generating Video Descriptions with Topic Guidance
Video Captioning with Guidance of Multimodal Latent Topics
Regime-Specific Multi-Objective Variational Principle of Compromise in Competition at Mesoscales
Tunneling behavior of Ising and Potts models on grid graphs
An Incremental Redundancy HARQ Scheme for Polar Code
Performance of SiPMs in the nonlinear region
On Boosting, Tug of War, and Lexicographic Programming
Quantifying Facial Age by Posterior of Age Comparisons
Visualizing Co-Phylogenetic Reconciliations
General Robust Bayes Pseudo-Posterior: Exponential Convergence results with Applications
Robust Wald-Type Tests under Random Censoring
Bounds on entanglement dimensions and quantum graph parameters via noncommutative polynomial optimization
Walk entropy and walk-regularity
Human and Machine Judgements for Russian Semantic Relatedness
Algorithmic patterns for $\mathcal{H}$-matrices on many-core processors
Sketching the order of events
Greene–Kleitman invariants for Sulzgruber insertion
Gaussian Processes for HRF estimation for BOLD fMRI
Disjoint Dominating Sets with a Perfect Matching
Zero-sum $K_m$ over $\mathbb{Z}$ and the story of $K_4$
Few Sequence Pairs Suffice: Representing All Rectangle Placements
Learning Lexico-Functional Patterns for First-Person Affect
Design and Analysis of the NIPS 2016 Review Process
Proactive Eavesdropping via Jamming over HARQ-Based Communications
Transfer Learning across Low-Resource, Related Languages for Neural Machine Translation
Box polynomials and the excedance matrix
Efficient tracking of a growing number of experts
On Baire Measurable Colorings of Group Actions
Better Decision Making in Drug Development Through Adoption of Formal Prior Elicitation
On the Bayesian calibration of expensive computer models with input dependent parameters
Inferring Human Activities Using Robust Privileged Probabilistic Learning
Walking Through Waypoints
Model based learning for accelerated, limited-view 3D photoacoustic tomography
3D Visual Perception for Self-Driving Cars using a Multi-Camera System: Calibration, Mapping, Localization, and Obstacle Detection
Reconstructing Generalized Staircase Polygons with Uniform Step Length
Predicting Cardiovascular Risk Factors from Retinal Fundus Photographs using Deep Learning
An analysis of the Act 43 Wisconsin Assembly district map using the $\sqrt{\varepsilon}$ test