If you did not already know

Contextual Bilateral Loss (CoBi) google
This paper shows that when applying machine learning to digital zoom for photography, it is beneficial to use real, RAW sensor data for training. Existing learning-based super-resolution methods do not use real sensor data, instead operating on RGB images. In practice, these approaches result in loss of detail and accuracy in their digitally zoomed output when zooming in on distant image regions. We also show that synthesizing sensor data by resampling high-resolution RGB images is an oversimplified approximation of real sensor data and noise, resulting in worse image quality. The key barrier to using real sensor data for training is that ground truth high-resolution imagery is missing. We show how to obtain the ground-truth data with optically zoomed images and contribute a dataset, SR-RAW, for real-world computational zoom. We use SR-RAW to train a deep network with a novel contextual bilateral loss (CoBi) that delivers critical robustness to mild misalignment in input-output image pairs. The trained network achieves state-of-the-art performance in 4X and 8X computational zoom. …

Uncertainty-Aware Feature Selection (UAFS) google
Missing data are a concern in many real world data sets and imputation methods are often needed to estimate the values of missing data, but data sets with excessive missingness and high dimensionality challenge most approaches to imputation. Here we show that appropriate feature selection can be an effective preprocessing step for imputation, allowing for more accurate imputation and subsequent model predictions. The key feature of this preprocessing is that it incorporates uncertainty: by accounting for uncertainty due to missingness when selecting features we can reduce the degree of missingness while also limiting the number of uninformative features being used to make predictive models. We introduce a method to perform uncertainty-aware feature selection (UAFS), provide a theoretical motivation, and test UAFS on both real and synthetic problems, demonstrating that across a variety of data sets and levels of missingness we can improve the accuracy of imputations. Improved imputation due to UAFS also results in improved prediction accuracy when performing supervised learning using these imputed data sets. Our UAFS method is general and can be fruitfully coupled with a variety of imputation methods. …

Deep Feature Fusion-Audio and Text Modal Fusion (DFF-ATMF) google
Sentiment analysis research has been rapidly developing in the last decade and has attracted widespread attention from academia and industry, most of which is based on text. However, the information in the real world usually comes as different modalities. In this paper, we consider the task of Multimodal Sentiment Analysis, using Audio and Text Modalities, proposed a novel fusion strategy including Multi-Feature Fusion and Multi-Modality Fusion to improve the accuracy of Audio-Text Sentiment Analysis. We call this the Deep Feature Fusion-Audio and Text Modal Fusion (DFF-ATMF) model, and the features learned from it are complementary to each other and robust. Experiments with the CMU-MOSI corpus and the recently released CMU-MOSEI corpus for Youtube video sentiment analysis show the very competitive results of our proposed model. Surprisingly, our method also achieved the state-of-the-art results in the IEMOCAP dataset, indicating that our proposed fusion strategy is also extremely generalization ability to Multimodal Emotion Recognition. …

Distance Metric Learned Collaborative Representation Classifier (DML-CRC) google
Any generic deep machine learning algorithm is essentially a function fitting exercise, where the network tunes its weights and parameters to learn discriminatory features by minimizing some cost function. Though the network tries to learn the optimal feature space, it seldom tries to learn an optimal distance metric in the cost function, and hence misses out on an additional layer of abstraction. We present a simple effective way of achieving this by learning a generic Mahalanabis distance in a collaborative loss function in an end-to-end fashion with any standard convolutional network as the feature learner. The proposed method DML-CRC gives state-of-the-art performance on benchmark fine-grained classification datasets CUB Birds, Oxford Flowers and Oxford-IIIT Pets using the VGG-19 deep network. The method is network agnostic and can be used for any similar classification tasks. …

Magister Dixit

“And that’s where the statistician needs to take it easy:
• Start with the results, so the audience has a clear view on the outcome
• Proceed to explain the analysis simply and with a minimum of statistical jargon
• Describe what an algorithm does, not the specifics of your killer algo
• Visualize the inputs (e.g.: a correlation matrix showing an ‘influence heat map’)
• Visualize the process (e.g.: a regression line on a chief predictor variable)
• Visualize the results (e.g.: a lift chart to show how much the analysis is improving results)
• Always, always tie each step back to the business challenge
• Always be open to questions and feedback.”
Andrew Pease ( November 3, 2014 )

Whats new on arXiv – Complete List

Object-Capability as a Means of Permission and Authority in Software Systems
A Scalable Framework for Multilevel Streaming Data Analytics using Deep Learning
Mutual Reinforcement Learning
Logic Conditionals, Supervenience, and Selection Tasks
Graph Interpolating Activation Improves Both Natural and Robust Accuracies in Data-Efficient Deep Learning
Evaluating Explanation Without Ground Truth in Interpretable Machine Learning
A Self-Attentive model for Knowledge Tracing
Deep Social Collaborative Filtering
Meta-Learning for Black-box Optimization
Are We Really Making Much Progress? A Worrying Analysis of Recent Neural Recommendation Approaches
Quantum Data Fitting Algorithm for Non-sparse Matrices
Perception of visual numerosity in humans and machines
Selection Heuristics on Semantic Genetic Programming for Classification Problems
Natural Adversarial Examples
Mediation Challenges and Socio-Technical Gaps for Explainable Deep Learning Applications
Construction and enumeration for self-dual cyclic codes of even length over $\mathbb{F}_{2^m} + u\mathbb{F}_{2^m}$
On the Relationships Between Average Channel Capacity, Average Bit Error Rate, Outage probability and Outage Capacity over Additive White Gaussian Noise Channels
The Bach Doodle: Approachable music composition with machine learning at scale
Topology Based Scalable Graph Kernels
Boosting Resolution and Recovering Texture of micro-CT Images with Deep Learning
Adaptive Flux-Only Least-Squares Finite Element Methods for Linear Transport Equations
DeepRace: Finding Data Race Bugs via Deep Learning
Integrating the Data Augmentation Scheme with Various Classifiers for Acoustic Scene Modeling
On Convergence and Optimality of Best-Response Learning with Policy Types in Multiagent Systems
Geometric Convergence of Distributed Gradient Play in Games with Unconstrained Action Sets
A Short Note on the Kinetics-700 Human Action Dataset
A portable potentiometric electronic tongue leveraging smartphone and cloud platforms
Concentration of the matrix-valued minimum mean-square error in optimal Bayesian inference
Cataloging Accreted Stars within Gaia DR2 using Deep Learning
Linear Receivers for Massive MIMO Systems with One-Bit ADCs
Slow Feature Analysis for Human Action Recognition
Robust Variational Autoencoders for Outlier Detection in Mixed-Type Data
Quant GANs: Deep Generation of Financial Time Series
Towards Near-imperceptible Steganographic Text
Period mimicry: A note on the $(-1)$-evaluation of the peak polynomials
A row-invariant parameterized algorithm for integer programming
Sampled-Data Observers for Delay Systems and Hyperbolic PDE-ODE Loops
Comparison Between Algebraic and Matrix-free Geometric Multigrid for a Stokes Problem on Adaptive Meshes with Variable Viscosity
CupQ: A New Clinical Literature Search Engine
A Stratification Approach to Partial Dependence for Codependent Variables
Output maximization container loading problem with time availability constraints
Imaginary replica analysis of loopy regular random graphs
PPO Dash: Improving Generalization in Deep Reinforcement Learning
Towards network admissible optimal dispatch of flexible loads in distribution networks
MaskPlus: Improving Mask Generation for Instance Segmentation
Ramanujan Congruences for Fractional Partition Functions
Asymptotic stabilization of a system of coupled $n$th–order differential equations with potentially unbounded high-frequency oscillating perturbations
DOD-ETL: Distributed On-Demand ETL for Near Real-Time Business Intelligence
Real-time Facial Surface Geometry from Monocular Video on Mobile GPUs
Deep learning-based color holographic microscopy
Overcoming the curse of dimensionality in the numerical approximation of Allen-Cahn partial differential equations via truncated full-history recursive multilevel Picard approximations
Lower Bounding the AND-OR Tree via Symmetrization
Padé Activation Units: End-to-end Learning of Flexible Activation Functions in Deep Networks
Condensed Ricci Curvature of Complete and Strongly Regular Graphs
Defining mediation effects for multiple mediators using the concept of the target randomized trial
Real-time Hair Segmentation and Recoloring on Mobile GPUs
Binary Decision Diagrams: from Tree Compaction to Sampling
Almost all Steiner triple systems are almost resolvable
Low-supervision urgency detection and transfer in short crisis messages
A Data-Driven Game-Theoretic Approach for Behind-the-Meter PV Generation Disaggregation
Designing Perfect Simulation Algorithms using Local Correctness
Development of a General Momentum Exchange Devices Fault Model for Spacecraft Fault-Tolerant Control System Design
Independence numbers of Johnson-type graphs
AugLabel: Exploiting Word Representations to Augment Labels for Face Attribute Classification
Elastic depths for detecting shape anomalies in functional data
Some error estimates for the DEC method in the plane
High-order couplings in geometric complex networks of neurons
Partitioning Graphs for the Cloud using Reinforcement Learning
Increasing Power for Observational Studies of Aberrant Response: An Adaptive Approach
Subspace Determination through Local Intrinsic Dimensional Decomposition: Theory and Experimentation
Efficient Pipeline for Camera Trap Image Review
Hands Off my Database: Ransomware Detection in Databases through Dynamic Analysis of Query Sequences
Improving 3D Object Detection for Pedestrians with Virtual Multi-View Synthesis Orientation Estimation
Nonlinear filtering of stochastic differential equations driven by correlated Lévy noises
Rethinking RGB-D Salient Object Detection: Models, Datasets, and Large-Scale Benchmarks
AR(1) processes driven by second-chaos white noise: Berry-Esséen bounds for quadratic variation and parameter estimation
A simplified proof of CLT for convex bodies
Some Black-box Reductions for Objective-robust Discrete Optimization Problems Based on their LP-Relaxations
Planar graphs without 7-cycles and butterflies are DP-4-colorable
Study of Max-Link Relay Selection with Buffers for Multi-Way Cooperative Multi-Antenna Systems
2nd Place Solution to the GQA Challenge 2019
Efficient Autonomy Validation in Simulation with Adaptive Stress Testing
Instant Motion Tracking and Its Applications to Augmented Reality
Asynchronous Coded Caching
A Bird’s Eye View of Nonlinear System Identification
Ethical Underpinnings in the Design and Management of ICT Projects
Alternating Dynamic Programming for Multiple Epidemic Change-Point Estimation
Stochastic viscosity solutions for stochastic integral-partial differential equations and singular stochastic control
Hydrodynamic synchronization and collective dynamics of colloidal particles driven along a circular path
A Quantum-inspired Algorithm for General Minimum Conical Hull Problems
Energy-efficient Alternating Iterative Secure Structure of Maximizing Secrecy Rate for Directional Modulation Networks
Stereo-based terrain traversability analysis using normal-based segmentation and superpixel surface analysis
EL-Shelling on Comodernistic Lattices
Adversarial Sensor Attack on LiDAR-based Perception in Autonomous Driving
Automated Deobfuscation of Android Native Binary Code
CL-Shellable Posets with No EL-Shellings
Noise Removal of FTIR Hyperspectral Images via MMSE
An Inter-Layer Weight Prediction and Quantization for Deep Neural Networks based on a Smoothly Varying Weight Hypothesis
Quality-aware skill translation models for expert finding on StackOverflow
Improved Reinforcement Learning through Imitation Learning Pretraining Towards Image-based Autonomous Driving
The Quantum Version Of Classification Decision Tree Constructing Algorithm C5.0
Deep inspection: an electrical distribution pole parts study via deep neural networks
The continuous Bernoulli: fixing a pervasive error in variational autoencoders
Coherency and Online Signal Selection Based Wide Area Control of Wind Integrated Power Grid
Discontinuous Galerkin Finite Element Methods for the Landau-de Gennes Minimization Problem of Liquid Crystals
Modeling competitive evolution of multiple languages
Vibrational spectrum derived from the local mechanical response in disordered solids
AirwayNet: A Voxel-Connectivity Aware Approach for Accurate Airway Segmentation Using Convolutional Neural Networks
Broadcast Distributed Voting Algorithm in Population Protocols
Quantifying replicability and consistency in systematic reviews
Labelings vs. Embeddings: On Distributed Representations of Distances
A generic rule-based system for clinical trial patient selection
The Impact of Tribalism on Social Welfare
Distributed data storage for modern astroparticle physics experiments
Light Multi-segment Activation for Model Compression
Global and local pointwise error estimates for finite element approximations to the Stokes problem on convex polyhedra
Separable Convolutional LSTMs for Faster Video Segmentation
Cascade RetinaNet: Maintaining Consistency for Single-Stage Object Detection
Learning Depth from Monocular Videos Using Synthetic Data: A Temporally-Consistent Domain Adaptation Approach
Minimal-norm static feedbacks using dissipative Hamiltonian matrices
Deep Reinforcement Learning Based Robot Arm Manipulation with Efficient Training Data through Simulation
A Unified Framework for Problems on Guessing, Source Coding and Task Partitioning
A General Framework for Uncertainty Estimation in Deep Learning
Modeling User Selection in Quality Diversity
Partial Solvers for Generalized Parity Games
Mango Tree Net — A fully convolutional network for semantic segmentation and individual crown detection of mango trees
Single-bit-per-weight deep convolutional neural networks without batch-normalization layers for embedded systems
Human Pose Estimation for Real-World Crowded Scenarios
The Bregman-Tweedie Classification Model
Assessing Refugees’ Integration via Spatio-temporal Similarities of Mobility and Calling Behaviors
Performance Assessment of Kron Reduction in the Numerical Analysis of Polyphase Power Systems
A theorem about partitioning consecutive numbers
Improving Bayesian Local Spatial Models in Large Data Sets
On the $L_p$-error of the Grenander-type estimator in the Cox model
Graphs with large girth and free groups
Semi-supervised Breast Lesion Detection in Ultrasound Video Based on Temporal Coherence
Machine learning without a feature set for detecting bursts in the EEG of preterm infants
Language comparison via network topology
A Subjective Interestingness measure for Business Intelligence explorations
Abstract categorial grammars with island constraints and effective decidability
on removal of perfect matching from folded hypercubes
Fused Detection of Retinal Biomarkers in OCT Volumes
On the Variational Iteration Method for the Nonlinear Volterra Integral Equation
Stochastic Evolution of spatial populations: From configurations to genealogies and back
A Unified Deep Framework for Joint 3D Pose Estimation and Action Recognition from a Single RGB Camera
Random projections and sampling algorithms for clustering of high-dimensional polygonal curves
Representative Days for Expansion Decisions in Power Systems
Effect of disorder correlation on Anderson localization of two-dimensional massless pseudospin-1 Dirac particles in a random one-dimensional scalar potential
Lossless Prioritized Embeddings
Positive specializations of symmetric Grothendieck polynomials
Stochastic gradient Markov chain Monte Carlo
Detecting anomalies in fibre systems using 3-dimensional image data
Speed estimation evaluation on the KITTI benchmark based on motion and monocular depth information
X-Net: Brain Stroke Lesion Segmentation Based on Depthwise Separable Convolution and Long-range Dependencies
Latent Adversarial Defence with Boundary-guided Generation
Uniqueness and characterization of local minimizers for the interaction energy with mildly repulsive potentials
CLCI-Net: Cross-Level fusion and Context Inference Networks for Lesion Segmentation of Chronic Stroke
Gender Balance in Computer Science and Engineering in Italian Universities
Threshold Logical Clocks for Asynchronous Distributed Coordination and Consensus
Improving Semantic Segmentation via Dilated Affinity
Outliers in meta-analysis: an asymmetric trimmed-mean approach
Applying twice a minimax theorem
Transmission Power Control for Remote State Estimation in Industrial Wireless Sensor Networks
Unforeseen Evidence
Computing Nested Fixpoints in Quasipolynomial Time
Data Selection for training Semantic Segmentation CNNs with cross-dataset weak supervision
Morphisms of Skew Hadamard Matrices
Cayley Structures and Coset Acyclicity
Adaptive Prior Selection for Repertoire-based Online Learning in Robotics
Dark Patterns at Scale: Findings from a Crawl of 11K Shopping Websites
Neural Language Model Based Training Data Augmentation for Weakly Supervised Early Rumor Detection
Uncertainty-aware Self-ensembling Model for Semi-supervised 3D Left Atrium Segmentation
Structured Variational Inference in Unstable Gaussian Process State Space Models
Information processing constraints in travel behaviour modelling: A generative learning approach
Embedded Ridge Approximations: Constructing Ridge Approximations Over Localized Scalar Fields For Improved Simulation-Centric Dimension Reduction
Pedestrian Tracking by Probabilistic Data Association and Correspondence Embeddings
Homophily as a process generating social networks: insights from Social Distance Attachment model
A note on duality theorems in mass transportation
How much real data do we actually need: Analyzing object detection performance using synthetic and real data
SGD momentum optimizer with step estimation by online parabola model
Shrinkage in the Time-Varying Parameter Model Framework Using the R Package shrinkTVP
RadioTalk: a large-scale corpus of talk radio transcripts
Prediction of neural network performance by phenotypic modeling
Anatomically-Informed Multiple Linear Assignment Problems for White Matter Bundle Segmentation
On The Termination of a Flooding Process
Data-driven strategies for optimal bicycle network growth
A Reduced Order technique to study bifurcating phenomena: application to the Gross-Pitaevskii equation
Variable selection in sparse high-dimensional GLARMA models
Massive MU-MIMO-OFDM Uplink with Direct RF-Sampling and 1-Bit ADCs
On the smallest singular value of multivariate Vandermonde matrices with clustered nodes
Measuring I2P Censorship at a Global Scale
Two-stage sample robust optimization
A Two-Stage Approach to Multivariate Linear Regression with Sparsely Mismatched Data
Step-by-Step Community Detection for Volume-Regular Graphs
Efficient Segmentation: Learning Downsampling Near Semantic Boundaries
The Tradeoff Between Privacy and Accuracy in Anomaly Detection Using Federated XGBoost
On the Performance of Renewable Energy-Powered UAV-Assisted Wireless Communications
Security Smells in Infrastructure as Code Scripts
EnforceNet: Monocular Camera Localization in Large Scale Indoor Sparse LiDAR Point Cloud
Predicting Next-Season Designs on High Fashion Runway
From Harnack inequality to heat kernel estimates on metric measure spaces and applications
Explaining Classifiers with Causal Concept Effect (CaCE)
Fast, Provably convergent IRLS Algorithm for p-norm Linear Regression
On the ”steerability’ of generative adversarial networks
Ordinal pattern probabilities for symmetric random walks
Tightness and tails of the maximum in 3D Ising interfaces
Hubs and authorities of the scientific migration network

Whats new on arXiv

A Scalable Framework for Multilevel Streaming Data Analytics using Deep Learning

The rapid growth of data in velocity, volume, value, variety, and veracity has enabled exciting new opportunities and presented big challenges for businesses of all types. Recently, there has been considerable interest in developing systems for processing continuous data streams with the increasing need for real-time analytics for decision support in the business, healthcare, manufacturing, and security. The analytics of streaming data usually relies on the output of offline analytics on static or archived data. However, businesses and organizations like our industry partner Gnowit, strive to provide their customers with real time market information and continuously look for a unified analytics framework that can integrate both streaming and offline analytics in a seamless fashion to extract knowledge from large volumes of hybrid streaming data. We present our study on designing a multilevel streaming text data analytics framework by comparing leading edge scalable open-source, distributed, and in-memory technologies. We demonstrate the functionality of the framework for a use case of multilevel text analytics using deep learning for language understanding and sentiment analysis including data indexing and query processing. Our framework combines Spark streaming for real time text processing, the Long Short Term Memory (LSTM) deep learning model for higher level sentiment analysis, and other tools for SQL-based analytical processing to provide a scalable solution for multilevel streaming text analytics.

Mutual Reinforcement Learning

Recently, collaborative robots have begun to train humans to achieve complex tasks, and the mutual information exchange between them can lead to successful robot-human collaborations. In this paper we demonstrate the application and effectiveness of a new approach called \textit{mutual reinforcement learning} (MRL), where both humans and autonomous agents act as reinforcement learners in a skill transfer scenario over continuous communication and feedback. An autonomous agent initially acts as an instructor who can teach a novice human participant complex skills using the MRL strategy. While teaching skills in a physical (block-building) (n=34) or simulated (Tetris) environment (n=31), the expert tries to identify appropriate reward channels preferred by each individual and adapts itself accordingly using an exploration-exploitation strategy. These reward channel preferences can identify important behaviors of the human participants, because they may well exercise the same behaviors in similar situations later. In this way, skill transfer takes place between an expert system and a novice human operator. We divided the subject population into three groups and observed the skill transfer phenomenon, analyzing it with Simpson’s psychometric model. 5-point Likert scales were also used to identify the cognitive models of the human participants. We obtained a shared cognitive model which not only improves human cognition but enhances the robot’s cognitive strategy to understand the mental model of its human partners while building a successful robot-human collaborative framework.

Logic Conditionals, Supervenience, and Selection Tasks

Principles of cognitive economy would require that concepts about objects, properties and relations should be introduced only if they simplify the conceptualisation of a domain. Unexpectedly, classic logic conditionals, specifying structures holding within elements of a formal conceptualisation, do not always satisfy this crucial principle. The paper argues that this requirement is captured by \emph{supervenience}, hereby further identified as a property necessary for compression. The resulting theory suggests an alternative explanation of the empirical experiences observable in Wason’s selection tasks, associating human performance with conditionals on the ability of dealing with compression, rather than with logic necessity.

Graph Interpolating Activation Improves Both Natural and Robust Accuracies in Data-Efficient Deep Learning

Improving the accuracy and robustness of deep neural nets (DNNs) and adapting them to small training data are primary tasks in deep learning research. In this paper, we replace the output activation function of DNNs, typically the data-agnostic softmax function, with a graph Laplacian-based high dimensional interpolating function which, in the continuum limit, converges to the solution of a Laplace-Beltrami equation on a high dimensional manifold. Furthermore, we propose end-to-end training and testing algorithms for this new architecture. The proposed DNN with graph interpolating activation integrates the advantages of both deep learning and manifold learning. Compared to the conventional DNNs with the softmax function as output activation, the new framework demonstrates the following major advantages: First, it is better applicable to data-efficient learning in which we train high capacity DNNs without using a large number of training data. Second, it remarkably improves both natural accuracy on the clean images and robust accuracy on the adversarial images crafted by both white-box and black-box adversarial attacks. Third, it is a natural choice for semi-supervised learning. For reproducibility, the code is available at \url{https://…/DNN-DataDependentActivation}.

Evaluating Explanation Without Ground Truth in Interpretable Machine Learning

Interpretable Machine Learning (IML) has become increasingly important in many applications, such as autonomous cars and medical diagnosis, where explanations are preferred to help people better understand how machine learning systems work and further enhance their trust towards systems. Particularly in robotics, explanations from IML are significantly helpful in providing reasons for those adverse and inscrutable actions, which could impair the safety and profit of the public. However, due to the diversified scenarios and subjective nature of explanations, we rarely have the ground truth for benchmark evaluation in IML on the quality of generated explanations. Having a sense of explanation quality not only matters for quantifying system boundaries, but also helps to realize the true benefits to human users in real-world applications. To benchmark evaluation in IML, in this paper, we rigorously define the problem of evaluating explanations, and systematically review the existing efforts. Specifically, we summarize three general aspects of explanation (i.e., predictability, fidelity and persuasibility) with formal definitions, and respectively review the representative methodologies for each of them under different tasks. Further, a unified evaluation framework is designed according to the hierarchical needs from developers and end-users, which could be easily adopted for different scenarios in practice. In the end, open problems are discussed, and several limitations of current evaluation techniques are raised for future explorations.

A Self-Attentive model for Knowledge Tracing

Knowledge tracing is the task of modeling each student’s mastery of knowledge concepts (KCs) as (s)he engages with a sequence of learning activities. Each student’s knowledge is modeled by estimating the performance of the student on the learning activities. It is an important research area for providing a personalized learning platform to students. In recent years, methods based on Recurrent Neural Networks (RNN) such as Deep Knowledge Tracing (DKT) and Dynamic Key-Value Memory Network (DKVMN) outperformed all the traditional methods because of their ability to capture complex representation of human learning. However, these methods face the issue of not generalizing well while dealing with sparse data which is the case with real-world data as students interact with few KCs. In order to address this issue, we develop an approach that identifies the KCs from the student’s past activities that are \textit{relevant} to the given KC and predicts his/her mastery based on the relatively few KCs that it picked. Since predictions are made based on relatively few past activities, it handles the data sparsity problem better than the methods based on RNN. For identifying the relevance between the KCs, we propose a self-attention based approach, Self Attentive Knowledge Tracing (SAKT). Extensive experimentation on a variety of real-world dataset shows that our model outperforms the state-of-the-art models for knowledge tracing, improving AUC by 4.43% on average.

Deep Social Collaborative Filtering

Recommender systems are crucial to alleviate the information overload problem in online worlds. Most of the modern recommender systems capture users’ preference towards items via their interactions based on collaborative filtering techniques. In addition to the user-item interactions, social networks can also provide useful information to understand users’ preference as suggested by the social theories such as homophily and influence. Recently, deep neural networks have been utilized for social recommendations, which facilitate both the user-item interactions and the social network information. However, most of these models cannot take full advantage of the social network information. They only use information from direct neighbors, but distant neighbors can also provide helpful information. Meanwhile, most of these models treat neighbors’ information equally without considering the specific recommendations. However, for a specific recommendation case, the information relevant to the specific item would be helpful. Besides, most of these models do not explicitly capture the neighbor’s opinions to items for social recommendations, while different opinions could affect the user differently. In this paper, to address the aforementioned challenges, we propose DSCF, a Deep Social Collaborative Filtering framework, which can exploit the social relations with various aspects for recommender systems. Comprehensive experiments on two-real world datasets show the effectiveness of the proposed framework.

Meta-Learning for Black-box Optimization

Recently, neural networks trained as optimizers under the ‘learning to learn’ or meta-learning framework have been shown to be effective for a broad range of optimization tasks including derivative-free black-box function optimization. Recurrent neural networks (RNNs) trained to optimize a diverse set of synthetic non-convex differentiable functions via gradient descent have been effective at optimizing derivative-free black-box functions. In this work, we propose RNN-Opt: an approach for learning RNN-based optimizers for optimizing real-parameter single-objective continuous functions under limited budget constraints. Existing approaches utilize an observed improvement based meta-learning loss function for training such models. We propose training RNN-Opt by using synthetic non-convex functions with known (approximate) optimal values by directly using discounted regret as our meta-learning loss function. We hypothesize that a regret-based loss function mimics typical testing scenarios, and would therefore lead to better optimizers compared to optimizers trained only to propose queries that improve over previous queries. Further, RNN-Opt incorporates simple yet effective enhancements during training and inference procedures to deal with the following practical challenges: i) Unknown range of possible values for the black-box function to be optimized, and ii) Practical and domain-knowledge based constraints on the input parameters. We demonstrate the efficacy of RNN-Opt in comparison to existing methods on several synthetic as well as standard benchmark black-box functions along with an anonymized industrial constrained optimization problem.

Are We Really Making Much Progress? A Worrying Analysis of Recent Neural Recommendation Approaches

Deep learning techniques have become the method of choice for researchers working on algorithmic aspects of recommender systems. With the strongly increased interest in machine learning in general, it has, as a result, become difficult to keep track of what represents the state-of-the-art at the moment, e.g., for top-n recommendation tasks. At the same time, several recent publications point out problems in today’s research practice in applied machine learning, e.g., in terms of the reproducibility of the results or the choice of the baselines when proposing new models. In this work, we report the results of a systematic analysis of algorithmic proposals for top-n recommendation tasks. Specifically, we considered 18 algorithms that were presented at top-level research conferences in the last years. Only 7 of them could be reproduced with reasonable effort. For these methods, it however turned out that 6 of them can often be outperformed with comparably simple heuristic methods, e.g., based on nearest-neighbor or graph-based techniques. The remaining one clearly outperformed the baselines but did not consistently outperform a well-tuned non-neural linear ranking method. Overall, our work sheds light on a number of potential problems in today’s machine learning scholarship and calls for improved scientific practices in this area. Source code of our experiments and full results are available at: https://…/RecSys2019_DeepLearning_Evaluation.

Quantum Data Fitting Algorithm for Non-sparse Matrices

We propose a quantum data fitting algorithm for non-sparse matrices, which is based on the Quantum Singular Value Estimation (QSVE) subroutine and a novel efficient method for recovering the signs of eigenvalues. Our algorithm generalizes the quantum data fitting algorithm of Wiebe, Braun, and Lloyd for sparse and well-conditioned matrices by adding a regularization term to avoid the over-fitting problem, which is a very important problem in machine learning. As a result, the algorithm achieves a sparsity-independent runtime of O(\kappa^2\sqrt{N}\mathrm{polylog}(N)/(\epsilon\log\kappa)) for an N\times N dimensional Hermitian matrix \bm{F}, where \kappa denotes the condition number of \bm{F} and \epsilon is the precision parameter. This amounts to a polynomial speedup on the dimension of matrices when compared with the classical data fitting algorithms, and a strictly less than quadratic dependence on \kappa.

Perception of visual numerosity in humans and machines

Numerosity perception is foundational to mathematical learning, but its computational bases are strongly debated. Some investigators argue that humans are endowed with a specialized system supporting numerical representation; others argue that visual numerosity is estimated using continuous magnitudes, such as density or area, which usually co-vary with number. Here we reconcile these contrasting perspectives by testing deep networks on the same numerosity comparison task that was administered to humans, using a stimulus space that allows to measure the contribution of non-numerical features. Our model accurately simulated the psychophysics of numerosity perception and the associated developmental changes: discrimination was driven by numerosity information, but non-numerical features had a significant impact, especially early during development. Representational similarity analysis further highlighted that both numerosity and continuous magnitudes were spontaneously encoded even when no task had to be carried out, demonstrating that numerosity is a major, salient property of our visual environment.

Selection Heuristics on Semantic Genetic Programming for Classification Problems

In a steady-state evolution, tournament selection traditionally uses the fitness function to select the parents, and negative selection chooses an individual to be replaced with an offspring. This contribution focuses on analyzing the behavior, in terms of performance, of different heuristics when used instead of the fitness function in tournament selection. The heuristics analyzed are related to measuring the similarity of the individuals in the semantic space. In addition, the analysis includes random selection and traditional tournament selection. These selection functions were implemented on our Semantic Genetic Programming system, namely EvoDAG, which is inspired by the geometric genetic operators and tested on 30 classification problems with a variable number of samples, variables, and classes. The result indicated that the combination of accuracy and the random selection, in the negative tournament, produces the best combination, and the difference in performances between this combination and the tournament selection is statistically significant. Furthermore, we compare EvoDAG’s performance using the selection heuristics against 18 classifiers that included traditional approaches as well as auto-machine-learning techniques. The results indicate that our proposal is competitive with state-of-art classifiers. Finally, it is worth to mention that EvoDAG is available as open source software.

Natural Adversarial Examples

We introduce natural adversarial examples — real-world, unmodified, and naturally occurring examples that cause classifier accuracy to significantly degrade. We curate 7,500 natural adversarial examples and release them in an ImageNet classifier test set that we call ImageNet-A. This dataset serves as a new way to measure classifier robustness. Like l_p adversarial examples, ImageNet-A examples successfully transfer to unseen or black-box classifiers. For example, on ImageNet-A a DenseNet-121 obtains around 2% accuracy, an accuracy drop of approximately 90%. Recovering this accuracy is not simple because ImageNet-A examples exploit deep flaws in current classifiers including their over-reliance on color, texture, and background cues. We observe that popular training techniques for improving robustness have little effect, but we show that some architectural changes can enhance robustness to natural adversarial examples. Future research is required to enable robust generalization to this hard ImageNet test set.

Mediation Challenges and Socio-Technical Gaps for Explainable Deep Learning Applications

The presumed data owners’ right to explanations brought about by the General Data Protection Regulation in Europe has shed light on the social challenges of explainable artificial intelligence (XAI). In this paper, we present a case study with Deep Learning (DL) experts from a research and development laboratory focused on the delivery of industrial-strength AI technologies. Our aim was to investigate the social meaning (i.e. meaning to others) that DL experts assign to what they do, given a richly contextualized and familiar domain of application. Using qualitative research techniques to collect and analyze empirical data, our study has shown that participating DL experts did not spontaneously engage into considerations about the social meaning of machine learning models that they build. Moreover, when explicitly stimulated to do so, these experts expressed expectations that, with real-world DL application, there will be available mediators to bridge the gap between technical meanings that drive DL work, and social meanings that AI technology users assign to it. We concluded that current research incentives and values guiding the participants’ scientific interests and conduct are at odds with those required to face some of the scientific challenges involved in advancing XAI, and thus responding to the alleged data owners’ right to explanations or similar societal demands emerging from current debates. As a concrete contribution to mitigate what seems to be a more general problem, we propose three preliminary XAI Mediation Challenges with the potential to bring together technical and social meanings of DL applications, as well as to foster much needed interdisciplinary collaboration among AI and the Social Sciences researchers.

Document worth reading: “Introduction to Multi-Armed Bandits”

Multi-armed bandits a simple but very powerful framework for algorithms that make decisions over time under uncertainty. An enormous body of work has accumulated over the years, covered in several books and surveys. This book provides a more introductory, textbook-like treatment of the subject. Each chapter tackles a particular line of work, providing a self-contained, teachable technical introduction and a review of the more advanced results. The chapters are as follows: Stochastic bandits; Lower bounds; Bayesian Bandits and Thompson Sampling; Lipschitz Bandits; Full Feedback and Adversarial Costs; Adversarial Bandits; Linear Costs and Semi-bandits; Contextual Bandits; Bandits and Zero-Sum Games; Bandits with Knapsacks; Incentivized Exploration and Connections to Mechanism Design. Status of the manuscript: essentially complete (modulo some polishing), except for last chapter, which the author plans to add over the next few months. Introduction to Multi-Armed Bandits

Distilled News

Reinforcement Learning: A Survey

<Please note that this post is for my own educational purpose.>

Getting started with AI? Start here!

Many teams try to start an applied AI project by diving into algorithms and data before figuring out desired outputs and objectives. Unfortunately, that’s like raising a puppy in a New York City apartment for a few years, then being surprised that it can’t herd sheep for you.

What AI-Driven Decision Making Looks Like

Many companies have adapted to a ‘data-driven’ approach for operational decision-making. Data can improve decisions, but it requires the right processor to get the most from it. Many people assume that processor is human. The term ‘data-driven’ even implies that data is curated by – and summarized for – people to process. But to fully leverage the value contained in data, companies need to bring artificial intelligence (AI) into their workflows and, sometimes, get us humans out of the way. We need to evolve from data-driven to AI-driven workflows. Distinguishing between ‘data-driven’ and ‘AI-driven’ isn’t just semantics. Each term reflects different assets, the former focusing on data and the latter processing ability. Data holds the insights that can enable better decisions; processing is the way to extract those insights and take actions. Humans and AI are both processors, with very different abilities. To understand how best to leverage each its helpful to review our own biological evolution and how decision-making has evolved in industry. Just fifty to seventy five years ago human judgment was the central processor of business decision-making. Professionals relied on their highly-tuned intuitions, developed from years of experience (and a relatively tiny bit of data) in their domain, to, say, pick the right creative for an ad campaign, determine the right inventory levels to stock, or approve the right financial investments. Experience and gut instinct were most of what was available to discern good from bad, high from low, and risky vs. safe.

Optimization Problem in Deep Neural Networks

Training deep neural networks to achieve the best performance is a challenging task. In this post, I would be exploring the most common problems and their solutions. These problems include taking too long to train, vanishing and exploding gradients and initialization. All these problems are known as Optimization problems. Another category of issue that arises while training the network is Regularization Problem. I have discussed them in my previous post. If you haven’t already read it, you can read it by clicking the link below.

LIME: Explaining predictions of machine learning models (1/2)

I would like to begin by asking the following question: ‘Can we trust the model predictions just because the model performance is convincingly high on the test data?’ Many people might answer this question as ‘Yes’. But this is not always true. High model performance should not be considered an indicator to trust the model predictions, as the signals being picked up by the model can be random and might not make business sense.

P-values Explained By Data Scientist

I remember when I was having my first overseas internship at CERN as a summer student, most people were still talking about the discovery of Higgs boson upon confirming that it met the ‘five sigma’ threshold (which means having p-value of 0.0000003). Back then I knew nothing about p-value, hypothesis testing or even statistical significance. And you’re right. I went to google the word – p-value, and what I found on Wikipedia made me even more confused… In statistical hypothesis testing, the p-value or probability value is, for a given statistical model, the probability that, when the null hypothesis is true, the statistical summary (such as the absolute value of the sample mean difference between two compared groups) would be greater than or equal to the actual observed results.

Implementing and Analyzing different Activation Functions and Weight Initialization Methods Using Python

In this post, we will discuss how to implement different combinations of non-linear activation functions and weight initialization methods in python. Also, we will analyze how the choice of activation function and weight initialization method will have an effect on accuracy and the rate at which we reduce our loss in a deep neural network using a non-linearly separable toy data set. This is a follow-up post to my previous post on activation functions and weight initialization methods. Note: This article assumes that the reader has a basic understanding of Neural Network, weights, biases, and backpropagation. If you want to learn the basics of the feed-forward neural network, check out my previous article (Link at the end of this article).

Natural Language Processing is Fun!

Computers are great at working with structured data like spreadsheets and database tables. But us humans usually communicate in words, not in tables. That’s unfortunate for computers.

Conversational AI ? but where is the I?

I remember the first time I saw a computer, it was a Power Macintosh 5260 (with Monkey Island on it). I was around 5 years old and I looked at it as if it belonged to another universe. It did, I was not allowed to get anywhere close to it within a 5 mile radius; it was my older brother’s! That did not stop me. I browsed it for hours. The possibilities of computers were infinite and fuelled by the inspiration of sci-fi worlds the dream of talking machines, machines that can assist humans, think themselves and even have feelings never stopped. I kept dreaming about the possibilities of the future.

Top 5 Mistakes of Greenhorn Data Scientists

1. Enter ‘Generation Kaggle’
2. Neural Networks are the cure to everything
3. Machine Learning is the Product
4. Confuse Causation with Correlation
5. Optimize the wrong metrics

Unity Machine Learning Agents Toolkit

The Unity Machine Learning Agents Toolkit (ML-Agents) is an open-source Unity plugin that enables games and simulations to serve as environments for training intelligent agents. Agents can be trained using reinforcement learning, imitation learning, neuroevolution, or other machine learning methods through a simple-to-use Python API. We also provide implementations (based on TensorFlow) of state-of-the-art algorithms to enable game developers and hobbyists to easily train intelligent agents for 2D, 3D and VR/AR games. These trained agents can be used for multiple purposes, including controlling NPC behavior (in a variety of settings such as multi-agent and adversarial), automated testing of game builds and evaluating different game design decisions pre-release. The ML-Agents toolkit is mutually beneficial for both game developers and AI researchers as it provides a central platform where advances in AI can be evaluated on Unity’s rich environments and then made accessible to the wider research and game developer communities.

The Best Refactoring You’ve Never Heard Of

Hello everyone. I’m so excited to be here at Compose, along so many enthusiastic and some very advanced functional programmers. I live a dual life. At day, I teach computers how to think about code more deeply and, at night, I teach people how to think about code more deeply. So, this is the talk I’ve been really excited about the for last year; this is hands down the coolest thing I learned in the year of 2018. I was just reading this paper about programming language semantics and it was like, ‘Oh, these two things look completely different! Here’s how they’re the same, you do this.’ I was like: wait, what was that? What? It explains so many changes that I see people like myself already do. This got all of one slide in my web course I teach, but now I’ll get a chance to really explain why it’s so cool You are all here to learn.

How to Get Started with NLP – 6 Unique Methods to Perform Tokenization

Are you fascinated by the amount of text data available on the internet? Are you looking for ways to work with this text data but aren’t sure where to begin? Machines, after all, recognize numbers, not the letters of our language. And that can be a tricky landscape to navigate in machine learning.
1. Tokenization using Python’s split() function
2. Tokenization using Regular Expressions (RegEx)
3. Tokenization using NLTK
4. Tokenization using the spaCy library
5. Tokenization using Keras
6. Tokenization using Gensim

Popular Machine Learning Applications and Use Cases in our Daily Life

1. Machine Learning Use Cases in Smartphones
• Voice Assistants
• Smartphone Cameras
• App Store and Play Store Recommendations
• Face Unlock – Smartphones
2. Machine Learning Use Cases in Transportation
• Dynamic Pricing in Travel
• Transportation and Commuting – Uber
• Google Maps
3. Machine Learning Use Cases in Popular Web Services
• Email filtering
• Google Search
• Google Translate
• LinkedIn and Facebook recommendations and ads
4. Machine Learning Use Cases in Sales and Marketing
• Recommendation Engines
• Personalized Marketing
• Customer Support Queries (and Chatbots)
6. Machine Learning Use Cases in Security
• Video Surveillance
• Cyber Security (Captchas)
7. Machine Learning Use Cases in the Financial Domain
• Catching Fraud in Banking
• Personalized Banking
8. Other Popular Machine Learning Use Cases
• Self-Driving Cars

Lorenz ’96 is too easy! Machine learning research needs a more realistic toy model.

Ed Lorenz was a genius at coming up with simple models that capture the essence of a problem in a much more complex system. His famous butterfly model from 1963 jump-started chaos research, followed by more sophisticated models to describe upscale error growth (1969) and the general circulation of the atmosphere (1984). In 1995, he created another chaotic mode that shall be the topic of this blog post. Confusingly, even though the original paper appeared in 1995, most people refer to the model as the Lorenz 96 (L96) model, which we will also do here.

How Deepfakes and Other Reality-Distorting AI Can Actually Help Us

We’re not far from the day when artificial intelligence will provide us with a paintbrush for reality. As the foundations we’ve relied upon lose their integrity, many people find themselves afraid of what’s to come. But we’ve always lived in a world where our senses misrepresent reality. New technologies will help us get closer to the truth by showing us where we can’t find it. From a historical viewpoint, we’ve never successfully stopped the progression of any technology and owe the level of safety and security we enjoy to that ongoing progression. While normal accidents do occur and the downsides of progress likely won’t ever cease to exist, we make the problem worse when trying to fight the inevitable. Besides, reality has never been as clear and accurate as we want to believe. We fight against new technology because we believe it creates uncertainty when, more accurately, it only shines a light on the uncertainty that’s always existed and we’ve preferred to ignore.

Let’s get it right

Article: The Scariest Thing About DeepNude Wasn’t the Software

At the end of June, Motherboard reported on a new app called DeepNude, which promised – ‘with a single click’ – to transform a clothed photo of any woman into a convincing nude image using machine learning. In the weeks since this report, the app has been pulled by its creator and removed from GitHub, though open source copies have surfaced there in recent days. Most of the coverage of DeepNude has focused on the specific dangers posed by its technical advances. ‘DeepNude is an evolution of that technology that is easier to use and faster to create than deepfakes,’ wrote Samantha Cole in Motherboard’s initial report on the app. ‘DeepNude also dispenses with the idea that this technology can be used for anything other than claiming ownership over women’s bodies.’ With its promise of single-click undressing of any woman, it made it easier than ever to manufacture naked photos – and, by extension, to use those fake nudes to harass, extort, and publicly shame women everywhere. But even following the app’s removal, there’s a lingering problem with DeepNude that goes beyond its technical advances and ease of use. It’s something older and deeper, something far more intractable – and far harder to erase from the internet – than a piece of open source code.

Paper: The Elusive Model of Technology, Media, Social Development, and Financial Sustainability

We recount in this essay the decade-long story of Gram Vaani, a social enterprise with a vision to build appropriate ICTs (Information and Communication Technologies) for participatory media in rural and low-income settings, to bring about social development and community empowerment. Other social enterprises will relate to the learning gained and the strategic pivots that Gram Vaani had to undertake to survive and deliver on its mission, while searching for a robust financial sustainability model. While we believe the ideal model still remains elusive, we conclude this essay with an open question about the reason to differentiate between different kinds of enterprises – commercial or social, for-profit or not-for-profit – and argue that all enterprises should have an ethical underpinning to their work.

Paper: Ethical Underpinnings in the Design and Management of ICT Projects

With a view towards understanding why undesirable outcomes often arise in ICT projects, we draw attention to three aspects in this essay. First, we present several examples to show that incorporating an ethical framework in the design of an ICT system is not sufficient in itself, and that ethics need to guide the deployment and ongoing management of the projects as well. We present a framework that brings together the objectives, design, and deployment management of ICT projects as being shaped by a common underlying ethical system. Second, we argue that power-based equality should be incorporated as a key underlying ethical value in ICT projects, to ensure that the project does not reinforce inequalities in power relationships between the actors directly or indirectly associated with the project. We present a method to model ICT projects to make legible its influence on the power relationships between various actors in the ecosystem. Third, we discuss that the ethical values underlying any ICT project ultimately need to be upheld by the project teams, where certain factors like political ideologies or dispersed teams may affect the rigour with which these ethical values are followed. These three aspects of having an ethical underpinning to the design and management of ICT projects, the need for having a power-based equality principle for ICT projects, and the importance of socialization of the project teams, needs increasing attention in today’s age of ICT platforms where millions and billions of users interact on the same platform but which are managed by only a few people.

Paper: Mediation Challenges and Socio-Technical Gaps for Explainable Deep Learning Applications

The presumed data owners’ right to explanations brought about by the General Data Protection Regulation in Europe has shed light on the social challenges of explainable artificial intelligence (XAI). In this paper, we present a case study with Deep Learning (DL) experts from a research and development laboratory focused on the delivery of industrial-strength AI technologies. Our aim was to investigate the social meaning (i.e. meaning to others) that DL experts assign to what they do, given a richly contextualized and familiar domain of application. Using qualitative research techniques to collect and analyze empirical data, our study has shown that participating DL experts did not spontaneously engage into considerations about the social meaning of machine learning models that they build. Moreover, when explicitly stimulated to do so, these experts expressed expectations that, with real-world DL application, there will be available mediators to bridge the gap between technical meanings that drive DL work, and social meanings that AI technology users assign to it. We concluded that current research incentives and values guiding the participants’ scientific interests and conduct are at odds with those required to face some of the scientific challenges involved in advancing XAI, and thus responding to the alleged data owners’ right to explanations or similar societal demands emerging from current debates. As a concrete contribution to mitigate what seems to be a more general problem, we propose three preliminary XAI Mediation Challenges with the potential to bring together technical and social meanings of DL applications, as well as to foster much needed interdisciplinary collaboration among AI and the Social Sciences researchers.

Paper: Canada Protocol: an ethical checklist for the use of Artificial Intelligence in Suicide Prevention and Mental Health

Introduction: To improve current public health strategies in suicide prevention and mental health, governments, researchers and private companies increasingly use information and communication technologies, and more specifically Artificial Intelligence and Big Data. These technologies are promising but raise ethical challenges rarely covered by current legal systems. It is essential to better identify, and prevent potential ethical risks. Objectives: The Canada Protocol – MHSP is a tool to guide and support professionals, users, and researchers using AI in mental health and suicide prevention. Methods: A checklist was constructed based upon ten international reports on AI and ethics and two guides on mental health and new technologies. 329 recommendations were identified, of which 43 were considered as applicable to Mental Health and AI. The checklist was validated, using a two round Delphi Consultation. Results: 16 experts participated in the first round of the Delphi Consultation and 8 participated in the second round. Of the original 43 items, 38 were retained. They concern five categories: ‘Description of the Autonomous Intelligent System’ (n=8), ‘Privacy and Transparency’ (n=8), ‘Security’ (n=6), ‘Health-Related Risks’ (n=8), ‘Biases’ (n=8). The checklist was considered relevant by most users, and could need versions tailored to each category of target users.

Paper: Fairness and Diversity in the Recommendation and Ranking of Participatory Media Content

Online participatory media platforms that enable one-to-many communication among users, see a significant amount of user generated content and consequently face a problem of being able to recommend a subset of this content to its users. We address the problem of recommending and ranking this content such that different viewpoints about a topic get exposure in a fair and diverse manner. We build our model in the context of a voice-based participatory media platform running in rural central India, for low-income and less-literate communities, that plays audio messages in a ranked list to users over a phone call and allows them to contribute their own messages. In this paper, we describe our model and evaluate it using call-logs from the platform, to compare the fairness and diversity performance of our model with the manual editorial processes currently being followed. Our models are generic and can be adapted and applied to other participatory media platforms as well.

Paper: Global AI Ethics: A Review of the Social Impacts and Ethical Implications of Artificial Intelligence

The ethical implications and social impacts of artificial intelligence have become topics of compelling interest to industry, researchers in academia, and the public. However, current analyses of AI in a global context are biased toward perspectives held in the U.S., and limited by a lack of research, especially outside the U.S. and Western Europe. This article summarizes the key findings of a literature review of recent social science scholarship on the social impacts of AI and related technologies in five global regions. Our team of social science researchers reviewed more than 800 academic journal articles and monographs in over a dozen languages. Our review of the literature suggests that AI is likely to have markedly different social impacts depending on geographical setting. Likewise, perceptions and understandings of AI are likely to be profoundly shaped by local cultural and social context. Recent research in U.S. settings demonstrates that AI-driven technologies have a pattern of entrenching social divides and exacerbating social inequality, particularly among historically-marginalized groups. Our literature review indicates that this pattern exists on a global scale, and suggests that low- and middle-income countries may be more vulnerable to the negative social impacts of AI and less likely to benefit from the attendant gains. We call for rigorous ethnographic research to better understand the social impacts of AI around the world. Global, on-the-ground research is particularly critical to identify AI systems that may amplify social inequality in order to mitigate potential harms. Deeper understanding of the social impacts of AI in diverse social settings is a necessary precursor to the development, implementation, and monitoring of responsible and beneficial AI technologies, and forms the basis for meaningful regulation of these technologies.

Paper: A Study on the Prevalence of Human Values in Software Engineering Publications, 2015-2018

Failure to account for human values in software (e.g., equality and fairness) can result in user dissatisfaction and negative socio-economic impact. Engineering these values in software, however, requires technical and methodological support throughout the development life cycle. This paper investigates to what extent software engineering (SE) research has considered human values. We investigate the prevalence of human values in recent (2015 – 2018) publications at some of the top-tier SE conferences and journals. We classify SE publications, based on their relevance to different values, against a widely used value structure adopted from social sciences. Our results show that: (a) only a small proportion of the publications directly consider values, classified as relevant publications; (b) for the majority of the values, very few or no relevant publications were found; and (c) the prevalence of the relevant publications was higher in SE conferences compared to SE journals. This paper shares these and other insights that motivate research on human values in software engineering.

Finding out why

Paper: Audits as Evidence: Experiments, Ensembles, and Enforcement

We develop tools for utilizing correspondence experiments to detect illegal discrimination by individual employers. Employers violate US employment law if their propensity to contact applicants depends on protected characteristics such as race or sex. We establish identification of higher moments of the causal effects of protected characteristics on callback rates as a function of the number of fictitious applications sent to each job ad. These moments are used to bound the fraction of jobs that illegally discriminate. Applying our results to three experimental datasets, we find evidence of significant employer heterogeneity in discriminatory behavior, with the standard deviation of gaps in job-specific callback probabilities across protected groups averaging roughly twice the mean gap. In a recent experiment manipulating racially distinctive names, we estimate that at least 85% of jobs that contact both of two white applications and neither of two black applications are engaged in illegal discrimination. To assess the tradeoff between type I and II errors presented by these patterns, we consider the performance of a series of decision rules for investigating suspicious callback behavior under a simple two-type model that rationalizes the experimental data. Though, in our preferred specification, only 17% of employers are estimated to discriminate on the basis of race, we find that an experiment sending 10 applications to each job would enable accurate detection of 7-10% of discriminators while falsely accusing fewer than 0.2% of non-discriminators. A minimax decision rule acknowledging partial identification of the joint distribution of callback rates yields higher error rates but more investigations than our baseline two-type model. Our results suggest illegal labor market discrimination can be reliably monitored with relatively small modifications to existing audit designs.

Paper: The Design of Mutual Information

We derive the functional form of mutual information (MI) from a set of design criteria and a principle of maximal sufficiency. The (MI) between two sets of propositions is a global quantifier of correlations and is implemented as a tool for ranking joint probability distributions with respect to said correlations. The derivation parallels the derivations of relative entropy with an emphasis on the behavior of independent variables. By constraining the functional $I$ according to special cases, we arrive at its general functional form and hence establish a clear meaning behind its definition. We also discuss the notion of sufficiency and offer a new definition which broadens its applicability.

Paper: Defining mediation effects for multiple mediators using the concept of the target randomized trial

Causal mediation approaches have been primarily developed for the goal of ‘explanation’, that is, to understand the pathways that lead from a cause to its effect. A related goal is to evaluate the impact of interventions on mediators, for example in epidemiological studies seeking to inform policies to improve outcomes for sick or disadvantaged populations by targeting intermediate processes. While there has been some methodological work on evaluating mediator interventions, no proposal explicitly defines the target estimands in terms of a ‘target trial’: the hypothetical randomized controlled trial that one might seek to emulate. In this paper, we define so-called interventional effects in terms of a target trial evaluating a number of population-level mediator interventions in the context of multiple interdependent mediators and real-world constraints of policy implementation such as limited resources, with extension to the evaluation of sequential interventions. We describe the assumptions required to identify these novel effects from observational data and a g-computation estimation method. This work was motivated by an investigation into alternative strategies for improving the psychosocial outcomes of adolescent self-harmers, based on data from the Victorian Adolescent Health Cohort Study. We use this example to show how our approach can be used to inform the prioritization of alternative courses of action. Our proposal opens up avenues for the definition and estimation of mediation effects that are policy-relevant, providing a valuable tool for building an evidence base on which to justify future time and financial investments in the development and evaluation of interventions.

Paper: Explaining Classifiers with Causal Concept Effect (CaCE)

How can we understand classification decisions made by deep neural nets? We propose answering this question by using ideas from causal inference. We define the “Causal Concept Effect” (CaCE) as the causal effect that the presence or absence of a concept has on the prediction of a given deep neural net. We then use this measure as a mean to understand what drives the network’s prediction and what does not. Yet many existing interpretability methods rely solely on correlations, resulting in potentially misleading explanations. We show how CaCE can avoid such mistakes. In high-risk domains such as medicine, knowing the root cause of the prediction is crucial. If we knew that the network’s prediction was caused by arbitrary concepts such as the lighting conditions in an X-ray room instead of medically meaningful concept, this would prevent us from disastrous deployment of such models. Estimating CaCE is difficult in situations where we cannot easily simulate the do-operator. As a simple solution, we propose learning a generative model, specifically a Variational AutoEncoder (VAE) on image pixels or image embeddings extracted from the classifier to measure VAE-CaCE. We show that VAE-CaCE is able to correctly estimate the true causal effect as compared to other baselines in controlled settings with synthetic and semi-natural high dimensional images.

Paper: Unbiased Learning to Rank: Counterfactual and Online Approaches

This tutorial covers and contrasts the two main methodologies in unbiased Learning to Rank (LTR): Counterfactual LTR and Online LTR. There has long been an interest in LTR from user interactions, however, this form of implicit feedback is very biased. In recent years, unbiased LTR methods have been introduced to remove the effect of different types of bias caused by user-behavior in search. For instance, a well addressed type of bias is position bias: the rank at which a document is displayed heavily affects the interactions it receives. Counterfactual LTR methods deal with such types of bias by learning from historical interactions while correcting for the effect of the explicitly modelled biases. Online LTR does not use an explicit user model, in contrast, it learns through an interactive process where randomized results are displayed to the user. Through randomization the effect of different types of bias can be removed from the learning process. Though both methodologies lead to unbiased LTR, their approaches differ considerably, furthermore, so do their theoretical guarantees, empirical results, effects on the user experience during learning, and applicability. Consequently, for practitioners the choice between the two is very substantial. By providing an overview of both approaches and contrasting them, we aim to provide an essential guide to unbiased LTR so as to aid in understanding and choosing between methodologies.

Paper: Potential Outcome and Directed Acyclic Graph Approaches to Causality: Relevance for Empirical Practice in Economics

In this essay I discuss potential outcome and graphical approaches to causality, and their relevance for empirical work in economics. I review some of the work on directed acyclic graphs, including the recent ‘The Book of Why,’ by Pearl and MacKenzie. I also discuss the potential outcome framework developed by Rubin and coauthors, building on work by Neyman. I then discuss the relative merits of these approaches for empirical work in economics, focusing on the questions each answer well, and why much of the the work in economics is closer in spirit to the potential outcome framework.

R Packages worth a look

Global Envelopes (GET)
Implementation of global envelopes with intrinsic graphical interpretation which can be used for graphical Monte Carlo and permutation tests where the …

Interactive Document for Working with Variance Analysis (VTShiny)
An interactive document on the topic of variance analysis using ‘rmarkdown’ and ‘shiny’ packages. Runtime examples are provided in the package function …

Multistage Allocation (R2BEAT)
Multivariate optimal allocation for different domains in one and two stages stratified sample design. R2BEAT extends the Neyman (1934) <doi:10.2307/ …

Collinearity Detection in a Multiple Linear Regression Model (multiColl)
The detection of worrying approximate collinearity in a multiple linear regression model is a problem addressed in all existing statistical packages. H …

Whats new on arXiv – Complete List

Towards Generation of Visual Attention Map for Source Code
Automatic Repair and Type Binding of Undeclared Variables using Neural Networks
Predicting Merge Conflicts in Collaborative Software Development
Characterizing Developer Use of Automatically Generated Patches
Patterns of Effort Contribution and Demand and User Classification based on Participation Patterns in NPM Ecosystem
The Design of Mutual Information
Metamorphic Testing of a Deep Learning based Forecaster
A Divide-and-Conquer Approach towards Understanding Deep Networks
Estimation and Feature Selection in Mixtures of Generalized Linear Experts Models
Task Selection Policies for Multitask Learning
Bayesian Synthesis of Probabilistic Programs for Automatic Data Modeling
Measuring the Transferability of Adversarial Examples
Exploring Deep Anomaly Detection Methods Based on Capsule Net
A Simple Uniformly Valid Test for Inequalities
Discriminative Active Learning
What does it mean to understand a neural network?
Sequential online prediction in the presence of outliers and change points: an instant temporal structure learning approach
Dynamical Systems as Temporal Feature Spaces
Comprehensive Process Drift Detection with Visual Analytics
Quick, Stat!: A Statistical Analysis of the Quick, Draw! Dataset
A Causal Bayesian Networks Viewpoint on Fairness
Confidentiality and linked data
A study on the Interpretability of Neural Retrieval Models using DeepSHAP
A Recursive Algebraic Coloring Technique for Hardware-Efficient Symmetric Sparse Matrix-Vector Multiplication
Markov chain Monte Carlo algorithms with sequential proposals
The Age of Incorrect Information: A New Performance Metric for Status Updates
Agglomerative Attention
Pathways to Good Healthcare Services and Patient Satisfaction: An Evolutionary Game Theoretical Approach
Parallelism Theorem and Derived Rules for Parallel Coherent Transformations
Low Power Receiver Front Ends: Scaling Laws and Applications
Spectrum Sensing and Resource Allocation for 5G Heterogeneous Cloud Radio Access Networks
Cover and variable degeneracy
Variable degeneracy on toroidal graphs
Tensor Methods for Finding Approximate Stationary Points of Convex Functions
Energy-Efficient Activation and Uplink Transmission for Cellular IoT
Smile, be Happy 🙂 Emoji Embedding for Visual Sentiment Analysis
Modeling the Uncertainty in Electronic Health Records: a Bayesian Deep Learning Approach
On Rado conditions for nonlinear Diophantine equations
Compressed Subspace Learning Based on Canonical Angle Preserving Property
FoodX-251: A Dataset for Fine-grained Food Classification
Microsoft Translator at WMT 2019: Towards Large-Scale Document-Level Neural Machine Translation
On Happy Colorings, Cuts, and Structural Parameterizations
The FAST Algorithm for Submodular Maximization
Fast Graph Sampling Set Selection Using Gershgorin Disc Alignment
Hybrid Offline-Online Design for UAV-Enabled Data Harvesting in Probabilistic LoS Channel
Metadata Extraction from Raw Astroparticle Data of TAIGA Experiment
On decomposing complete tripartite graphs into 5-cycles
Emergency DC Power Support Strategy Based on Coordinated Droop Control in Multi-Infeed HVDC System
A GPU implementation of the Discontinuous Galerkin method for simulation of diffusion in brain tissue
On improving learning capability of ELM and an application to brain-computer interface
On the Role of Time in Learning
A local maximizer for lattice width of $3$-dimensional hollow bodies
Necessary and sufficient condition for equilibrium of the Hotelling model
Wong-Zakai approximations with convergence rate for stochastic partial differential equations
Unsupervised Automatic Building Extraction Using Active Contour Model on Unregistered Optical Imagery and Airborne LiDAR Data
A Note on M-convex Functions on Jump Systems
Simple Automatic Post-editing for Arabic-Japanese Machine Translation
Guaranteeing E2E QoS via Joint Radio and NFV Resource Allocation for 5G and Beyond
An efficient method to construct self-dual cyclic codes of length $p^s$ over $\mathbb{F}_{p^m}+u\mathbb{F}_{p^m}$
A fast direct solver for two dimensional quasi-periodic multilayered medium scattering problems
A Simple BERT-Based Approach for Lexical Simplification
Designing Unimodular Sequences with Optimized Auto/cross-correlation Properties via Consensus-ADMM/PDMM Approaches
Multi-Level Order-Flow Imbalance in a Limit Order Book
Avoiding Membrane Locking with Regge Interpolation
Pointwise adaptive kernel density estimation under local approximate differential privacy
Combinatorial t-designs from quadratic functions
Markov-switching State Space Models for Uncovering Musical Interpretation
A new bound for the crossing number of wrapped butterflies
On the Global Convergence of Actor-Critic: A Case for Linear Quadratic Regulator with Ergodic Cost
State Estimation in Visual Inertial Autonomous Helicopter Landing Using Optimisation on Manifold
Delivery, consistency, and determinism: rethinking guarantees in distributed stream processing
Wave solutions of Gilson-Pickering equation
On the Equivalence of Youla, System-level and Input-output Parameterizations
More Supervision, Less Computation: Statistical-Computational Tradeoffs in Weakly Supervised Learning
Feature space transformations and model selection to improve the performance of classifiers
Counterfactual Reasoning for Fair Clinical Risk Prediction
Solving Mean-Payoff Games via Quasi Dominions
Hybrid Model-Based and Data-Driven Wind Velocity Estimator for the Navigation System of a Robotic Airship
Flux superperiods and periodicity transitions in quantum Hall interferometers
ALFA: A Dataset for UAV Fault and Anomaly Detection
An Artificial Spiking Quantum Neuron
Metric Thickenings, Borsuk-Ulam Theorems, and Orbitopes
Wall-crossings for Hassett descendant potentials
Synchronization for KPZ
Discourse Behavior of Older Adults Interacting With a Dialogue Agent Competent in Multiple Topics
Autoencoding sensory substitution
Learning Neural Networks with Adaptive Regularization
Finite-Time Performance Bounds and Adaptive Learning Rate Selection for Two Time-Scale Reinforcement Learning
TWEETQA: A Social Media Focused Question Answering Dataset
An Approach Based on Bayesian Networks for Query Selectivity Estimation
Perceptually Motivated Method for Image Inpainting Comparison
Universal Non-Intrusive Load Monitoring (NILM) Using Filter Pipelines, Probabilistic Knapsack, and Labelled Partition Maps
Compound TCP with Random Early Detection (RED): stability, bifurcation and performance analyses
On the Evolution of U.S. Temperature Dynamics
Tensor train-Karhunen-Loève expansion for continuous-indexed random fields using higher-order cumulant functions
Resource theory of asymmetric distinguishability for quantum channels
Splaying Preorders and Postorders
New Paths from Splay to Dynamic Optimality
Numerical study of vanishing and spreading dynamics of chemotaxis systems with logistic source and a free boundary
Enabling Multi-Shell b-Value Generalizability of Data-Driven Diffusion Models with Deep SHORE
Gradient Flow Based Discretized Kohn-Sham Density Functional Theory
A Novel User Representation Paradigm for Making Personalized Candidate Retrieval
FastV2C-HandNet: Fast Voxel to Coordinate Hand Pose Estimation with 3D Convolutional Neural Networks
Multilevel Particle Filters for the Non-Linear Filtering Problem in Continuous Time
Ranking sentences from product description & bullets for better search
Myers-Briggs Personality Classification and Personality-Specific Language Generation Using Pre-trained Language Models
Seedless Graph Matching via Tail of Degree Distribution for Correlated Erdos-Renyi Graphs
Remarks on Gross’ technique for obtaining a conformal Skorohod embedding of planar Brownian motion
An efficient estimator of the parameters of the Generalized Lambda Distribution
A dynamic over games drives selfish agents to win-win outcomes
Alternating Direction Method of Multipliers (ADMMs) Based Distributed Approach For Wide-Area Control
Controlling Model Complexity in Probabilistic Model-Based Dynamic Optimization of Neural Network Structures
Joint Language Identification of Code-Switching Speech using Attention based E2E Network
A Constructive Proof of Jacobi’s Identity for the Sum of Two Squares
Structural multiscale topology optimization with stress constraint for additive manufacturing
Motorway Traffic Flow Prediction using Advanced Deep Learning
Entanglement-assisted Quantum Codes from Algebraic Geometry Codes
CA-RefineNet:A Dual Input WSI Image Segmentation Algorithm Based on Attention
The Elusive Model of Technology, Media, Social Development, and Financial Sustainability
Micro, Meso, Macro: the effect of triangles on communities in networks
Linked partition ideals, directed graphs and $q$-multi-summations
Multimodal deep networks for text and image-based document classification
Mitigating the Hubness Problem for Zero-Shot Learning of 3D Objects
Edge-bipancyclicity of bubble-sort star graphs
Proper Orientation Number of Triangle-free Bridgeless Outerplanar Graphs
Reliability-Latency Performance of Frameless ALOHA with and without Feedback
GLOSS: Generative Latent Optimization of Sentence Representations
Single-Component Privacy Guarantees in Helper Data Systems and Sparse Coding
Sequence Level Semantics Aggregation for Video Object Detection
A Dual Memory Structure for Efficient Use of Replay Memory in Deep Reinforcement Learning
Energy cost for target control of complex networks
Optimal Control of a Hot Plasma
Improving the Harmony of the Composite Image by Spatial-Separated Attention Module
Investigation on N-gram Approximated RNNLMs for Recognition of Morphologically Rich Speech
Stabilized Barzilai-Borwein method
To Model or to Intervene: A Comparison of Counterfactual and Online Learning to Rank from User Interactions
Concept-Centric Visual Turing Tests for Method Validation
One-dimensional minimizers for a diffuse interface generalized antiferromagnetic model in general dimension
Stochastic Galerkin finite volume shallow flow model: well-balanced treatment over uncertain topography
On the Polarization of Rényi Entropy
Flag-transitive non-symmetric $2$-designs with $(r,λ)=1$ and exceptional groups of Lie type
Multi-hop Federated Private Data Augmentation with Sample Compression
A Neural Turing~Machine for Conditional Transition Graph Modeling
Noise-Stable Rigid Graphs for Euclidean Embedding
Empirical Coordination Subject to a Fidelity Criterion
Emergence of a bicritical end point in the random crystal field Blume-Capel model
Shadow Simulated Annealing algorithm: a new tool for global optimisation and statistical inference
RaKUn: Rank-based Keyword extraction via Unsupervised learning and Meta vertex aggregation
Exponential decay of correlations in the 2D random field Ising model
Improved Penalty Algorithm for Mixed Integer PDE Constrained Optimization (MIPDECO) Problems
Anonymous and confidential file sharing over untrusted clouds
Out-of-core singular value decomposition
Neural network regression for Bermudan option pricing
Fast Algorithms and Theory for High-Dimensional Bayesian Varying Coefficient Models
Proximal Policy Optimization with Mixed Distributed Training
Unsupervised Fault Detection in Varying Operating Conditions
Color Cerberus
Naver Labs Europe’s Systems for the WMT19 Machine Translation Robustness Task
DeepSUM: Deep neural network for Super-resolution of Unregistered Multitemporal images
Robust Nonlinear Component Estimation with Tikhonov Regularization
An Efficient Framework for Visible-Infrared Cross Modality Person Re-Identification
Labels instead of coefficients: a label bracket which dominates the Jones polynomial, the Kuperberg bracket, and the normalised arrow polynomial
Detecting and Simulating Artifacts in GAN Fake Images
A new combinatorial interpretation of the Fibonacci numbers squared
Tracking sex: The implications of widespread sexual data leakage and tracking on porn websites
Inapproximability within W[1]: the case of Steiner Orientation
Federated Reinforcement Distillation with Proxy Experience Memory
Computing the Kreiss Constant of a Matrix
Deep Sequential Mosaicking of Fetoscopic Videos
Subgroups of simple primitive permutation groups defined by unordered relations
A Dimension-free Algorithm for Contextual Continuum-armed Bandits
Dynamic Tube MPC for Nonlinear Systems
Asking Clarifying Questions in Open-Domain Information-Seeking Conversations
Should we Embed? A Study on the Online Performance of Utilizing Embeddings for Real-Time Job Recommendations
Achievable Data Rate for URLLC-Enabled UAV Systems with 3-D Channel Model
Addressing Delayed Feedback for Continuous Training with Neural Networks in CTR prediction
The Elicitation of Prior Distributions for Bayesian Responsive Survey Design: Historical Data Analysis vs. Literature Review
The Many AI Challenges of Hearthstone
Recovery Guarantees for Compressible Signals with Adversarial Noise
Improved Hybrid Layered Image Compression using Deep Learning and Traditional Codecs
Posterior Predictive Treatment Assignment Methods for Causal Inference in the Context of Time-Varying Treatments
On Cyclic Finite-State Approximation of Data-Driven Systems
Automated Playtesting of Matching Tile Games
Efficient Video Generation on Complex Datasets
Improved Budgeted Connected Domination and Budgeted Edge-Vertex Domination
Probability inequalities for high dimensional time series under a triangular array framework
Experimental machine learning quantum homodyne tomography
Forced synchronization of an oscillator with a line of equilibria
A comparison of European and Asian options under Markov additive processes
Medical Concept Representation Learning from Claims Data and Application to Health Plan Payment Risk Adjustment
Hotelling Games with Multiple Line Faults
Bayesian Wavelet Shrinkage with Beta Priors
The saturation assumption yields optimal convergence of two-level adaptive BEM
Revealing posturographic features associated with the risk of falling in patients with Parkinsonian syndromes via machine learning
Facebook FAIR’s WMT19 News Translation Task Submission
Audits as Evidence: Experiments, Ensembles, and Enforcement
Zero-sum subsequences in bounded-sum $\{-r,s\}$-sequences
Multi-scale Graph-based Grading for Alzheimer’s Disease Prediction
Batch-Shaped Channel Gated Networks
A Geometric Perspective on Quantum Parameter Estimation