Risk-Sensitive Reinforcement Learning: A Constrained Optimization Viewpoint

The classic objective in a reinforcement learning (RL) problem is to find a policy that minimizes, in expectation, a long-run objective such as the infinite-horizon discounted or long-run average cost. In many practical applications, optimizing the expected value alone is not sufficient, and it may be necessary to include a risk measure in the optimization process, either as the objective or as a constraint. Various risk measures have been proposed in the literature, e.g., mean-variance tradeoff, exponential utility, the percentile performance, value at risk, conditional value at risk, prospect theory and its later enhancement, cumulative prospect theory. In this article, we focus on the combination of risk criteria and reinforcement learning in a constrained optimization framework, i.e., a setting where the goal to find a policy that optimizes the usual objective of infinite-horizon discounted/average cost, while ensuring that an explicit risk constraint is satisfied. We introduce the risk-constrained RL framework, cover popular risk measures based on variance, conditional value-at-risk and cumulative prospect theory, and present a template for a risk-sensitive RL algorithm. We survey some of our recent work on this topic, covering problems encompassing discounted cost, average cost, and stochastic shortest path settings, together with the aforementioned risk measures in a constrained framework. This non-exhaustive survey is aimed at giving a flavor of the challenges involved in solving a risk-sensitive RL problem, and outlining some potential future research directions.


A Simple Baseline Algorithm for Graph Classification

Graph classification has recently received a lot of attention from various fields of machine learning e.g. kernel methods, sequential modeling or graph embedding. All these approaches offer promising results with different respective strengths and weaknesses. However, most of them rely on complex mathematics and require heavy computational power to achieve their best performance. We propose a simple and fast algorithm based on the spectral decomposition of graph Laplacian to perform graph classification and get a first reference score for a dataset. We show that this method obtains competitive results compared to state-of-the-art algorithms.


What is an Ontology?

In the knowledge engineering community ‘ontology’ is usually defined in the tradition of Gruber as an ‘explicit specification of a conceptualization’. Several variations of this definition exist. In the paper we argue that (with one notable exception) these definitions are of no explanatory value, because they violate one of the basic rules for good definitions: The defining statement (the definiens) should be clearer than the term that is defined (the definiendum). In the paper we propose a different definition of ‘ontology’ and discuss how it helps to explain various phenomena: the ability of ontologies to change, the role of the choice of vocabulary, the significance of annotations, the possibility of collaborative ontology development, and the relationship between ontological conceptualism and ontological realism.


Node Representation Learning for Directed Graphs

We propose a novel approach for learning node representations in directed graphs, which maintains separate views or embedding spaces for the two distinct node roles induced by the directionality of the edges. In order to achieve this, we propose a novel alternating random walk strategy to generate training samples from the directed graph while preserving the role information. These samples are then trained using Skip-Gram with Negative Sampling (SGNS) with nodes retaining their source/target semantics. We conduct extensive experimental evaluation to showcase our effectiveness on several real-world datasets on link prediction, multi-label classification and graph reconstruction tasks. We show that the embeddings from our approach are indeed robust, generalizable and well performing across multiple kinds of tasks and networks. We show that we consistently outperform all random-walk based neural embedding methods for link prediction and graph reconstruction tasks. In addition to providing a theoretical interpretation of our method we also show that we are more considerably robust than the other directed graph approaches.


From Hard to Soft: Understanding Deep Network Nonlinearities via Vector Quantization and Statistical Inference

Nonlinearity is crucial to the performance of a deep (neural) network (DN). To date there has been little progress understanding the menagerie of available nonlinearities, but recently progress has been made on understanding the r\^ole played by piecewise affine and convex nonlinearities like the ReLU and absolute value activation functions and max-pooling. In particular, DN layers constructed from these operations can be interpreted as {\em max-affine spline operators} (MASOs) that have an elegant link to vector quantization (VQ) and K-means. While this is good theoretical progress, the entire MASO approach is predicated on the requirement that the nonlinearities be piecewise affine and convex, which precludes important activation functions like the sigmoid, hyperbolic tangent, and softmax. {\em This paper extends the MASO framework to these and an infinitely large class of new nonlinearities by linking deterministic MASOs with probabilistic Gaussian Mixture Models (GMMs).} We show that, under a GMM, piecewise affine, convex nonlinearities like ReLU, absolute value, and max-pooling can be interpreted as solutions to certain natural ‘hard’ VQ inference problems, while sigmoid, hyperbolic tangent, and softmax can be interpreted as solutions to corresponding ‘soft’ VQ inference problems. We further extend the framework by hybridizing the hard and soft VQ optimizations to create a \beta-VQ inference that interpolates between hard, soft, and linear VQ inference. A prime example of a \beta-VQ DN nonlinearity is the {\em swish} nonlinearity, which offers state-of-the-art performance in a range of computer vision tasks but was developed ad hoc by experimentation. Finally, we validate with experiments an important assertion of our theory, namely that DN performance can be significantly improved by enforcing orthogonality in its linear filters.


Alternating Linear Bandits for Online Matrix-Factorization Recommendation

We consider the problem of online collaborative filtering in the online setting, where items are recommended to the users over time. At each time step, the user (selected by the environment) consumes an item (selected by the agent) and provides a rating of the selected item. In this paper, we propose a novel algorithm for online matrix factorization recommendation that combines linear bandits and alternating least squares. In this formulation, the bandit feedback is equal to the difference between the ratings of the best and selected items. We evaluate the performance of the proposed algorithm over time using both cumulative regret and average cumulative NDCG. Simulation results over three synthetic datasets as well as three real-world datasets for online collaborative filtering indicate the superior performance of the proposed algorithm over two state-of-the-art online algorithms.


An Exploration of Dropout with RNNs for Natural Language Inference

Dropout is a crucial regularization technique for the Recurrent Neural Network (RNN) models of Natural Language Inference (NLI). However, dropout has not been evaluated for the effectiveness at different layers and dropout rates in NLI models. In this paper, we propose a novel RNN model for NLI and empirically evaluate the effect of applying dropout at different layers in the model. We also investigate the impact of varying dropout rates at these layers. Our empirical evaluation on a large (Stanford Natural Language Inference (SNLI)) and a small (SciTail) dataset suggest that dropout at each feed-forward connection severely affects the model accuracy at increasing dropout rates. We also show that regularizing the embedding layer is efficient for SNLI whereas regularizing the recurrent layer improves the accuracy for SciTail. Our model achieved an accuracy 86.14% on the SNLI dataset and 77.05% on SciTail.


On $s$-distance-transitive graphs
Constituent Parsing as Sequence Labeling
Transition-based Parsing with Lighter Feed-Forward Networks
Visualization Framework for Colonoscopy Videos
Safe Adaptive Cruise Control with Road Grade Preview and V2V Communication
Robust Receiver Design for Non-orthogonal Multiple Access
Signal Adaptive Variable Selector for the Horseshoe Prior
Theoretical and Practical Aspects of the Linear Tape Scheduling Problem
A Non-asymptotic, Sharp, and User-friendly Reverse Chernoff-Cramèr Bound
Spatial Co-location Pattern Mining – A new perspective using Graph Database
3D shape retrieval basing on representatives of classes
On unconstrained optimization problems solved using CDT and triality theory
Combinatorics of $k$-Farey graphs
C2A: Crowd Consensus Analytics for Virtual Colonoscopy
On a linear functional for infinitely divisible moving average random fields
eXogenous Kalman Filter for Lithium-Ion Batteries State-of-Charge Estimation in Electric Vehicles
Local Properties via Color Energy Graphs and Forbidden Configurations
Correcting an estimator of a multivariate monotone function with isotonic regression
Hierarchical ResNeXt Models for Breast Cancer Histology Image Classification
Actor-Critic Policy Optimization in Partially Observable Multiagent Environments
Distributed Approximate Distance Oracles
Soft Concept Analysis
Depth with Nonlinearity Creates No Bad Local Minima in ResNets
Patient Subtyping with Disease Progression and Irregular Observation Trajectories
VIENA2: A Driving Anticipation Dataset
On DC based Methods for Phase Retrieval
A convex integer programming approach for optimal sparse PCA
Optimal electricity demand response contracting with responsiveness incentives
Where is this? Video geolocation based on neural network features
On the Conditional Smooth Renyi Entropy and its Applications in Guessing and Source Coding
Learning from the Kernel and the Range Space
Labeling Gaps Between Words: Recognizing Overlapping Mentions with Mention Separators
Atrial fibrosis quantification based on maximum likelihood estimator of multivariate images
Our Practice Of Using Machine Learning To Recognize Species By Voice
Sparsemax and Relaxed Wasserstein for Topic Sparsity
ComNet: Combination of Deep Learning and Expert Knowledge in OFDM Receivers
A general learning system based on neuron bursting and tonic firing
SG-One: Similarity Guidance Network for One-Shot Semantic Segmentation
Interpretability is Harder in the Multiclass Setting: Axiomatic Interpretability for Multiclass Additive Models
Degree growth for tame automorphisms of an affine quadric threefold
A Variable Reduction Method for Large-Scale Security Constrained Unit Commitment
Can We Gain More from Orthogonality Regularizations in Training Deep CNNs?
Actor-Expert: A Framework for using Action-Value Methods in Continuous Action Spaces
Norm-Range Partition: A Univiseral Catalyst for LSH based Maximum Inner Product Search (MIPS)
Evolution of holonic control architectures towards Industry 4.0: A short overview
Learning to Measure Change: Fully Convolutional Siamese Metric Networks for Scene Change Detection
The Bregman chord divergence
Surrogate modeling based on resampled polynomial chaos expansions
Uniform and $L^q$-Ensemble Reachability of Parameter-dependent Linear Systems
Atrial scars segmentation via potential learning in the graph-cuts framework
Distributed Mixed Voltage Angle and Frequency Droop Control of Microgrid Interconnections with Loss of Distribution-PMU Measurements
Unsupervised Detection of Anomalous Sound based on Deep Learning and the Neyman-Pearson Lemma
Do Deep Generative Models Know What They Don’t Know?
DNN-based Source Enhancement to Increase Objective Sound Quality Assessment Score
Bayesian Modelling of Lexis Mortality Data
Mining useful Macro-actions in Planning
Beyond ROUGE Scores in Algorithmic Summarization: Creating Fairness-Preserving Textual Summaries
Mean-based Heuristic Search for Real-Time Planning
PriSTE: From Location Privacy to Spatiotemporal Event Privacy
A Dual-Attention Hierarchical Recurrent Neural Network for Dialogue Act Classification
Temporal inactivation enhances robustness in an evolving system
Threat or Opportunity? – Examining Social Bots in Social Media Crisis Communication
Exploring Correlations in Multiple Facial Attributes through Graph Attention Network
Named Entity Disambiguation using Deep Learning on Graphs
A Maximum Likelihood-Based Minimum Mean Square Error Separation and Estimation of Stationary Gaussian Sources from Noisy Mixtures
Ensemble Method for Censored Demand Prediction
Optimal arrangements of hyperplanes for multiclass classification
Dating Ancient Paintings of Mogao Grottoes Using Deeply Learnt Visual Codes
The Hessenberg matrices and Catalan and its generalized numbers
Compositional coding capsule network with k-means routing for text classification
Learning sparse transformations through backpropagation
Computation via Interacting Magnetic Memory Bites: Integration of Boolean Gates
Subtleties in the interpretation of hazard ratios
Field Of Interest Proposal for Augmented Mitotic Cell Count: Comparison of two Convolutional Networks
Graph Convolutional Reinforcement Learning for Multi-Agent Cooperation
Multi-Agent Actor-Critic with Generative Cooperative Policy Network
Halfspace depth does not characterize probability distributions
Weighted asymmetric least squares regression for longitudinal data using GEE
Chance-Constrained AC Optimal Power Flow Integrating HVDC Lines and Controllability
On Number Rigidity for Pfaffian Point Processes
Cost-Sensitive Robustness against Adversarial Examples
Knowledge Graph Completion to Predict Polypharmacy Side Effects
Approximations of the boundary crossing probabilities for the maximum of moving sums
A Constraint-Reduced MPC Algorithm for Convex Quadratic Programming, with a Modified Active Set Identification Scheme
A Review on Learning Planning Action Models for Socio-Communicative HRI
Optimal terminal dimensionality reduction in Euclidean space
The Price equation program: simple invariances unify population dynamics, thermodynamics, probability, information and inference
Coalition Resilient Outcomes in Max k-Cut Games
Data-driven optimization of processes with degrading equipment
A Bayesian Nonparametrics based Robust Particle Filter Algorithm
Optimal distributed control of a stochastic Cahn-Hilliard equation
The Multi-Scale Impact of the Alzheimer’s Disease in the Topology Diversity of Astrocytes Molecular Communications Nanonetworks
Sparse constrained projection approximation subspace tracking
RCanopus: Making Canopus Resilient to Failures and Byzantine Faults
BioSentVec: creating sentence embeddings for biomedical texts
On the k-Boundedness for Existential Rules
Topological and metric recurrence for general Markov chains
Assessing the Impact of Gamification on Self-Directed Learning in Medical Students
Circuits through prescribed edges
On the number of limit cycles in asymmetric neural networks
Double-precision FPUs in High-Performance Computing: an Embarrassment of Riches?
Recovering Robustness in Model-Free Reinforcement learning
Baseline Detection in Historical Documents using Convolutional U-Nets
Adversarial Online Learning with noise
Generation of Virtual Dual Energy Images from Standard Single-Shot Radiographs using Multi-scale and Conditional Adversarial Network
Fast Dual Simulation Processing of Graph Database Queries (Supplement)
Weighted Super Poincare Inequalities for Infinite-Dimensional Extension of the Dirichlet Distribution
Enabling Efficient RDMA-based Synchronous Mirroring of Persistent Memory Transactions
ensmallen: a flexible C++ library for efficient function optimization
Coupled Longitudinal and Lateral Control of a Vehicle using Deep Learning
Description of Incomplete Financial Markets for the Discrete Time Evolution of Risk Assets
Brain Tumor Image Retrieval via Multitask Learning
On the Atkin and Swinnerton-Dyer type congruences for some truncated hypergeometric ${}_1F_0$ series
Predictive Linguistic Features of Schizophrenia
Linguistic Legal Concept Extraction in Portuguese
Unsupervised Learning of Shape and Pose with Differentiable Point Clouds
A minimax near-optimal algorithm for adaptive rejection sampling
A neuro-inspired architecture for unsupervised continual learning based on online clustering and hierarchical predictive coding
Nonhomogeneous Euclidean first-passage percolation and distance learning
Towards a context-dependent numerical data quality evaluation framework
Subcritical approximations to stochastic defocusing mass-critical nonlinear Schrödinger equation on $\mathbb{R}$
Event-triggered Natural Hazard Monitoring with Convolutional Neural Networks on the Edge
Properties of an N Time-Slice Dynamic Chain Event Graph
Human-Competitive Awards 2018
Optimality of the final model found via Stochastic Gradient Descent
Scaling up Deep Learning for PDE-based Models
New Bounds for the Dichromatic Number of a Digraph
Proactive Security: Embedded AI Solution for Violent and Abusive Speech Recognition
Bayesian multi-domain learning for cancer subtype discovery from next-generation sequencing count data

Advertisements