Investigating practical, linear temporal difference learning

Off-policy reinforcement learning has many applications including: learning from demonstration, learning multiple goal seeking policies in parallel, and representing predictive knowledge. Recently there has been an proliferation of new policy-evaluation algorithms that fill a longstanding algorithmic void in reinforcement learning: combining robustness to off-policy sampling, function approximation, linear complexity, and temporal difference (TD) updates. This paper contains two main contributions. First, we derive two new hybrid TD policy-evaluation algorithms, which fill a gap in this collection of algorithms. Second, we perform an empirical comparison to elicit which of these new linear TD methods should be preferred in different situations, and make concrete suggestions about practical use.


The Survival Complex

We introduce a new way to associate a simplicial complex called the \emph{survival complex} to a commutative semigroup with zero. Restricting our attention to the semigroup of monomials arising from an Artinian monomial ring, we determine that any such complex has an isolated point. Indeed, we show that there is exactly one isolated point essentially only in the case where the monomial ideal is generated purely by powers of the variables. This allows us to recover Beintema’s result that an Artinian monomial ring is Gorenstein if and only if it is a complete intersection. A key ingredient of the translation between the pure power result and Beintema’s result is given by the one-to-one correspondence we show between the so-called \emph{truly isolated} points of our complex and the generators of the socle of the defining ideal. In another relation between the geometry of the complex and the algebra of the ring, we essentially give a correspondence between the nontrivial connected components of the complex and the factors of a fibre product representation of the ring. Finally, we explore algorithms for building survival complexes from specified isolated points. That is, we work to build the ring out of a description of the socle.


Finite presentability and isomorphism of Cayley graphs of monoids

On the computational complexity of minimum-concave-cost flow in a two-dimensional grid

Theoretical Properties and Practical Performance of Fully Robust One-Sided Cross-Validation

Finding a Large Submatrix of a Gaussian Random Matrix

A better lower bound on average degree of 4-list-critical graphs

Random Chain Complexes

Modifications of Wald’s score tests on large dimensional covariance matrices structure

Significance Driven Hybrid 8T-6T SRAM for Energy-Efficient Synaptic Storage in Artificial Neural Networks

Multiplier-less Artificial Neurons Exploiting Error Resiliency for Energy-Efficient Neural Computing

Convergence properties of Gibbs samplers for Bayesian probit regression with proper priors

Online Tree Caching

Towards Neural Knowledge DNA

Content-based Video Indexing and Retrieval Using Corr-LDA

Approximating Bayesian confidence regions in convex inverse problems

Fourier transforms of polytopes, solid angle sums, and discrete volume

Fast Gibbs sampling for high-dimensional Bayesian inversion

Scalable Bayesian Rule Lists

CLT for linear eigenvalue statistics for a tensor product version of sample covariance matrices

The Second Neighborhood Conjecture For Oriented Graphs Missing Generalized Combs

On assessing the accuracy of defect free energy computations

Relationship Between the Uncompensated Price-Elasticity and the Income-Elasticity of Demand Under Conditions of Additive Preferences

Testing for parameter change in general integer-valued time series

Shuffle and Faà di Bruno Hopf Algebras in the Center Problem for Ordinary Differential Equations

QuotationFinder – Searching for Quotations and Allusions in Greek and Latin Texts and Establishing the Degree to Which a Quotation or Allusion Matches Its Source

On the Exit Time and Stochastic Homogenization of Isotropic Diffusions in Large Domains

Interval k-Graphs and Orders

Lie Access Neural Turing Machine

Robust hyperparameter estimation protects against hypervariable genes and improves power to detect differential expression

Combinatorial properties of poly-Bernoulli relatives

Landau-Zener Bloch oscillations with perturbed flat bands

Modelling of lung cancer survival data for critical illness insurances

Leading log expansion of combinatorial Dyson Schwinger equations

On the entropy numbers of the mixed smoothness function classes

Identification of Parallel Passages Across a Large Hebrew/Aramaic Corpus

The Erdős-Hajnal hypergraph Ramsey problem

Free subgroup numbers modulo prime powers: the non-periodic case

Improved bounds and algorithms for graph cuts and network reliability

A Structured Variational Auto-encoder for Learning Deep Hierarchies of Sparse Features

Heuristics for the Variable Sized Bin Packing Problem Using a Hybrid P-System and CUDA Architecture

Uniqueness of the extremal graph in the problem of maximizing the number of independent sets in regular graphs

Gibberish Semantics: How Good is Russian Twitter in Word Semantic Similarity Task?

Optimizing the Learning Order of Chinese Characters Using a Novel Topological Sort Algorithm

Stability and Structural Properties of Gene Regulation Networks with Coregulation Rules

Adjusting for Scorekeeper Bias in NBA Box Scores

Structured Prediction with Test-time Budget Constraints

New Statistical Perspective to The Cosmic Void Distribution

Macro vs. Micro Methods in Non-Life Claims Reserving (an Econometric Perspective)

Does quantification without adjustments work?

Counting results for sparse pseudorandom hypergraphs I

Counting results for sparse pseudorandom hypergraphs II

Maximum Pseudolikelihood Estimation for a Model-Based Clustering of time series Data

Inference in Functional Linear Quantile Regression

Quadratic covariations for the solution to a stochastic heat equation

Iterative Aggregation Method for Solving Principal Component Analysis Problems

An integral functional driven by fractional Brownian motion

Exploring the coevolution of predator and prey morphology and behavior

Exploratory data analysis for extreme values using non-parametric kernel methods

Compressing Graphs and Indexes with Recursive Graph Bisection

Access Time Tradeoffs in Archive Compression

Elementary symmetric polynomials in Stanley–Reisner face ring

Bioinformatics and Classical Literary Study

Flexible online multivariate regression with variational Bayes and the matrix-variate Dirichlet process

Discretizing Malliavin calculus

Age-structured population model of infectious disease spread applied to data on varicella prevalence in Poland

A non-crossing word cooperad for free homotopy probability theory

The maximum of the CUE field

Octahedral, dicyclic and special linear solutions of some unsolved Hamilton-Waterloo problems

Skolem Circles

Stochastic bandits on a social network: Collaborative learning with local information sharing

The valuation of American options in multidimensional exponential Levy model

The correlation between fragility, density and atomic interaction in glass-forming liquids

Improved Fréchet–Hoeffding bounds on $d$-copulas and applications in model-free finance

New properties of a certain method of summation of generalized hypergeometric series

Metastability for Glauber dynamics on random graphs

Perturbation bounds and degree of imprecision for uniquely convergent imprecise Markov chains

Range-based argumentation semantics as 2-valued models

On the Partition Dimension and the Twin Number of a Graph

A hierarchy of local decision

$L_2$Boosting in High-Dimensions: Rate of Convergence

Clustering coefficient of random intersection graphs with infinite degree variance

A Complex-Network Perspective on Alexander’s Wholeness

Representation of linguistic form and function in recurrent neural networks

On 132-representable Graphs

Statistical models for dynamics in extreme value processes

Even Trolls Are Useful: Efficient Link Classification in Signed Networks

Scaling limit and ageing for branching random walk in Pareto environment

Simsun permutations, simsun successions and simsun patterns

Beyond CCA: Moment Matching for Multi-View Models

Elliptic hypergeometric summations by Taylor series expansion and interpolation

Algorithms on Ideal over Complex Multiplication order

On Complex Valued Convolutional Neural Networks

One-point localization for branching random walk in Pareto environment

On the Generalised Colouring Numbers of Graphs that Exclude a Fixed Minor

Bayesian estimation of airborne fugitive emissions using a Gaussian plume model

Nanoscale artificial intelligence: creating artificial neural networks using autocatalytic reactions

Personalized and situation-aware multimodal route recommendations: the FAVOUR algorithm

A comparison of two methods for detecting abrupt changes in the variance of climatic time series

Bayesian Variable Selection for Skewed Heteroscedastic Response

Symmetril Moulds, Generic Group Schemes, Resummation of Mzvs

A family of centered random walks on weight lattices conditioned to stay in Weyl chambers

Easy Monotonic Policy Iteration

Multi Snapshot Sparse Bayesian Learning for DOA Estimation

Adjusted Empirical Likelihood for Time Series Models

Continuous Analogues for the Binomial Coefficients and the Catalan Numbers

The Capacity of Private Information Retrieval

On the real-rootedness of the Veronese construction for rational formal power series