This paper frames causal structure estimation as a machine learning task. The idea is to treat indicators of causal relationships between variables as `labels’ and to exploit available data on the variables of interest to provide features for the labelling task. Background scientific knowledge or any available interventional data provide labels on some causal relationships and the remainder are treated as unlabelled. To illustrate the key ideas, we develop a distance-based approach (based on bivariate histograms) within a manifold regularization framework. We present empirical results on three different biological data sets (including examples where causal effects can be verified by experimental intervention), that together demonstrate the efficacy and general nature of the approach as well as its simplicity from a user’s point of view.
It is a fundamental precept of System Dynamics that structure leads to behavior. Clearly relating the two is one of the roadblocks in the widespread use of feedback models as it normally depends on substantial experimentation or the application of specialized analytic techniques that are not easily approachable by most model builders. LoopX is a tool that builds understanding of structure as it determines behavior by rendering and highlighting structure responsible for behavior as the behavior unfolds. The tool builds on the Loops that Matter (Schoenberg 2019) approach to analyzing loop dominance by presenting the outcome of applying that theory in an easy to use, interactive, web based piece of software. This is a significant step forward in the challenges of automatically visualizing model behavior and linking it to generative structures identified in Sterman (2000). LoopX can be used to machine generate high quality causal loop diagrams from model equations at different levels of detail based on the dynamic importance of links and variables as well as animate them based on their importance from a loop dominance perspective. Several examples are provided that demonstrate the comprehensiveness and ease of use of the tool, important attributes supporting its broad uptake.
To enhance the expressiveness and representational capacity of recurrent neural networks (RNN), a large body of work has emerged exploring stacked architectures with additional topological modifications like shortcut connections or bidirectionality. However, choosing the best network for a particular problem requires a combinatorial search over architectures and their hyperparameters. In this work, we show that a single-layer RNN can perfectly mimic an arbitrarily deep stacked RNN under specific constraints on its weight matrix and a delay between input and output. This obviates the need to manually select hyperparameters like the number of layers. Additionally, we show that weakening weight constraints while keeping the delay gives rise to partial acausality in the single-layer RNN, much like a bidirectional network. Synthetic experiments confirm that the delayed RNN can mimic bidirectional networks in perfectly solving some acausal tasks, outperforming them in others. Finally, we show that in a challenging language processing task, the delayed RNN performs within 0.3\% of the accuracy of the bidirectional network while reducing computational costs.
Algorithmic risk assessments are increasingly used to help humans make decisions in high-stakes settings, such as medicine, criminal justice and education. In each of these cases, the purpose of the risk assessment tool is to inform actions, such as medical treatments or release conditions, often with the aim of reducing the likelihood of an adverse event such as hospital readmission or recidivism. Problematically, most tools are trained and evaluated on historical data in which the outcomes observed depend on the historical decision-making policy. These tools thus reflect risk under the historical policy, rather than under the different decision options that the tool is intended to inform. Even when tools are constructed to predict risk under a specific decision, they are often improperly evaluated as predictors of the target outcome. Focusing on the evaluation task, in this paper we define counterfactual analogues of common predictive performance and algorithmic fairness metrics that we argue are better suited for the decision-making context. We introduce a new method for estimating the proposed metrics using doubly robust estimation. We provide theoretical results that show that only under strong conditions can fairness according to the standard metric and the counterfactual metric simultaneously hold. Consequently, fairness-promoting methods that target parity in a standard fairness metric may — and as we show empirically, do — induce greater imbalance in the counterfactual analogue. We provide empirical comparisons on both synthetic data and a real world child welfare dataset to demonstrate how the proposed method improves upon standard practice.
A trend across most areas where simulation-driven development is used is the ever increasing size and complexity of the systems under consideration, pushing established methods of modeling and simulation towards their limits. This paper complements existing surveys on large-scale modeling and simulation of physical systems by conducting expert surveys. We conducted a two-stage empirical survey in order to investigate research needs, current challenges as well as promising modeling and simulation paradigms. Furthermore, we applied the analytic hierarchy process method to prioritise the strengths and weakness of different modeling paradigms. The results of this study show that experts consider acausal modeling techniques to be suitable for modeling large scale systems, while causal techniques are considered less suitable.
Reasoning based on causality, instead of association has been considered as a key ingredient towards real machine intelligence. However, it is a challenging task to infer causal relationship/structure among variables. In recent years, an Independent Mechanism (IM) principle was proposed, stating that the mechanism generating the cause and the one mapping the cause to the effect are independent. As the conjecture, it is argued that in the causal direction, the conditional distributions instantiated at different value of the conditioning variable have less variation than the anti-causal direction. Existing state-of-the-arts simply compare the variance of the RKHS mean embedding norms of these conditional distributions. In this paper, we prove that this norm-based approach sacrifices important information of the original conditional distributions. We propose a Kernel Intrinsic Invariance Measure (KIIM) to capture higher order statistics corresponding to the shapes of the density functions. We show our algorithm can be reduced to an eigen-decomposition task on a kernel matrix measuring intrinsic deviance/invariance. Causal directions can then be inferred by comparing the KIIM scores of two hypothetic directions. Experiments on synthetic and real data are conducted to show the advantages of our methods over existing solutions.
Identifying directed interactions between species from time series of their population densities has many uses in ecology. This key statistical task is equivalent to causal time series inference, which connects to the Granger causality (GC) concept: $x$ causes $y$ if $x$ improves the prediction of $y$ in a dynamic model. However, the entangled nature of nonlinear ecological systems has led to question the appropriateness of Granger causality, especially in its classical linear Multivariate AutoRegressive (MAR) model form. Convergent-cross mapping (CCM), developed for deterministic dynamical systems, has been suggested as an alternative, although less grounded in statistical theory. Here, we show that linear GC and CCM are able to uncover interactions with surprisingly similar performance, for predator-prey cycles, 2-species deterministic (chaotic) or stochastic competition, as well 10- and 20-species interaction networks. There is no correspondence between the degree of nonlinearity of the dynamics and which method performs best. Our results therefore imply that Granger causality, even in its linear MAR($p$) formulation, is a valid method for inferring interactions in nonlinear ecological networks; using GC or CCM (or both) can instead be decided based on the aims and specifics of the analysis.
Chernozhukov et al. (2018) proposed the sorted effect method for nonlinear regression models. This method consists of reporting percentiles of the causal effects in addition to the average commonly used to summarize the heterogeneity in the causal effects. They also propose to use the sorted effects to carry out classification analysis where the observational units are classified as most and least affected if their causal effects are above or below some tail sorted effects. The SortedEffects package implements the estimation and inference methods therein and provides tools to visualize the results. This vignette serves as an introduction to the package and displays basic functionality of the functions within.