**Paper**: **Tutorial and Survey on Probabilistic Graphical Model and Variational Inference in Deep Reinforcement Learning**

Probabilistic Graphical Modeling and Variational Inference play an important role in recent advances in Deep Reinforcement Learning. Aiming at a self-consistent tutorial survey, this article illustrates basic concepts of reinforcement learning with Probabilistic Graphical Models, as well as derivation of some basic formula as a recap. Reviews and comparisons on recent advances in deep reinforcement learning with different research directions are made from various aspects. We offer Probabilistic Graphical Models, detailed explanation and derivation to several use cases of Variational Inference, which serve as a complementary material on top of the original contributions.

**Paper**: **Machine learning algorithms to infer trait matching and predict species interactions in ecological networks**

Ecologists have long suspected that species are more likely to interact if their traits match in a particular way. For example, a pollination interaction may be particularly likely if the proportions of a bee’s tongue match flower shape in a beneficial way. Empirical evidence for trait matching, however, varies significantly in strength among different types of ecological networks. Here, we show that ambiguity among empirical trait matching studies may have arisen at least in parts from using overly simple statistical models. Using simulated and real data, we contrast conventional regression models with Machine Learning (ML) models (Random Forest, Boosted Regression Trees, Deep Neural Networks, Convolutional Neural Networks, Support Vector Machines, naive Bayes, and k-Nearest-Neighbor), testing their ability to predict species interactions based on traits, and infer trait combinations causally responsible for species interactions. We find that the best ML models can successfully predict species interactions in plant-pollinator networks (up to 0.93 AUC) and outperform conventional regression models. Our results also demonstrate that ML models can better identify the causally responsible trait matching combinations than GLMs. In two case studies, the best ML models could successfully predict species interactions in a global plant-pollinator database and infer ecologically plausible trait matching rules for a plant-hummingbird network from Costa Rica, without any prior assumptions about the system. We conclude that flexible ML models offer many advantages over traditional regression models for understanding interaction networks. We anticipate that these results extrapolate to other network types, such as trophic or competitive networks. More generally, our results highlight the potential of ML and artificial intelligence for inference beyond standard tasks such as pattern recognition.

**Paper**: **Causally interpretable multi-step time series forecasting: A new machine learning approach using simulated differential equations**

This work represents a new approach which generates then analyzes a highly non linear complex system of differential equations to do interpretable time series forecasting at a high level of accuracy. This approach provides insight and understanding into the mechanisms responsible for generating past and future behavior. Core to this method is the construction of a highly non linear complex system of differential equations that is then analyzed to determine the origins of behavior. This paper demonstrates the technique on Mass and Senge’s two state Inventory Workforce model (1975) and then explores its application to the real world problem of organogenesis in mice. The organogenesis application consists of a fourteen state system where the generated set of equations reproduces observed behavior with a high level of accuracy (0.880 r^2) and when analyzed produces an interpretable and causally plausible explanation for the observed behavior.

**Paper**: **Multimodal Neuroimaging Data Integration and Pathway Analysis**

With fast advancements in technologies, the collection of multiple types of measurements on a common set of subjects is becoming routine in science. Some notable examples include multimodal neuroimaging studies for the simultaneous investigation of brain structure and function, and multi-omics studies for combining genetic and genomic information. Integrative analysis of multimodal data allows scientists to interrogate new mechanistic questions. However, the data collection and generation of integrative hypotheses is outpacing available methodology for joint analysis of multimodal measurements. In this article, we study high-dimensional multimodal data integration in the context of mediation analysis. We aim to understand the roles different data modalities play as possible mediators in the pathway between an exposure variable and an outcome. We propose a mediation model framework with two data types serving as separate sets of mediators, and develop a penalized optimization approach for parameter estimation. We study both the theoretical properties of the estimator through an asymptotic analysis, and its finite-sample performance through simulations. We illustrate our method with a multimodal brain pathway analysis having both structural and functional connectivities as mediators in the association between sex and language processing.

**Paper**: **A New Proposal of Applications of Statistical Depth Functions in Causal Analysis of Socio-Economic Phenomena Based on Official Statistics — A Study of EU Agricultural Subsidies and Digital Developement in Poland**

Results of a convincing causal statistical inference related to socio-economic phenomena are treated as especially desired background for conducting various socio-economic programs or government interventions. Unfortunately, quite often real socio-economic issues do not fulfill restrictive assumptions of procedures of causal analysis proposed in the literature. This paper indicates certain empirical challenges and conceptual opportunities related to applications of procedures of data depth concept into a process of causal inference as to socio-economic phenomena. We show, how to apply a statistical functional depths in order to indicate factual and counterfactual distributions commonly used within procedures of causal inference. The presented framework is especially useful in a context of conducting causal inference basing on official statistics, i.e., basing on already existing databases. Methodological considerations related to extremal depth, modified band depth, Fraiman-Muniz depth, and multivariate Wilcoxon sum rank statistic are illustrated by means of example related to a study of an impact of EU direct agricultural subsidies on a digital development in Poland in a period of 2012-2019.

**Paper**: **Physics-Informed Machine Learning Models for Predicting the Progress of Reactive-Mixing**

This paper presents a physics-informed machine learning (ML) framework to construct reduced-order models (ROMs) for reactive-transport quantities of interest (QoIs) based on high-fidelity numerical simulations. QoIs include species decay, product yield, and degree of mixing. The ROMs for QoIs are applied to quantify and understand how the chemical species evolve over time. First, high-resolution datasets for constructing ROMs are generated by solving anisotropic reaction-diffusion equations using a non-negative finite element formulation for different input parameters. Non-negative finite element formulation ensures that the species concentration is non-negative (which is needed for computing QoIs) on coarse computational grids even under high anisotropy. The reactive-mixing model input parameters are a time-scale associated with flipping of velocity, a spatial-scale controlling small/large vortex structures of velocity, a perturbation parameter of the vortex-based velocity, anisotropic dispersion strength/contrast, and molecular diffusion. Second, random forests, F-test, and mutual information criterion are used to evaluate the importance of model inputs/features with respect to QoIs. Third, Support Vector Machines (SVM) and Support Vector Regression (SVR) are used to construct ROMs based on the model inputs. Then, SVR-ROMs are used to predict scaling of QoIs. Qualitatively, SVR-ROMs are able to describe the trends observed in the scaling law associated with QoIs. Fourth, the scaling law’s exponent dependence on model inputs/features are evaluated using $k$-means clustering. Finally, in terms of the computational cost, the proposed SVM-ROMs and SVR-ROMs are $\mathcal{O}(10^7)$ times faster than running a high-fidelity numerical simulation for evaluating QoIs.

**Paper**: **Sparse, Low-bias, and Scalable Estimation of High Dimensional Vector Autoregressive Models via Union of Intersections**

Vector autoregressive (VAR) models are widely used for causal discovery and forecasting in multivariate time series analyses in fields as diverse as neuroscience, environmental science, and econometrics. In the high-dimensional setting, model parameters are typically estimated by L1-regularized maximum likelihood; yet, when applied to VAR models, this technique produces a sizable trade-off between sparsity and bias with the choice of the regularization hyperparameter, and thus between causal discovery and prediction. That is, low-bias estimation entails dense parameter selection, and sparse selection entails increased bias; the former is useful in forecasting but less likely to yield scientific insight leading to discovery of causal influences, and conversely for the latter. This paper presents a scalable algorithm for simultaneous low-bias and low-variance estimation (hence good prediction) with sparse selection for high-dimensional VAR models. The method leverages the recently developed Union of Intersections (UoI) algorithmic framework for flexible, modular, and scalable feature selection and estimation that allows control of false discovery and false omission in feature selection while maintaining low bias and low variance. This paper demonstrates the superior performance of the UoI-VAR algorithm compared with other methods in simulation studies, exhibits its application in data analysis, and illustrates its good algorithmic scalability in multi-node distributed memory implementations.

**Article**: **My bet on causal reinforcement learning**

Last week, I started preparations to teach a few data science grad students on a special topic – causal modeling in reinforcement learning. Upon reflection on this topic, I’m making a bet: causal reinforcement learning will be the AI killer marketing app within the next ten years.