**Paper**: **Estimating Transfer Entropy via Copula Entropy**

Causal inference is a fundemental problem in statistics and has wide applications in different fields. Transfer Entropy (TE) is a important notion defined for measuring causality, which is essentially conditional Mutual Information (MI). Copula Entropy (CE) is a theory on measurement of statistical independence and is equivalent to MI. In this paper, we prove that TE can be represented with only CE and then propose a non-parametric method for estimating TE via CE. The proposed method was applied to analyze the Beijing PM2.5 data in the experiments. Experimental results show that the proposed method can infer causality relationships from data effectively and hence help to understand the data better.

**Paper**: **Causality and deceit: Do androids watch action movies?**

We seek causes through science, religion, and in everyday life. We get excited when a big rock causes a big splash, and we get scared when it tumbles without a cause. But our causal cognition is usually biased. The ‘why’ is influenced by the ‘who’. It is influenced by the ‘self’, and by ‘others’. We share rituals, we watch action movies, and we influence each other to believe in the same causes. Human mind is packed with subjectivity because shared cognitive biases bring us together. But they also make us vulnerable. An artificial mind is deemed to be more objective than the human mind. After many years of science-fiction fantasies about even-minded androids, they are now sold as personal or expert assistants, as brand advocates, as policy or candidate supporters, as network influencers. Artificial agents have been stunningly successful in disseminating artificial causal beliefs among humans. As malicious artificial agents continue to manipulate human cognitive biases, and deceive human communities into ostensive but expansive causal illusions, the hope for defending us has been vested into developing benevolent artificial agents, tasked with preventing and mitigating cognitive distortions inflicted upon us by their malicious cousins. Can the distortions of human causal cognition be corrected on a more solid foundation of artificial causal cognition? In the present paper, we study a simple model of causal cognition, viewed as a quest for causal models. We show that, under very mild and hard to avoid assumptions, there are always self-confirming causal models, which perpetrate self-deception, and seem to preclude a royal road to objectivity.

**Paper**: **Regret Analysis of Causal Bandit Problems**

We study how to learn optimal interventions sequentially given causal information represented as a causal graph along with associated conditional distributions. Causal modeling is useful in real world problems like online advertisement where complex causal mechanisms underlie the relationship between interventions and outcomes. We propose two algorithms, causal upper confidence bound (C-UCB) and causal Thompson Sampling (C-TS), that enjoy improved cumulative regret bounds compared with algorithms that do not use causal information. We thus resolve an open problem posed by~\cite{lattimore2016causal}. Further, we extend C-UCB and C-TS to the linear bandit setting and propose causal linear UCB (CL-UCB) and causal linear TS (CL-TS) algorithms. These algorithms enjoy a cumulative regret bound that only scales with the feature dimension. Our experiments show the benefit of using causal information. For example, we observe that even with a few hundreds of iterations, the regret of causal algorithms is less than that of standard algorithms by a factor of three. We also show that under certain causal structures, our algorithms scale better than the standard bandit algorithms as the number of interventions increases.

**Paper**: **Causal Games and Causal Nash Equilibrium**

Classical results of Decision Theory, and its extension to a multi-agent setting: Game Theory, operate only at the associative level of information; this is, classical decision makers only take into account probabilities of events; we go one step further and consider causal information: in this work, we define Causal Decision Problems and extend them to a multi-agent decision problem, which we call a causal game. For such games, we study belief updating in a class of strategic games in which any player’s action causes some consequence via a causal model, which is unknown by all players; for this reason, the most suitable model is Harsanyi’s Bayesian Game. We propose a probability updating for the Bayesian Game in such a way that the knowledge of any player in terms of probabilistic beliefs about the causal model, as well as what is caused by her actions as well as the actions of every other player are taken into account. Based on such probability updating we define a Nash equilibria for Causal Games.

**Paper**: **The Case for Evaluating Causal Models Using Interventional Measures and Empirical Data**

Causal inference is central to many areas of artificial intelligence, including complex reasoning, planning, knowledge-base construction, robotics, explanation, and fairness. An active community of researchers develops and enhances algorithms that learn causal models from data, and this work has produced a series of impressive technical advances. However, evaluation techniques for causal modeling algorithms have remained somewhat primitive, limiting what we can learn from experimental studies of algorithm performance, constraining the types of algorithms and model representations that researchers consider, and creating a gap between theory and practice. We argue for more frequent use of evaluation techniques that examine interventional measures rather than structural or observational measures, and that evaluate those measures on empirical data rather than synthetic data. We survey the current practice in evaluation and show that these are rarely used in practice. We show that such techniques are feasible and that data sets are available to conduct such evaluations. We also show that these techniques produce substantially different results than using structural measures and synthetic data.

**Paper**: **Comment on ‘Blessings of Multiple Causes’**

The premise of the deconfounder method proposed in ‘Blessings of Multiple Causes’ by Wang and Blei, namely that a variable that renders multiple causes conditionally independent also controls for unmeasured multi-cause confounding, is incorrect. This can be seen by noting that no fact about the observed data alone can be informative about ignorability, since ignorability is compatible with any observed data distribution. Methods to control for unmeasured confounding may be valid with additional assumptions in specific settings, but they cannot, in general, provide a checkable approach to causal inference, and they do not, in general, require weaker assumptions than the assumptions that are commonly used for causal inference. While this is outside the scope of this comment, we note that much recent work on applying ideas from latent variable modeling to causal inference problems suffers from similar issues.

**Paper**: **Interventional Experiment Design for Causal Structure Learning**

It is known that from purely observational data, a causal DAG is identifiable only up to its Markov equivalence class, and for many ground truth DAGs, the direction of a large portion of the edges will be remained unidentified. The golden standard for learning the causal DAG beyond Markov equivalence is to perform a sequence of interventions in the system and use the data gathered from the interventional distributions. We consider a setup in which given a budget $k$, we design $k$ interventions non-adaptively. We cast the problem of finding the best intervention target set as an optimization problem which aims to maximize the number of edges whose directions are identified due to the performed interventions. First, we consider the case that the underlying causal structure is a tree. For this case, we propose an efficient exact algorithm for the worst-case gain setup, as well as an approximate algorithm for the average gain setup. We then show that the proposed approach for the average gain setup can be extended to the case of general causal structures. In this case, besides the design of interventions, calculating the objective function is also challenging. We propose an efficient exact calculator as well as two estimators for this task. We evaluate the proposed methods using synthetic as well as real data.

**Paper**: **Causal Mechanism Transfer Network for Time Series Domain Adaptation in Mechanical Systems**

Data-driven models are becoming essential parts in modern mechanical systems, commonly used to capture the behavior of various equipment and varying environmental characteristics. Despite the advantages of these data-driven models on excellent adaptivity to high dynamics and aging equipment, they are usually hungry to massive labels over historical data, mostly contributed by human engineers at an extremely high cost. The label demand is now the major limiting factor to modeling accuracy, hindering the fulfillment of visions for applications. Fortunately, domain adaptation enhances the model generalization by utilizing the labelled source data as well as the unlabelled target data and then we can reuse the model on different domains. However, the mainstream domain adaptation methods cannot achieve ideal performance on time series data, because most of them focus on static samples and even the existing time series domain adaptation methods ignore the properties of time series data, such as temporal causal mechanism. In this paper, we assume that causal mechanism is invariant and present our Causal Mechanism Transfer Network(CMTN) for time series domain adaptation. By capturing and transferring the dynamic and temporal causal mechanism of multivariate time series data and alleviating the time lags and different value ranges among different machines, CMTN allows the data-driven models to exploit existing data and labels from similar systems, such that the resulting model on a new system is highly reliable even with very limited data. We report our empirical results and lessons learned from two real-world case studies, on chiller plant energy optimization and boiler fault detection, which outperforms the existing state-of-the-art method.