Paper: Robust contrastive learning and nonlinear ICA in the presence of outliers

Nonlinear independent component analysis (ICA) is a general framework for unsupervised representation learning, and aimed at recovering the latent variables in data. Recent practical methods perform nonlinear ICA by solving a series of classification problems based on logistic regression. However, it is well-known that logistic regression is vulnerable to outliers, and thus the performance can be strongly weakened by outliers. In this paper, we first theoretically analyze nonlinear ICA models in the presence of outliers. Our analysis implies that estimation in nonlinear ICA can be seriously hampered when outliers exist on the tails of the (noncontaminated) target density, which happens in a typical case of contamination by outliers. We develop two robust nonlinear ICA methods based on the {\gamma}-divergence, which is a robust alternative to the KL-divergence in logistic regression. The proposed methods are shown to have desired robustness properties in the context of nonlinear ICA. We also experimentally demonstrate that the proposed methods are very robust and outperform existing methods in the presence of outliers. Finally, the proposed method is applied to ICA-based causal discovery and shown to find a plausible causal relationship on fMRI data.

Paper: Learning Hawkes Processes from a Handful of Events

Learning the causal-interaction network of multivariate Hawkes processes is a useful task in many applications. Maximum-likelihood estimation is the most common approach to solve the problem in the presence of long observation sequences. However, when only short sequences are available, the lack of data amplifies the risk of overfitting and regularization becomes critical. Due to the challenges of hyper-parameter tuning, state-of-the-art methods only parameterize regularizers by a single shared hyper-parameter, hence limiting the power of representation of the model. To solve both issues, we develop in this work an efficient algorithm based on variational expectation-maximization. Our approach is able to optimize over an extended set of hyper-parameters. It is also able to take into account the uncertainty in the model parameters by learning a posterior distribution over them. Experimental results on both synthetic and real datasets show that our approach significantly outperforms state-of-the-art methods under short observation sequences.

Paper: Space Objects Maneuvering Prediction via Maximum Causal Entropy Inverse Reinforcement Learning

This paper uses inverse Reinforcement Learning (RL) to determine the behavior of Space Objects (SOs) by estimating the reward function that an SO is using for control. The approach discussed in this work can be used to analyze maneuvering of SOs from observational data. The inverse RL problem is solved using maximum causal entropy. This approach determines the optimal reward function that a SO is using while maneuvering with random disturbances by assuming that the observed trajectories are optimal with respect to the SO’s own reward function. Lastly, this paper develops results for scenarios involving Low Earth Orbit (LEO) station-keeping and Geostationary Orbit (GEO) station-keeping.

Paper: Modeling National Latent Socioeconomic Health and Examination of Policy Effects via Causal Inference

This research develops a socioeconomic health index for nations through a model-based approach which incorporates spatial dependence and examines the impact of a policy through a causal modeling framework. As the gross domestic product (GDP) has been regarded as a dated measure and tool for benchmarking a nation’s economic performance, there has been a growing consensus for an alternative measure—such as a composite `wellbeing’ index—to holistically capture a country’s socioeconomic health performance. Many conventional ways of constructing wellbeing/health indices involve combining different observable metrics, such as life expectancy and education level, to form an index. However, health is inherently latent with metrics actually being observable indicators of health. In contrast to the GDP or other conventional health indices, our approach provides a holistic quantification of the overall `health’ of a nation. We build upon the latent health factor index (LHFI) approach that has been used to assess the unobservable ecological/ecosystem health. This framework integratively models the relationship between metrics, the latent health, and the covariates that drive the notion of health. In this paper, the LHFI structure is integrated with spatial modeling and statistical causal modeling, so as to evaluate the impact of a policy variable (mandatory maternity leave days) on a nation’s socioeconomic health, while formally accounting for spatial dependency among the nations. We apply our model to countries around the world using data on various metrics and potential covariates pertaining to different aspects of societal health. The approach is structured in a Bayesian hierarchical framework and results are obtained by Markov chain Monte Carlo techniques.

Paper: Fast Dimensional Analysis for Root Cause Investigation in Large-Scale Service Environment

Root cause analysis in a large-scale production environment is challenging due to the complexity of services running across global data centers. Due to the distributed nature of a large-scale system, the various hardware, software, and tooling logs are often maintained separately, making it difficult to review the logs jointly for detecting issues. Another challenge in reviewing the logs for identifying issues is the scale – there could easily be millions of entities, each with hundreds of features. In this paper we present a fast dimensional analysis framework that automates the root cause analysis on structured logs with improved scalability. We first explore item-sets, i.e. a group of feature values, that could identify groups of samples with sufficient support for the target failures using the Apriori algorithm and a subsequent improvement, FP-Growth. These algorithms were designed for frequent item-set mining and association rule learning over transactional databases. After applying them on structured logs, we select the item-sets that are most unique to the target failures based on lift. With the use of a large-scale real-time database, we propose pre- and post-processing techniques and parallelism to further speed up the analysis. We have successfully rolled out this approach for root cause investigation purposes in a large-scale infrastructure. We also present the setup and results from multiple production use-cases in this paper.

Paper: A two-dimensional propensity score matching method for longitudinal quasi-experimental studies: A focus on travel Behavior and the built environment

The lack of longitudinal studies of the relationship between the built environment and travel behavior has been widely discussed in the literature. This paper discusses how standard propensity score matching estimators can be extended to enable such studies by pairing observations across two dimensions: longitudinal and cross-sectional. Researchers mimic randomized controlled trials (RCTs) and match observations in both dimensions, to find synthetic control groups that are similar to the treatment group and to match subjects synthetically across before-treatment and after-treatment time periods. We call this a two-dimensional propensity score matching (2DPSM). This method demonstrates superior performance for estimating treatment effects based on Monte Carlo evidence. A near-term opportunity for such matching is identifying the impact of transportation infrastructure on travel behavior.

Paper: Estimating quantities conserved by virtue of scale invariance in timeseries

In contrast to the symmetries of translation in space, rotation in space, and translation in time, the known laws of physics are not universally invariant under transformation of scale. However, the action can be invariant under change of scale in the special case of a scale free dynamical system that can be described in terms of a Lagrangian, that itself scales inversely with time. Crucially, this means symmetries under change of scale can exist in dynamical systems under certain constraints. Our contribution lies in the derivation of a generalised scale invariant Lagrangian – in the form of a power series expansion – that satisfies these constraints. This generalised Lagrangian furnishes a normal form for dynamic causal models (i.e., state space models based upon differential equations) that can be used to distinguish scale invariance (scale symmetry) from scale freeness in empirical data. We establish face validity with an analysis of simulated data and then show how scale invariance can be identified – and how the associated conserved quantities can be estimated – in neuronal timeseries.

Paper: Optimal two-stage testing of multiple mediators

Mediation analysis in high-dimensional settings often involves identifying potential mediators among a large number of measured variables. For this purpose, a two step familywise error rate (FWER) procedure called ScreenMin has been recently proposed (Djordjilovi\’c et al. 2019). In ScreenMin, variables are first screened and only those that pass the screening are tested. The proposed threshold for selection has been shown to guarantee asymptotic FWER. In this work, we investigate the impact of the selection threshold on the finite sample FWER. We derive power maximizing selection threshold and show that it is well approximated by an adaptive threshold of Wang et al. (2016). We study the performance of the proposed procedures in a simulation study, and apply them to a case-control study examining the effect of fish intake on the risk of colorectal adenoma.