In many applications (e.g., anomaly detection and security systems) of smart cities, rare events dominate the importance of the total information of big data collected by Internet of Things (IoTs). That is, it is pretty crucial to explore the valuable information associated with the rare events involved in minority subsets of the voluminous amounts of data. To do so, how to effectively measure the information with importance of the small probability events from the perspective of information theory is a fundamental question. This paper first makes a survey of some theories and models with respect to importance measures and investigates the relationship between subjective or semantic importance and rare events in big data. Moreover, some applications for message processing and data analysis are discussed in the viewpoint of information measures. In addition, based on rare events detection, some open challenges related to information measures, such as smart cities, autonomous driving, and anomaly detection in IoTs, are introduced which can be considered as future research directions.
We study the independence structure of finitely exchangeable distributions over random vectors and random networks. In particular, we provide necessary and sufficient conditions for an exchangeable vector so that its elements are completely independent or completely dependent. We also provide a sufficient condition for an exchangeable vector so that its elements are marginally independent. We then generalize these results and conditions for exchangeable random networks. In this case, it is demonstrated that the situation is more complex. We show that the independence structure of exchangeable random networks lies in one of six regimes represented by undirected and bidirected independence graphs in graphical model sense. In addition, under certain additional assumptions, we provide necessary and sufficient conditions for the exchangeable network distributions to be faithful to each of these graphs.
In this work, we propose an introspection technique for deep neural networks that relies on a generative model to instigate salient editing of the input image for model interpretation. Such modification provides the fundamental interventional operation that allows us to obtain answers to counterfactual inquiries, i.e., what meaningful change can be made to the input image in order to alter the prediction. We demonstrate how to reveal interesting properties of the given classifiers by utilizing the proposed introspection approach on both the MNIST and the CelebA dataset.
Causal estimation relies on separating the variation in the outcome due to the confounders from that due to the treatment. To achieve this separation, practitioners can use external sources of randomness that only influence the treatment called instrumental variables (IVs). Traditional IV-methods rely on structural assumptions that limit the effect that the confounders can have on both outcome and treatment. To relax these assumptions we develop a new estimator called the generalized control-function method (GCFN). GCFN’s first stage called variational decoupling (VDE) recovers the residual variation in the treatment given the IV. In the second stage, GCFN regresses the outcome on the treatment and residual variation to compute the causal effect. We evaluate GCFN on simulated data and on recovering the causal effect of slave export on community trust. We show how VDE can help unify IV-estimators and non-IV-estimators.
Extreme hydrological events in the Danube river basin may severely impact human populations, aquatic organisms, and economic activity. One often characterizes the joint structure of the extreme events using the theory of multivariate and spatial extremes and its asymptotically justified models. There is interest however in cascading extreme events and whether one event causes another. In this paper, we argue that an improved understanding of the mechanism underlying severe events is achieved by combining extreme value modelling and causal discovery. We construct a causal inference method relying on the notion of the Kolmogorov complexity of extreme conditional quantiles. Tail quantities are derived using multivariate extreme value models and causal-induced asymmetries in the data are explored through the minimum description length principle. Our CausEV, for Causality for Extreme Values, approach uncovers causal relations between summer extreme river discharges in the upper Danube basin and finds significant causal links between the Danube and its Alpine tributary Lech.
Scientists construct and analyze computational models to understand the world. That understanding comes from efforts to augment, combine, and compare models of related phenomena. We propose SemanticModels.jl, a system that leverages techniques from static and dynamic program analysis to process executable versions of scientific models to perform such metamodeling tasks. By framing these metamodeling tasks as metaprogramming problems, SemanticModels.jl enables writing programs that generate and expand models. To this end, we present a category theory-based framework for defining metamodeling tasks, and extracting semantic information from model implementations, and show how this framework can be used to enhance scientific workflows in a working case study.