The Attributable Fraction (AF) Described as a Function of Disease Heritability, Prevalence and Intervention Specific Factors (AFheritability)
The AFfunction() is a function which returns an estimate of the Attributable Fraction (AF) and a plot of the AF as a function of heritability, disease …
Computation and Visualization of Package Download Counts and Percentiles (packageRank)
Compute and visualize the cross-sectional and longitudinal number and rank percentile of package downloads from RStudio’s CRAN mirror.
Amazon Redshift Tools (redshiftTools)
Efficiently upload data into an Amazon Redshift database using the approach recommended by Amazon <<a href="https://aws.amazon.com/es/redshift/>. …
Spy on Your R Session (matahari)
Conveniently log everything you type into the R console. Logs are are stored as tidy data frames which can then be analyzed using ‘tidyverse’ style tools.
Agglomerative Partitioning Framework for Dimension Reduction (partition)
A fast and flexible framework for agglomerative partitioning. ‘partition’ uses an approach called Direct-Measure-Reduce to create new variables that ma …
Tools and Palettes for Bivariate Thematic Mapping (biscale)
Provides a ‘ggplot2’ centric approach to bivariate mapping. This is a technique that maps two quantities simultaneously rather than the single value th …
“The mindset shift required for AI can lead to ‘cultural anxiety’ because it calls for a deep change in behaviors and ways of thinking.” Gartner ( 2018 )
Moving Average Convergence Divergence (MACD)
MACD, short for moving average convergence/divergence, is a trading indicator used in technical analysis of stock prices, created by Gerald Appel in the late 1970s. It is supposed to reveal changes in the strength, direction, momentum, and duration of a trend in a stock’s price. The MACD indicator (or ‘oscillator’) is a collection of three time series calculated from historical price data, most often the closing price. These three series are: the MACD series proper, the ‘signal’ or ‘average’ series, and the ‘divergence’ series which is the difference between the two. The MACD series is the difference between a ‘fast’ (short period) exponential moving average (EMA), and a ‘slow’ (longer period) EMA of the price series. The average series is an EMA of the MACD series itself.
Moving Average Convergence Divergence (MACD) …
In statistics, Bessel’s correction, named after Friedrich Bessel, is the use of n – 1 instead of n in the formula for the sample variance and sample standard deviation, where n is the number of observations in a sample. This corrects the bias in the estimation of the population variance, and some (but not all) of the bias in the estimation of the population standard deviation, but often increases the mean squared error in these estimations. …
Orbit is a composable framework for orchestrating change processing, tracking, and synchronization across multiple data sources. Orbit is written in Typescript and distributed on npm through the @orbit organization. Pre-built distributions are provided in several module formats and ES language levels. Orbit is isomorphic – it can be run both in modern browsers as well as in the Node.js runtime. …
Article: AI – Fear, uncertainty, and hope
Article: Towards Trans-Inclusive AI
Article: 9 Steps Toward Ethical AI
Article: Will Big Data Affect Opinion Polls?
Article: The Hitchhiker’s Guide to AI Ethics
Article: AI TRAPS: Automating Discrimination
Distributed Cooperative Logistics Platform (DCLP)
Supply Chains and Logistics have a growing importance in global economy. Supply Chain Information Systems over the world are heterogeneous and each one can both produce and receive massive amounts of structured and unstructured data in real-time, which are usually generated by information systems, connected objects or manually by humans. This heterogeneity is due to Logistics Information Systems components and processes that are developed by different modelling methods and running on many platforms; hence, decision making process is difficult in such multi-actor environment. In this paper we identify some current challenges and integration issues between separately designed Logistics Information Systems (LIS), and we propose a Distributed Cooperative Logistics Platform (DCLP) framework based on NoSQL, which facilitates real-time cooperation between stakeholders and improves decision making process in a multi-actor environment. We included also a case study of Hospital Supply Chain (HSC), and a brief discussion on perspectives and future scope of work. …
Diverse Online Feature Selection
Online feature selection has been an active research area in recent years. We propose a novel diverse online feature selection method based on Determinantal Point Processes (DPP). Our model aims to provide diverse features which can be composed in either a supervised or unsupervised framework. The framework aims to promote diversity based on the kernel produced on a feature level, through at most three stages: feature sampling, local criteria and global criteria for feature selection. In the feature sampling, we sample incoming stream of features using conditional DPP. The local criteria is used to assess and select streamed features (i.e. only when they arrive), we use unsupervised scale invariant methods to remove redundant features and optionally supervised methods to introduce label information to assess relevant features. Lastly, the global criteria uses regularization methods to select a global optimal subset of features. This three stage procedure continues until there are no more features arriving or some predefined stopping condition is met. We demonstrate based on experiments conducted on that this approach yields better compactness, is comparable and in some instances outperforms other state-of-the-art online feature selection methods. …
Pairwise Augmented GAN
We propose a novel autoencoding model called Pairwise Augmented GANs. We train a generator and an encoder jointly and in an adversarial manner. The generator network learns to sample realistic objects. In turn, the encoder network at the same time is trained to map the true data distribution to the prior in latent space. To ensure good reconstructions, we introduce an augmented adversarial reconstruction loss. Here we train a discriminator to distinguish two types of pairs: an object with its augmentation and the one with its reconstruction. We show that such adversarial loss compares objects based on the content rather than on the exact match. We experimentally demonstrate that our model generates samples and reconstructions of quality competitive with state-of-the-art on datasets MNIST, CIFAR10, CelebA and achieves good quantitative results on CIFAR10. …
“The number of companies that are delivering Analytics as a Service via cloud-based platforms is increasing day-by-day.” Mithun Sridharan ( September 4, 2014 )
Graph kernels have attracted a lot of attention during the last decade, and have evolved into a rapidly developing branch of learning on structured data. During the past 20 years, the considerable research activity that occurred in the field resulted in the development of dozens of graph kernels, each focusing on specific structural properties of graphs. Graph kernels have proven successful in a wide range of domains, ranging from social networks to bioinformatics. The goal of this survey is to provide a unifying view of the literature on graph kernels. In particular, we present a comprehensive overview of a wide range of graph kernels. Furthermore, we perform an experimental evaluation of several of those kernels on publicly available datasets, and provide a comparative study. Finally, we discuss key applications of graph kernels, and outline some challenges that remain to be addressed. Graph Kernels: A Survey