Computed ABC Analysis (ABCanalysis)
For a given data set, the package provides a novel method of computing precise limits to acquire subsets which are easily interpreted. Closely related to the Lorenz curve, the ABC curve visualizes the data by graphically representing the cumulative distribution function. Based on an ABC analysis the algorithm calculates, with the help of the ABC curve , the optimal limits by exploiting the mathematical properties pertaining to distribution of analyzed items. The data containing positive values is divided into three disjoint subsets A, B and C, with subset A comprising very profitable values, i.e. largest data values (“the important few”) subset B comprising values where the profit equals to the effort required to obtain it, and the subset C comprising of non-profitable values, i.e., the smallest data sets (“the trivial many”).
Extensible, Parallelizable Implementation of the Random Forest Algorithm (Rborist)
Scalable decision tree training and prediction.
Functions to Automate Downloading Geospatial Data Available from Several Federated Data Sources (FedData)
Functions to automate downloading geospatial data available from several federated data sources (mainly sources maintained by the US Federal government). Currently, the package allows for retrieval of four datasets: The National Elevation Dataset digital elevation models (1 and 1/3 arc-second; USGS); The National Hydrography Dataset (USGS); The Soil Survey Geographic (SSURGO) database from the National Cooperative Soil Survey (NCSS), which is led by the Natural Resources Conservation Service (NRCS) under the USDA; and the Global Historical Climatology Network (GHCN), coordinated by National Climatic Data Center at NOAA. Additional data sources are in the works, including global DEM resources (ETOPO1, ETOPO5, ETOPO30, SRTM), global soils (HWSD), tree-ring records (ITRDB), MODIS satellite data products, the National Atlas (US), Natural Earth, PRISM, and WorldClim.
Exploratory Data Analysis and Manipulation of Multi-Label Data Sets (mldr)
Exploratory data analysis and manipulation functions for multi-label data sets along with interactive Shiny application to ease their use.
Spatio-Temporal Modeling of Large Data Using a Spectral SPDE Approach (spate)
This package provides functionality for spatio-temporal modeling of large data sets. A Gaussian process in space and time is defined through a stochastic partial differential equation (SPDE). The SPDE is solved in the spectral space, and after discretizing in time and space, a linear Gaussian state space model is obtained. When doing inference, the main computational difficulty consists in evaluating the likelihood and in sampling from the full conditional of the spectral coefficients, or equivalently, the latent space-time process. In comparison to the traditional approach of using a spatio-temporal covariance function, the spectral SPDE approach is computationally advantageous. This package aims at providing tools for two different modeling approaches. First, the SPDE based spatio-temporal model can be used as a component in a customized hierarchical Bayesian model (HBM). The functions of the package then provide parameterizations of the process part of the model as well as computationally efficient algorithms needed for doing inference with the HBM. Alternatively, the adaptive MCMC algorithm implemented in the package can be used as an algorithm for doing inference without any additional modeling. The MCMC algorithm supports data that follow a Gaussian or a censored distribution with point mass at zero. Covariates can be included in the model through a regression term.
Univariate and Multivariate Spatial-temporal Modeling (spBayes)
Fits univariate and multivariate spatio-temporal models with Markov chain Monte Carlo (MCMC).