MatrixDS google
Work on your own projects, collaborate with others, and share with the whole community on a secure cloud-based platform. …

MediChainTM google
The set of distributed ledger architectures known as blockchain is best known for cryptocurrency applications such as Bitcoin and Ethereum. These permissionless block chains are showing the potential to be disruptive to the financial services industry. Their broader adoption is likely to be limited by the maximum block size, the cost of the Proof of Work consensus mechanism, and the increasing size of any given chain overwhelming most of the participating nodes. These factors have led to many cryptocurrency blockchains to become centralized in the nodes with enough computing power and storage to be a dominant miner and validator. Permissioned chains operate in trusted environments and can, therefore, avoid the computationally expensive consensus mechanisms. Permissioned chains are still susceptible to asset storage demands and non-standard user interfaces that will impede their adoption. This paper describes an approach to addressing these limitations: permissioned blockchain that uses off-chain storage of the data assets and this is accessed through a standard browser and mobile app. The implementation in the Hyperledger framework is described as is an example use of patient-centered health data management. …

Data Fusion google
Data fusion is the process of integration of multiple data and knowledge representing the same real-world object into a consistent, accurate, and useful representation. Data fusion processes are often categorized as low, intermediate or high, depending on the processing stage at which fusion takes place. Low level data fusion combines several sources of raw data to produce new raw data. The expectation is that fused data is more informative and synthetic than the original inputs. For example, sensor fusion is also known as (multi-sensor) data fusion and is a subset of information fusion. …

PingAn google
Geo-distributed data analysis in a cloud-edge system is emerging as a daily demand. Out of saving time in wide area data transfer, some tasks are dispersed to the edge clusters satisfied data locality. However, execution in the edge clusters is less well, due to limited resource, overload interference and cluster-level unreachable troubles, which obstructs the guarantee on the speed and completion of jobs. Synthesizing the impact of cluster heterogeneity and costly inter-cluster data fetch, we expect to make effective copies across clusters for tasks to provide both success and efficiency of the arriving jobs. To this end, we design PingAn, an online insurance algorithm making redundance across-cluster copies for tasks, promising $(1+\varepsilon)-speed \, o(\frac{1}{\varepsilon^2+\varepsilon})-competitive$ in sum of the job flowtimes. PingAn shares resource among a part of jobs with an adjustable $\varepsilon$ fraction to fit the system load condition and insures for tasks following efficiency-first reliability-aware principle to optimize the effect of copies on jobs’ performance. Trace-driven simulations demonstrate that PingAn can reduce the average job flowtimes by at least $14\%$ more than the state-of-the-art speculation mechanisms. We also build PingAn in Spark on Yarn System to verify its practicality and generality. Experiments show that PingAn can reduce the average job completion time by up to $40\%$ comparing to the default Spark execution. …