System G google
Motivated by the need to extract knowledge and value frominterconnected data, graph analytics on big data is a veryactive area of research in both industry and academia. Tosupport graph analytics efficiently a large number of in memory graph libraries, graph processing systems and graphdatabases have emerged. Projects in each of these categories focus on particular aspects such as static versus dynamic graphs, off line versus on line processing, small versuslarge graphs, etc. While there has been much advance in graph processingin the past decades, there is still a need for a fast graph processing, using a cluster of machines with distributed storage. In this paper, we discuss a novel distributed graph database called System G designed for efficient graph data storage andprocessing on modern computing architectures. In particular we describe a single node graph database and a runtimeand communication layer that allows us to compose a distributed graph database from multiple single node instances. From various industry requirements, we find that fast insertions and large volume concurrent queries are critical partsof the graph databases and we optimize our database forsuch features. We experimentally show the efficiency of System G for storing data and processing graph queries onstate-of-the-art platforms. …

tick google
tick is a statistical learning library for Python~3, with a particular emphasis on time-dependent models, such as point processes, and tools for generalized linear models and survival analysis. The core of the library is an optimization module providing model computational classes, solvers and proximal operators for regularization. tick relies on a C++ implementation and state-of-the-art optimization algorithms to provide very fast computations in a single node multi-core setting. Source code and documentation can be downloaded from https://…/tick

MixTrain google
There is an arms race to defend neural networks against adversarial examples. Notably, adversarially robust training and verifiably robust training are the most promising defenses. The adversarially robust training scales well but cannot provide provable robustness guarantee for the absence of attacks. We present an Interval Attack that reveals fundamental problems about the threat model used by adversarially robust training. On the contrary, verifiably robust training achieves sound guarantee, but it is computationally expensive and sacrifices accuracy, which prevents it being applied in practice. In this paper, we propose two novel techniques for verifiably robust training, stochastic output approximation and dynamic mixed training, to solve the aforementioned challenges. They are based on two critical insights: (1) soundness is only needed in a subset of training data; and (2) verifiable robustness and test accuracy are conflicting to achieve after a certain point of verifiably robust training. On both MNIST and CIFAR datasets, we are able to achieve similar test accuracy and estimated robust accuracy against PGD attacks within $14\times$ less training time compared to state-of-the-art adversarially robust training techniques. In addition, we have up to 95.2% verified robust accuracy as a bonus. Also, to achieve similar verified robust accuracy, we are able to save up to $5\times$ computation time and offer 9.2% test accuracy improvement compared to current state-of-the-art verifiably robust training techniques. …