We´ve created a completely new processor that´s the first to be specifically designed for machine intelligence workloads – an Intelligence Processing Unit (IPU) that will set a new pace of innovation. The IPU has been optimized to work efficiently on the extremely complex high-dimensional models needed for machine intelligence workloads. It emphasizes massively parallel, low-precision floating-point compute and provides much higher compute density than other solutions. At NIPS 2017 we shared more information about our IPU architecture, how to program the IPU with TensorFlow and Poplar and some preliminary benchmarks.

GraphQL is the new REST

1. A query language for your API: GraphQL is a query language for APIs and a runtime for fulfilling those queries with your existing data. GraphQL provides a complete and understandable description of the data in your API, gives clients the power to ask for exactly what they need and nothing more, makes it easier to evolve APIs over time, and enables powerful developer tools.
2. Ask for what you need, get exactly that: Send a GraphQL query to your API and get exactly what you need, nothing more and nothing less. GraphQL queries always return predictable results. Apps using GraphQL are fast and stable because they control the data they get, not the server.
3. Get many resources in a single request: GraphQL queries access not just the properties of one resource but also smoothly follow references between them. While typical REST APIs require loading from multiple URLs, GraphQL APIs get all the data your app needs in a single request. Apps using GraphQL can be quick even on slow mobile network connections.
4. Describe what´s possible with a type system: GraphQL APIs are organized in terms of types and fields, not endpoints. Access the full capabilities of your data from a single endpoint. GraphQL uses types to ensure Apps only ask for what´s possible and provide clear and helpful errors. Apps can use types to avoid writing manual parsing code.
5. Evolve your API without versions: Add new fields and types to your GraphQL API without impacting existing queries. Aging fields can be deprecated and hidden from tools. By using a single evolving version, GraphQL APIs give apps continuous access to new features and encourage cleaner, more maintainable server code.
6. Bring your own data and code: GraphQL creates a uniform API across your entire application without being limited by a specific storage engine. Write GraphQL APIs that leverage your existing data and code with GraphQL engines available in many languages. You provide functions for each field in the type system, and GraphQL calls them with optimal concurrency.

Preliminary IPU benchmarks

Graphcore’s IPU (Intelligence Processing Unit) is a new AI accelerator bringing an unprecedented level of performance to both current and future machine learning workloads. Its unique combination of massively parallel multi-tasking compute, synchronized execution within an IPU or across multiple IPUs, innovative data exchange fabric and large amounts of on-chip SRAM give unheard of capabilities for both training and inference across a large range of machine learning algorithms.

Using Python to Power Spreadsheets in Data Science

Learn how Python can be used more effectively than Excel, with the Pandas package.

A Better Stats 101

Statistics encourages us to think systemically and recognize that variables normally do not operate in isolation, and that an effect usually has multiple causes. Some call this multivariate thinking. Statistics is particularly useful for uncovering the Why.

R 3.5.0 on Debian and Ubuntu: An Update

R 3.5.0 was released a few weeks ago. As it changes some (important) internals, packages installed with a previous version of R have to be rebuilt. This was known and expected, and we took several measured steps to get R binaries to everybody without breakage. The question of but how do I upgrade without breaking my system was asked a few times, e.g., on the r-sig-debian list as well as in this StackOverflow question.

5 Machine Learning Projects You Should Not Overlook, June 2018

Here is a new installment of 5 more machine learning or machine learning-related projects you may not yet have heard of, but may want to consider checking out!
1. Live Loss Plot
2. Parfit
3. Yellowbrick
4. textgenrnn
5. Magnitude

Improving Language Understanding by Generative Pre-Training

Natural language understanding comprises a wide range of diverse tasks such as textual entailment, question answering, semantic similarity assessment, and document classification. Although large unlabeled text corpora are abundant, labeled data for learning these specific tasks is scarce, making it challenging for discriminatively trained models to perform adequately. We demonstrate that large gains on these tasks can be realized by generative pre-training of a language model on a diverse corpus of unlabeled text, followed by discriminative fine-tuning on each specific task. In contrast to previous approaches, we make use of task-aware input transformations during fine-tuning to achieve effective transfer while requiring minimal changes to the model architecture. We demonstrate the effectiveness of our approach on a wide range of benchmarks for natural language understanding. Our general task-agnostic model outperforms discriminatively trained models that use architectures specifically crafted for each task, significantly improving upon the state of the art in 9 out of the 12 tasks studied. For instance, we achieve absolute improvements of 8.9% on commonsense reasoning (Stories Cloze Test), 5.7% on question answering (RACE), and 1.5% on textual entailment (MultiNLI).

The ssh Package: Secure Shell (SSH) Client for R

Have you ever needed to connect to a remote server over SSH to transfer files via SCP or to setup a secure tunnel, and wished you could do so from R itself The new rOpenSci ssh package provides a native ssh client in R allows you to do that and even more, like running a command or script on the host while streaming stdout and stderr directly to the client. The package is based on libssh, a powerful C library implementing the SSH protocol.

Merging spatial buffers in R

Merging spatial buffers in R June 11, 2018 I´m sure there´s a better way out there, but I struggled to find a way to dissolve polygons that touched/overlapped each other (the special case being buffers). For example, using the osmdata package, we can download the polygons representing hospital buildings in Bern, Switzerland.

Customizing time and date scales in ggplot2

In the last post of this series we dealt with axis systems. In this post we are also dealing with axes but this time we are taking a look at the position scales of dates, time and datetimes. Since we at STATWORX are often forecasting – and thus plotting – time series, this is an important issue for us. The choice of axis ticks and labels can make the message conveyed by a plot clearer. Oftentimes, some points in time are – e.g. due to their business implications – more important than others and should be easily identified. Unequivocal, yet parsimonious labeling is key to the readability of any plot. Luckily, ggplot2 enables us to do so for dates and times with almost any effort at all.

Create outstanding dashboards with the new semantic.dashboard package

We all know that Shiny is great for interactive data visualisations. But, sometimes even the best attempts to fit all your graphs just in one Shiny page are not enough. From our experience, almost every project with growing amount of KPIs struggles with a problem of messy and not readable final reports. Here is where dashboards appear to be handy. Dashboards allow you to intuitively structure your reports by breaking them down into the sections, panels and tabs. Thanks to that it is much easier for a final user to navigate through your work. shinydashboard does a decent job here. However, when you create a bunch of apps using it, you quickly realize that they all look the same and are simply boring. In this tutorial, I will show you how to take advantage of semantic.dashboard package. This is an alternative to shinydashboard which makes use of Semantic UI. Thanks to that you can introduce beautiful Semantic components into your app and select from many available themes.