Why GEMM is at the heart of deep learning
I spend most of my time worrying about how to make deep learning with neural networks faster and more power efficient. In practice that means focusing on a function called GEMM. It’s part of the BLAS (Basic Linear Algebra Subprograms) library that was first created in 1979, and until I started trying to optimize neural networks I’d never heard of it.
R vs QGIS for sustainable transport planning
This article is about one part of GISRUK and insights gleaned from it about R, QGIS and other tools for sustainable transport planning.
Rscript as Service API
R is getting popular programming language in the area of Data Science. Integrating Rscript with web UI pages is a challenge which many application developers are facing. In this blog post I will explain how we can expose R script as an API, using rApache and Apache webserver. rApache is a project supporting web application development using the R statistical language and environmentand the Apache web server.
From Data to Insight: Seven Tips for a Great Data Strategy
1. Brainstorming on Current and Future Goals
2. Understanding your End-to-End Business Processes
3. Doing a Data Inventory
4. Knowing which Tools and Techniques to Use
5. Legal and Policy Considerations
6. How your Approach will Work in Different Locations
7. How your Target Market will Interact with your Data
How to convert an R data.tree to JSON
I have recently published the data.tree R package to CRAN. It provides OO-style tree building, with standard tree traversal methods. Read the vignette about data.tree features if you are interested, or the one explaining how to use data.tree for classification models. I’ve been asked how to convert a data.tree to an XML or JSON. So here’s the answer. Bear in mind that data.tree has not been built for this purpose, so we need to do a few extra steps and the code is not really beautiful. However, thinking of it, it’s a natural application, as JSON and XML documents are inherently trees. Also, I’m surprised how easy it was to come up with a generic answer.
Awk in 20 Minutes
Awk is a tiny programming language and a command line tool. It’s particularly appropriate for log parsing on servers, mostly because Awk will operate on files, usually structured in lines of human-readable text. I say it’s useful on servers because log files, dump files, or whatever text format servers end up dumping to disk will tend to grow large, and you’ll have many of them per server. If you ever get into the situation where you have to analyze gigabytes of files from 50 different servers without tools like Splunk or its equivalents, it would feel fairly bad to have and download all these files locally to then drive some forensics on them. This personally happens to me when some Erlang nodes tend to die and leave a crash dump of 700MB to 4GB behind, or on smaller individual servers (say a VPS) where I need to quickly go through logs, looking for a common pattern. In any case, Awk does more than finding data (otherwise, grep or ack would be enough) — it also lets you process the data and transform it.
Linkurious Enterprise democratizes graph visualization
Linkurious announces the launch of Linkurious Enterprise, the first data visualization platform for graph databases.
ID3 Classification using data.tree
This introduction provides a stylized example of the capabilities of the R package data.tree. The code for this post was written in less than an hour. This is possible because, thanks to the data.tree package, the implementation of the training algorithm follows the algorithm’s pseudo code almost line by line.
Combining several lattice charts into one
Last week I mentioned the grid.arrange function of the gridExtra package that allows me to combine graphical grid objects onto one page. The latticeExtra package provides another elegant solution for trellis (lattice) plots: the function c.trellis() or just c() combines the panels of multiple trellis objects into one.