Interactive RTutor Problemsets via RStudio Cloud

I just learned about RStudio Cloud (see ) that allows to simply run RStudio instances from your browser. Moreover, you can simply set-up RStudio projects that other users can simply copy and use themselves. RStudio Cloud is still in alpha and currently one can freely register to use it.

The Keras 4 Step Workflow

In his book ‘Deep Learning with Python,’ Francois Chollet outlines a process for developing neural networks with Keras in 4 steps. Let’s take a look at this process with a simple example.

Bitcoin Transactions: From BigQuery to MapD

Bitcoin, cryptocurrency, and blockchain technology were hot topics in many 2017 water cooler chats, especially after the meteoric rise in Bitcoin prices late last year. Despite Bitcoin’s popularity, very few of us, myself included, understand the inner workings of an actual Bitcoin transaction and how this growing public ledger is maintained. But wait, I work for MapD – the most powerful visual analytics engine on the planet – and extremely large data sets are perfect fodder for MapD. So, I ingested the raw Bitcoin transactions into the MapD Core analytics database, along with other publicly available market data. Now, we can explore the correlations between the different datasets and visualize them in MapD Immerse visual analytics. Later in this blog, I show how with a few clicks I was able to visually drill down into the transactions during the wild ride of late 2017 through early 2018, where I find all the million dollar transactions that involve the top 100 popular bitcoin addresses.

The Thoughtful Programmer, A Thoughtful Citizen – An Educational Agenda for Computer and Data Science

Artificial intelligence (AI) is the science and technology of the construction of intelligent agents. Roughly, these are technologies that behave in an environment in such a way that if performed by humans we would call it “intelligent”. In this sense “intelligence” basically refers to instrumental conceptions of rational decision-making as found in economics and statistics. It is about the ability to make instrumentally optimal decisions by following certain kinds of plans and inferences. In the last two decades, the combination of machine learning, statistics, control theory, and computational neuroscience with the availability of vast amounts of data and computer processing power has yielded huge advances in AI. This shows up in a wide variety of domains such as in speech recognition and machine translation, autonomous vehicles and aircraft, bipedal movement, computer vision, question and answer systems, and ranking systems. It is now widely accepted that AI research is making rapid advances and that its societal impacts will steadily increase. One needs only consider the scale of investment: according to some estimates, the leading technology giants spent up to USD 30 Billion on AI in 2016, with 90 % of this spent on R&D and deployment, and 10 % on AI acquisitions (Columbus 2017). And in the UK, the government has started offering salaries of well over 100 000 Euro per year to computer scientists to develop machine learning to help the unemployed in their job search, predict pension fund performance, and find patterns in and sort customs and revenue documents (Buranyi 2017). It is more or less received wisdom that the potential benefits are enormous not only economically but for human civilization itself. In his freshly published Enlightenment Now, the Harvard cognitivist psychologist Steven Pinker (2018) expounds the optimistic view that digital- and nano-technologies combined with AI will make it possible for the planet to sustainably maintain a population of nine billion humans leading flourishing lives according to some basic universal humanist tenets.

Classification from scratch, neural nets 6/8

Sixth post of our series on classification from scratch. The latest one was on the lasso regression, which was still based on a logistic regression model, assuming that the variable of interest YYY has a Bernoulli distribution. From now on, we will discuss technique that did not originate from those probabilistic models, even if they might still have a probabilistic interpretation. Somehow. Today, we will start with neural nets. Maybe I should start with a disclaimer. The goal is not to replicate well designed R functions, used for predictive modeling. It is simply to get a basic understanding of what’s going on.

Service Fabric: A Distributed Platform for Building Microservices in the Cloud

We describe Service Fabric (SF), Microsoft’s distributed platform for building, running, and maintaining microservice applications in the cloud. SF has been running in production for 10+ years, powering many critical services at Microsoft. This paper outlines key design philosophies in SF. We then adopt a bottom-up approach to describe low-level components in its architecture, focusing on modular use and support for strong semantics like fault-tolerance and consistency within each component of SF. We discuss lessons learned, and present experimental results from production data.

Microsoft to Acquire GitHub for $7.5B to Solidify Developer Ties

Microsoft on Monday confirmed it has agreed to acquire GitHub, the top software-development platform in the world, for $7.5 billion in stock. It’s not much of a stretch to see why, given Microsoft’s recent moves around cloud and AI development. The acquisition should help Microsoft expand its focus on developing AI, tools and services that work across devices. In its statement, Microsoft said: Today, every company is becoming a software company and developers are at the center of digital transformation; they drive business processes and functions across organizations from customer service and HR to marketing and IT. And the choices these developers make will increasingly determine value creation and growth across every industry. GitHub is home for modern developers and the world’s most popular destination for open source projects and software innovation. The platform hosts a growing network of developers in nearly every country representing more than 1.5 million companies across healthcare, manufacturing, technology, financial services, retail and more.

Lessons learned turning machine learning models into real products and services

1. Models degrade in accuracy as soon as they are put in production
2. The exact same model can rarely be deployed twice
3. Often, the real modeling work starts in production
4. Tools exist to help deploy, measure, and secure models

5 key drivers for getting more value from your data

1. Consolidate data into a single data lake to avoid data sprawl
2. Provide users with the appropriate level of access to data
3. Strike a balance between governance and freedom
4. Align data initiatives with business goals
5. Create a data infrastructure with the ability to scale