Neural Networks from a Bayesian Perspective

Understanding what a model doesn´t know is important both from the practitioner´s perspective and for the end users of many different machine learning applications. In our previous blog post we discussed the different types of uncertainty. We explained how we can use it to interpret and debug our models. In this post we´ll discuss different ways to obtain uncertainty in Deep Neural Networks. Let´s start by looking at neural networks from a Bayesian perspective.


An Introduction to t-SNE with Python Example

In this post we´ll give an introduction to the exploratory and visualization t-SNE algorithm. t-SNE is a powerful dimension reduction and visualization technique used on high dimensional data.


Basic Generalized Linear Modeling – Part 3: Exercises

In this exercise, we will continue to solve problems from the last exercise about GLM here. Therefore, the exercise number will start at 9. Please make sure you read and follow the previous exercise before you continue practicing.


Announcing Practical Data Science with R, 2nd Edition

Manning Publications has just announced the launching of the MEAP (Manning Early Access Program) for the second edition. The MEAP allows you to subscribe to drafts of chapters as they become available, and give us feedback before the book goes into print. Currently, drafts of the first three chapters are available.


Site Reliability Engineering

Site Reliability Engineering (SRE) is a discipline that incorporates aspects of software engineering and applies that to IT operations problems. The main goals are to create ultra-scalable and highly reliable software systems. According to Ben Treynor, founder of Google’s Site Reliability Team, SRE is ‘what happens when a software engineer is tasked with what used to be called operations.’ Site Reliability Engineering was created at Google around 2003 when Ben Treynor was hired to lead a team of seven software engineers to run a production environment. The team was tasked to make Google’s sites run smoothly, efficiently, and more reliably. Early on, Google’s large-scale systems required the company to come up with new paradigms on how to manage such large systems and at the same time introduce new features continuously but at a very high-quality end user experience. The SRE footprint at Google is now larger than 1500 engineers. Many products have small to medium sized SRE teams supporting them, though by far not all products have SREs. The SRE processes that have been honed over the years are being used by other, mainly large scale, companies that are also starting to implement this paradigm. ServiceNow, Microsoft, Apple, Twitter, Facebook, Dropbox, Amazon, Target, Dell Technologies, IBM, Xero, Oracle, Zalando, Acquia, VMware and GitHub have all put together SRE teams.


Site reliability engineering (SRE): A simple overview

Get a basic understanding of site reliability engineering (SRE) and then go deeper with recommended resources.


Simplifying machine learning lifecycle management

In this episode of the Data Show, I spoke with Harish Doddi, co-founder and CEO of Datatron, a startup focused on helping companies deploy and manage machine learning models. As companies move from machine learning prototypes to products and services, tools and best practices for productionizing and managing models are just starting to emerge. Today´s data science and data engineering teams work with a variety of machine learning libraries, data ingestion, and data storage technologies. Risk and compliance considerations mean that the ability to reproduce machine learning workflows is essential to meet audits in certain application domains. And as data science and data engineering teams continue to expand, tools need to enable and facilitate collaboration. As someone who specializes in helping teams turn machine learning prototypes into production-ready services, I wanted to hear what Doddi has learned while working with organizations that aspire to ‘become machine learning companies.’


It’s time to establish big data standards

Technologies for streaming, storing, and querying big data have matured to the point where the computer industry can usefully establish standards. As in other areas of engineering, standardization allows practitioners to port their learnings across a multitude of solutions, and to more easily employ different technologies together; standardization also allows solution providers to take advantage of sub-components to expeditiously build more compelling solutions with broader applicability.


Delivering the Intelligent and Connected Enterprise

It is a new world. The Latin term for this is Mundus Novus. In the new world, industries are facing a series of challenging macro-trends as they transform into Intelligent and Connected Enterprises. The demands of a Millennial workforce, the relentless threat of cyberattack, changing modes of work and complex regulatory environments are changing the ways that businesses operate. Industry 4.0 has introduced new business models, created cyber-currencies and changed the nature of conflict, society and security. Organizations must use new technologies to unlock the power of information, become more intelligent and connected and drive engagement with customers, partners and employees. OpenText Enterprise Information Management (EIM) enables the Intelligent and Connected Enterprise with machines (automation), artificial intelligence (AI), Application Programming Interfaces (APIs) and data management combined into an intelligent information core. These capabilities bring together information from both humans and machines so that it can be securely managed, stored and accessed – and mined with analytics for actionable insights.


Dash for Beginners

Dash is Python framework for building web applications. It built on top of Flask, Plotly.js, React and React Js. It enables you to build dashboards using pure Python. Dash is open source, and its apps run on the web browser. In this tutorial, we introduce the reader to Dash fundamentals and assume that they have prior experience with Plotly.


MXNet Tensor Basics & Simple Derivatives

This is an overview of some basic functionality of the MXNet ndarray package for creating tensor-like objects, and using the autograd package for performing automatic differentiation.


Reinforcement Learning: The Business Use Case, Part 2

In this post, I will explore the implementation of reinforcement learning in trading. The Financial industry has been exploring the applications of Artificial Intelligence and Machine Learning for their use-cases, but the monetary risk has prompted reluctance. Traditional algorithmic trading has evolved in recent years and now high-computational systems automates the tasks, but traders still build the policies that govern choices to buy and sell. An algorithmic model for buying stocks based on a list of valuation and growth metric conditions might define a ‘buy’ or ‘sell’ signal that would in turn be triggered by some specific rules that the trader has defined.


Dealing with The Problem of Multicollinearity in R

Imagine a situation where you are asked to predict the tourism revenue for a country, let´s say India. In this case, your output or dependent or response variable will be total revenue earned (in USD) in a given year. But, what about independent or predictor variables?