Best Practices: Implementing DataOps with a Data Science Platform

To overcome inefficiencies commonly associated with using large amounts of data to power many applications, companies are increasingly turning to DataOps – an agile methodology that focuses on collaboration and automation in order to accelerate data science. When implemented using a data science platform, DataOps can help scale the work of your data science team, get models into production faster, and ultimately drive innovation at your company.

RStudio Connect v1.5.8

We’re pleased to announce RStudio Connect: version 1.5.8. This release enables reconnects for Shiny applications, more consistent and trustworthy editing of user information, and various LDAP improvements.

Top 10 Machine Learning with R Videos

1. How to Build a Text Mining, Machine Learning Document Classification System in R!
2. Principal Component Analysis Using R
4. Introduction to Cluster Analysis with R – an example
5. Decision Tree Classification in R
6. Support Vector Machines (SVM) Overview and Demo using R
7. Random Forest Overview and Demo in R
8. Neural Networks in R
9. R Programming Language for Machine Learning
10. Introduction to Machine Learning with R and caret

Object detection with TensorFlow

Image classification can perform some pretty amazing feats, but a large drawback of many image classification applications is that the model can only detect one class per image. With an object detection model, not only can you classify multiple classes in one image, but you can specify exactly where that object is in an image with a bounding box framing the object. The TensorFlow Models GitHub repository has a large variety of pre-trained models for various machine learning tasks, and one excellent resource is their object detection API. The object detection API makes it extremely easy to train your own object detection model for a large variety of different applications. Whether you need a high-speed model to work on live stream high-frames-per-second (fps) applications or high-accuracy desktop models, the API makes it easy to train and export a model. This tutorial will walk through all the steps for building a custom object classification model using TensorFlow’s API.

Key considerations for building an AI platform

The promises of AI are great, but taking the steps to build and implement AI within an organization is challenging. As companies learn to build intelligent products in real production environments, engineering teams face the complexity of the machine learning development process—from data sourcing and cleaning to feature engineering, modeling, training, deployment, and production infrastructure. Core to addressing these challenges is building an effective AI platform strategy—just as Facebook did with FBLearner Flow and Uber did with Michelangelo. Often, this task is easier said than done. Navigating the process of building a platform bears complexities of its own, particularly since the definition of “platform” is broad and inconclusive. In this post, I’ll walk through the key considerations of building an AI platform that is right for your business, and avoiding common pitfalls.
• Who will use the platform?
• Are you solving for simplicity or flexibility?
• How do you balance product and engineering decisions?
• What Is your multi-layer approach?

Create editable Microsoft Office charts from R

R has a rich and infinitely flexible graphics system, and you can easily embed R graphics into Microsoft Office documents like PowerPoint or Word. The one thing I dread hearing after delivering such a document, though, is ‘how can I tweak that graphic?’. I could change the colors or fonts or dimensions in R, of course, but sometimes people just want to watch the world burn tweak graphics to their hearts’ content. If you’re in that situation, you have a couple of options for using R to create Office documents with graphics, and make those graphics editable. Both options work in conjunction with the ‘officer’ package, which lets you create Word and PowerPoint documents from R.

CRAN search based on natural language processing

CRAN contains up to date (October 2017) more than 11500 R packages. If you want to scroll through all of these, you probably need to spend a few days, assuming you need 5 seconds per package and there are 8 hours in a day. Since R version 3.4, we can also get a dataset will all packages, their dependencies, the package title, the description and even the installation errors which the packages have. Which makes the CRAN database with all packages an excellent dataset for doing text mining.

Demo Week: Time Series Machine Learning with timetk

We’re into the second day of Business Science Demo Week. What’s demo week? Every day this week we are demoing an R package: tidyquant (Monday), timetk (Tuesday), sweep (Wednesday), tibbletime (Thursday) and h2o (Friday)! That’s five packages in five days! We’ll give you intel on what you need to know about these packages to go from zero to hero. Second up is timetk, your toolkit for time series in R. Here we go!