Webscraping ALLSTAT

In this post, I´ll webscrape and analyse meta-data of ALLSTAT emails. It´ll also be the occasion for me to take the wonderful new polite package for a ride, that helps respectful webscraping!

Comparing the Four Major AI Strategies

Now that we´ve detailed the four main AI-first strategies: Data Dominance, Vertical, Horizontal, and Systems of Intelligence, it´s time to pick. Here we provide side-by-side comparison and our opinion on the winner(s) for your own AI-first startup.

How to design bot conversations

This is an example of running a usability test to learn about and improve your bot before you launch it. There might be other types of tests you would like to run; ones that check user satisfaction or brand impact, for example. You might also use a different tool to prototype or create an alpha of your bot in order to run these tests. The most important thing is to iterate and learn.

GPU Grant Program

NVIDIA’s Academic Programs Team is dedicated to empowering and collaborating with professors and researchers at universities worldwide. We aim to inspire cutting-edge technological innovation and to find new ways of enhancing faculty research as well as the teaching and learning experience. We achieve this through a variety of initiatives and programs including:
• Small scale GPU grants
• Graduate Fellowships
• Providing free teaching materials and GPU cloud resources through our Deep Learning Institute (DLI) Teaching Kits
• Providing access to developer forums, pre-released tools and drivers through the NVIDIA Developer Program

A package for dimensionality reduction of large data

A few weeks ago, as part of the rOpenSci Unconference, a group of us (Sean Hughes, Malisa Smith, Angela Li, Ju Kim, and Ted Laderas) decided to work on making the UMAP algorithm accessible within R. UMAP (Uniform Manifold Approximation and Projection) is a dimensionality reduction technique that allows the user to reduce high dimensional data (multiple columns) into a smaller number of columns for visualization purposes (github, arxiv). It is similar to both Principal Components Analysis (PCA) and t-SNE, which are techniques often used in the single-cell omics (such as genomics, flow cytometry, proteomics) world to visualize high dimensional data. t-SNE is actually quite a slow algorithm; one of the advantages of UMAP is that it runs faster than t-SNE. Because the data.frames that are typically run with these algorithms can run into millions of rows, efficiency is important.

How to Use Hierarchical Bayes for Choice Modeling In R

You want to understand how your customers make decisions efficiently. Today, customers more so than ever undergo an incredibly complex decision making process. Normally, accurately representing this process using choice models would require advanced methods and statistical techniques. In this post, I’ll show you how it can be done easily in R using Hierarchical Bayes and the package flipChoice.

General Principles of Root Cause Analysis (RCA)

1. The primary aim of root cause analysis is: to identify the factors that resulted in the nature, the magnitude, the location, and the timing of the harmful outcomes (consequences) of one or more past events; to determine what behaviors, actions, inactions, or conditions need to be changed; to prevent recurrence of similar harmful outcomes; and to identify lessons that may promote the achievement of better consequences. (‘Success’ is defined as the near-certain prevention of recurrence).
2. To be effective, root cause analysis must be performed systematically, usually as part of an investigation, with conclusions and root causes that are identified backed up by documented evidence. A team effort is typically required.
3. There may be more than one root cause for an event or a problem, therefore the difficult part is demonstrating the persistence and sustaining the effort required to determine them.
4. The purpose of identifying all solutions to a problem is to prevent recurrence at lowest cost in the simplest way. If there are alternatives that are equally effective, then the simplest or lowest cost approach is preferred.
5. The root causes identified will depend on the way in which the problem or event is defined. Effective problem statements and event descriptions (as failures, for example) are helpful and usually required to ensure the execution of appropriate analyses.
6. One logical way to trace down root causes is by utilizing hierarchical clustering data-mining solutions (such as graph-theory-based data mining). A root cause is defined in that context as ‘the conditions that enable one or more causes’. Root causes can be deductively sorted out from upper groups of which the groups include a specific cause.
7. To be effective, the analysis should establish a sequence of events or timeline for understanding the relationships between contributory (causal) factors, root cause(s) and the defined problem or event to be prevented.
8. Root cause analysis can help transform a reactive culture (one that reacts to problems) into a forward-looking culture (one that solves problems before they occur or escalate). More importantly, RCA reduces the frequency of problems occurring over time within the environment where the process is used.
9. Root cause analysis as a force for change is a threat to many cultures and environments. Threats to cultures are often met with resistance. Other forms of management support may be required to achieve effectiveness and success with root cause analysis. For example, a ‘non-punitive’ policy toward problem identifiers may be required.