OpenText File Intelligence

Essentially, all organizations need to find and manage content for information governance or regulatory compliance and also for internal or regulatory investigations. Responding in a quick and cost-effective manner is directly related to how effectively an organization can identify, collect, analyze and act on all relevant information. Since most business communications and activities take place electronically and the volume of electronically content is growing exponentially, content exists in a wide and ever-expanding variety of disparate systems and locations across the enterprise. Many of these sources are unmanaged, unorganized and continually changing. As a result, identifying, analyzing and acting on content that is relevant to responding to a regulatory request or investigation can be an extremely time consuming and costly endeavor. Compound this information management challenge with the standards set forth by various global legal and regulatory bodies, e.g. the European General Data Protection Regulation (GDPR), US Federal Rules of Civil Procedure (FRCP), U.K. Bribery Act, Freedom of Information Act (FOIA), Jackson Reforms, etc., and it becomes apparent that, now more than ever, corporate legal and IT teams need an integrated solution supported by industry best practices to address this information management and governance challenge.

AI In Telecom: Intelligent Operations is the New Norm

The move towards an intelligent world is faster and more rapid than it ever was before. The increase in this transition has been propagated through the role of several high key stakeholders that have redefined the way we look at technology. One of the key players in this transition is Huawei. Huawei’s recent UBBF conference held in Hangzhou on October 18-19 was a step towards awareness in this regard. Being personally present at this conference, there were numerous intakes that I noted down and would like to present to my readers.

On-Device Conversational Modeling with TensorFlow Lite

Earlier this year, we launched Android Wear 2.0 which featured the first ‘on-device’ machine learning technology for smart messaging. This enabled cloud-based technologies like Smart Reply, previously available in Gmail, Inbox and Allo, to be used directly within any application for the first time, including third-party messaging apps, without ever having to connect to the cloud. So you can respond to incoming chat messages on the go, directly from your smartwatch. Today, we announce TensorFlow Lite, TensorFlow’s lightweight solution for mobile and embedded devices. This framework is optimized for low-latency inference of machine learning models, with a focus on small memory footprint and fast performance. As part of the library, we have also released an on-device conversational model and a demo app that provides an example of a natural language application powered by TensorFlow Lite, in order to make it easier for developers and researchers to build new machine intelligence features powered by on-device inference. This model generates reply suggestions to input conversational chat messages, with efficient inference that can be easily plugged in to your chat application to power on-device conversational intelligence.

The Myth of Entry-level Data Science

There might not be any topic a data scientist is asked more about than “how can I get into data science.” I get it. It’s a great career and every week in the last few years there’s a new article about the unmet demand for “the best job in America.” Working on some of the most exciting new technologies like self-driving cars and AI powered chatbots is understandably appealing. And yet, it seems hard to find these jobs. If you’re baffled by this paradox, you’re not alone.

Book Review: Weapons of Math Destruction by Cathy O’Neil

According to O’Neil, Weapons of Math Destruction or WMDs can be characterized by three features: opacity, scale, and the damage the model causes. WMDs can be summarized in the following ways:
• An algorithm based on mathematical principles that implements a scoring system that evaluates people in various ways.
• A WMD is widely used in determining life-affecting circumstances like the amount of credit a person can access, job assessments, car insurance premiums, and many others.
• A common characteristics of WMDs is that they’re opaque and unaccountable in that people aren’t able to understand the process by which they are being scored and cannot complain about them if they’re wrong.
• WMDs cause destructive “feedback loops” that undermine the algorithm’s original goals, which in most cases are positive in intent.

Best Online Masters in Data Science and Analytics – a comprehensive, unbiased survey

The first comprehensive and objective survey of online Masters in Analytics / Data Science, including rankings, tuition, and duration of the education program.

Extracting Tweets With R

This article will give you a great, brief overview for extracting Tweets using R.

Mapping data using R and leaflet

The R language provides many different tools for creating maps and adding data to them. I’ve been using the leaflet package at work recently, so I thought I’d provide a short example here. Whilst searching for some data that might make a nice map, I came across this article at ABC News. It includes a table containing Australian members of parliament, their electorate and their voting intention regarding legalisation of same-sex marriage. Since I reside in New South Wales, let’s map the data for electorates in that state.

An update for MRAN

MRAN, the Microsoft R Application Network has been migrated to a new high-performance, high-availability server, and we’ve taken the opportunity to make a few upgrades along the way. You shouldn’t notice any breaking changes (of course if you do, please let us know), but you should notice faster performance for the MRAN site and for the checkpoint package. (MRAN is also the home of daily archives of CRAN, which checkpoint relies on to deliver specific package versions for its reproducibility functions.)

Functional peace of mind

I think what I enjoy the most about functional programming is the peace of mind that comes with it. With functional programming, there’s a lot of stuff you don’t need to think about. You can write functions that are general enough so that they solve a variety of problems.

A/B Testing Primer and the DEED framework

Short lecture on A/B testing.

The 10 Statistical Techniques Data Scientists Need to Master

Regardless of where you stand on the matter of Data Science sexiness, it’s simply impossible to ignore the continuing importance of data, and our ability to analyze, organize, and contextualize it. Drawing on their vast stores of employment data and employee feedback, Glassdoor ranked Data Scientist #1 in their 25 Best Jobs in America list. So the role is here to stay, but unquestionably, the specifics of what a Data Scientist does will evolve. With technologies like Machine Learning becoming ever-more common place, and emerging fields like Deep Learning gaining significant traction amongst researchers and engineers – and the companies that hire them – Data Scientists continue to ride the crest of an incredible wave of innovation and technological progress.
1. Linear Regression
2. Classification
3. Resampling Methods
4. Subset Selection
5. Shrinkage
6. Dimension Reduction
7. Nonlinear Models
8. Tree-Based Methods
9. Support Vector Machines
10. Unsupervised Learning

You have created your first Linear Regression Model. Have you validated the assumptions?

Linear Regression is an excellent starting point for Machine Learning, but it is a common mistake to focus just on the p-values and R-Squared values while determining validity of model. Here we examine the underlying assumptions of a Linear Regression, which need to be validated before applying the model.