Real-Time Speech Analytics
We’ve all heard the canned notifications when we call companies for customer service: “this call may be recorded for security or quality purposes.” Most customer service organizations today record their phone interactions with their customers. Often those recordings just sit untouched on the digital equivalent of a dusty shelf in a storage closet. The recordings are there to ensure regulatory compliance or, in rare cases, to be pulled off the shelf in case of a major dispute with a customer. In essence, the part of the notification about security rings true; the quality part, not so much.

How sure are you that large margin implies low VC dimension?
How sure are you that large margin implies low VC dimension (and good generalization error)? It is true. But even if you have taken a good course on machine learning you many have seen the actual proof (with all of the caveats and conditions). I worked through the literature proofs over the holiday and it took a lot of notes to track what is really going on in the derivation of the support vector machine.

Fixing the visual versus fixing the story
… There is no doubt the new version brings out the data more clearly. …

Big Data, or Not Big Data: What is your question?
Before jumping on the Big Data bandwagon, I think it is important to ask the question of whether the problem you have requires much data. That is, I think its important to determine when Big Data is relevant to the problem at hand.

A Brief Introduction to Knowledge Management
A helpful slideset that is used to explain the purposes, positions and roles of Knowledge Management.

Data Science, Machine Learning, and Statistics: what is in a name?
A fair complaint when seeing yet another “data science” article is to say: “this is just medical statistics” or “this is already part of bioinformatics.” We certainly label many articles as “data science” on this blog. Probably the complaint is slightly cleaner if phrased as “this is already known statistics.” But the essence of the complaint is a feeling of claiming novelty in putting old wine in new bottles. Rob Tibshirani nailed this type of distinction in is famous machine learning versus statistics glossary.

Online Algorithms in High-frequency Trading
HFT (high-frequency trading) has emerged as a powerful force in modern financial markets. Only 20 years ago, most of the trading volume occurred in exchanges such as the New York Stock Exchange, where humans dressed in brightly colored outfits would gesticulate and scream their trading intentions. Nowadays, trading occurs mostly in electronic servers in data centers, where computers communicate their trading intentions through network messages. This transition from physical exchanges to electronic platforms has been particularly profitable for HFT firms, which invested heavily in the infrastructure of this new environment.