Top 9 Data Science Use Cases in Banking

1. Fraud detection
2. Managing customer data
3. Risk modeling for investment banks
4. Personalized marketing
5. Lifetime value prediction
6. Real-time and predictive analytics
7. Customer segmentation
8. Recommendation engines
9. Customer support

Semantic Segmentation Models for Autonomous Vehicles

In a previous post, we studied various open datasets that could be used to train a model for pixel-wise semantic segmentation of urban scenes. Here, we take a look at various deep learning architectures that cater specifically to time-sensitive domains like autonomous vehicles. In recent years, deep learning has surpassed traditional computer vision algorithms by learning a hierarchy of features from the training dataset itself. This eliminates the need for hand-crafted features and thus such techniques are being extensively explored in academia and industry.

BotRNot: An R app to detect Twitter bots

Twitter’s bot problem is well documented, influencing discourse on divisive topics like politics and civil rights. But it’s getting harder and harder to spot such nefarious bots, who often borrow biographies and tweets from real (and often stolen) profiles to evade detection. (The New York Times recently published an outstanding feature on bots and follower factories.) Can we distinguish bots from real users using data science?

Top 40 New Package Picks

Here are my picks for the “Top 40” packages of the 171 new packages that made it to CRAN (and stuck) in February, organized into the following categories: Computational Methods, Data, Finance, Science, Statistics, Time Series, and Utilities.

Readings in Applied Data Science

These readings reflect my personal thoughts about applied data science, and are skewed towards topics that I think are important but are generally under appreciated. It is not a systematic attempt to survey the field. That said, if you think there’s something major that I’ve missed, please feel free to submit an issue or pull request!). These readings will evolve as the quarter goes by. Many of the readings come from Practical Data Science for Stats, a join PeerJ collection and special issue of the American Statistician. Jenny Bryan and I pulled this collection together in order to publish some of the important parts of data science that were previously unpublished. Other readings are blog posts because so much of applied data science is outside the comfort zone of traditional academic fields.

Automated Data Collection with R and mlbgameday

Opening day is on the way Time to set up a persistent database to collect every pitch thrown in this year’s baseball season. The mlbgameday package is designed to facilitate extract, transform and load for MLBAM “Gameday” data. The package is optimized for parallel processing of data that may be larger than memory. https://…/mlbgameday