Cheat Sheets for AI, Neural Networks, Machine Learning, Deep Learning & Big Data

Over the past few months, I have been collecting AI cheat sheets. From time to time I share them with friends and colleagues and recently I have been getting asked a lot, so I decided to organize and share the entire collection. To make things more interesting and give context, I added descriptions and/or excerpts for each major topic.


Creating Better Translations with AI

Amazon has released a dataset of nearly 400,000 English, Hebrew, Russian, Arabic, and Japanese names collected from Wikipedia articles to help AI perform more accurate translations between alphabets. Differences in alphabets, such as the use of different characters and pronunciations, can affect how well AI can perform translations. For example, Amazon found its AI did better at understanding English to Russian translations than Arabic to English because the Latin alphabet is more similar to the Cyrillic alphabet than the Arabic alphabet. This data could help personal assistants retrieve information across languages.


Polyaxon 0.2! Updated dashboard, integrations, private registries, better search, replication, SSO, multi volume mounts …

The new version of Polyaxon (a platform for building, training, and monitoring large scale deep learning applications) comes with a lot of new features, bug fixes, and performance enhancement. We received a lot of feedback from our users, and this helped us prioritize our work. In this blog post I will go over some of the new features that we made available since Polyaxon 0.1.


Teaching the Google Assistant to be Multilingual

Multilingual households are becoming increasingly common, with several sources indicating that multilingual speakers already outnumber monolingual counterparts, and that this number will continue to grow. With this large and increasing population of multilingual users, it is more important than ever that Google develop products that can support multiple languages simultaneously to better serve our users.


Tips for analyzing Excel data in R

If you’re familiar with analyzing data in Excel and want to learn how to work with the same data in R, Alyssa Columbus has put together a very useful guide: How To Use R With Excel. In addition to providing you with a guide for installing and setting up R and the RStudio IDE, it provide a wealth of useful tips for working with Excel data in R.


Fast R functions to get first principal components

In this post, I compare different approaches to get first principal components of large matrices in R.


Slack and Plumber, Part One

In the previous post, we introduced plumber as a way to expose R processes and programs to external systems via REST API endpoints. In this post, we´ll go further by building out an API that powers a Slack slash command, all from within R using plumber. A subsequent post will outline deploying and securing the API.


A Comprehensive Study Exploring Enterprise Data Lake Market – Key Players Involved: SAP, Microsoft, Cloudwick, SAS Institute, Informatica, Teradata

HTF MI released a new market study on Global Enterprise Data Lake Market with 100+ market data Tables, Pie Chat, Graphs & Figures spread through Pages and easy to understand detailed analysis. At present, the market is developing its presence. The Research report presents a complete assessment of the Market and contains a future trend, current growth factors, attentive opinions, facts, and industry validated market data. The research study provides estimates for Global Enterprise Data Lake Forecast till 2025*.


Machine learning: A solution to backorder problem and inventory optimisation

For any business, the worst scenario is getting out of product inventory when customers are ready to buy your product. Keeping a stock of every item in the store is another burden to carry for every business. This trade off has been even more problematic in current times, when manufacturing firms are flooding with SKUs (Stock Keeping Unit) ranging from product sizes, flavours, styles etc. To cater personalised demand companies are customising products by adding various features to it & this is making life even more complex for all parts of businesses involved in the whole supply chain. To understand this problem, lets take an example of a toothpaste. There are more than 6-7 popular brands such as Colgate, Pepsodent, Close up, Dabur, Himalaya, Meswak etc with each having 4-5 toothpaste different sizes ranging from 50gm to 300 gm & 4-5 different variants such as Sensitive, Germi-check, Gumcare, Whitening etc.


Pragmatic Approach to Structured Data Querying via Natural Language Interface

Introducing the research paper that describes a practical approach to building natural language interfaces for structured data querying.


Exploratory vs Confirmatory Research

Exploratory research is the stage of the research process that aims at connecting ideas as to unveil the ‘why’s of potential cause/effect relationships. This occurs when researchers get started at understanding what they are actually ‘observing’ when in the process of building cause/effect models. Confirmatory research (a.k.a. hypothesis testing) is where researchers have a pretty good idea of what’s going on. That is, researcher has a theory (or several theories), and the objective is to find out if the theory is supported by the facts. The essence of all this is that exploratory and confirmatory research are two complementary components of the same goal: to discover relevant findings in the most efficient, reliable, replicable, applicable manner. (PDF) Exploratory vs Confirmatory Research. Available from: https://…8525_Exploratory_vs_Confirmatory_Research [accessed Aug 31 2018].


AI Knowledge Map: How To Classify AI Technologies

I have been in the space of artificial intelligence for a while and am aware that multiple classifications, distinctions, landscapes, and infographics exist to represent and track the different ways to think about AI. However, I am not a big fan of those categorization exercises, mainly because I tend to think that the effort of classifying dynamic data points into predetermined fixed boxes is often not worth the benefits of having such a ‘clear’ framework (this is a generalization of course as sometimes they are extremely useful).