Introduction to Augmented Random Search.

This article is based on a paper published on March, 2018 by Horia Mania, Aurelia Guy, and Benjamin Recht from the University of California, Berkeley. The authors assert that they have built an algorithm that is at least 15 times more efficient than the fastest competing model-free methods on MuJoCo locomotion benchmarks. They baptised the algorithm as Augmented Random Search or ARS for short.


Pump up the Volumes: Data in Docker

This article is about using data with Docker. In it, we’ll focus on Docker volumes. Check out the previous articles in the series if you haven’t yet. We covered Docker concepts, the ecosystem, Dockerfiles, slimming down images, and popular commands.


Great Books for Data Science

There is no shortage of books that promise to teach data science. Most of these books read like college textbooks with a wealth of technical material prefaced by a short conceptual introduction. While there is plenty of demand for these technical skills, really good data scientists need a mix of conceptual understanding and curiosity too before they can make the most of their tools. Each of these books approaches the concept of data science differently: some are about what data science is, some are about how professional data scientists think, some introduce perspectives that may change how you view the world, and one even contains the full history of information.
1. Weapons of Math Destruction
2. The Information
3. Everybody Lies
4. Dataclysm
5. Big Data
6. Algorithms to Live By
7. The Signal and the Noise
8. Complex Adaptive Systems
9. Antifragile/Incerto
10. Superforecasting


Learning from Graph data using Keras and Tensorflow

There is a lot of data out there that can be represented in the form of a graph in real-world applications like in Citation Networks, Social Networks (Followers graph, Friends network, … ), Biological Networks or Telecommunications. Using Graph extracted features can boost the performance of predictive models by relying of information flow between neighboring nodes. However, representing graph data is not straightforward especially if we don’t intend to implement hand-crafted features since most ML models expect a fixed sized or linear input which is not the case for graph data. In this post we will explore some ways to deal with generic graphs to do node classification based on graph representations learned directly from data.


5-Steps and 10-Steps, to Learn Machine Learning.

There is a 5-Step shortcut that you can do to be able to solve machine learning problems right away, As a beginner, you can take this path at first if you want to get something done with machine learning. And then you can take the 10-steps path to be a data scientist or a more advanced machine learning engineer.


Test Time Augmentation (TTA) and how to perform it with Keras

There exists a lot of ways to improve the results of a neural network by changing the way we train it. In particular, Data Augmentation is a common practice to virtually increase the size of training dataset, and is also used as a regularization technique, making the model more robust to slight changes in the input data. Data Augmentation is the process of randomly applying some operations (rotation, zoom, shift, flips,…) to the input data. By this mean, the model is never shown twice the exact same example and has to learn more general features about the classes he has to recognize.


Learning generic sentence representation by various NLP tasks

There are a lot of papers introduced different way to compute a better text representation. From character embeddings, word embeddings to sentence embeddings, I introduced different way to deal with it. However, most of them are training by single data set or problem domain. Subramania et al. explored a simple, effective multi-task learning approach for sentence embeddings in this time. The hypothesize is that good sentence representations can be learnt from a vast amount of weakly related tasks.


Text Encoding: A Review

The key to perform any text mining operation, such as topic detection or sentiment analysis, is to transform words into numbers, sequences of words into sequences of numbers. Once we have numbers, we are back in the well-known game of data analytics, where machine learning algorithms can help us with classifying and clustering.


https://towardsdatascience.com/inference-using-em-algorithm-d71cccb647bc

The goal of this post is to explain a powerful algorithm in statistical analysis: the Expectation-Maximization (EM) algorithm. It is powerful in the sense that it has the ability to deal with missing data and unobserved features, the use-cases for which come up frequently in many real-world applications.


So, what is AI really?

So, what is AI really? This post tries to give some guidance. The traditional way coding a computer program worked was through carefully analyzing the problem, trying to separate its different parts into simpler sub-problems, put the solution into an algorithmic form (kind of a recipe) and finally code it. Let’s have a look at an example!


RFM Analysis in R

RFM (recency, frequency, monetary) analysis is a behavior based technique used to segment customers by examining their transaction history such as
• how recently a customer has purchased (recency)
• how often they purchase (frequency)
• how much the customer spends (monetary)
It is based on the marketing axiom that 80% of your business comes from 20% of your customers. RFM helps to identify customers who are more likely to respond to promotions by segmenting them into various categories.


Introducing PlaNet: A Deep Planning Network for Reinforcement Learning

Research into how artificial agents can improve their decisions over time is progressing rapidly via reinforcement learning (RL). For this technique, an agent observes a stream of sensory inputs (e.g. camera images) while choosing actions (e.g. motor commands), and sometimes receives a reward for achieving a specified goal. Model-free approaches to RL aim to directly predict good actions from the sensory observations, enabling DeepMind’s DQN to play Atari and other agents to control robots. However, this blackbox approach often requires several weeks of simulated interaction to learn through trial and error, limiting its usefulness in practice. Model-based RL, in contrast, attempts to have agents learn how the world behaves in general. Instead of directly mapping observations to actions, this allows an agent to explicitly plan ahead, to more carefully select actions by ‘imagining’ their long-term outcomes. Model-based approaches have achieved substantial successes, including AlphaGo, which imagines taking sequences of moves on a fictitious board with the known rules of the game. However, to leverage planning in unknown environments (such as controlling a robot given only pixels as input), the agent must learn the rules or dynamics from experience. Because such dynamics models in principle allow for higher efficiency and natural multi-task learning, creating models that are accurate enough for successful planning is a long-standing goal of RL.


Rant

Rant is an all-purpose procedural text engine that is most simply described as the opposite of Regex. It has been refined to include a dizzying array of features for handling everything from the most basic of string generation tasks to advanced dialogue generation, code templating, automatic formatting, and more. The goal of the project is to enable developers of all kinds to automate repetitive writing tasks with a high degree of creative freedom.