Engineering guide to writing correct User Stories

Agile people are obsessed with writing user stories. And it is a powerful instrument indeed. But, from my practice a lot of people are doing it wrong.

Computer Vision Tutorial: A Step-by-Step Introduction to Image Segmentation Techniques (Part 1)

What’s the first thing you do when you’re attempting to cross the road? We typically look left and right, take stock of the vehicles on the road, and make our decision. Our brain is able to analyze, in a matter of milliseconds, what kind of vehicle (car, bus, truck, auto, etc.) is coming towards us. Can machines do that? The answer was an emphatic ‘no’ till a few years back. But the rise and advancements in computer vision have changed the game. We are able to build computer vision models that can detect objects, determine their shape, predict the direction the objects will go in, and many other things. You might have guessed it – that’s the powerful technology behind self-driving cars!

A survey on association rules mining using heuristics

Association rule mining (ARM) is a commonly encountred data mining method. There are many approaches to mining frequent rules and patterns from a database and one among them is heuristics. Many heuristic approaches have been proposed but, to the best of our knowledge, there is no comprehensive literature review on such approaches, yet with only a limited attempt. This gap needs to be filled. This paper reviews heuristic approaches to ARM and points out their most significant strengths and weaknesses. We propose eight performance metrics, such as execution time, memory consumption, completeness, and interestingness, we compare approaches against these performance metrics and discuss our findings. For instance, comparison results indicate that SRmining, PMES, Ant-ARM, and MDS-H are the fastest heuristic ARM algorithms. HSBO-TS is the most complete one, while SRmining and ACS require only one database scan. In addition, we propose a parameter, named GT-Rank for ranking heuristic ARM approaches, and based on that, ARMGA, ASC, and Kua emerge as the best approaches. We also consider ARM algorithms and their characteristics as transactions and items in a transactional database, respectively, and generate association rules that indicate research trends in this area.

A Radical AI Strategy – Platformication

A new business model strategy based around intermediary platforms powered by AI/ML is promising the most direct path to fastest growth, profitability, and competitive success. Adopting this new approach requires a deep change in mindset and is quite different from just adopting AI/ML to optimize your current operations.

Long-range Correlations in Time Series: Modeling, Testing, Case Study

We investigate a large class of auto-correlated, stationary time series, proposing a new statistical test to measure departure from the base model, known as Brownian motion. We also discuss a methodology to deconstruct these time series, in order to identify the root mechanism that generates the observations. The time series studied here can be discrete or continuous in time, they can have various degrees of smoothness (typically measured using the Hurst exponent) as well as long-range or short-range correlations between successive values. Applications are numerous, and we focus here on a case study arising from some interesting number theory problem. In particular, we show that one of the times series investigated in my article on randomness theory [see here, read section 4.1.(c)] is not Brownian despite the appearance. It has important implications regarding the problem in question. Applied to finance or economics, it makes the difference between an efficient market, and one that can be gamed. This article it accessible to a large audience, thanks to its tutorial style, illustrations, and easily replicable simulations. Nevertheless, we discuss modern, advanced, and state-of-the-art concepts. This is an area of active research.

Machine Learning, Natural Language Meet to Understand Intent

Machine learning and natural language capabilities will bring the power of analytics to more people through semantics.

XAI – A Data Scientist’s Mouthpiece

We outline the usefulness of Explainable AI, which allows you to explain the results of a multidimensional model – including having a multimodal decision boundary – to a business user.

RInside Help in Testing

A problem arises when building R interfaces to C/C++ libraries involves testing: how to go about replicating the existing C/C++ tests in R without undue effort. If the C/C++ tests are simple and small enough, they can be manually translated. However, when there are many tests, and each test initializes its own large data structures, the task becomes a chore. We faced this problem with a recent release of the ECOSolveR, a solver package crucial to our larger package CVXR. Until version 0.4, we had been content with including one small test and a larger one using saved RDA files in the R package. But with our work on CVXR moving towards a version 1.0 release, we wanted to batten down the hatch.

Shiny Apps for Interactive Data Analysis

We are excited and happy to share a set of shiny apps built for interactive data analysis and teaching at Rsquared Academy. The apps are part of our R packages and presently cover the following topics:
• Descriptive Statistics
• Probability Distributions
• Hypothesis Testing
• Linear Regression
• Logistic Regression
• RFM Analysis
• Data Visualization
We would suggest that you explore the apps using sample data sets available within the app before using your own data set so that you get comfortable with the user interface.

Mixture of Variational Autoencoders – a Fusion Between MoE and VAE

An unsupervised approach to digit classification and generation.

Which Deep Learning Framework is Growing Fastest?

In September 2018, I compared all the major deep learning frameworks in terms of demand, usage, and popularity in this article. TensorFlow was the undisputed heavyweight champion of deep learning frameworks. PyTorch was the young rookie with lots of buzz. ?? How has the landscape changed for the leading deep learning frameworks in the past six months?

Fundamental Techniques of Feature Engineering for Machine Learning

What is a feature and why we need the engineering of it? Basically, all machine learning algorithms use some input data to create outputs. This input data comprise features, which are usually in the form of structured columns. Algorithms require features with some specific characteristic to work properly. Here, the need for feature engineering arises.

Simple text classification skill of DeepPavlov

Suppose you have a customer support call center that should resolve your clients’ issues and answer their questions. In most cases, the number of unique entities in the pool of client requests is rather limited. This leads to a situation when your employees face a huge amount of repeatedly asked questions. In this case, you would definitely benefit from applying a natural language processing (NLP) system that is able to find a question semantically similar to your client’s question and give the corresponding answer from a predefined FAQ-like list of questions and answers.

Productising TensorFlow / Keras models via Tensorflow Serving

If you are in SW Engineering and trying to work with Deep Learning models, chances are that mostly you would be using an OpenSourced Deep Learning model, like SSD, FasterRCNN or the like and building your application on top. There are many ways to do this. Mostly you could build a standalone Tensorflow or Keras application that loads your model to the GPU and create a custom REST or GRPC interface and write custom clients. Depending on the model that you have used, your server and client will vary. This is how most of the systems are in use. There is a better way called TensorFlow serving and that is an elegant way of serving your TensorFlow or Keras models. More than that once DL matures or if you want to productize it, you may need some sort of verification and continuous deployment (CD) for your models as well. The TF Serving pipeline can help here as well.