Polynomial regression with MShadow library tutorial

Hello, this is my second article about how to use modern C++ for solving machine learning problems. This time I will show how to make a model for polynomial regression problem described in previous article, but now with another library which allows you to use your GPU easily. For this tutorial I chose MShadow library, you can find documentation for it here. This library was chosen because it is actively developed now, and used as a basis for one of a wide used deep learning framework MXNet. Also it is a header only library with minimal dependencies, so it’s integration is not hard at all.

Text Mining and Sentiment Analysis – A Primer

Over years, a crucial part of data-gathering behavior has revolved around what other people think. With the constantly growing popularity and availability of opinion-driven resources such as personal blogs and online review sites, new challenges and opportunities are emerging as people have started using advanced technologies to make decisions now. Sentiment analysis or opinion mining, refers to the use of computational linguistics, text analytics and natural language processing to identify and extract information from source materials. Sentiment analysis is considered one of the most popular applications of text analytics. The primary aspect of sentiment analysis includes data analysis on the body of the text for understanding the opinion expressed by it and other key factors comprising modality and mood. Usually, the process of sentiment analysis works best on text that has a subjective context than on that with only an objective context. This is because when a body of text has an objective context or perspective to it, the text usually depicts some normal statements or facts without expressing any emotion, feelings, or mood. Subjective text contains text that is usually expressed by a human having typical moods, emotions, and feelings. Sentiment analysis is widely used, especially as a part of social media analysis for any domain, be it a business, a recent movie, or a product launch, to understand its reception by the people and what they think of it based on their opinions or, sentiment.

Why GDPR will Make Machine Learning not so legal

How exactly compliance with GDPR will look is not entirely clear. Just because something is required by law does not necessarily mean that everyone and every organisation complies with either the letter or the spirit of the law. In short time there would be GDPR-compliant data protection products, services, consultation works and audit services around this new buzz word will flourish. Privacy policies are getting updated to be more user-friendly to address new data regulations. These same standards apply in all vital areas such as big data analysis and artificial intelligence. I am sure you will have many questions running in head but I am sure I would be able to clear many in my subsequent blog posts.

Classification from scratch, logistic with splines 2/8

Today, second post of our series on classification from scratch, following the brief introduction on the logistic regression.

Classification from scratch, logistic with kernels 3/8

Third post of our series on classification from scratch, following the previous post introducing smoothing techniques, with (b)-splines. Consider here kernel based techniques. Note that here, we do not use the “logistic” model… it is purely non-parametric.

Ecology of Metrics

Although I deal with many different types of metrics, I believe they can be generally classified as follows: 1) time use; 2) alignment; 3) production; 4) performance; 5) service; 6) and market. In this blog, I will be providing some comments pertaining to each. Although I have yet to encounter any myself, I am certain that there must be text books on the issue of operational metrics and how to make use of them. However, I personally developed nearly all of those that I use. Although I do not feel that I can freely share elaborate details, techniques, or methodologies, I recognize that young people entering the field should probably have some familiarity with metrics. I also feel that data science specifically cannot survive in a purely academic setting; it has to be highly applicable to business and real-life problems. So to promote the industry I would like to offer some personal insights and experiences.

Time-series data mining & applications

A time series is a sequence of data points recorded at specific time points – most often in regular time intervals (seconds, hours, days, months etc.). Every organization generates a high volume of data every single day – be it sales figure, revenue, traffic, or operating cost. Time series data mining can generate valuable information for long-term business decisions, yet they are underutilized in most organizations.