Common Mistakes to Avoid When Learning to Code in Python

Python is one of the simple program language one can learn and it´s a very flexible, object oriented language when it comes to syntax. Python created a new revolution in the coding segment. Coding is joy. Coding is fun.Coding is everything to programmers, but developers often get mislead by python simple syntax. In this article we will discuss most common mistake python programmers do are listed

Pandas Tutorial 1: Pandas Basics (Reading Data Files, DataFrames, Data Selection)

Pandas is one of the most popular Python libraries for Data Science and Analytics. I like to say it´s the ‘SQL of Python.’ Why Because pandas helps you to manage two-dimensional data tables in Python. Of course, it has many more features. In this pandas tutorial series, I´ll show you the most important (that is, the most often used) things that you have to know as an Analyst or a Data Scientist. This is the first episode and we will start from the basics!

How to take machine learning from exploration to implementation

Recognizing the interest in ML, the Strata Data Conference program is designed to help companies adopt ML across large sections of their existing operations. Interest in machine learning (ML) has been growing steadily, and many companies and organizations are aware of the potential impact these tools and technologies can have on their underlying operations and processes. The reality is that we are still in the early phases of adoption, and a majority of companies have yet to deploy ML across their operations.

Real-time data visualization using R and data extracting from SQL Server

In the previous post, I have showed how to visualize near real-time data using Python and Dash module. And it is time to see one of the many ways, how to do it in R. This time, I will not use any additional frames for visualization, like shiny, plotly or any others others, but will simply use base R functions and RODBC package to extract data from SQL Server. Extracting data from SQL Server will and simulating inserts in SQL Server table will primarily simulate the near real-time data. If you have followed the previous post, you will notice that I am using same T-SQL table and query to extract real-time data.

DALEX and H2O: Machine Learning Model Interpretability And Feature Explanation

As advanced machine learning algorithms are gaining acceptance across many organizations and domains, machine learning interpretability is growing in importance to help extract insight and clarity regarding how these algorithms are performing and why one prediction is made over another. There are many methodologies to interpret machine learning results (i.e. variable importance via permutation, partial dependence plots, local interpretable model-agnostic explanations), and many machine learning R packages implement their own versions of one or more methodologies. However, some recent R packages that focus purely on ML interpretability agnostic to any specific ML algorithm are gaining popularity. One such package is DALEX and this post covers what this package does (and does not do) so that you can determine if it should become part of your preferred machine learning toolbox. We implement machine learning models using H2O, a high performance ML toolkit. Let´s see how DALEX and H2O work together to get the best of both worlds with high performance and feature explainability!

How to add Trend Lines in R Using Plotly

When you are conducting an exploratory analysis of time-series data, you’ll need to identify trends while ignoring random fluctuations in your data. There are multiple ways to solve this common statistical problem in R by estimating trend lines. We’ll show you how in this article as well as how to visualize it using the Plotly package.

Where to get help with your R question?

Last time I blogged, I offered my obnoxious helpful advice for blog content and promotion. Today, let me again be the agony aunt you didn´t even write to! Imagine you have an R question, i.e. a question related to how you can do something with R, and your search engine efforts haven´t been too successful: where should you ask it to increase your chance of its getting answered You could see this post as my future answer to stray suboptimal Twitter R questions, or as a more general version of Scott Chamberlain´s excellent talk about how to get help related to rOpenSci software in the 2017-03-07 rOpenSci comm call. I think that the general journey to getting answers to your R questions is first trying your best to get answers locally in the documentation of R, then to use a search engine, and then to post a well-formulated question somewhere. My post is aimed at helping you find that somewhere. Note that it´s considered best practice to ask in one somewhere at once, and to then move on to another somewhere if you haven´t got any answer! One thing this post isn´t about is how to ask for help to humans, which is a topic that´s e.g. covered very well in Jenny Bryan´s talk about reprex in the same 2017-03-07 rOpenSci comm call, but I´ll link out to useful resources I´ve found. This post is also not about how to ask for help to e.g. Google, and I don´t know of a good search engine guide yet although e.g. ‘It can be particularly helpful to paste an error message into a search engine to find out whether others have solved a problem that you encountered.’ in https://…/help.html is true.