**10 Steps to your very own Corporate A.I project**

A non-technical guide for managers, leaders, thinkers and dreamers.

1. Formulate an Executive Strategy

2. Identify and Prioritise Ideas

2. Identify and Prioritise Ideas

4. Perform the necessary Risk Assessments

5. Choose the relevant Method & Model

6. Make a BBP decision

7. Run performance checks

8. Deploy the algorithm

9. Communicate both successes and failures

10. Tracking

1. Formulate an Executive Strategy

2. Identify and Prioritise Ideas

2. Identify and Prioritise Ideas

4. Perform the necessary Risk Assessments

5. Choose the relevant Method & Model

6. Make a BBP decision

7. Run performance checks

8. Deploy the algorithm

9. Communicate both successes and failures

10. Tracking

Today we are going to talk about quantile regression. When we use the lm command in R we are fitting a linear regression using Ordinary Least Squares (OLS), which has the interpretation of a model for the conditional mean of y on x. However, sometimes we may need to look at more than the conditional mean to understand our data and quantile regressions may be a good alternative. Instead of looking at the mean, quantile regressions will establish models for particular quantiles as chosen by the user. The most simple case when quantile regressions are good is when you have outliers in your data because the median is much less affected by extreme values than the mean (0.5 quantile). But there are other cases where quantile regression may be used, for example to identify some heterogeneous effects of some variable or even to give more robustness to your results.

**Statistical Sentiment-Analysis for Survey Data using Python**

Survey’s play the main part when receiving client feedback on a particular product or service one offers for the public. Are we getting too many negative feedbacks? Why? How can we fix issues? What are we doing well and what have we improved after some time? What are the most key issues to comprehend? Assessing responses from customer surveys and creating a report that will give us the answers to these questions is easier said than done. It may take us hours, or even days to go through all responses and find the root of a problem.

**Effective Way for Finding Deep Learning Papers**

Recently, I came across a great video of Prof. Andrew Ng who explains in front of a CS class at Stanford how one can excel in the field of artificial intelligence. I will rephrase his words below. Deep learning is evolving fast enough that, even though you have learned the foundations of deep learning, when you are working on specific applications you need to read research papers to stay on top of most recent ideas.

**Applying product methodologies in data science**

What makes a great data driven product? Fancy models? Ground breaking ideas? The truth is that the secret sauce usually rests in successfully implementing a product methodology. In this post I carry out a retro on a recent hackathon experience, using lean and agile methodology concepts of Minimum Viable Product, Risky Assumptions, and Spikes. I explore how these approaches can help a team quickly identify a use case, map the risks and complexity of the solutions envisioned, and iterate rapidly towards a shippable product.

**Learn how to use PySpark in under 5 minutes (Installation + Tutorial)**

I’ve found that is a little difficult to get started with Apache Spark (this will focus on PySpark) and install it on local machines for most people. With this simple tutorial you’ll get there really fast! Apache Spark is a must for Big data’s lovers as it is a fast, easy-to-use general engine for big data processing with built-in modules for streaming, SQL, machine learning and graph processing. This technology is an in-demand skill for data engineers, but also data scientists can benefit from learning Spark when doing Exploratory Data Analysis (EDA), feature extraction and, of course, ML. But please remember that Spark is only truly realized when it is run on a cluster with a large number of nodes.

**Moving Towards ML: Evaluation Functions**

This week, we’re going to start taking our AI in a somewhat new direction. Right now, we’re hard-coding specific decisions for our player to make. But this week, we’ll make a more general function for evaluating different positions. Our initial results will be inferior to the AI we’ve hand-coded. But we’ll set ourselves up to have a much better AI in the future by applying machine learning. For more details on the code for this article, take a look at the evaluation-game-functionbranch on our Github Repository! This article also starts our move towards machine learning related concepts. So now would be a good time to review our Haskell AI Series. You can download our Tensor Flow Guide to learn more about using Haskell and Tensor Flow!

**A Detailed Guide to 7 Loss Functions for Machine Learning Algorithms with Python Code**

1. Squared Error Loss

2. Absolute Error Loss

3. Huber Loss

4. Binary Cross Entropy Loss

5. Hinge Loss

6. Multi-Class Cross Entropy Loss

7. Kullback-Liebler Divergence

2. Absolute Error Loss

3. Huber Loss

4. Binary Cross Entropy Loss

5. Hinge Loss

6. Multi-Class Cross Entropy Loss

7. Kullback-Liebler Divergence

**Insurance Data Science: Pictures**

At the Summer School of the Swiss Association of Actuaries, in Lausanne, following the part of Jean-Philippe Boucher (UQAM) on telematic data, I will start talking about pictures this Wednesday.

**Project Euphonia’s Personalized Speech Recognition for Non-Standard Speech**

The utility of technology is dependent on its accessibility. One key component of accessibility is automatic speech recognition (ASR), which can greatly improve the ability of those with speech impairments to interact with every-day smart devices. However, ASR systems are most often trained from ‘typical’ speech, which means that underrepresented groups, such as those with speech impairments or heavy accents, don’t experience the same degree of utility. For example, amyotrophic lateral sclerosis (ALS) is a disease that can adversely affect a person’s speech – about 25% of people with ALS experiencing slurred speech as their first symptom. In addition, most people with ALS eventually lose the ability to walk, so being able to interact with automated devices from a distance can be very important. Yet current state-of-the-art ASR models can yield high word error rates (WER) for speakers with only a moderate speech impairment from ALS, effectively barring access to ASR reliant technologies.

**MALMEM: model averaging in linear measurement error models**

We develop model averaging estimation in the linear regression model where some covariates are subject to measurement error. The absence of the true covariates in this framework makes the calculation of the standard residual-based loss function impossible. We take advantage of the explicit form of the parameter estimators and construct a weight choice criterion. It is asymptotically equivalent to the unknown model average estimator minimizing the loss function. When the true model is not included in the set of candidate models, the method achieves optimality in terms of minimizing the relative loss, whereas, when the true model is included, the method estimates the model parameter with root n rate. Simulation results in comparison with existing Bayesian information criterion and Akaike information criterion model selection and model averaging methods strongly favour our model averaging method. The method is applied to a study on health.

We propose a novel class of dynamic shrinkage processes for Bayesian time series and regression analysis. Building on a global-local framework of prior construction, in which continuous scale mixtures of Gaussian distributions are employed for both desirable shrinkage properties and computational tractability, we model dependence between the local scale parameters. The resulting processes inherit the desirable shrinkage behaviour of popular global-local priors, such as the horseshoe prior, but provide additional localized adaptivity, which is important for modelling time series data or regression functions with local features. We construct a computationally efficient Gibbs sampling algorithm based on a Pólya-gamma scale mixture representation of the process proposed. Using dynamic shrinkage processes, we develop a Bayesian trend filtering model that produces more accurate estimates and tighter posterior credible intervals than do competing methods, and we apply the model for irregular curve fitting of minute-by-minute Twitter central processor unit usage data. In addition, we develop an adaptive time varying parameter regression model to assess the efficacy of the Fama-French five-factor asset pricing model with momentum added as a sixth factor. Our dynamic analysis of manufacturing and healthcare industry data shows that, with the exception of the market risk, no other risk factors are significant except for brief periods.

**How to Automate EDA with DataExplorer in R**

EDA (Exploratory Data Analysis) is one of the key steps in any Data Science Project. The better the EDA is the better the Feature Engineering could be done. From Modelling to Communication, EDA has got much more hidden benefits that aren’t often emphasised while beginners start while teaching Data Science for beginners.