“We all tend to think of ‘Statistical Analysis’ as one big skill, but it’s not” Karen Grace-Martin ( 2018 )

# Magister Dixit

**12**
*Wednesday*
Dec 2018

Posted Magister Dixit

in
Advertisements

**12**
*Wednesday*
Dec 2018

Posted Magister Dixit

in“We all tend to think of ‘Statistical Analysis’ as one big skill, but it’s not” Karen Grace-Martin ( 2018 )

Advertisements

**12**
*Wednesday*
Dec 2018

Posted Documents

inGiven a set of predictor variables and a response variable, how much information do the predictors have about the response, and how is this information distributed between unique, complementary, and shared components Recent work has proposed to quantify the unique component of the decomposition as the minimum value of the conditional mutual information over a constrained set of information channels. We present an efficient iterative divergence minimization algorithm to solve this optimization problem with convergence guarantees, and we evaluate its performance against other techniques. Computing the Unique Information

**12**
*Wednesday*
Dec 2018

Posted R Packages

in* Management of Deterministic and Stochastic Projects* (

Management problems of deterministic and stochastic projects. It obtains the duration of a project and the appropriate slack for each activity in a det …

A flexible computational framework for mixture distributions with the focus on the composite models.

Isolation forest is anomaly detection method introduced by the paper Isolation based Anomaly Detection (Liu, Ting and Zhou <doi:10.1145/2133360.2133 …

A lightweight but powerful R interface to the ‘Azure Resource Manager’ REST API. The package exposes classes and methods for ‘OAuth’ authentication and …

**12**
*Wednesday*
Dec 2018

Posted What is ...

in**Sequential Adaptive Nonlinear Modeling of Vector Time Series (SLANTS)**

We propose a method for adaptive nonlinear sequential modeling of vector-time series data. Data is modeled as a nonlinear function of past values corrupted by noise, and the underlying non-linear function is assumed to be approximately expandable in a spline basis. We cast the modeling of data as finding a good fit representation in the linear span of multi-dimensional spline basis, and use a variant of l1-penalty regularization in order to reduce the dimensionality of representation. Using adaptive filtering techniques, we design our online algorithm to automatically tune the underlying parameters based on the minimization of the regularized sequential prediction error. We demonstrate the generality and flexibility of the proposed approach on both synthetic and real-world datasets. Moreover, we analytically investigate the performance of our algorithm by obtaining both bounds of the prediction errors, and consistency results for variable selection. … **Artificial Stupidity**

And artificial stupidity’s most valid usage relates to examples of the obvious faultiness of AI technologies and systems. Finally within the field of computer science, artificial stupidity has one other significant application: it refers to a technique of deliberately dumbing down computer programs in order to introduce errors in their responses.

Building Safer AGI by introducing Artificial Stupidity … **Adversarial Graphical Model (AGM)**

In many structured prediction problems, complex relationships between variables are compactly defined using graphical structures. The most prevalent graphical prediction methods—probabilistic graphical models and large margin methods—have their own distinct strengths but also possess significant drawbacks. Conditional random fields (CRFs) are Fisher consistent, but they do not permit integration of customized loss metrics into their learning process. Large-margin models, such as structured support vector machines (SSVMs), have the flexibility to incorporate customized loss metrics, but lack Fisher consistency guarantees. We present adversarial graphical models (AGM), a distributionally robust approach for constructing a predictor that performs robustly for a class of data distributions defined using a graphical structure. Our approach enjoys both the flexibility of incorporating customized loss metrics into its design as well as the statistical guarantee of Fisher consistency. We present exact learning and prediction algorithms for AGM with time complexity similar to existing graphical models and show the practical benefits of our approach with experiments. …

**12**
*Wednesday*
Dec 2018

Posted Books

in
**12**
*Wednesday*
Dec 2018

Posted Magister Dixit

in“The volume and variety of data have far outstripped the capacity of manual analysis, and in some cases have exceeded the capacity of conventional databases.” Foster Provost & Tom Fawcett ( 2014 )

**12**
*Wednesday*
Dec 2018

Posted arXiv Papers

in**Improved Knowledge Graph Embedding using Background Taxonomic Information**

Knowledge graphs are used to represent relational information in terms of triples. To enable learning about domains, embedding models, such as tensor factorization models, can be used to make predictions of new triples. Often there is background taxonomic information (in terms of subclasses and subproperties) that should also be taken into account. We show that existing fully expressive (a.k.a. universal) models cannot provably respect subclass and subproperty information. We show that minimal modifications to an existing knowledge graph completion method enables injection of taxonomic information. Moreover, we prove that our model is fully expressive, assuming a lower-bound on the size of the embeddings. Experimental results on public knowledge graphs show that despite its simplicity our approach is surprisingly effective.

**Communication-Efficient Distributed Reinforcement Learning**

This paper studies the distributed reinforcement learning (DRL) problem involving a central controller and a group of learners. Two DRL settings that find broad applications are considered: multi-agent reinforcement learning (RL) and parallel RL. In both settings, frequent information exchange between the learners and the controller are required. However, for many distributed systems, e.g., parallel machines for training deep RL algorithms, and multi-robot systems for learning the optimal coordination strategies, the overhead caused by frequent communication is not negligible and becomes the bottleneck of the overall performance. To overcome this challenge, we develop a new policy gradient method that is amenable to efficient implementation in such communication-constrained settings. By adaptively skipping the policy gradient communication, our method can reduce the communication overhead without degrading the learning accuracy. Analytically, we can establish that i) the convergence rate of our algorithm is the same as the vanilla policy gradient for the DRL tasks; and, ii) if the distributed computing units are heterogeneous in terms of their reward functions and initial state distributions, the number of communication rounds needed to achieve a targeted learning accuracy is reduced. Numerical experiments on a popular multi-agent RL benchmark corroborate the significant communication reduction of our algorithm compared to the alternatives.

**Generalized Batch Normalization: Towards Accelerating Deep Neural Networks**

Utilizing recently introduced concepts from statistics and quantitative risk management, we present a general variant of Batch Normalization (BN) that offers accelerated convergence of Neural Network training compared to conventional BN. In general, we show that mean and standard deviation are not always the most appropriate choice for the centering and scaling procedure within the BN transformation, particularly if ReLU follows the normalization step. We present a Generalized Batch Normalization (GBN) transformation, which can utilize a variety of alternative deviation measures for scaling and statistics for centering, choices which naturally arise from the theory of generalized deviation measures and risk theory in general. When used in conjunction with the ReLU non-linearity, the underlying risk theory suggests natural, arguably optimal choices for the deviation measure and statistic. Utilizing the suggested deviation measure and statistic, we show experimentally that training is accelerated more so than with conventional BN, often with improved error rate as well. Overall, we propose a more flexible BN transformation supported by a complimentary theoretical framework that can potentially guide design choices.

**No Peek: A Survey of private distributed deep learning**

We survey distributed deep learning models for training or inference without accessing raw data from clients. These methods aim to protect confidential patterns in data while still allowing servers to train models. The distributed deep learning methods of federated learning, split learning and large batch stochastic gradient descent are compared in addition to private and secure approaches of differential privacy, homomorphic encryption, oblivious transfer and garbled circuits in the context of neural networks. We study their benefits, limitations and trade-offs with regards to computational resources, data leakage and communication efficiency and also share our anticipated future trends.

**Secure Federated Transfer Learning**

Machine learning relies on the availability of a vast amount of data for training. However, in reality, most data are scattered across different organizations and cannot be easily integrated under many legal and practical constraints. In this paper, we introduce a new technique and framework, known as federated transfer learning (FTL), to improve statistical models under a data federation. The federation allows knowledge to be shared without compromising user privacy, and enables complimentary knowledge to be transferred in the network. As a result, a target-domain party can build more flexible and powerful models by leveraging rich labels from a source-domain party. A secure transfer cross validation approach is also proposed to guard the FTL performance under the federation. The framework requires minimal modifications to the existing model structure and provides the same level of accuracy as the non-privacy-preserving approach. This framework is very flexible and can be effectively adapted to various secure multi-party machine learning tasks.

Time series data account for a major part of data supply available today. Time series mining handles several tasks such as classification, clustering, query-by-content, prediction, and others. Performing data mining tasks on raw time series is inefficient as these data are high-dimensional by nature. Instead, time series are first pre-processed using several techniques before different data mining tasks can be performed on them. In general, there are two main approaches to reduce time series dimensionality, the first is what we call landmark methods. These methods are based on finding characteristic features in the target time series. The second is based on data transformations. These methods transform the time series from the original space into a reduced space, where they can be managed more efficiently. The method we present in this paper applies a third approach, as it projects a time series onto a lower-dimensional space by selecting important points in the time series. The novelty of our method is that these points are not chosen according to a geometric criterion, which is subjective in most cases, but through an optimization process. The other important characteristic of our method is that these important points are selected on a dataset-level and not on a single time series-level. The direct advantage of this strategy is that the distance defined on the low-dimensional space lower bounds the original distance applied to raw data. This enables us to apply the popular GEMINI algorithm. The promising results of our experiments on a wide variety of time series datasets, using different optimizers, and applied to the two major data mining tasks, validate our new method.

**Theory of Curriculum Learning, with Convex Loss Functions**

Curriculum Learning – the idea of teaching by gradually exposing the learner to examples in a meaningful order, from easy to hard, has been investigated in the context of machine learning long ago. Although methods based on this concept have been empirically shown to improve performance of several learning algorithms, no theoretical analysis has been provided even for simple cases. To address this shortfall, we start by formulating an ideal definition of difficulty score – the loss of the optimal hypothesis at a given datapoint. We analyze the possible contribution of curriculum learning based on this score in two convex problems – linear regression, and binary classification by hinge loss minimization. We show that in both cases, the expected convergence rate decreases monotonically with the ideal difficulty score, in accordance with earlier empirical results. We also prove that when the ideal difficulty score is fixed, the convergence rate is monotonically increasing with respect to the loss of the current hypothesis at each point. We discuss how these results bring to term two apparently contradicting heuristics: curriculum learning on the one hand, and hard data mining on the other.

**Closed-form Inference and Prediction in Gaussian Process State-Space Models**

We examine an analytic variational inference scheme for the Gaussian Process State Space Model (GPSSM) – a probabilistic model for system identification and time-series modelling. Our approach performs variational inference over both the system states and the transition function. We exploit Markov structure in the true posterior, as well as an inducing point approximation to achieve linear time complexity in the length of the time series. Contrary to previous approaches, no Monte Carlo sampling is required: inference is cast as a deterministic optimisation problem. In a number of experiments, we demonstrate the ability to model non-linear dynamics in the presence of both process and observation noise as well as to impute missing information (e.g. velocities from raw positions through time), to de-noise, and to estimate the underlying dimensionality of the system. Finally, we also introduce a closed-form method for multi-step prediction, and a novel criterion for assessing the quality of our approximate posterior.

Methods proposed in the literature towards continual deep learning typically operate in a task-based sequential learning setup. A sequence of tasks is learned, one at a time, with all data of current task available but not of previous or future tasks. Task boundaries and identities are known at all times. This setup, however, is rarely encountered in practical applications. Therefore we investigate how to transform continual learning to an online setup. We develop a system that keeps on learning over time in a streaming fashion, with data distributions gradually changing and without the notion of separate tasks. To this end, we build on the work on Memory Aware Synapses, and show how this method can be made online by providing a protocol to decide i) when to update the importance weights, ii) which data to use to update them, and iii) how to accumulate the importance weights at each update step. Experimental results show the validity of the approach in the context of two applications: (self-)supervised learning of a face recognition model by watching soap series and learning a robot to avoid collisions.

**Fast convergence rates of deep neural networks for classification**

We derive the fast convergence rates of a deep neural network (DNN) classifier with the rectified linear unit (ReLU) activation function learned using the hinge loss. We consider three cases for a true model: (1) a smooth decision boundary, (2) smooth conditional class probability, and (3) the margin condition (i.e., the probability of inputs near the decision boundary is small). We show that the DNN classifier learned using the hinge loss achieves fast rate convergences for all three cases provided that the architecture (i.e., the number of layers, number of nodes and sparsity). is carefully selected. An important implication is that DNN architectures are very flexible for use in various cases without much modification. In addition, we consider a DNN classifier learned by minimizing the cross-entropy, and show that the DNN classifier achieves a fast convergence rate under the condition that the conditional class probabilities of most data are sufficiently close to either 1 or zero. This assumption is not unusual for image recognition because human beings are extremely good at recognizing most images. To confirm our theoretical explanation, we present the results of a small numerical study conducted to compare the hinge loss and cross-entropy.

Changepoint detection methods are used in many areas of science and engineering, e.g., in the analysis of copy number variation data, to detect abnormalities in copy numbers along the genome. Despite the broad array of available tools, methodology for quantifying our uncertainty in the strength (or presence) of given changepoints, post-detection, are lacking. Post-selection inference offers a framework to fill this gap, but the most straightforward application of these methods results in low-powered tests and leaves open several important questions about practical usability. In this work, we carefully tailor post-selection inference methods towards changepoint detection, focusing as our main scientific application on copy number variation data. As for changepoint algorithms, we study binary segmentation, and two of its most popular variants, wild and circular, and the fused lasso. We implement some of the latest developments in post-selection inference theory: we use auxiliary randomization to improve power, which requires implementations of MCMC algorithms (importance sampling and hit-and-run sampling) to carry out our tests. We also provide recommendations for improving practical useability, detailed simulations, and an example analysis on array comparative genomic hybridization (CGH) data.

**Ramp-based Twin Support Vector Clustering**

Traditional plane-based clustering methods measure the cost of within-cluster and between-cluster by quadratic, linear or some other unbounded functions, which may amplify the impact of cost. This letter introduces a ramp cost function into the plane-based clustering to propose a new clustering method, called ramp-based twin support vector clustering (RampTWSVC). RampTWSVC is more robust because of its boundness, and thus it is more easier to find the intrinsic clusters than other plane-based clustering methods. The non-convex programming problem in RampTWSVC is solved efficiently through an alternating iteration algorithm, and its local solution can be obtained in a finite number of iterations theoretically. In addition, the nonlinear manifold-based formation of RampTWSVC is also proposed by kernel trick. Experimental results on several benchmark datasets show the better performance of our RampTWSVC compared with other plane-based clustering methods.

**Functional Design of Computation Graph**

Representing the control flow of a computer program as a computation graph can bring many benefits in a broad variety of domains where performance is critical. This technique is a core component of most major numerical libraries (TensorFlow, PyTorch, Theano, MXNet,…) and is successfully used to speed up and optimise many computationally-intensive tasks. However, different design choices in each of these libraries lead to noticeable differences in efficiency and in the way an end user writes efficient code. In this report, we detail the implementation and features of the computation graph support in OCaml’s numerical library Owl, a recent entry in the world of scientific computing.

**Sufficient Dimension Reduction for Classification**

We propose a new sufficient dimension reduction approach designed deliberately for high-dimensional classification. This novel method is named maximal mean variance (MMV), inspired by the mean variance index first proposed by Cui, Li and Zhong (2015), which measures the dependence between a categorical random variable with multiple classes and a continuous random variable. Our method requires reasonably mild restrictions on the predicting variables and keeps the model-free advantage without the need to estimate the link function. The consistency of the MMV estimator is established under regularity conditions for both fixed and diverging dimension (p) cases and the number of the response classes can also be allowed to diverge with the sample size n. We also construct the asymptotic normality for the estimator when the dimension of the predicting vector is fixed. Furthermore, our method works pretty well when n < p. The surprising classification efficiency gain of the proposed method is demonstrated by simulation studies and real data analysis.

We consider a sequence of successively more restrictive definitions of abstraction for causal models, starting with a notion introduced by Rubenstein et al. (2017) called exact transformation that applies to probabilistic causal models, moving to a notion of uniform transformation that applies to deterministic causal models and does not allow differences to be hidden by the ‘right’ choice of distribution, and then to abstraction, where the interventions of interest are determined by the map from low-level states to high-level states, and strong abstraction, which takes more seriously all potential interventions in a model, not just the allowed interventions. We show that procedures for combining micro-variables into macro-variables are instances of our notion of strong abstraction, as are all the examples considered by Rubenstein et al.

**Regularization by architecture: A deep prior approach for inverse problems**

The present paper studies the so called deep image prior (DIP) technique in the context of inverse problems. DIP networks have been introduced recently for applications in image processing, also first experimental results for applying DIP to inverse problems have been reported. This paper aims at discussing different interpretations of DIP and to obtain analytic results for specific network designs and linear operators. The main contribution is to introduce the idea of viewing these approaches as the optimization of Tiknonov functionals rather than optimizing networks. Besides theoretical results, we present numerical verifications for an academic example (integration operator) as well as for the inverse problem of magnetic particle imaging (MPI). The reconstructions obtained by deep prior networks are compared with state of the art methods.

**Bayesian Layers: A Module for Neural Network Uncertainty**

We describe Bayesian Layers, a module designed for fast experimentation with neural network uncertainty. It extends neural network libraries with layers capturing uncertainty over weights (Bayesian neural nets), pre-activation units (dropout), activations (‘stochastic output layers’), and the function itself (Gaussian processes). With reversible layers, one can also propagate uncertainty from input to output such as for flow-based distributions and constant-memory backpropagation. Bayesian Layers are a drop-in replacement for other layers, maintaining core features that one typically desires for experimentation. As demonstration, we fit a 10-billion parameter ‘Bayesian Transformer’ on 512 TPUv2 cores, which replaces attention layers with their Bayesian counterpart.

**11**
*Tuesday*
Dec 2018

Posted Documents

inThe Big Data is the most popular paradigm nowadays and it has almost no untouched area. For instance, science, engineering, economics, business, social science, and government. The Big Data are used to boost up the organization performance using massive amount of dataset. The Data are assets of the organization, and these data gives revenue to the organizations. Therefore, the Big Data is spawning everywhere to enhance the organizations’ revenue. Thus, many new technologies emerging based on Big Data. In this paper, we present the taxonomy of Big Data. Besides, we present in-depth insight on the Big Data paradigm. Taxonomy of Big Data: A Survey

**11**
*Tuesday*
Dec 2018

Posted Distilled News

in**Introduction to Python Metaclasses**

In this tutorial, learn what metaclasses are, how to implement them in Python, and how to create custom ones.

Empowering you to use machine learning to get valuable insights from data.

• Implement basic ML algorithms and deep neural networks with PyTorch.

• Run everything on the browser without any set up using Google Colab.

• Learn object-oriented ML to code for products, not just tutorials.

• Implement basic ML algorithms and deep neural networks with PyTorch.

• Run everything on the browser without any set up using Google Colab.

• Learn object-oriented ML to code for products, not just tutorials.

**Exotic link functions for GLMs**

In my previous post on GLMs, I discussed power link functions. But there are much more links that can be used

**How different are conventional programming and machine learning? Explained with a toy example**

Engineering allowed us to push the limits of human capabilities. We used our understanding of nature and utilized that to serve our purposes. Be it a high performant mechanical machinery or an encoded silicon chip. Computers have been by far one of the most intricate utilization of nature’s forces put to help humans in pushing their limits of capabilities i.e. many tasks which can be performed by computers can never be performed that quickly and efficiently by a human or a set of humans. As Steve Jobs would say, computers are like a bicycle for our minds.

You’ve estimated a GLM or a related model (GLMM, GAM, etc.) for your latest paper and, like a good researcher, you want to visualise the model and show the uncertainty in it. In general this is done using confidence intervals with typically 95% converage. If you remember a little bit of theory from your stats classes, you may recall that such an interval can be produced by adding to and subtracting from the fitted values 2 times their standard error. Unfortunately this only really works like this for a linear model. If I had a dollar (even a Canadian one) for every time I’ve seen someone present graphs of estimated abundance of some species where the confidence interval includes negative abundances, I’d be rich! Here, following the rule of ‘if I’m asked more than once I should write a blog post about it!’ I’m going to show a simple way to correctly compute a confidence interval for a GLM or a related model.

**10 R functions for Linux commands and vice-versa**

This post will go through 10 different Linux commands and their R alternatives. If you’re interested in learning more R functions for working with files like some of those below, also check out this post.

**How to tune a BigQuery ML classification model to achieve a desired precision or recall**

BigQuery provides an incredibly convenient way to train machine learning models on large, structured datasets. In an earlier article, I showed you how to train a classification model to predict flight delays.

**Implementing Defensive Design in AI Deployments**

With the upcoming launch of one of our AI products, there has been a repeating question that clients kept asking. This same question also shows up once in a while with our consulting engagements, to a lesser degree, but still demands an answer. The simple version of the question is this: How can I know that the AI is doing a good job? Now it’s easy to throw around confusion matrices and neural activation graphs to clients, but they have a much deeper question?-?and a very valid concern. They are not asking about the performance of the system, they are asking about its alignment to their own problems. If this model is now in charge of one or many of their business processes, how can they manage it if they cannot see its criteria for how it is executing its tasks? This touches on a combination of management fundamentals, business logic, and the ongoing evolution of the machine learning fields. The goal of bespoke AI solutions is to accelerate key processes that can alleviate the workload of the rest of the staff, or to make decisions in real time. As such, a system that cannot reliably execute a process within a trustworthy tolerance range might as well not be implemented at all.

**6 Emotionally Rewarding Data Science Projects**

The field of data science is best-suited for those who love mathematics and working with numbers. While some projects are tedious and monotonous, particularly on the entry level, there are plenty of exciting and rewarding jobs in the sector for qualified, experienced professionals. The dawn of big data and next-gen data analytics makes the field even more innovative and exciting by giving individuals access to more data than ever before. Since we are currently living in the Information Age, it only makes sense to use this data in fun, creative and rewarding ways.

**A ‘short’ introduction to model selection**

In this post I will discuss a topic central to the process of building good (supervised) machine learning models: model selection. This is not to say that model selection is the centerpiece of the data science workflow?-?without high-quality data, model building is vanity. Nevertheless, model selection plays a crucial role in building good machine learning models.

**A review of recent reinforcment learning applications to healthcare**

The application of machine learning to healthcare has yielded many great results. However, the vast margin of these focus on diagnosing or forecasting, and not explicitly on treatment. Although these can indirectly help at treating people (for example diagnosis is the first step to finding treatment) in many cases particularly where there are many available treatment options figuring out the best treatment policy to use for a particular patient is challenging for human decision makers. Reinforcement learning has grown quite popular, however the majority of papers focus on applying it board or video games. RL performed well at learning the optimal policies in these(video/board games) contexts but has been relatively untested in real world environments like healthcare. Naturally, RL is a good candidate for this purpose, however there are many barriers for it to work in practicallity. In this article I’m going to outline some of the more recent approaches as well as some of the barriers that still exist with the application of RL to healthcare. If this topic interests you I will also go into more detail about some of these models at the PyData Orono Meetup on Reinforcement Learning in the Real World which will be broadcast on Zoom this Wednesday 7-9:30 EST. This article assumes that you have a basic knowledge of reinforcement learning. If you don’t, I suggest reading one of the many articles already on Towards Data Science on the subject.

**Chatbots are cool! A framework using Python Part 1:Overview**

The bot framework is modularized which opens up an array of opportunities for the readers to design and implement their own features. Integrations can be done easily in the framework. Also, the probability for failure is minimal since it is designed to be plug and play.

Beginner: An overall idea on how the framework is developed and used for this specific project. You should be able to download the codes from Github and complete the setup successfully. This includes package installations, slack and IBM Watson account creation and setup, run one time files to generate the links and movie recommendations. You can add extra skills in IBM Watson (like a small talk which generate static responses) and see the results in slack environment.

Intermediate: You should be able to use this framework as a template to design your own chatbot which can be deployed on a different domain. In addition, you can extend the knowledge base for the chatbot by adding new data sources which includes writing codes to connect to different databases (Elastic search, SQL databases, Excel and so on..). Also, you can add extra NLP features to the bot and see the results in slack environment.

Expert: You should be able to add/extend bot features by integrating API connections for Slack/NLP. I used IBM Watson to identify question category and to generate static responses. You can replace IBM Watson in the framework by designing your own NLP capabilities. Also, you can extend the bot integrations for different platforms (Web, Skype and so on..)

Beginner: An overall idea on how the framework is developed and used for this specific project. You should be able to download the codes from Github and complete the setup successfully. This includes package installations, slack and IBM Watson account creation and setup, run one time files to generate the links and movie recommendations. You can add extra skills in IBM Watson (like a small talk which generate static responses) and see the results in slack environment.

Intermediate: You should be able to use this framework as a template to design your own chatbot which can be deployed on a different domain. In addition, you can extend the knowledge base for the chatbot by adding new data sources which includes writing codes to connect to different databases (Elastic search, SQL databases, Excel and so on..). Also, you can add extra NLP features to the bot and see the results in slack environment.

Expert: You should be able to add/extend bot features by integrating API connections for Slack/NLP. I used IBM Watson to identify question category and to generate static responses. You can replace IBM Watson in the framework by designing your own NLP capabilities. Also, you can extend the bot integrations for different platforms (Web, Skype and so on..)

As you have probably noticed, AI is currently a ‘hot topic’: media coverage and public discussion about AI is almost impossible to avoid. However, you may also have noticed that AI means different things to different people. For some, AI is about artificial life-forms that can surpass human intelligence, and for others, almost any data processing technology can be called AI. To set the scene, so to speak, we’ll discuss what AI is, how it can be defined, and what other fields or technologies are closely related. Before we do so, however, we’ll highlight three applications of AI that illustrate different aspects of AI.

**The Hidden Dangers in Algorithmic Decision Making**

The quiet revolution of artificial intelligence looks nothing like the way movies predicted; AI seeps into our lives not by overtaking our lives as sentient robots, but instead, steadily creeping into areas of decision-making that were previously exclusive to humans. Because it is so hard to spot, you might not have even noticed how much of your life is influenced by algorithms. Picture this?-?this morning, you woke up, reached for your phone, and checked Facebook or Instagram, in which you consumed media from a content feed created by an algorithm. Then you checked your email; only the messages that matter, of course. Everything negligible was automatically dumped into your spam or promotions folder. You may have listened to a new playlist on Spotify that was suggested to you based on the music that you’d previously shown interest in. You then proceeded with your morning routine before getting in your car and using Google Maps to see how long your commute would take today. In the span of half an hour, the content you consumed, the music you listened to, and your ride to work relied on brain power other than your own?-?it relied on predictive modelling from algorithms. Machine learning is here. Artificial intelligence is here. We are right in the midst of the information revolution and while it’s an incredible time and place to be in, one must be wary of the implications that come along with it. Having a machine tell you how long your commute will be, what music you should listen to, and what content you would likely engage with are all relatively harmless examples. But while you’re scrolling through your Facebook newsfeed, an algorithm somewhere is determining someone’s medical diagnoses, their parole eligibility, or their career prospects. At face value, machine learning algorithms look like a promising solution for mitigating the wicked problem that is human bias, and all the ways it can negatively impact the lives of millions of people. The idea is that the algorithms in AI are capable of being more fair and efficient than humans ever could be. Companies, governments, organizations, and individuals worldwide are handing off decision-making for many reasons?-?it’s more reliable, it becomes easier, it is less costly, and it’s time-efficient. However, there are still some concerns to be aware of.

**Who Do We Blame When an AI Finally Kills Somebody**

We’re rapidly approaching the point where AI will be so pervasive that it’s inevitable that someone will be injured or killed. If you thought this was covered by simple product defect warranties it’s not at all that clear. Here’s what we need to start thinking about.

**11**
*Tuesday*
Dec 2018

Posted Books

in