For my lectures on applied linear models, I wanted to illustrate the fact that the R^2 is never a good measure of the goodness of the model, since it’s quite easy to improve it. Consider the following dataset …
R users have been enjoying the benefits of SQL query generators for quite some time, most notably using the dbplyr package. I would like to talk about some features of our own rquery query generator, concentrating on derived result re-use.
This post includes a rework of all presentation of ‘Elements of Neural Networks and Deep Learning Parts 1-8 ‘ since my earlier presentations had some missing parts, omissions and some occasional errors. So I have re-recorded all the presentations. This series of presentation will do a deep-dive into Deep Learning networks starting from the fundamentals. The equations required for performing learning in a L-layer Deep Learning network are derived in detail, starting from the basics. Further, the presentations also discuss multi-class classification, regularization techniques, and gradient descent optimization methods in deep networks methods. Finally the presentations also touch on how Deep Learning Networks can be tuned.
Among the factors that will leave a lasting impact on our discipline, technology certainly is one of the main vectors at play. The inception of technological solutions at every step of the value chain has already significantly transformed Architecture. The conception of buildings has in fact already started a slow transformation: first by leveraging new construction technics, then by developing adequate software, and eventually today by introducing statistical computing capabilities (including Data Science & AI). Rather than a disruption, we want to see here a continuity that led Architecture through successive evolutions until today. Modularity, Computational Design, Parametricism and finally Artificial Intelligence are to us the four intricated steps of a slow-paced transition. Beyond the historical background, we posit that this evolution is the wireframe of a radical improvement in architectural conception.
Kaggle’s Don’t Overfit II competition presents an interesting problem. We have 20,000 rows of continuous variables, with only 250 of them belonging to the training set. The challenge is not to overfit. With such a small dataset – and even smaller training set, this can be a difficult task! In this article, we’ll explore hyperparameter optimization as a means of preventing overfitting.
Hawkes processes are a particularly interesting class of stochastic process that have been applied in diverse areas, from earthquake modelling to financial analysis. They are point processes whose defining characteristic is that they ‘self-excite’, meaning that each arrival increases the rate of future arrivals for some period of time. Hawkes processes are well established, particularly within the financial literature, yet many of the treatments are inaccessible to one not acquainted with the topic. This survey provides background, introduces the field and historical developments, and touches upon all major aspects of Hawkes processes.
Marginal structural models (MSMs) are a new class of causal models for the estimation, from observational data, of the causal effect of a time-dependent exposure in the presence of time-dependent covariates that may be simultaneously confounders and intermediate variables. The parameters of a MSM can be consistently estimated using a new class of estimators: the inverseprobability-of-treatment weighted (IPTW) estimators. MSMs are an alternative to structural nested models (SNMs), the parameters of which are estimated through the method of g-estimation.
Artificial Intelligence and Machine Learning is going to be our biggest helper in coming decade! Today morning, I was reading an article which reported that an AI system won against 20 lawyers and the lawyers were actually happy that AI can take care of repetitive part of their roles and help them work on complex topics. These lawyers were happy that AI will enable them to have more fulfilling roles. Today, I will be sharing a similar example – How to count number of people in crowd using Deep Learning and Computer Vision? But, before we do that – let us develop a sense of how easy the life is for a Crowd Counting Scientist.
It wasn’t too long ago when somebody said to me, ‘You do reports when you get to doing them.’ To me, this position is most defensible if the reports are for bookkeeping purposes. I pointed out one day that my reports are for management purposes; and for this reason timeliness is important. For instance, when one is driving a car, and it is necessary to turn at the next right, turning at the next right five lights later is fairly relevant. Timing counts. The ‘active’ process of driving requires timely information. The ‘passive’ process of putting records away might not require the same level of timeliness. When an accounting company handles financial records after the fact, in all likelihood it is too late to alter the day-to-day routine to significantly alter the unfolding of events. In this blog, I will consider a more pressing use of data.
This data is censored, all family income above $155,000 is stated as $155,000. A further explanation about censored and truncated data can be found here. It would be incorrect to use this variable as a continuous predictor due to its censoring. This does not mean this data cannot be used as a predictor. The data can be converted into a categorical variable. How can we determine the number of categories and the increments of income that are in each category?
Image recognition and classification is a rapidly growing field in the area of machine learning. In particular, object recognition is a key feature of image classification, and the commercial implications of this are vast.
It is sometimes the case that a random variable is dependent upon another random variable. For example, on some slot machines, the number of spins of the bonus wheel depends on the number of spin/bonus icons you achieve on the slots wheel itself. If you get 1 spin icon you get to spin the wheel once, if you get 2 spin icons, you get to spin the wheel twice and so on. If there is a certain probability of winning the bonus in the wheel each time you spin it and the wins from each spin are independent, then the more times you’re allowed to spin it, the more chance of winning the bonus. But, the number of initial spin icons determines the number of times you’re allowed to spin the wheel which in turn give you more chances to win the bonus. The question is, what is the probability of winning the bonus each time you play? Additionally, suppose the probability distribution of the slots is known but not that of the wheel. Given observations on the history of the games played and the results of those games, it is possible to make some guess on the probability distribution of the wheel that would most likely result in such data. We will approach this part of the problem by thinking in a Bayesian context. Below, I frame a similar problem with a wheel and coin flips to simplify the work a little.
There are several technology and business forces in-play that are going to derive and drive new sources of customer, product and operational value. As a set up for this blog on the Economic Value of Data Science, let’s review some of those driving forces.
The imminent danger with Artificial Intelligence has nothing to do with machines becoming too intelligent. It has to do with machines inheriting the stupidity of people.
The open source software Qresp ‘Curation and Exploration of Reproducible Scientific Papers’ facilitates the organization, annotation and exploration of data presented in scientific papers.
AIL is a modular framework to analyse potential information leaks from unstructured data sources like pastes from Pastebin or similar services or unstructured data streams. AIL framework is flexible and can be extended to support other functionalities to mine or process sensitive information (e.g. data leak prevention).
In the world of data science, we define bias as the phenomenon in which a system overgeneralize from its data and learn the wrong thing. When this happens, the usual first action we take is to point fingers at the data or training process, followed by saying ‘this data is bad’ or ‘I should further tune my hyperparameters.’ Sure, this could be a problem. However, before spending more time and processing power, I’d like to invite you to stop, take a step back, and think about how the data we are using came to be, and more importantly, let us reason about we are interpreting it.