When I think about artificial intelligence, I get into this tricky habit of mixing understanding with capability. I imagine that there are ways we can tell how much a machine knows by what it can produce. However, the interesting thing to me is that machines don’t actually need to understand anything to produce a lot computationally. Popular science blurs this idea with a lot of conversation around things like the turing test. Movies like Ex Machina and Bladerunner give us a sense that perhaps there is some understanding that machines can take on about the world around them. These ideas in our zeitgeist blur into how we perceive tool capability used in industry today. Through some tertiary data science exploration, I will try and unpack one challenge commonly still faced in marketing today – summarizing content without reading all the content.
In this story, Shake-Shake Regularization (Shake-Shake), by Xavier Gastaldi from London Business School, is briefly reviewed. The motivation of this paper is that data augmentation is applied at the input image, it might be also possible to apply data augmentation techniques to internal representations. It is found in prior art that adding noise to the gradient during training helps training and generalization of complicated neural networks. And Shake-Shake regularization can be seen as an extension of this concept where gradient noise is replaced by a form of gradient augmentation. This is a paper in 2017 ICLR Workshop with over 10 citations. And the long version in 2017 arXiv has got over 100 citations. (Sik-Ho Tsang @ Medium)
Many of us have seen Deep learning accomplishing huge success in a variety of fields in recent years, with most of it coming from their ability to automate the frequently tedious and difficult feature engineering phase by learning ‘hierarchical feature extractors’ from data. Also, as architecture design (i.e. the process of creating the shape and functionality of a neural network) happens to be a long and difficult process that has been mainly done manually, innovativeness is limited and most progress has come from old algorithms that have been performing remarkably well with nowadays computing resources and data. Another issue is that Deep Neural Networks are mainly optimized by gradient following algorithms (e.g. SGD, RMSProp), which are a great resource to constraint the search space but is susceptible to get trapped by local optima, saddle points and noisy gradients, especially in dense solution areas such as reinforcement learning . This article reviews how evolutionary algorithms have been proposed and tested as a competitive alternative to address the described problems.
Which is better: Random Forest or Neural Network? This is a common question, with a very easy answer: it depends 🙂 I will try to show you when it is good to use Random Forest and when to use Neural Network. First of all, Random Forest (RF) and Neural Network (NN) are different types of algorithms. The RF is the ensemble of decision trees. Each decision tree in the ensemble process the sample and predicts the output label (in case of classification). Decision trees in the ensemble are independent. Each can predict the final response. The Neural Network is a network of connected neurons. The neurons cannot operate without other neurons – they are connected. Usually, they are grouped in layers and process data in each layer and pass forward to the next layers. The last layer of neurons is making decisions.
In this article, we get to know the steps on doing the Full Stack Deep Learning according to the FSDL course on March 2019. First, we need to setup and plan the project. We need to define the goals, metrics, and baseline in this step. Then, we collect the data and label it with available tools. In building the codebase, there are some tools that can maintain the quality of the project that have been described above. Then we do modeling with testing and debugging. After the model met the requirement, finally we know the step and tools for deploying and monitoring the application to the desired interface.
Machine learning is aimed at the automatic extraction of semantic-level information from potentially raw and unstructured data. A key challenge in building intelligent systems lies in the ability to extract and fuse information from multiple sources. In the present thesis, this challenge is addressed by using representation learning, which has been one of the most important innovations in machine learning in the last decade. Representation learning is the basis for modern approaches to natural language processing and artificial neural networks, in particular deep learning, which includes popular models such as convolutional neural networks (CNN) and recurrent neural networks (RNN). It has also been shown that many approaches to tensor decomposition and multi-way models can also be related to representation learning. Tensor decompositions have been applied to a variety of tasks, e.g., knowledge graph modeling and electroencephalography (EEG) data analysis. In this thesis, we focus on machine learning models based on recent representation learning techniques, which can combine information from multiple channels by exploiting their inherent multi-channel data structure.
Recently, I came across to the ggalluvial package in R. This package is particularly used to visualize the categorical data. As usual, I will use it with medical data from NHANES. Ggalluvial is a great choice when visualizing more than two variables within the same plot.
Deep Neural Networks are one of the most powerful class of machine learning models. With enough data, their accuracy in tasks such as Computer Vision and Natural Language Processing (NLP) is unmatched. The only drawback that many scientists will comment on is the fact that these networks are completely black-box. We still have very little knowledge as to how deep networks learn their target patterns so well, especially how all the neurons work together to achieve the final result.
Recently we have discussed the emerging concept of smart farming that makes agriculture more efficient and effective with the help of high-precision algorithms. The mechanism that drives it is Machine Learning – the scientific field that gives machines the ability to learn without being strictly programmed. It has emerged together with big data technologies and high-performance computing to create new opportunities to unravel, quantify, and understand data intensive processes in agricultural operational environments.
Joe, a good family friend, dropped by earlier this week. As we do often, we discussed the weather (seems to be hotter than normal already here in the Pacific Northwest), the news (mostly about how we are both taking actions to avoid the news), and our kids. Both of us have children that really enjoy playing with Legos®. And with Legos inevitably comes the intense pain of stepping on Legos, usually in the middle of the night or first thing in the morning on the way to make coffee. Stepping on lingering Legos seems to happen despite Joe and I both following after our children, picking up all the Legos we can find that the children left behind. Joe and I keep batting around ways to decrease the chance of stepping on Legos. After some time, I suggest we might be able to use probability and statistics to estimate the probability of there being Legos not removed in our sweeps after the kids. Joe says he’s on board, ‘Anything, my feet cannot take anymore!’. I fire up my favorite tools for estimating probabilities, and Joe and I get started on ways we might be able to estimate the likelihood there are Legos remaining after our sweeps to pick up the Legos missed by our children.
Still waiting for ML training to be over? Tired of running experiments manually? Not sure how to reproduce results? Wasting too much of your time on devops and data wrangling? It’s okay if you’re a hobbyist, but data science models are meant to be incorporated into real business applications. Businesses won’t invest in data science if they don’t see a positive ROI. This calls for the adoption of an ‘engineered’ approach – otherwise it is no more than a glorified science project with data. Engineers use microservices and automated CI/CD (continuous integration and deployment) in modern agile development. You write code, push and it gets tested automatically on some cluster at scale. It then goes into some type of beta/Canary testing phase if it passes the test, and onto production from there. Kubernetes, a cloud-native cluster orchestrator, is the tool now widely used by developers and devops to build an agile application delivery environment.