While going through some of my documents, I came across a paper I was a part of in college that I thought would be perfect to share. For context, this paper was written for one of my classes where we had a project sponsored by a company. In my case, I worked with Matthew Bussing, Kai Nichols, and Sidney Johnson on a project sponsored by ProKarma to try to automate customer satisfaction surveys. This paper was originally published on our course website here. It has been adapted to fit this format and the original is embedded at the end of the article. Enjoy!
A look at data integrity, lifecycle, and security in computerized analytical systems. Assurance of data security, integrity, and privacy, is required by regulators and is essential for most industries. However, security it’s becoming harder for the bioanalytical laboratories of today. This challenge is heightened by the growing complexity and size of datasets, which have recently expanded, including multiple span geographies, analytical techniques, business models, and regulatory frameworks. It has become more important than ever that laboratory and information technology managers be proactive in organizing, securing, and protecting their data. This piece discusses the issues related to data integrity and security when using typical computerized analytical systems. Although this piece will focus on bioanalytical and clinical laboratories, similar considerations apply to food testing, environmental testing, forensic, and other laboratories. DATA INTEGRITY Introduction Data Integrity is defined as: ‘The degree to which a collection of data is complete, consistent, and accurate.’
Graphs are mathematical structures used to model pairwise relations between objects. A graph is made up of vertices which are connected by edges. In an undirected graph, I will find shortest path between two vertices. Q-learning is a model-free reinforcement learning algorithm. The goal of Q-learning is to learn a policy, which tells an agent what action to take under what circumstances. It does not require a model of the environment, and it can handle problems with stochastic transitions and rewards, without requiring adaptations.
Why true AI requires more than pattern recognition. When I hear news about ‘AI’ these days, what is often meant are methods for pattern recognition and approximations of complex functions, most importantly in the form of Machine Learning. It is true that we have seen impressive applications of Machine Learning systems in a number of different industries such as product personalization, fraud detection, credit risk modeling, insurance pricing, medical image analysis, or self-driving cars. But originally, AI is a field of research that tries to answer a much deeper question: What is the origin of intelligent behavior? Intelligent behavior is the capability of using one’s knowledge about the world to make decisions in novel situations: people act intelligently if the use what they know to get what they want. The premise of AI research is that this type of intelligence is fundamentally computational in nature, and that we can therefore find ways to replicate it in machines.
This blog is for those who want to create their own deep learning machine but are fearing the build process. This blog serves as a guide to what are the absolute things you should look at to make sure you are set to create your own deep learning machine and don’t accidentally buy out expensive hardware that later shows out to be incompatible and creates an issue. But before we even start…
TSML (Time Series Machine Learning Package) is package for Time Series data processing and prediction. It combines ML libraries from Python’s ScikitLearn, R’s Caret, and Julia using a common API and allows seamless ensembling and integration of heterogenous ML libraries to create complex models for robust time-series prediction.
Imagine the very realistic scenario, in which several families from your town are planning to go on a trip together. To save on transportation costs, they wish to charter several minibuses. The fewer the better, of course. To make the transport arrangements efficient and pleasant, they wish for each family to stick together on the same minibus, and partition the families according to geographic locations, to avoid redundant detours.
When dealing with customers, being able to anticipate churn is both an opportunity to improve customer service and an indicator of how good the business is performing. As the focus of the capstone project of the Udacity Data Science Nanodegree, I chose to work on churn prediction for a music streaming service called Sparkify. I present here insights on the data I used and conclusions I drew as well as the approach I chose to implement.
Forecasting is to predict or estimate (a future event or trend). For businesses and analysts forecasting is determining what is going to happen in the future by analyzing what happened in the past and what is going on now. Let’s think about how forecasting is applicable and where it’s required. Forecasting can be key when deciding whether to build a dam, or a power generation plant in the next few years based on forecasts of future demand; I’ve used forecasting at one of my former workplaces to help in scheduling staff in a call center for the coming week based on call volumes at certain days and certain times. Another area I’ve applied forecasting is telling a business when to stock up and on what they should stock for this was purely based on demand for the product. I bet you’ve used forecasting before you read this, maybe not in the same cases as mine but I’m 100% sure you have. Have you ever looked at the weather and realized you’re overdressed or under dressed? That is forecasting!
How AI agents cheat the system by doing exactly what they’re told, and what we can learn from them. If W. W. Jacobs had been born a century later, the The Monkey’s Paw might have featured a devilish AI instead of that accursed hand. AI agents, like the titular paw, are notorious for doing what they were technically asked to do in a way that no one expected or wanted. Just as the occasional firefighter commits arson in order to play the hero and ‘save the day’ (you were already a hero, bud), and like that dog who was rewarded with steak when he saved a drowning child and so took to knocking kids into the river, AI agents will do just about anything they can to maximize their reward. This sort of behaviour, in which AI agents increase their reward using strategies that violate the spirit or intent of the rules, is called reward hacking. Often, what seems like a reasonable reward to an AI developer or policy-maker leads to hilariously disastrous results. Here we’ll explore three cases of AI agents acting naughty in pursuit of reward and what they can teach us about good reward design, both in AI and for humans.
Let’s roll back the time to 2007 when the first-ever cricket T20 World Cup was organized. The world was harping about it but the cricket associations were looking at it with caution – the commercial breaks were reduced from 99 seconds to 39 seconds. Ouch! That’s quite a bit of a reduction in revenue. But this decision was the most rewarding one in the long run, and now its the highest revenue grossing format in the history of cricket! The world will be taken by storm by the Indian Cricket Team at the T-20 World Cup in 2020! Our captain Virat Kohli will once again be in the spotlight for taking critical decisions. Isn’t that a very stressful job? Especially when the hopes of millions of people dwell upon you?
Meet Neo ! Neo is a talented developer who loves building stuff. One fine morning , Neo decides to take up a road, less travelled, decides to build a chatbot ! After a couple of keyword searches and skimming through dozens of articles with titles ‘build a chatbot in 5 mins’ , ‘chatbot from scratch’ etc, Neo figures out the basic components to be intent detection, Named Entity Recognition ,Text Matching for QnA . Another 30 mins of Google search and Neo has collected his arsenal, the state of the art implementations for these 3 components. His arsenal has the almighty Bert for NER, Ulmfit for Text classification, RoBERTa for text matching.
DependencyTrees.jl is a Julia package for working with natural language sentence annotated with dependency structure. It provides implementations of dependency parse trees (DependencyTree), a treebank reader, and implementations of several transition systems with oracles.