Hyperparameter tuning is like tuning your guitar. And then magic happens !
Regardless of whatever we think about the mysterious subject of Probability, we live and breath in a stochastic environment. From the ever elusive Quantum Mechanics to our daily life (‘There is 70% chance it will rain today’, ‘The chance of getting the job done in time is less than 30%’ … ) we use it, knowingly or unknowingly. We live in a ‘Chancy, Chancy, Chancy world’. And thus, knowing how to reason about it, is one of the most important tools in the arsenal of any person.
Analyzing Text Data in Just Two Lines of Code
Whether you are working on predicting data in an office setting or just competing in a Kaggle competition, it’s important to test out different models to find the best fit for the data you are working with. I recently had the opportunity to compete with some very smart colleagues in a private Kaggle competition predicting faulty water pumps in Tanzania. I ran the following models after doing some data cleaning and I’ll show you the results.
Suppose you are in a new town and you have no map nor GPS, and you need to reach downtown. You can try assess your current position relative to your destination, as well the effectiveness (value) of each direction you take. You can think of this as computing the value function. Or you can ask a local and he tells you to go straight and when you see a fountain you go to the left and continue until you reach downtown. He gave you a policy to follow. Naturally, in this case, following the given policy is much less complicated than computing the value function on your own.
Data science has reached new levels of complexity and of course awesomeness. I’ve been doing this for years now, I’m what I want for people is to have a clear and easy path to do their job. I’ve been talking about data science and more for a while now, but it’s time to get our hands dirty and code together. This is the beginning of a series on articles about data science with Optimus, Spark and Python.
Let’s start this series by defining what time series are…
How to measure your model’s fairness and decide on the best fairness metrics.