Python Graph Gallery

Welcome to the Python Graph Gallery. This website displays hundreds of charts, always providing the reproducible python code! It aims to showcase the awesome dataviz possibilities of python and to help you benefit it. Feel free to propose a chart or report a bug. Any feedback is highly welcome. Get in touch with the gallery by following it on Twitter, Facebook, or by subscribing to the blog.

Five Reasons Why Your Data Science Project Could Fail – And What You Can Do to Avoid It

• Losing sight of the BIG picture
• Lack of engagement with key stakeholders
• Putting the ‘How’ before the ‘Why’
• Not solving the right problem
• Hiring Data Scientists who are Unicorns

Learning git is not enough: becoming a data scientist after a science PhD

If you’re thinking of leaving post-PhD science for data science then doubtless people have told you to learn version control. They’re absolutely right. You should. But learning git is not enough. So, in the spirit of A PhD is Not Enough, a great book about careers in science, here’s some advice about moving from academia into data science after completing a PhD in a natural science. Unlike A PhD is Not Enough, however, this post is not a complete guide to a career. It’s just a collection of (hopefully non-obvious) things that have occurred to me since I made the move myself three years ago. And to be clear: none of what I say here applies to you if you have a PhD in computer science, mathematics, statistics or the humanities.

Apache Kafka and the four challenges of production machine learning systems

Machine learning has become mainstream, and suddenly businesses everywhere are looking to build systems that use it to optimize aspects of their product, processes or customer experience. The cartoon version of machine learning sounds quite easy: you feed in training data made up of examples of good and bad outcomes, and the computer automatically learns from these and spits out a model that can make similar predictions on new data not seen before. What could be easier, right?
• Challenge one: Machine learning systems use advanced analytical techniques in production software
• Challenge two: Integrating model builders and system builders
• Challenge three: The failure of QA and the importance of instrumentation
• Challenge four: Diverse data dependencies

Database Queries With R

There are many ways to query data with R. This post shows you three of the most common ways:
1.Using DBI
2.Using dplyr syntax
3.Using R Notebooks