
It often happens that you need to provide a dataset to non-tech personnel in your company. The most common use case for this that I’ve encountered is when a business unit wants […]
It often happens that you need to provide a dataset to non-tech personnel in your company. The most common use case for this that I’ve encountered is when a business unit wants […]
Exporting data into flat files is a really common task for a data scientist yet over the last few years I’ve seen almost everyone fall down a particular rabbit hole when it […]
Originally published 07/04/2016 as part of the Bulletproof Technical Blog The importance of collecting data As a society we need to measure, plan, predict and test and to do that we need data. […]
Background A few months ago we were writing a Spark job to process AWS billing data. The idea was that every day we’d automatically spin up an Amazon EMR cluster which would do […]