Data Science Chronicles

Hello and welcome to what I hope will be a useful record of the experiences of a working Data Scientist. Come with me as I meander on my daily journey through the data science space.

My aim is that this blog will serve as a record of thoughts about the data industry (and probably IT in general), discussions of books, podcasts, different tool sets, strange encounters with programming languages (R – I’m looking at you….) and maybe some deep dives into particular areas as they come to mind.

In short, this is a place to tell data stories, practice writing and have a spot to record this wild ride I’m on. Straight up, this blog isn’t about telling people how good I (think I) am – in fact it’s quite the opposite.

The data science industry is challenging and exciting, but also poorly defined and exhausting. It’s a harsh mistress that sometimes stops me sleeping. It’s changing daily and I sometimes feel like there’s an Emperor’s New Clothes affect at play. No one wants to admit what we know to be true – no one person can possibly hope to stay abreast of all of this innovation.


The older I get and the more people I meet, and the more I learn and archive old stuff as its working value attenuates, the more I’m realising how much I still don’t know.

It’s a double edged sword but one that everyone who’s chosen IT as their professional career has cut themselves on enough times to learn to live with the Savlon and band-aids.

Originally, I wasn’t going to enable comments but I’d like to try and build a bit of a community here and I think Jeff Atwood is right about blogs and comments. I don’t want to be preaching one way here but I also don’t want it to all devolve into pissing matches over coding minutiae that I see on so many other tech blogs.

Let’s see how it goes.

Some of these posts will be general musings and some I have in mind will be deep dives that’ll probably require many posts to detail. I’ll talk about data visualisation, coding in R, python, Java and Scala and also when plain old SQL is good enough to get the job done. I’m also a big fan of machine learning and statistics so have no doubt I’ll go into that at length too…:)

I’ll tag and categorise as I go as much as possible so that hopefully this blog is useful not just to “real” data scientists (and we’ll talk about what the hell that even means soon enough) but also to people just starting out in data science and trying to get a feel for what it is (and isn’t) about.

And maybe this can be a good resource for quasi/non technical people who are just wondering what the fuss is all about (and why it sometimes feels like they have to devote 2% of their GDP just to pay a data science team).

Lastly, this blog is a work in progress so bear with me while I front-load all the WordPress goodies. I’ll try and make it a pleasant place to visit.

So there it is.

“And if you’ve come this far, maybe you’re willing to come a little further”

– Andy Dufresne, The Shawshank Redemption