2020-11-07 The Downfall of Command and Control Data Leadership
2020-10-22 Demystifying Apache Arrow
2020-04-16 Fuzzy Matching and Deduplicating Hundreds of Millions of Records using Apache Spark
2020-02-22 Why you should open source your analytical work
2019-12-08 Understanding the Spark UI by example: sorting data
2019-12-01 Understanding the Spark UI by example: the Left Join
2019-11-15 Spark UI SQL detailed annotator
2019-11-03 Unsupervised probabalistic data matching using the Expectation Maximisation algorithm
2019-10-11 Interactive blogging with Observable Notebooks and gatsby.js
2019-08-26 Effective testing of analytical models using automated sense checks
2019-03-14 Questions Senior Leaders Should Ask Their Data Delivery Teams
2018-08-22 Why I’m backing Vega-Lite as our default tool for data visualisation
2018-08-11 Transforming analytical functions by mainstreaming data science
2020-04-17 Comparing energy usage across countries
2020-04-17 Filling the country with solar panels
2019-10-13 Carbon offsetting vs. the cost of renewable energy
2019-10-09 Flight distance calculator
2019-10-05 Energy usage ready reckoner
2019-10-05 My flights