This is a selection of projects I’ve worked on in the past related to data. The images link to the project sites. Note that this only includes public projects. A good, general thought piece of mine is Data as culture: how will we live in a data driven society?
Table of Contents
In Show me the money I analysed all existing transactions of the three biggest peer-to-peer (P2P) platforms in the UK: Zopa, RateSetter, and Funding Circle. The data contains almost 14 million loan parts. It provided the most comprehensive snapshot of the UK P2P market at the time of publication. We gained high-profile media coverage for this story and have direct evidence of change in the peer-to-peer sector. Nice cartogram!
2013, peer-to-peer, R, analysis, visualisation, project management
A joint project between Deloitte and the Open Data Institute on how election data can help give insights into voting behaviour. I enjoyed this project a lot, as it was interesting, a fast turn-around and I got to apply various models among them random forests.
2014, R, analysis, visualisation, elections
A short project during my time at the Open Data Institute, where I analysed more than 20,000 links to CSV files on data.gov.uk. Results: only around one third turned out to be machine-readable. A typical CSV is between 1kb-1mb in size and has around eight columns. And I got to play around with Gephi.
2014, R, analysis, CSV, study
The presentation slides give a brief overview about the findings. For me it was interesting to play around with SPSS and R integration, that is variable labels. There’s also a nice integration with Google documents and R. I was particularly pleased when I reused code that was years old – and it worked.
2014, quantified self, R, analysis, survey, presentation
There is a lot more where this came from… Here is an example: a book chapter.
I designed the methodology and enjoyed classifying case studies, applications, anecdata. For each sector, we mapped out relevant datasets and examples of real-world open data applications. We then offer three goal options to help decision- and policy-makers select datasets to release as open data.
2014, open data, report, spreadsheet, recommendation
Together with the UK Anonymisation Network and Purple Guerrilla I’ve managed and developed an online course as an introduction to anonymisation. The online learning aims to promote the decision-making framework and give data practitioners confidence when dealing with personal data.
2015, anonymisation, governance, learning, project management, spreadsheet
A high-level overview of if and how we can evaluate and rank countries, organisations and projects, based on how well they use open data in different ways. As open data becomes more widespread and useful, so does the need for effective ways to analyse it.
2014, open data, benchmarking, report, recommendations
An example of a failed study because we never managed to publish the results. Not everything is always a success, and I hope others can related to this. The study was titled How the UK could gain up to £387 million per year and then it got political. The upshot: I learned a lot about R, the train industry and the value of travel time savings.
2014, open data, R, rail, benchmarking, report
clusterCons
. More information about the package can be found here. It has now been removed, so I’m not sure what’s going on.row-sample
, a function to make analysing a random subset easier.ggplot2
. I think I’m not allowed to release the training material, so I err on the safe side.