This article on data preparation is the third in a series of blog posts exploring value in public sector data:
We’ve listed key aspects of data planning that if carried out in sequence with the Public Sector Data Mapping and Public Sector Data Planning then you’ll have the ingredients to understand and measure what data you have and be able to measure improvement or deterioration in data.
We need to start rationalizing the systems we have and use. Do we need 150 systems or can we consolidate effort and purpose? Some public sector organisations have mapped their systems in detail. My favourite at the moment is North Lanarkshire Council (shout out to Peter Tolland): http://www.northlanarkshire.gov.uk/index.aspx?articleid=33882
The same process can be applied to the data held in these systems. This is generally taken up in a master data management plan or project. Why have 100 systems that capture a name and address when you can consolidate those and then consolidate the places for collecting data? From that, you might be able to explore opportunities in not capturing information in some systems as it is collected elsewhere.
Now I’ve thought that within this climate of yearly reduction of public sector budgets that we were nearing or passed the point where we can just cut people or services. Apparently, I was wrong. I’ve been hearing that particularly local authorities are bringing in consultants who are recommending that IT departments are removed in favour of IT Management Firms. Now while salesmen (or women no gender stereotypes here) make this sound amazing and the cost savings are brilliant the reality sets in very fast and is very costly. There appear to be five stages to this process:
The first thing that happens is all the internal knowledge is laid off and moves out into the world. Getting new jobs in locations that are not local driving the talent towards cities and away from the locality.
Second, the incoming IT Management Firm realises that almost every system within the organisation is several updates behind and generally in a position that cannot be upgraded without buying a new version of the product.
Third, the IT Firm approaches the local authority saying they need to spend money to upgrade the systems and the amount turns out to be an eye-watering sum. IT Firm demands this is carried out or they cannot effectively manage the organisation’s IT.
Fourth the local authority looks back with fondness to the time when they employed their own IT department and things didn’t cost so much and realize it will cost more to get them back and sack this IT Firm which is nickel and diming everything they have to do.
Fifth the local authority is in a worse state than it began.
Please do not follow this path.
Capturing Domain Knowledge
We then need to start writing down the processes revolving around data capture and committing to paper what and how those processes work in the backdrop of all processes throughout an organisation.
The neat thing about processes is that they are very rarely personal information. Almost every aspect of how a council delivers services can be released as open data. This is not always met with jubilation on the part of local authorities but is necessary to build a framework that allows for incrementally improving service delivery.
Data can then be worked on and cleaned. This is a terrible activity that always needs to take place and would be a large part of the analysis. If we can commit to continuous data cleaning then our data will be in a better place when we approach it to look for insights.
There is no other way to look at data cleaning. It is a thankless task. Much like the extraction of ore from the rock, it is necessary to get at the valuable mineral. The upside is that once we have figured out what is required in cleaning we can automate the process and always have clean data prepared for analysis.