Round table discussion: Data governance for the public sector
OpenSky Data Systems hosted a round table discussion on the future of data governance in the public sector, bringing together key stakeholders from across the public and private sectors.
Why is data governance important? What are the benefits of effective data governance?
Owen Harrison
We all accept the value of data, but organisations must make sure they know what data they hold, that they can access that data, that they understand what it represents, how it has been processed and who is responsible for it. All of those things are required to be present and explicit in an organisation in order for data’s potential to be reached. In a small-scale organisation, all those aspects are well known. People either know them or are within arm’s reach of someone who does. In a large-scale organisation, it’s clear to see that such awareness and understanding isn’t so apparent. In order for those aspects to be readily available, you need strong governance.
Denis Parfenov
Every single day, more than two and a half quintillion bytes of data are produced, which is by one estimate two hundred and fifty thousand times more than printed material in the US Congress. This data can be roughly split into private data and public data; private data is the data generated by citizens, with their phones, payment and travel cards; public data is non-personal and non-sensitive data produced on the citizens’ behalf at the taxpayers’ expense. Private data needs to be secure, while public data has to be open, which require resources.
Our partners have been involved in a project to preserve the works of Alexander Kluge, a German author and film director. It was estimated that digitising Kluge’s works would require storage for two petabytes of data at a cost of $70,000-$100,000 per year; that’s data of just one creative and productive individual out of seven billion of us. There is also the environmental cost, for example one of Amazon’s data centre which is under construction near Blanchardstown will consume four per cent of Ireland’s energy. The moonshot benefit of effective data management is that we, the global community of individuals are enabled by trusted data to address global problems collectively.
William Flanagan
The people on whose behalf the data is being controlled, need to have confidence in those who are controlling the data.
The public tend to be quite harsh on public bodies when there’s an absence of data governance. For example, when there is a breach involving a social media site, the question is often “why wasn’t the regulation stronger?” A data breach involving a public sector organisation would be even more harshly viewed by citizens. The concept of data governance, and the lifecycle of the data is fundamental to the overall picture of public confidence. Confidence is absolutely key, as the leaking of personal data could have quite serious consequences for the security and safety of an individual.
Aileen McHugh
In terms of the PRA, both registration, and the collection of personal data for that purpose, are mandatory. An extra duty of care therefore arises with a public register containing personal data. Some people wrongly think data governance is the same as ICT governance. We hold paper-based records going back to 1708. It’s also about the management and conservation of archives, for example, and making them accessible and available online. Equally, it’s about decision making authority for bulk data requests. If the public sector wants to innovate, we need to fully understand data governance requirements in context, to enable the emergence of appropriate space for data sharing. We need to have in place a mature stage data governance regime to retain trust and provide evidence of full regulatory and legislative compliance.
Declan Sheehan
We regard data as a corporate asset that both our organisation and the wider economy can get value from. Data governance is about putting in place the relevant rules and regulations to extract maximum value from that asset. In terms of governance, we need to ensure the data quality, put in place the common data business language, securing it and putting in place privacy by design so we protect the citizen’s data whilst making it available to the wider economy for purposes such as innovation.
Brinsley Sheridan
Data is a national asset. As a government organisation, trust is a huge issue with the citizen. The way to get trust is accurate management of that national asset. Is it secure? Look at the Public Service Cards and MyGovID and how this has affected the perception of privacy and security. It is best practice privacy and security management but has generated negative press which is not a technology issue. It is key that we have accurate, secure data and that we are producing databases with proper data governance that the citizen can trust.
What are the key elements of data governance? Can you identify any opportunities to strengthen data governance?
Brinsley Sheridan
Accuracy is huge. In our own data centre, we have to improve our accuracy. It’s the accuracy of your contacts; are you dealing with individuals or companies? How do you identify them? The de-duplication of our data sets is also a huge problem. Things like EirCode have helped hugely. We are putting in a new system now to try and leverage the EirCode to stop the duplication process, but it does come down to people and companies. You need a centralised data dictionary identifying what the entities are and how you identify them in a consistent way across the board.
Declan Sheehan
There are many owners of data within an organisation. Finding a way to govern that data across an organisation has been the biggest challenge for me. My organisation has 15 different functional areas, each with their own data set. Finding a way to administer data governance across the organisation is a challenge. I can imagine that it is a greater challenge across the whole of public sector.
William Flanagan
The important thing is the governance structure. For example, you get a lot of data sharing between central government and local authorities, but the processing of that information could be done in 31 different ways if there’s no one national data dictionary. In many cases, there’s no reason why there couldn’t be that national data dictionary. One of the difficulties would be that they’d have to adopt these standards in 31 local authorities as well as the central body, which is easier said than done. Defining standards is relatively easy but adopting them across public sector bodies is a challenge.
Aileen McHugh
What we found useful was that we revised our Governance Framework this year and for the first time it became necessary to conceptualise data governance in the context of protecting our data assets in a changing data landscape. The act of articulating our issues in a public facing strategic document greatly assisted us in capturing all the various forces at play and in visualising how to establish an appropriate way forward both in terms of innovation and sharing our data. However, more guidance from the centre is required to assist all public bodies in this process.
Denis Parfenov
There is an acronym, FAIR – making data findable, accessible, interoperable and reusable. Time is very important. On average, a digital object has a lifespan of just 18 months and a hard drive lasts between five to 10 years. We need to think decades ahead at the very minimum. At present data disappears much quicker than we are able to preserve it. Trust is paramount; the authenticity and persistence of data has to be assured algorithmically for future users of data.
Owen Harrison
From a broad public service perspective, the key elements of data governance start with a broad vision of what we want from data and how it can serve the public. From that vision, you need a series of standards and rules to make sure that vision gets manifested – like data dictionaries, systems for use, processes to follow. Once they are acknowledged as needed, we need an authority who has the power bring about those rules. The recent Data Sharing and Governance Act established that who: the Minister of Public Service and Reform has the authority to bring about rules and procedures governing public data on a regulatory footing. After you set a vision and the who, you need compliance. Compliance can come from systems, for example, restricting data processing, to auditing initiatives. The Act provides for a Data Governance Board, that board has the powers to monitor compliance and report to the Minister.
How do we improve data sharing across government?
William Flanagan
We need to champion data sharing across government. One way to do that would be to hold up a shining light to case studies where sharing has been done already. There was a really good example in the NTA in the taxi regulation space. Taxi regulation is an ecosystem of stakeholders across the public sector and beyond. A multifarious collection of stakeholders’ data sharing is at the root of all of that; the governance is what underpins the ability to do all of that. In that case, it’s an enabler that gives the operational benefit to an ecosystem that otherwise would be much more expensive to the NTA. The same goes in PRA, where external organisations need access to their records to perform title checks on properties. Data sharing is fundamental to these examples and the governance must be there as an enabler. The technology is usually relatively straightforward but having robust data sharing agreements is fundamental to success.
Declan Sheehan
We obviously put those systems in place some time ago and invested a lot of time in putting in place a varied and wide collection of data sharing agreements. It would be very useful if there was a standard data sharing agreement that we could use for these organisations, and for future organisations that we want to share data with. It would also be very beneficial if there was a standard tool or technology for sharing data between organisations in the public sector.
Owen Harrison
All that makes complete sense. Our department is actually obliged to produce a data sharing agreement template so that is on the way. To stimulate sharing, the first thing you need is legal clarity as to whether you can share data. That was the primary purpose of the Data Sharing and Governance Act from a sharing perspective. The idea was to make it explicit how you go about sharing data between public service bodies and in what cases you can share if you don’t have an alternative form of legislation that gives you explicit rights to share. After that, there are a number of elements to stimulate sharing, such as data cataloguing. We can’t share what we don’t know we have. We need to pursue technological solutions to encourage interoperability, what we want to get everybody to is API-based integration. Then, we need to address stewardship. We need each organisation, if they have a lot of data, to start appointing Chief Data Officers and ensuring that is an explicit role within your organisation. All this ties in with the prospect of centrally released guidance that would support that.
Denis Parfenov
Technology is pretty easy, incentivising people to do the right things is much more difficult and it takes a lot of time and effort. Firstly, it has to be legally mandated that data is available within an agreed period of time intervals and under a service level agreement. In order to invest time and money utilising the data, one needs to have legally binding assurances that it will be accessible in the future under pre-agreed standard. Secondly, in countries like Ireland, with relatively small populations, it can be difficult to make the business case for data re-use. Data has to be seen as a component of public infrastructure, similar to public roads, public parks and public libraries, all of which serve a purpose and are all designed, built and maintained by the State. The data has to be seen as a digital public infrastructure. Applications which built on the top of the public data will change in the same way we change bicycles or vehicles, but at the same time the road remains the same and you know it will exist tomorrow.
Aileen McHugh
I agree in relation to legal mandate. The Statistics Act is an example of clarity and authority which details exactly what’s required of public bodies, even in relation to envisioning how and what data might be collected by organisations in the national interest. I think we need greater clarity about the new legislation on data sharing. We also need more case studies and scenario testing. Public sector bodies are very diverse, and it would be useful to learn about more unusual stories, rather than narratives solely in relation to the larger government departments. In addition, beneath the Data Governance Board there could be a Data Governance Officers Network where common issues could be aired. This forum would be representative of all data holding organisations. In the PRA we have developed a specific governance and compliance model. We do have a CIO, but data governance forms part of oversight and is distinct from ICT and all other support functions.
Brinsley Sheridan
If you look at data sharing, you’re typically talking about data coming out of databases and going onto the the Governmet open data portal and this typically is presented the form of .csv files and the ilk, but if you’re looking at data sharing in general, you have to look at all the files people have that are not traditional data. We went through an exercise of identifying all of this as part of GDPR and it put discipline on us. We are an organisation of 100 people and we had millions of files. We had data stewards appointed in all the five functional areas but what can’t be underestimated is the job to actually classify and categorise them into a taxonomy. There’s the question of whether that taxonomy should be central or organisation specific. I think the point we’re making about how different we are at the ground level stands. There are parts of it that can be centralised and then whatever other classification should be up to the periphery function or business. We have put together a taxonomy but haven’t implemented it yet because the job is going to be huge. We culled all our old data before GDPR but to get value from those files you have to classify them, archive them and have a data retention policy. The effort that good governance requires can’t be underestimated.
How important is open data? How can we leverage open data to facilitate innovation in public services?
Denis Parfenov
We see open data as a public library of the 21st Century, where open data can be defined as non-personal data produced on the citizens’ behalf and at the taxpayers’ expense and available for use and re-use to anyone for any purpose. Ireland has made excellent progress in opening up data sets, but to make a difference in someone’s life, the data needs to be converted into information; information has to become knowledge; knowledge needs to be taken into account while making evidence-based decisions and the decisions must enable actions. It takes money, energy and human efforts to manage data, so it has to make differences in people’s lives, from things as simple as informing about train schedule times to enhancing our collective intelligence by enabling scientists to work collaboratively across the globe on addressing global issues.
Owen Harrison
The obvious aspect is the sparking of innovation and making a positive impact in a real sense for people. There’s also the element of trust and building trust with the public. Without trust, we can’t deliver the services and innovations we want to. Open data is a pillar of establishing trust in that people can interrogate the policy decisions made by seeing the same data Government sees and can assess the quality of decision making. If organisations are instilling an open data culture, it’s a reflection of data maturity within that organisation. On the issue of trust, a recent survey was measured that 67 per cent of people in Ireland trust the Government with their data. A world leader on this is Norway at 70 per cent, so we’re not doing badly.
Aileen McHugh
A good example of data sharing relates to mapping or what is now geospatial data. We all originally used OSi paper maps and then this moved into the digital environment. For my organisation this is base data. Other source data is inputted including title documents and maps lodged by applicants for registration. This is how we produce our title information which ultimately becomes publicly accessible knowledge in relation to property rights. Open data becomes easier but entails greater risk in the digital environment and that‘s why data governance has emerged as such an important issue.
William Flanagan
The entire lifecycle of this data is being funded by the taxpayer, by the time it’s made public they can see where their money went. I think open data is a really good way to demonstrate value to the public. Strategically, it can be a tool to stimulate economic activity too; for example, Transport for London’s strategy is to build APIs, rather than apps, and put them out there to the marketplace. Someone out there is going to unlock the latent value in that data. Security of open data is also an important consideration because technology has moved on. What was thought to be unhackable before is being found to not be so. For example, it would be crucial that open data is strongly anonymised, so as to prevent personal identifiers being reverse engineered via Machine Learning or AI engines.
Brinsley Sheridan
I think if all public service bodies asked the question of how important open data is in someone’s life, they would start to have a clearer understanding of when to release data. Take our rent index for example, there’s a housing crisis in the country and all of a sudden, a data set that had low visibility previously becomes front and centre. We had to share our data set anomalised with the ESRI because they’re doing hedonic regression analysis. What makes that compelling is that, with close to 80 per cent of our addresses EirCoded, we can use other data sets to show the local electoral area boundaries. We can now give very accurate data as to what the average rent of a three-bed semi-detached is within an electoral area just by sharing correctly with a number of bodies.
Declan Sheehan
There are huge advantages to cross-sector data sharing. The NTA are huge consumers of data, particularly in our transport models. We are also producers of data and we’re beginning to use other data sets to enhance our data. We are applying machine learning techniques to these to identify very useful insights that we would never have seen through traditional data analytics. This is one of the big advantages to open data.
What one area should we focus on to improve data governance in the future?
Brinsley Sheridan
Accuracy. All data, public or private, must be representative, accurate, secure and relevant. I really believe that MyGovID is a fantastic idea, but the way that it has been presented hasn’t been great. I want to ensure I know an entity’s identity so my data is accurate, so that I can give the public the service they deserve and at the same time inform government policy to make proper decisions. There’s an education process there for the public in general to accept the need to grasp something like this.
Declan Sheehan
It would be really good to have a central data governance forum put in place across public sector bodies that would help us all move to a more consistent place when it comes to managing data going forward. I would also like to see some standard data sharing agreements put in place to make it easier to transfer and share data.
Aileen McHugh
Greater clarity on the accountability and legal aspects is needed along with a central network of Data Governance officers so that all our voices are heard when policy is being made. Consistency is also important in aligning the interacting and competing legislation, policies and values, some as drivers and others as restraints, on further development of data sharing across government.
William Flanagan
We need to sell the idea of data governance as a suite of tools that’s going to improve the management of data across the public sector. What we’re asking for is organisations to invest in governance because you get all the benefits from that. Inevitably, people will ask why they should invest money and resources in data governance. We have to help people recognise the value of putting data governance in place and champion data governance at the individual organisational level.
Denis Parfenov
The data management technologies evolve constantly, and algorithms know no borders. The example that springs to mind is the social credit system in China, where the Chinese Government combines criminal records, medical records, social media and everything else to assign a social credit score to 1.4 billion of its citizens. Sooner rather than later, the world will be fuelled by data algorithms that will be used everywhere in the world, by state bodies, banks and insurance companies, to determine, for instance, our mortgage rates and insurance premiums. We need to make sure that it’s not only big enterprises that benefit from utilising data, but we, individuals around the world benefit from data-driven economies and societies of the future.
Owen Harrison
We need to implement the Public Service Data Strategy, which sets out 31 actions to create a coherent and mature data ecosystem across the Public Service. MyGovID adoption, one of the strategy actions, is a perfect example of a key enabler for accurate data processing and protecting people’s data. It’s a very successful project, growing by about 5,000 verified accounts per week. It’s an exemplar data project that all public service bodies should use to make sure they’re protecting people’s data. The Department of Employment Affairs and Social Protection did a recent survey and 87 per cent of people think the data collected should be reused between public service bodies so they don’t have to provide it again. People get it, they trust us with their data, they see the value in sharing — so let’s get on and do it.
The Participants
William Flanagan
William Flanagan is the Commercial & Technology Officer of OpenSky, of which he is also a co-founder. He is also the executive sponsor of OpenSky’s key accounts. With 20 years of experience in technology to his name, his work has spanned a wide and varied group of domains, including the health, environmental, transport, housing and financial sectors.
Owen Harrison
Owen Harrison is a Principal Officer in the Office of the Government CIO in the Department of Public Expenditure and Reform. He has responsibility for the implementation of the Public Service Digital, Data, and Shared Applications strategy under the Public Service ICT Strategy. He has a BSc in computer science from DCU, and a PhD in computer science from Trinity College Dublin.
Aileen McHugh
Dr. Aileen McHugh is Head of Operations at the Property Registration Authority (PRA) with responsibility HR, Finance, ICT, Corporate Services, Casework and Customer Service, Spatial Data and Mapping, Quality, Governance and Compliance. Aileen previously spent over 17 years as a civil servant in the Houses of the Oireachtas, which included working with several parliamentary committees. Aileen holds a primary degree in Public Administration, a Masters in the Management of Change and a Doctorate in Business Administration.
Denis Parfenov
Denis Parfenov is an entrepreneur, open data advocate, founder of the Data Management Hub and co-founder of the Open Knowledge Ireland. He is also a member of the Creative Commons Global Network, a member of the Open Data Governance Board and the Open Knowledge Ambassador for Ireland.
Declan Sheehan
Declan is the Chief Information Officer in the National Transport Authority. He oversees all of the technology and information projects in the Authority, which include the Leap Card family of systems, the travel information systems, the taxi regulation systems, Rural Transport systems, Business Intelligence and Data Analytics and the Corporate systems. Declan was previously senior programme manager on the Leap Card system.
Brinsley Sheridan
Brinsley has 30 years’ experience in enabling and driving leading edge ICT projects in commercial semi-state and the public sector. His most notable positions were/are Head of IT in Bord na Mona where I was responsible for 120 system implementations across 47 locations in Ireland, UK, France and USA serving 600 users and his current role as Head of ICT in the Residential Tenancies Board.