ARN

NSW Data Analytics Centre aims to create a standard on de-identified data

Privacy, data handling and use are top concerns when it comes to sharing data sets

Issues over handling and privacy are top of mind when it comes sharing data sets among government agencies, which is fuelling a need to create a new standard. 

While privacy may be perceived as a top concern among government agencies in sharing data, NSW Data Analytics Centre CEO, Ian Oppermann, said real ‘unvoiced’ concerns were related to the consequences of using the data, how it was being used and handled.

“Will I lose control over my actions?," he said. "If you see these insights, what does it say about the job we've been doing so far? Will we be embarrassed by what happens? 

"Do you have the expertise to analyse the insights rather than decide and let the data speak for itself? Do you need context? What are the consequences of taking action based on those insights? And how far will you use this insight in terms of an all automated process, how much do you need the person in the loop?.".

In the past three years, Oppermann said the NSW Data Analytics Centre has been working with the Australian Computer Society, Standards Australia, CSIRO’s Data61, legal firms, privacy advocates and the Australian Bureau of Statistics, among other government bodies, to work through what it can measure in linked de-identified data sets. 

“What does that do in terms of the measure of personal information? And can we put a measure on it? And as we link more and more data sets together, how far does that payload of information grow? And what becomes the risk of identification? It turns out to be a very, very subtle and very complex activity,” Oppermann said. 

“We've been working on this measure of personal information, to at least take that conversation into a quantifiable space to understand when we reach that threshold of a reasonable likelihood of identification.”

Oppermann said he would like to take these measures to organisations that developed such things as the cyber security standards SE2700 series to start taking what it has done to date, and look at building a standard in the de-identified data space, something he thinks is about two years away. 

“We've also tried to work out the different environments, the different protections we need to put in place when thinking about how we look after our data,” he said. 

The NSW Data Analytics Centre has just turned four years old and looks at the challenges facing a range of industries and verticals from compulsory third party insurance to risky response times to transport optimisation, and serious issues addressing domestic and family violence, and the out of home care reform. To date, it has been involved in about 50 projects.

"Our philosophy is that data is a way of seeing the world. Every data set is incomplete, is imperfect and doesn't give full coverage, but it gives you a point of light," he said. "If we think about the impact of a piece of infrastructure, it's not just transport, but it also has implications on the workforce, education and even health," he said. 

"We deliberately link together hundreds and thousands of data sets of a great variety to get an insight into a problem we're looking at...linking many different data sets means we're constantly focused on privacy. Without getting privacy right, it means we're not doing the right thing with the data sets that we're linking together."