Skip to main content
Note:

Digital.govt.nz will be offline for maintenance on Tuesday 4 October from 5pm to 6pm.

Making personal information safe for reuse

A range of methods can be applied to personal information to make it safe for reuse.

When using personal information for research, evaluation and other such activities, it’s important to reduce the risk that people, households, and organisations are identified without their permission.

There are different methods to make personal information safe for such reuse. Each method offers a different level of reducing the risk of re-identification.

An agency’s data does not exist in a vacuum and it may be possible to identify an individual by combining an agency’s data that has used 1 of the following methods with data available elsewhere.

AboutMyInfo — a Harvard University project — shows how easy or hard it can be to identify someone based on only on their birthdate and zip code. While using the tool requires a US zip code, there are 4 illustrative samples provided.

Aboutmyinfo.org  — identity samples

How personal information can be made safe for reuse

The following methods can be used to reduce the amount of identifiable personal information contained in the individual client data that’s collected and used as part of providing public services. Other terms used include raw data, microdata, and transactional data.

Methods to use

The methods below are listed from least likelihood to greatest likelihood.

Confidentialisation

The statistical methods used to protect against confidential information being disclosed to people who are not authorised to have access to it, in a way that could identify an individual, household or organisation.

The statistical methods used provide a level of protection against identification that cannot be obtained from de-identification.

Aggregation

Data combined from several measurements but without the additional use of statistical methods to protect against re-identification.

De-identification

The process of removing information from microdata to reduce risk of spontaneous recognition. It typically includes removing names, exact dates of birth or death, and exact addresses.

Anonymisation — A term most commonly used to refer to data from which direct identifiers have been removed (de-identified data) but is sometimes used to refer to confidentialised data. Due to this confusion it’s not used in this diagram.

Pseudonymisation

The process of replacing direct identifiers with different ones in microdata to reduce risk of spontaneous recognition.

Figure 1. How personal information can be made safe for reuse

 This diagram shows 5 processes that can be used to make information safe for reuse. Examples of identifiable personal information like age range, eye and hair colour, height, weight, and name, are given for each case.

Detailed description of graph

This diagram illustrates a list of 5 methods that capture data about individuals, with each subsequent method in the list reducing the likelihood even further than the previous method of individuals being able to be identified. 

The 5 methods are: individual client data, pseudonymisation, de-identification, aggregation and confidentialisation.

Here is a description of the data captured by each of these methods,  starting with the method most likely to identify an individual.

  1. Individual client data: This method includes information about the person’s name, date of birth and hair colour.
  2. Pseudonymisation: This method assigns the person a different name or another way to identify them, such as assigning them a group of letters, like XYZ, along with their date of birth and hair colour.
  3. De-identification: This method removes the name and replaces the date of birth with their age.
  4. Aggregation: This method combines individual data from a group of people into age, hair and height ranges.
  5. Confidentialisation: This method applies statistical techniques to the data to prevent individual identification. For example, it may include the age range, and give the average height of the group, but then remove data about hair colour because an individual could be identified by it (as determined by statistical analysis).
View larger image (PNG 160 KB)

Download — How personal information can be made safe for reuse diagram (PDF 133 KB)

More information

Utility links and page information

Was this page helpful?
Thanks, do you want to tell us more?

Do not enter personal information. All fields are optional.

Last updated