Back to Resources

Statistical or Scientific De-Identification Fact Sheet

posted on Mon, Feb 11 2019 12:32 pm by Network for Public Health Law

“Statistical or scientific de-identification” is an important tool to assist public health in negotiating its dual and sometimes conflicting missions – maintaining the privacy of the information it collects and sharing the information broadly with the community in a legal and privacy protective manner. As opposed to prescriptive methods, which delineate the removal of specific direct and indirect identifiers from the data set, this approach involves removing direct identifiers, like name and Social Security number, and balancing the utility of the inclusion of indirect identifiers, such as dates and geographies, with the risk of re-identification; this approach yields multiple solutions and provides flexibility. Statistical or scientific de-identification allows the expert, in consultation with the data steward, to determine which method(s) to apply to the data set to de-identify the indirect identifiers.

De-identification provides public health with many benefits:

  • If data are de-identified at the point of collection, the risk of a privacy breach while data are retained, is significantly decreased.
  • When data are de-identified prior to sharing, technical and policy controls may be minimized.
  • De-identification affords public health with the ability to share data widely with communities and others. 
  • This fact sheet is intended to be used by privacy officers, public health practitioners, data managers and their attorneys to provide awareness of these methods. See the Resources document, which is part of this toolkit, for technical resources.

This fact sheet provides an overview of statistical and scientific de-identification methods of structured data, such as lab values and patient demographics, where the data are entered utilizing pre-defined fields from within the record. This fact sheet is not intended for de-identification of unstructured data, such as narrative reports or multimedia. Additionally, detail regarding methods for creation of synthetic data or data enclaves are beyond the scope of this fact sheet.

View/download the Fact Sheet.