Skip to content
NAACCR | North American Association of Central Cancer Registries
MyNAACCR
Learn More
NetLink
Contact Us
Site Map
Search
Text Size
Decrease
S
M
L
XL
Increase
About NAACCR
Bylaws
Executive Office and Staff
Gifts, Donations, and Bequests
NAACCR Board of Directors
NAACCR Business Meetings
Narrative Newsletter
Policies
Press Releases
Special Projects We Are Working On
Sponsoring Member Organizations
Standing Rules
Strategic Management Plan
The Annual Report and 990 Tax Forms
The NAACCR Mission
Certification
Criteria
Certification Levels
Logos
Who is Certified
Data and Publications
Annual Report to the Nation
Call for Data
Canadian Cancer Statistics
Cancer in North America (CINA) Publications
Data Quality Assessments and Evaluations
Epidemiologic Reports
NAACCR Annotated Bibliography on Confidentiality in Data Release
Request Publications
Top 5 Cancers in the U.S.
Education and Training
Annual Conference
Cancer Registry and Surveillance Webinar Series
CTR Preparation and Review Webinar Series
Mentor Fellowship Program
Process Improvement Program
Recruitment Materials
Resources & Train. Progs. for CTRs
Town Hall Webinars
Membership
Awards Program
Become a NAACCR Member
Committees
Membership Communications
Membership Directory
MyNAACCR and Membership Directory FAQ
View or Post Job Opportunities
Research
CINA Data Products Overview
CINA Deluxe Analytic File
CINA Monograph In SAS
CINA+ in SEER*Stat
CINA+ Online
Confidentiality Issues
Data Analysis Tools
GIS Resources
HIPAA Information
IRB Information
Research Capabilities by Registry
Shortest Path Finder Tool
Standards and Registry Operations
Add/Change Data Items in Volume II
Add/Change Standards in Volume III
Auto. Tumor Linkage Group Documents
CCR Calendar of Operations
CRC Checksum
Implementation Guidelines
Interoperability Resources
National Interstate Data Exchange Agreement
NAACCR v12 Translation Tools
Path Lab Search Terms
Record Consolidation
Reg. Operations Guidelines
Security Showcase
Volume I, Data Exchange Standards and Record Description
Volume II, Data Standards and Data Dictionary
Volume III, Standards for Completeness, Quality, Analysis, Management, Security, and Confidentiality of Data
Volume IV, Standard Data EDITS
Volume V, Pathology Laboratory Electronic Reporting
← Back
NAACCR Record Uniqueness
At-a-glance
Application
Level
Registry Users
Customers Users
Tools
Intermediate
Advanced
Steward
Research/Surveillance
Research
Link:
http://www.naaccr.org/index.asp?Col_SectionKey=11&Col_ContentID=463#RecUniq
Overview:
The NAACCR Record Uniqueness Program is a useful risk-assessment tool. It is an implementation of the k-anonymity measure of risk in microdata files [Steel PM, Disclosure Risk Assessment for Microdata, 2004]. For k=1, this is a measure of the number of unique records for a given set of key variables. This measure is useful for assessing the risk of revealing additional information about a known cancer patient. (It is not an assessment of the risk of identifying a previously unknown cancer patient since it does not estimate the number of unique records in the population.) For k>1, this is a measure of the number of records with k or fewer common records for a given set of key variables. This measure is useful for estimating the number of table cells that might be subject to suppression if the microdata file is aggregated by the key values.
The documentation includes recommended threshold values for the percentage of unique records on a file, implying that a file is safe as long as the percentage of unique records is below these thresholds. These thresholds are, at best, general guidelines and should be used with caution. There is some amount of risk for any percentage of unique records greater than zero and the size of the risk depends on many things including the size of the sample and the resources available to the intruder. The threshold values may be either too low or too high for a given file and intrusion scenario. The record uniqueness values produced by this program are best used as relative values when comparing the tradeoff between risk and utility for various combinations of keys or key recoding options.
In addition to the number of unique records (or records with k or fewer common key values), the program generates an estimate of the relative contribution of each key variable to the total. This can be useful for identifying which key might yield the most benefit from recoding or removal.
There are two versions: one is a Windows executable suitable for analyzing a moderately sized input file, the other a SAS macro for larger files. At the time of this writing, the SAS macro version works correctly under Linux but reports five times more unique records than are on the file when run under Windows. This bug has been reported to the developer and a revised version is being tested.
[See also the μ-Argus program in the
Computational Aspects of Statistical Confidentiality (CASC) Project
entry]
Search Site
Search Publications
Search