Utility-Preserving Data Anonymization for Data Publishing
- 주제(키워드) Database , Data Privacy
- 발행기관 고려대학교 대학원
- 지도교수 정연돈
- 발행년도 2020
- 학위수여년월 2020. 8
- 학위구분 박사
- 학과 대학원 컴퓨터학과(정보대학)
- 원문페이지 101 p
- UCI I804:11009-000000232157
- DOI 10.23186/korea.000000232157.11009.0001171
- 본문언어 영어
- 제출원본 000046048476
초록/요약
Data privacy intends to utilize personal data without concerns about privacy leakage. To preserve privacy and utility, personal data should be anonymized before being published. In this thesis, we propose novel utility-preserving anonymization methods for privacy-preserving data publishing. First, we propose a anonymization method for syntactic privacy model. We present a novel utility preservation model, called h-ceiling, which restricts the generalization boundary. The proposed method satisfies both k-anonymity and h-ceiling with counterfeit record and its catalog. In addition, we devise an algorithm that derives the optimal result. We conduct experiments on real datasets to evaluate our method. Next, we design a differentially private anonymization method for semantic privacy model, differential privacy. To satisfy differential privacy, the proposed method comprises three steps: (1) generating candidates for data perturbation, (2) utility scoring of all candidates, and (3) choosing the result based on the scores. We adopt three techniques for data perturbation: generalization, suppression, and insertion. We also present a efficient algorithm for the proposed method. The performance of the proposed method is evaluated by various experiments on real dataset
more목차
1. Introduction
1.1 Introduction to Data Privacy
1.2 Motivation
1.3 Contributions of the Thesis
1.4 Organization of the Thesis
2. Utility-Preserving Data Anonymization using syntactic privacy model
2.1 Introduction
2.2 Preliminaries
2.2.1 Generalization, Suppression, and Relocation
2.2.2 Data Utility
2.3 The Proposed Method
2.3.1 Basic Concepts
2.3.2 h-ceiling
2.3.3 Insertion of Counterfeit Records
2.3.4 Catalog of Counterfeit Records
2.3.5 Quality Metric
2.3.6 Implementation of Anonymization Algorithm
2.4 Experiments
2.4.1 LM
2.4.2 RCE
2.4.3 Query Error Rate
2.4.4 Real World Analysis
2.5 Chapter Summary
3. Utility-Preserving Data Anonymization using semantic privacy model
3.1 Introduction
3.2 Preliminaries
3.3 The Proposed Method
3.4 Experiments
3.5 Chapter Summary
4. Conclusions and Future Work
4.1 Conclusions
4.2 Future Work

