Privacy-Preserving k-means Clustering of Encrypted Data
암호화된 데이터에 대한 프라이버시를 보존하는 k-means 클러스터링 기법
- 주제(키워드) Privacy-preserving clustering , Fully homomorphic encryption , k-means clustering
- 발행기관 고려대학교 정보보호대학원
- 지도교수 이동훈
- 발행년도 2019
- 학위수여년월 2019. 2
- 학위구분 석사
- 학과 정보보호대학원 정보보호학과
- 원문페이지 52 p
- 실제URI http://www.dcollection.net/handler/korea/000000083500
- UCI I804:11009-000000083500
- DOI 10.23186/korea.000000083500.11009.0000823
- 본문언어 영어
- 제출원본 000045978825
초록/요약
The k-means clustering algorithm groups input data with the number of groups represented by variable k. In fact, this algorithm is particularly useful in market segmentation and medical research, suggesting its wide applicability. In this paper, we propose a privacy-preserving clustering algorithm that is appropriate for outsourced encrypted data, while exposing no information about the input data itself. Notably, our proposed model facilitates encryption of all data, which is a large advantage over existing privacy-preserving clustering algorithms which rely on multi-party computation over plaintext data stored on several servers. Our approach compares homomorphically encrypted ciphertexts to measure the distance between input data. Finally, we theoretically prove that our scheme guarantees the security of input data during computation, and also evaluate our communication and computation complexity in detail.
more목차
1. Introduction 1
1.1 Motivation 1
1.2 Related Work 3
1.3 Contribution 5
2. Preliminaries 8
2.1 k-Means Clustering Algorithm 8
2.2 Lagrange Interpolating Polynomial 9
2.3 Ring Learning with Errors Problem 9
2.4 Fully Homomorphic Encryption 10
2.4.1 Integer-wise FHE 11
2.4.2 Bit-wise FHE 11
2.5 Sorting FHE Data 12
2.5.1 Integer-wise Comparison of FHE Data 12
2.5.2 Bit-wise Comparison of FHE Data 12
3. System Model and Threat Model 15
3.1 System Model 15
3.2 Threat Models 18
4. User Privacy-Preserving k-means Clustering Protocol(UP-kCP) 20
4.1 Preparation Phase 21
4.2 Evaluation Phase 22
4.3 Recomputing Mean Phase 28
4.4 Revealing Phase 31
5. Analysis 32
5.1 Security Analysis 32
5.1.1 Security Analysis of Secure Minimum Protocol 32
5.1.2 Security Analysis of Secure Average Protocol 33
5.1.3 Security Analysis of Revealing Mean Protocol 34
5.2 Comparison and Analysis of Communication and Computation Overhead 35
6. Conclusion 37

