×

Differentially private generative decomposed adversarial network for vertically partitioned data sharing. (English) Zbl 07834432

Summary: This paper considers the problem of differentially private vertically partitioned data sharing. In particular, with the assistance of a semi-honest curator, the involved parties (i.e., data owners) each holds records regarding different features of the same set of individuals collectively generate a shared dataset while satisfying differential privacy. Motivated by the superiority of the generative adversarial network (GAN) in data synthesizing, we present a differentially private generative decomposed adversarial network (DPGDAN) approach for vertically partitioned data sharing. In DPGDAN, the discriminator in the initial GAN is decomposed into several local discriminators and two relational discriminators, i.e., a real relational discriminator and a fake relational discriminator. Each party is assigned with a local discriminator while the curator holds the generator and the two relational discriminators. By combining the sanitized feedback from the local discriminators and the outputs of the relational discriminators, the curator can update the generator to approximate the distribution of the integrated dataset without compromising each party’s privacy. Moreover, to promote the shared data’s utility, we design a snapshot aggregation method by aggregating the synthetic records produced by the generators acquired during training. Furthermore, we show that DPGDAN satisfies \((\epsilon, \delta)\)- differential privacy. Extensive experiments validate the effectiveness of DPGDAN.

MSC:

68-XX Computer science
91-XX Game theory, economics, finance, and other social and behavioral sciences
Full Text: DOI

References:

[1] D. Alhadidi, N. Mohammed, B.C.M. Fung, M. Debbabi, Secure distributed framework for achieving ∊)differential privacy, in: Fischer-Hübner, S., Wright, M.K. (Eds.), Privacy Enhancing Technologies - 12th International Symposium, PETS 2012, Vigo, Spain, July 11-13, 2012. Proceedings, Springer, 2012, pp. 120-139. URL: doi: 10.1007/978-3-642-31680-7_7, DOI: 10.1007/978-3-642-31680-7_7.
[2] Beaulieu-Jones, B. K.; Wu, Z. S.; Williams, C.; Lee, R.; Bhavnani, S. P.; Byrd, J. B.; Greene, C. S., Privacy-preserving generative deep neural networks support clinical data sharing, Circulation: Cardiovascular Quality and Outcomes, 12, Article e005122 pp. (2019)
[3] C. Blake, Uci repository of machine learning databases, 1998.http://www.ics.uci.edu/mlearn/MLRepository.html.
[4] Z. Bu, J. Dong, Q. Long, W.J. Su, Deep learning with gaussian differential privacy, Harvard Data Sci. Rev. (2020).
[5] T. Che, Y. Li, A.P. Jacob, Y. Bengio, W. Li, Mode regularized generative adversarial networks, in: 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings, OpenReview.net, 2017. URL:https://openreview.net/forum?id=HJKkY35le.
[6] Chen, R.; Xiao, Q.; Zhang, Y.; Xu, J., Differentially private high-dimensional data publication via sampling-based inference (2015), SIGKDD
[7] Chen, Y.; Zhang, J.; Yeo, C. K., Privacy-preserving knowledge transfer for intrusion detection with federated deep autoencoding gaussian mixture model, Inf. Sci., 609, 1204-1220 (2022)
[8] Cheng, X.; Tang, P.; Su, S.; Chen, R.; Wu, Z.; Zhu, B., Multi-party high-dimensional data publishing under differential privacy, IEEE Trans. Knowl. Data Eng., 32, 1557-1571 (2020)
[9] Cock, M. D.; Dowsley, R.; Horst, C.; Katti, R. S.; Nascimento, A. C.A.; Poon, W.; Truex, S., Efficient and private scoring of decision trees, support vector machines and logistic regression models based on pre-computation, IEEE Trans. Dependable Secur. Comput., 16, 217-230 (2019)
[10] Ding, X.; Liu, P.; Jin, H., Privacy-preserving multi-keyword top-\( k\) k similarity search over encrypted data, IEEE Trans. Dependable Secur. Comput., 16, 344-357 (2019)
[11] Ding, X.; Wang, C.; Choo, K. R.; Jin, H., A novel privacy preserving framework for large scale graph data publishing, IEEE Trans. Knowl. Data Eng., 33, 331-343 (2021)
[12] Ding, X.; Wang, L.; Shao, Z.; Jin, H., Efficient recommendation of de-identification policies using mapreduce, IEEE Trans. Big Data, 5, 343-354 (2019)
[13] Ding, X.; Wang, Z.; Zhou, P.; Choo, K. R.; Jin, H., Efficient and privacy-preserving multi-party skyline queries over encrypted data, IEEE Trans. Inf. Forensics Secur., 16, 4589-4604 (2021)
[14] Ding, X.; Yang, W.; Choo, K. R.; Wang, X.; Jin, H., Privacy preserving similarity joins using mapreduce, Inf. Sci., 493, 20-33 (2019)
[15] Ding, X.; Zhou, W.; Sheng, S.; Bao, Z.; Choo, K. R.; Jin, H., Differentially private publication of streaming trajectory data, Inf. Sci., 538, 159-175 (2020)
[16] Ge, C.; Ilyas, I. F.; Kerschbaum, F., Secure multi-party functional dependency discovery, Proc. VLDB Endow., 13, 184-196 (2019)
[17] Goldreich, O.; Oren, Y., Definitions and properties of zero-knowledge proof systems, J. Cryptol., 7, 1-32 (1994) · Zbl 0791.94010
[18] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, Y. Bengio, Generative adversarial nets, in: Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N., Weinberger, K. (Eds.), Advances in Neural Information Processing Systems, 2014. URL:https://proceedings.neurips.cc/paper/2014/file/5ca3e9b122f61f8f06494c97b1afccf3-Paper.pdf.
[19] B. Gu, Z. Dang, X. Li, H. Huang, Federated doubly stochastic kernel learning for vertically partitioned data, in: Gupta, R., Liu, Y., Tang, J., Prakash, B.A. (Eds.), KDD ’20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Virtual Event, CA, USA, August 23-27, 2020, ACM, 2020, pp. 2483-2493. URL: doi: 10.1145/3394486.3403298.
[20] Gulrajani, I.; Ahmed, F.; Arjovsky, M.; Dumoulin, V.; Courville, A. C., Improved training of wasserstein gans (2017), NeurIPS
[21] Guo, J.; Ding, X.; Wang, T.; Jia, W., Combinatorial resources auction in decentralized edge-thing systems using blockchain and differential privacy, Inf. Sci., 607, 211-229 (2022) · Zbl 1539.68043
[22] Hong, Y.; Vaidya, J.; Lu, H.; Karras, P.; Goel, S., Collaborative search log sanitization: Toward differential privacy and boosted utility, IEEE Trans. Dependable Secur. Comput., 12, 504-518 (2015)
[23] D. Kifer, Attacks on privacy and definetti’s theorem, in: Proceedings of the 2009 ACM SIGMOD International Conference on Management of data, 2009, pp. 127-138.
[24] D.P. Kingma, J. Ba, Adam: A method for stochastic optimization, in: Bengio, Y., LeCun, Y. (Eds.), 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings, 2015. URL: http://arxiv.org/abs/1412.6980.
[25] J. Li, H. Huang, Faster secure data mining via distributed homomorphic encryption, in: Gupta, R., Liu, Y., Tang, J., Prakash, B.A. (Eds.), KDD ’20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Virtual Event, CA, USA, August 23-27, 2020, ACM, 2020, pp. 2706-2714. URL: doi: 10.1145/3394486.3403321.
[26] Li, N.; Chen, Z.; Nie, J.; Fu, X.; Jia, X., Complementary set encryption for privacy-preserving data consolidation, Inf. Sci., 593, 271-288 (2022) · Zbl 1533.68060
[27] Q. Li, Y. Li, J. Gao, B. Zhao, W. Fan, J. Han, Resolving conflicts in heterogeneous data by truth discovery and source reliability estimation, in: Proceedings of the 2014 ACM SIGMOD international conference on Management of data, 2014, pp. 1187-1198.
[28] Mironov, I., Rényi differential privacy (2017), CSF
[29] Mohammed, N.; Alhadidi, D.; Fung, B. C.M.; Debbabi, M., Secure two-party differentially private data release for vertically partitioned data, IEEE Trans. Dependable Secur. Comput., 11, 59-71 (2014)
[30] Papernot, N.; Abadi, M.; Erlingsson, Ú.; Goodfellow, I. J.; Talwar, K., Semi-supervised knowledge transfer for deep learning from private training data (2017), ICLR
[31] Papernot, N.; Song, S.; Mironov, I.; Raghunathan, A.; Talwar, K.; Erlingsson, Ú., Scalable private learning with PATE (2018), ICLR
[32] Ran, X.; Wang, Y.; Zhang, L. Y.; Ma, J., A differentially private nonnegative matrix factorization for recommender system, Inf. Sci., 592, 21-35 (2022) · Zbl 1535.68079
[33] Su, S.; Tang, P.; Cheng, X.; Chen, R.; Wu, Z., Differentially private multi-party high-dimensional data publishing (2016), ICDE
[34] Tang, P.; Cheng, X.; Su, S.; Chen, R.; Shao, H., Differentially private publication of vertically partitioned data, IEEE Trans. Dependable Secure Comput. (2019)
[35] Torfi, A.; Fox, E. A.; Reddy, C. K., Differentially private synthetic medical data generation using convolutional gans, Inf. Sci., 586, 485-500 (2022)
[36] S. Truex, N. Baracaldo, A. Anwar, T. Steinke, H. Ludwig, R. Zhang, Y. Zhou, A hybrid approach to privacy-preserving federated learning, in: Proceedings of the 12th ACM workshop on artificial intelligence and security, 2019, pp. 1-11.
[37] Voigt, P.; Von dem Bussche, A., The eu general data protection regulation (gdpr). A Practical Guide (2017), Springer International Publishing: Springer International Publishing Cham
[38] Wang, R.; Fung, B. C.; Zhu, Y.; Peng, Q., Differentially private data publishing for arbitrarily partitioned data, Inf. Sci., 553, 247-265 (2021) · Zbl 1483.68101
[39] Wu, Y.; Cai, S.; Xiao, X.; Chen, G.; Ooi, B. C., Privacy preserving vertical federated learning for tree-based models, Proc. VLDB Endow., 13, 2090-2103 (2020)
[40] Xu, C.; Ren, J.; Zhang, Y.; Qin, Z.; Ren, K., Dppro: Differentially private high-dimensional data release via random projection, IEEE Trans. Inf. Forensics Secur., 12, 3081-3093 (2017)
[41] Xu, L.; Skoularidou, M.; Cuesta-Infante, A.; Veeramachaneni, K., Modeling tabular data using conditional GAN (2019), NeurIPS
[42] Yang, L.; Li, C.; Cheng, Y.; Yu, S.; Ma, J., Achieving privacy-preserving sensitive attributes for large universe based on private set intersection, Inf. Sci., 582, 529-546 (2022) · Zbl 1535.68081
[43] Yao, A. C., Protocols for secure computations (extended abstract) (1982), FOCS
[44] Yu, F.; Rawat, A. S.; Menon, A.; Kumar, S., Federated learning with only positive labels, International Conference on Machine Learning, PMLR, 10946-10956 (2020)
[45] H. Yuan, T. Ma, Federated accelerated stochastic gradient descent, in: Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M., Lin, H. (Eds.), Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual. URL:https://proceedings.neurips.cc/paper/2020/hash/39d0a8908fbe6c18039ea8227f827023-Abstract.html.
[46] Zhang, J.; Cormode, G.; Procopiuc, C. M.; Srivastava, D.; Xiao, X., Privbayes: private data release via bayesian networks (2014), SIGMOD · Zbl 1474.68149
[47] Zhang, X.; Zhu, X.; Wang, J.; Yan, H.; Chen, H.; Bao, W., Federated learning with adaptive communication compression under dynamic bandwidth and unreliable networks, Inf. Sci., 540, 242-262 (2020)
[48] Zheng, Q.; Dong, J.; Long, Q.; Su, W., Sharp composition bounds for gaussian differential privacy via edgeworth expansion, International Conference on Machine Learning, PMLR, 11420-11435 (2020)
[49] Zhou, Y.; Liu, J.; Wang, J. H.; Wang, J.; Liu, G.; Wu, D.; Li, C.; Yu, S., Usst: A two-phase privacy-preserving framework for personalized recommendation with semi-distributed training (2022), Information Sciences · Zbl 07814165
[50] Zhu, L.; Liu, Z.; Han, S., Deep leakage from gradients (2019), NeurIPS
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.