×

Edge differentially private estimation in the \(\beta\)-model via jittering and method of moments. (English) Zbl 1539.62065

Summary: A standing challenge in data privacy is the trade-off between the level of privacy and the efficiency of statistical inference. Here, we conduct an in-depth study of this trade-off for parameter estimation in the \(\beta\)-model [S. Chatterjee et al., Ann. Appl. Probab. 21, No. 4, 1400–1435 (2011; Zbl 1234.05206)] for edge differentially private network data released via jittering [V. Karwa et al., J. R. Stat. Soc., Ser. C, Appl. Stat. 66, No. 3, 481–500 (2017; doi:10.1111/rssc.12185)]. Unlike most previous approaches based on maximum likelihood estimation for this network model, we proceed via the method of moments. This choice facilitates our exploration of a substantially broader range of privacy levels – corresponding to stricter privacy – than has been to date. Over this new range, we discover our proposed estimator for the parameters exhibits an interesting phase transition, with both its convergence rate and asymptotic variance following one of three different regimes of behavior depending on the level of privacy. Because identification of the operable regime is difficult, if not impossible in practice, we devise a novel adaptive bootstrap procedure to construct uniform inference across different phases. In fact, leveraging this bootstrap we are able to provide for simultaneous inference of all parameters in the \(\beta\)-model (i.e., equal to the number of nodes), which, to our best knowledge, is the first result of its kind. Numerical experiments confirm the competitive and reliable finite sample performance of the proposed inference methods, next to a comparable maximum likelihood method, as well as significant advantages in terms of computational speed and memory.

MSC:

62F12 Asymptotic properties of parametric estimators
68P27 Privacy of data
91D30 Social networks; opinion dynamics

Citations:

Zbl 1234.05206

Software:

PrivateLR

References:

[1] Benjamini, Y. and Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. J. Roy. Statist. Soc. Ser. B 57 289-300. MathSciNet: MR1325392 · Zbl 0809.62014
[2] BLOCKI, J., BLUM, A., DATTA, A. and SHEFFET, O. (2013). Differentially private data analysis of social networks via restricted sensitivity. In ITCS’13—Proceedings of the 2013 ACM Conference on Innovations in Theoretical Computer Science 87-96. ACM, New York. MathSciNet: MR3385388 · Zbl 1361.68078
[3] CHANG, J., CHEN, X. and WU, M. (2024). Central limit theorems for high dimensional dependent data. Bernoulli 30 712-742. Digital Object Identifier: 10.3150/23-bej1614 Google Scholar: Lookup Link MathSciNet: MR4665595 · Zbl 1530.60022 · doi:10.3150/23-bej1614
[4] CHANG, J., HU, Q., KOLACZYK, E. D., YAO, Q. and YI, F. (2024). Supplement to “Edge differentially private estimation in the \(β\)-model via jittering and method of moments.” https://doi.org/10.1214/24-AOS2365SUPP
[5] CHANG, J., KOLACZYK, E. D. and YAO, Q. (2020). Discussion of ‘Network cross-validation by edge sampling’ [4108931]. Biometrika 107 277-280. Digital Object Identifier: 10.1093/biomet/asaa017 Google Scholar: Lookup Link MathSciNet: MR4496202 · Zbl 1441.62044 · doi:10.1093/biomet/asaa017
[6] CHANG, J., KOLACZYK, E. D. and YAO, Q. (2022). Estimation of subgraph densities in noisy networks. J. Amer. Statist. Assoc. 117 361-374. Digital Object Identifier: 10.1080/01621459.2020.1778482 Google Scholar: Lookup Link MathSciNet: MR4399091 · Zbl 1506.62301 · doi:10.1080/01621459.2020.1778482
[7] CHANG, J., QIU, Y., YAO, Q. and ZOU, T. (2018). Confidence regions for entries of a large precision matrix. J. Econometrics 206 57-82. Digital Object Identifier: 10.1016/j.jeconom.2018.03.020 Google Scholar: Lookup Link MathSciNet: MR3840783 · Zbl 1398.62068 · doi:10.1016/j.jeconom.2018.03.020
[8] CHANG, J., ZHENG, C., ZHOU, W.-X. and ZHOU, W. (2017). Simulation-based hypothesis testing of high dimensional means under covariance heterogeneity. Biometrics 73 1300-1310. Digital Object Identifier: 10.1111/biom.12695 Google Scholar: Lookup Link MathSciNet: MR3744543 · Zbl 1405.62162 · doi:10.1111/biom.12695
[9] Chatterjee, S., Diaconis, P. and Sly, A. (2011). Random graphs with a given degree sequence. Ann. Appl. Probab. 21 1400-1435. Digital Object Identifier: 10.1214/10-AAP728 Google Scholar: Lookup Link MathSciNet: MR2857452 · Zbl 1234.05206 · doi:10.1214/10-AAP728
[10] CHEN, M., KATO, K. and LENG, C. (2021). Analysis of networks via the sparse \(β\)-model. J. R. Stat. Soc. Ser. B. Stat. Methodol. 83 887-910. Digital Object Identifier: 10.1111/rssb.12444 Google Scholar: Lookup Link MathSciNet: MR4349121 · Zbl 07555285 · doi:10.1111/rssb.12444
[11] Chernozhukov, V., Chetverikov, D. and Kato, K. (2013). Gaussian approximations and multiplier bootstrap for maxima of sums of high-dimensional random vectors. Ann. Statist. 41 2786-2819. Digital Object Identifier: 10.1214/13-AOS1161 Google Scholar: Lookup Link MathSciNet: MR3161448 · Zbl 1292.62030 · doi:10.1214/13-AOS1161
[12] CHERNOZHUKOV, V., CHETVERIKOV, D. and KATO, K. (2017). Central limit theorems and bootstrap in high dimensions. Ann. Probab. 45 2309-2352. Digital Object Identifier: 10.1214/16-AOP1113 Google Scholar: Lookup Link MathSciNet: MR3693963 · Zbl 1377.60040 · doi:10.1214/16-AOP1113
[13] DE LA PEÑA, V. H. and MONTGOMERY-SMITH, S. J. (1995). Decoupling inequalities for the tail probabilities of multivariate \(U\)-statistics. Ann. Probab. 23 806-816. MathSciNet: MR1334173 · Zbl 0827.60014
[14] DWORK, C. (2006). Differential privacy. In 33rd International Colloquium on Automata, Languages and Programming. Lecture Notes. Comput. Sci. 4052 1-12. Springer, Berlin. Digital Object Identifier: 10.1007/11787006_1 Google Scholar: Lookup Link MathSciNet: MR2307219 · Zbl 1133.68330 · doi:10.1007/11787006_1
[15] Dwork, C., McSherry, F., Nissim, K. and Smith, A. (2006). Calibrating noise to sensitivity in private data analysis. In Theory of Cryptography. Lecture Notes in Computer Science 3876 265-284. Springer, Berlin. Digital Object Identifier: 10.1007/11681878_14 Google Scholar: Lookup Link MathSciNet: MR2241676 · Zbl 1112.94027 · doi:10.1007/11681878_14
[16] FAN, Y., ZHANG, H. and YAN, T. (2020). Asymptotic theory for differentially private generalized \(β\)-models with parameters increasing. Stat. Interface 13 385-398. Digital Object Identifier: 10.4310/SII.2020.v13.n3.a8 Google Scholar: Lookup Link MathSciNet: MR4091804 · Zbl 07214271 · doi:10.4310/SII.2020.v13.n3.a8
[17] GINÉ, E., LATAŁA, R. and ZINN, J. (2000). Exponential and moment inequalities for \(U\)-statistics. In High Dimensional Probability, II (Seattle, WA, 1999). Progress in Probability 47 13-38. Birkhäuser, Boston, MA. MathSciNet: MR1857312 · Zbl 0969.60024
[18] HENNIG, C. (2007). Cluster-wise assessment of cluster stability. Comput. Statist. Data Anal. 52 258-271. Digital Object Identifier: 10.1016/j.csda.2006.11.025 Google Scholar: Lookup Link MathSciNet: MR2409980 · Zbl 1452.62447 · doi:10.1016/j.csda.2006.11.025
[19] JIANG, H., PEI, J., YU, D., YU, J., GONG, B. and CHENG, X. (2020). Applications of differential privacy in social network analysis: A survey. IEEE Trans. Knowl. Data Eng. 35 108-127.
[20] KARWA, V., KRIVITSKY, P. N. and SLAVKOVIĆ, A. B. (2017). Sharing social network data: Differentially private estimation of exponential family random-graph models. J. R. Stat. Soc. Ser. C. Appl. Stat. 66 481-500. Digital Object Identifier: 10.1111/rssc.12185 Google Scholar: Lookup Link MathSciNet: MR3632338 · doi:10.1111/rssc.12185
[21] KARWA, V. and SLAVKOVIĆ, A. (2016). Inference using noisy degrees: Differentially private \(β\)-model and synthetic graphs. Ann. Statist. 44 87-112. Digital Object Identifier: 10.1214/15-AOS1358 Google Scholar: Lookup Link MathSciNet: MR3449763 · Zbl 1331.62114 · doi:10.1214/15-AOS1358
[22] MUKHERJEE, R., MUKHERJEE, S. and SEN, S. (2018). Detection thresholds for the \(β\)-model on sparse graphs. Ann. Statist. 46 1288-1317. Digital Object Identifier: 10.1214/17-AOS1585 Google Scholar: Lookup Link MathSciNet: MR3798004 · Zbl 1392.62131 · doi:10.1214/17-AOS1585
[23] NISSIM, K., RASKHODNIKOVA, S. and SMITH, A. (2007). Smooth sensitivity and sampling in private data analysis. In STOC’07—Proceedings of the 39th Annual ACM Symposium on Theory of Computing 75-84. ACM, New York. Digital Object Identifier: 10.1145/1250790.1250803 Google Scholar: Lookup Link MathSciNet: MR2402430 · Zbl 1232.68039 · doi:10.1145/1250790.1250803
[24] RINALDO, A., PETROVIĆ, S. and FIENBERG, S. E. (2013). Maximum likelihood estimation in the \(β\)-model. Ann. Statist. 41 1085-1110. Digital Object Identifier: 10.1214/12-AOS1078 Google Scholar: Lookup Link MathSciNet: MR3113804 · Zbl 1292.62052 · doi:10.1214/12-AOS1078
[25] STEIN, S. and LENG, C. (2020). A sparse \(β\)-model with covariates for networks. Preprint. Available at arXiv:2010.13604.
[26] Wasserman, L. and Zhou, S. (2010). A statistical framework for differential privacy. J. Amer. Statist. Assoc. 105 375-389. Digital Object Identifier: 10.1198/jasa.2009.tm08651 Google Scholar: Lookup Link MathSciNet: MR2656057 · Zbl 1364.62011 · doi:10.1198/jasa.2009.tm08651
[27] YAN, T. and XU, J. (2013). A central limit theorem in the \(β\)-model for undirected random graphs with a diverging number of vertices. Biometrika 100 519-524. Digital Object Identifier: 10.1093/biomet/ass084 Google Scholar: Lookup Link MathSciNet: MR3068452 · Zbl 1452.62214 · doi:10.1093/biomet/ass084
[28] ZHANG, Y., WANG, Q., ZHANG, Y., YAN, T. and LUO, J. (2021). L-2 regularized maximum likelihood for \(β\)-model in large and sparse networks. Preprint. Available at arXiv:2110.11856.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.