×

A data-dependent approach for high-dimensional (robust) Wasserstein alignment. (English) Zbl 07887809

MSC:

68Wxx Algorithms in computer science

Software:

node2vec; k-means++; EMD

References:

[1] Aboagye, Prince Osei, Zheng, Yan, Yeh, Chin-Chia Michael, Wang, Junpeng, Zhang, Wei, Wang, Liang, Yang, Hao, and Phillips, Jeff M.. 2022. Normalization of language embeddings for cross-lingual alignment. In Proceedings of the 10th International Conference on Learning Representations (ICLR’22). OpenReview.net.
[2] Agarwal, Pankaj K., Fox, Kyle, Panigrahi, Debmalya, Varadarajan, Kasturi R., and Xiao, Allen. 2017. Faster algorithms for the geometric transportation problem. In Proceedings of the 33rd International Symposium on Computational Geometry (SoCG’17). 7:1-7:16. · Zbl 1432.68481
[3] Agarwal, Pankaj K. and Varadarajan, Kasturi R.. 2004. A near-linear constant-factor approximation for euclidean bipartite matching? In Proceedings of the 20th ACM Symposium on Computational Geometry. 247-252. · Zbl 1374.68628
[4] Ahuja, Ravindra K., Magnanti, Thomas L., and Orlin, James B.. 1993. Network Flows: Theory, Algorithms, and Applications. Prentice Hall. · Zbl 1201.90001
[5] Altschuler, Jason, Weed, Jonathan, and Rigollet, Philippe. 2017. Near-linear time approximation algorithms for optimal transport via Sinkhorn iteration. In Proceedings of the Annual Conference on Neural Information Processing Systems. 1964-1974.
[6] Alvarez-Melis, David and Jaakkola, Tommi S.. 2018. Gromov-wasserstein alignment of word embedding spaces. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, Riloff, Ellen, Chiang, David, Hockenmaier, Julia, and Tsujii, Jun’ichi (Eds.). Association for Computational Linguistics, 1881-1890.
[7] Andoni, Alexandr, Ba, Khanh Do, Indyk, Piotr, and Woodruff, David. 2009. Efficient sketches for earth-mover distance, with applications. In Proceedings of the 50th Annual IEEE Symposium on Foundations of Computer Science (FOCS’09). IEEE, 324-330. · Zbl 1292.68160
[8] Andoni, Alexandr, Nikolov, Aleksandar, Onak, Krzysztof, and Yaroslavtsev, Grigory. 2014. Parallel algorithms for geometric graph problems. In Proceedings of the Symposium on Theory of Computing (STOC’14). 574-583. · Zbl 1315.05133
[9] Andoni, Alexandr, Stein, Clifford, and Zhong, Peilin. 2020. Parallel approximate undirected shortest paths via low hop emulators. In Proccedings of the 52nd Annual ACM SIGACT Symposium on Theory of Computing (STOC’20), Makarychev, Konstantin, Makarychev, Yury, Tulsiani, Madhur, Kamath, Gautam, and Chuzhoy, Julia (Eds.). ACM, 322-335. · Zbl 07298251
[10] Arthur, David and Vassilvitskii, Sergei. 2007. K-means++ the advantages of careful seeding. In Proceedings of the 18th Annual ACM-SIAM Symposium on Discrete Algorithms. 1027-1035. · Zbl 1302.68273
[11] Belkin, Mikhail. 2003. Problems of Learning on Manifolds. The University of Chicago. · Zbl 1514.65011
[12] Ben-David, Shai, Blitzer, John, Crammer, Koby, Kulesza, Alex, Pereira, Fernando, and Vaughan, Jennifer Wortman. 2010. A theory of learning from different domains. Mach. Learn.79, 1-2 (2010), 151-175. · Zbl 1470.68081
[13] Besl, P. J. and McKay, Neil D.. 1992. A method for registration of 3-D shapes. IEEE Trans. Pattern Anal. Mach. Intell.14, 2 (1992), 239-256.
[14] Beugnot, Gaspard, Genevay, Aude, Greenewald, Kristjan, and Solomon, Justin. 2021. Improving approximate optimal transport distances using quantization. In Uncertainty in Artificial Intelligence. PMLR, 290-300.
[15] Blitzer, John, Crammer, Koby, Kulesza, Alex, Pereira, Fernando, and Wortman, Jennifer. 2007. Learning bounds for domain adaptation. In Proceedings of the 21st Annual Conference on Neural Information Processing Systems. 129-136. · Zbl 1470.68081
[16] Cabello, Sergio, Giannopoulos, Panos, Knauer, Christian, and Rote, Günter. 2008. Matching point sets with respect to the Earth Mover’s Distance. Comput. Geom.39, 2 (2008), 118-133. · Zbl 1129.65008
[17] Cao, Xudong, Wei, Yichen, Wen, Fang, and Sun, Jian. 2014. Face alignment by explicit shape regression. Int. J. Comput. Vis.107, 2 (2014), 177-190.
[18] Chapel, Laetitia, Flamary, Rémi, Wu, Haoran, Févotte, Cédric, and Gasso, Gilles. 2021. Unbalanced optimal transport through non-negative penalized linear regression. In Proceedings of the Annual Conference on Neural Information Processing Systems (NeurIPS’21), Ranzato, Marc’Aurelio, Beygelzimer, Alina, Dauphin, Yann N., Liang, Percy, and Vaughan, Jennifer Wortman (Eds.). 23270-23282.
[19] Li Chen, Rasmus Kyng, Yang P. Liu, Richard Peng, Maximilian Probst Gutenberg, and Sushant Sachdeva. 2022. Maximum flow and minimum-cost flow in almost-linear time. In IEEE 63rd Annual Symposium on Foundations of Computer Science (FOCS’22). IEEE, 612-623.
[20] Chen, Xi, Jayaram, Rajesh, Levi, Amit, and Waingarten, Erik. 2022. New streaming algorithms for high dimensional EMD and MST. In Proceedings of the 54th Annual ACM SIGACT Symposium on Theory of Computing. 222-233. · Zbl 07774335
[21] Cohen, Scott and Guibas, Leonidas. 1999. The earth mover’s distance under transformation sets. In Proceedings of the 7th IEEE International Conference on Computer Vision. 1.
[22] Cornea, Nicu D., Demirci, M. Fatih, Silver, Deborah, Dickinson, S. J., and Kantor, P. B.. 2005. 3D object retrieval using many-to-many matching of curve skeletons. In Proceedings of the International Conference on Shape Modeling and Applications. IEEE, 366-371.
[23] Courty, Nicolas, Flamary, Rémi, Tuia, Devis, and Rakotomamonjy, Alain. 2016. Optimal transport for domain adaptation. IEEE Trans. Pattern Anal. Mach. Intell.39, 9 (2016), 1853-1865.
[24] Courty, Nicolas, Flamary, Rémi, Tuia, Devis, and Rakotomamonjy, Alain. 2017. Optimal transport for domain adaptation. IEEE Trans. Pattern Anal. Mach. Intell.39, 9 (2017), 1853-1865.
[25] Cuturi, Marco. 2013. Sinkhorn distances: Lightspeed computation of optimal transport. In Proceedings of the 27th Annual Conference on Neural Information Processing Systems.2292-2300.
[26] Dasgupta, Sanjoy and Sinha, Kaushik. 2013. Randomized partition trees for exact nearest neighbor search. In Proceedinsg of the Conference on Learning Theory. 317-337. · Zbl 1311.68148
[27] Dev, Sunipa, Hassan, Safia, and Phillips, Jeff M.. 2021. Closed form word embedding alignment. Knowl. Inf. Syst.63, 3 (2021), 565-588.
[28] Ding, Hu, Chen, Tan, Yang, Fan, and Wang, Mingyue. 2021. A data-dependent algorithm for querying earth mover’s distance with low doubling dimensions. In Proceedings of the SIAM International Conference on Data Mining (SDM’21), Demeniconi, Carlotta and Davidson, Ian (Eds.). SIAM, 630-638.
[29] Ding, Hu and Xu, Jinhui. 2017. FPTAS for minimizing the earth mover’s distance under rigid transformations and related problems. Algorithmica78, 3 (2017), 741-770. · Zbl 1378.68193
[30] Ding, Hu and Ye, Mingquan. 2019. On geometric alignment in low doubling dimension. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 1460-1467.
[31] Dan Feldman. 2020. Core-sets: An updated survey. WIREs Data Min. Knowl. Discov. 10, 1 (2020), 23-44. · Zbl 1436.62707
[32] Kyle Fox and Jiashuai Lu. 2022. A near-linear time approximation scheme for geometric transportation with arbitrary supplies and spread. J. Comput. Geom. 13, 1 (2022), 204-225. · Zbl 07692357
[33] Goldberg, Andrew V. and Tarjan, Robert Endre. 1989. Finding minimum-cost circulations by canceling negative cycles. J. ACM36, 4 (1989), 873-886. · Zbl 0697.68063
[34] Gong, Boqing, Shi, Yuan, Sha, Fei, and Grauman, Kristen. 2012. Geodesic flow kernel for unsupervised domain adaptation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2066-2073.
[35] Gonzalez, Teofilo F.. 1985. Clustering to minimize the maximum intercluster distance. Theor. Comput. Sci.38 (1985), 293-306. · Zbl 0567.62048
[36] Grave, Edouard, Joulin, Armand, and Berthet, Quentin. 2019. Unsupervised alignment of embeddings with wasserstein procrustes. In Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics. PMLR, 1880-1890.
[37] Grover, Aditya and Leskovec, Jure. 2016. node2vec: Scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 855-864.
[38] Ham, Jihun, Lee, Daniel D., and Saul, Lawrence K.. 2005. Semisupervised alignment of manifolds. In Proceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS’05). 120-127.
[39] Har-Peled, Sariel and Mendel, Manor. 2006. Fast construction of nets in low-dimensional metrics and their applications. SIAM J. Comput.35, 5 (2006), 1148-1184. · Zbl 1100.68014
[40] Indyk, Piotr. 2007. A near linear time constant factor approximation for Euclidean bichromatic matching (cost). In Proceedings of the 18th Annual ACM-SIAM Symposium on Discrete Algorithms. Society for Industrial and Applied Mathematics, 39-42. · Zbl 1302.68290
[41] Indyk, Piotr and Thaper, N.. 2003. Fast color image retrieval via embeddings. In Workshop on Statistical and Computational Theories of Vision (at ICCV’03).
[42] Jin, Kun, Liu, Chaoyue, and Xia, Cathy. 2021. Two-sided wasserstein procrustes analysis. In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI’21). 3515-3521.
[43] Jubran, Ibrahim, Maalouf, Alaa, Kimmel, Ron, and Feldman, Dan. 2021. Provably approximated ICP. . Retrieved from https://arxiv.org/abs/2101.03588.
[44] Karger, David R. and Ruhl, Matthias. 2002. Finding nearest neighbors in growth-restricted metrics. In Proceedings of the 34th Annual ACM Symposium on Theory of Computing. ACM, 741-750. · Zbl 1192.68750
[45] Khesin, Andrey Boris, Nikolov, Aleksandar, and Paramonov, Dmitry. 2019. Preconditioning for the geometric transportation problem. In Proceedings of the 35th International Symposium on Computational Geometry. 15:1-15:14. · Zbl 1507.68326
[46] Kim, Wan Kyu and Marcotte, Edward M.. 2008. Age-dependent evolution of the yeast protein interaction network suggests a limited role of gene duplication and divergence. PLoS Comput. Biol.4, 11 (2008), e1000232.
[47] Klein, Oliver and Veltkamp, Remco C.. 2005. Approximation algorithms for computing the earth mover’s distance under transformations. In International Symposium on Algorithms and Computation. Springer, 1019-1028. · Zbl 1175.68561
[48] Krauthgamer, Robert and Lee, James R.. 2004. Navigating nets: Simple algorithms for proximity search. In Proceedings of the 15th Annual ACM-SIAM Symposium on Discrete Algorithms. Society for Industrial and Applied Mathematics, 798-807. · Zbl 1318.68071
[49] Kusner, Matt, Sun, Yu, Kolkin, Nicholas, and Weinberger, Kilian. 2015. From word embeddings to document distances. In Proceedings of the International Conference on Machine Learning. 957-966.
[50] Laakso, Tomi J.. 2002. Plane with \(A_{\infty }\) -weighten metric not Bilipschitz embeddable to \(R^n\) .Bull. Lond. Math. Soc.34, 6 (2002), 667-676. · Zbl 1029.30014
[51] Gall, François Le. 2012. Faster algorithms for rectangular matrix multiplication. In Proceedings of the IEEE 53rd Annual Symposium on Foundations of Computer Science. IEEE, 514-523.
[52] Lee, Yin Tat and Sidford, Aaron. 2014. Path finding methods for linear programming: Solving linear programs in Õ(vrank) iterations and faster algorithms for maximum flow. In Proceedings of the 55th IEEE Annual Symposium on Foundations of Computer Science. 424-433.
[53] Li, Shi. 2010. On constant factor approximation for earth mover distance over doubling metrics. CoRRabs/1002.4034.
[54] Liu, Yangwei, Ding, Hu, Chen, Danyang, and Xu, Jinhui. 2017. Novel geometric approach for global alignment of PPI networks. In Proceedings of the 31st AAAI Conference on Artificial Intelligence. 31-37.
[55] Lloyd, Stuart. 1982. Least squares quantization in PCM. IEEE Trans. Inf. Theory28, 2 (1982), 129-137. · Zbl 0504.94015
[56] Malod-Dognin, Noël, Ban, Kristina, and Pržulj, Nataša. 2017. Unified alignment of protein-protein interaction networks. Sci. Rep.7, 1 (2017), 953.
[57] Maltoni, Davide, Maio, Dario, Jain, Anil K., and Prabhakar, Salil. 2009. Handbook of Fingerprint Recognition. Springer Science & Business Media. · Zbl 1027.68114
[58] Mikolov, Tomas, Le, Quoc V., and Sutskever, Ilya. 2013. Exploiting similarities among languages for machine translation. arXiv:1309.4168. Retrieved from https://arxiv.org/abs/1309.4168.
[59] Mukherjee, Debarghya, Guha, Aritra, Solomon, Justin M., Sun, Yuekai, and Yurochkin, Mikhail. 2021. Outlier-robust optimal transport. In International Conference on Machine Learning. PMLR, 7850-7860.
[60] Munteanu, Alexander and Schwiegelshohn, Chris. 2018. Coresets-methods and history: A theoreticians design pattern for approximation and streaming algorithms. Künstl. Intell.32, 1 (2018), 37-53.
[61] Nasser, Soliman, Jubran, Ibrahim, and Feldman, Dan. 2015. Low-cost and faster tracking systems using core-sets for pose-estimation. CoRRabs/1511.09120 (2015).
[62] Orlin, James B.. 1988. A faster strongly polynominal minimum cost flow algorithm. In Proceedings of the 20th Annual ACM Symposium on Theory of Computing. 377-387.
[63] Orlin, James B.. 1997. A polynomial time primal network simplex algorithm for minimum cost flows. Math. Program.78, 2 (1997), 109-129. · Zbl 0888.90058
[64] Orlin, James B., Plotkin, Serge A., and Tardos, Éva. 1993. Polynomial dual network simplex algorithms. Math. Program.60, 1-3 (1993), 255-276. · Zbl 0784.90097
[65] Pan, Sinno Jialin and Yang, Qiang. 2010. A survey on transfer learning. IEEE Trans. Knowl. Data Eng.22, 10 (2010), 1345-1359.
[66] Perrot, Michaël, Courty, Nicolas, Flamary, Rémi, and Habrard, Amaury. 2016. Mapping estimation for discrete optimal transport. In Advances in Neural Information Processing Systems, 29.
[67] Rubner, Yossi, Tomasi, Carlo, and Guibas, Leonidas J.. 2000. The earth mover’s distance as a metric for image retrieval. Int. J. Comput. Vis.40, 2 (2000), 99-121. · Zbl 1012.68705
[68] Mohammad, Ebrahim Sahraeian Sayed and Yoon, Byung-Jun. 2012. A network synthesis model for generating protein interaction network families. PloS One7 (August2012).
[69] Schönemann, Peter H.. 1966. A generalized solution of the orthogonal procrustes problem. Psychometrika31, 1 (1966), 1-10. · Zbl 0147.19401
[70] Sharathkumar, R. and Agarwal, Pankaj K.. 2012. Algorithms for the transportation problem in geometric settings. In Proceedings of the 23rd Annual ACM-SIAM Symposium on Discrete Algorithms (SODA’12). 306-317. · Zbl 1423.90023
[71] Sharathkumar, R. and Agarwal, Pankaj K.. 2012. A near-linear time \(\epsilon \) -approximation algorithm for geometric bipartite matching. In Proceedings of the 44th Symposium on Theory of Computing Conference (STOC’12). 385-394. · Zbl 1286.05140
[72] Sherman, Jonah. 2017. Generalized preconditioning and undirected minimum-cost flow. In Proceedings of the 28th Annual ACM-SIAM Symposium on Discrete Algorithms. 772-780. · Zbl 1422.90063
[73] Solé, Ricard V., Pastor-Satorras, Romualdo, Smith, Eric, and Kepler, Thomas B.. 2002. A model of large-scale proteome evolution. Adv, Complex Syst,5, 01 (2002), 43-54. · Zbl 1020.92024
[74] Talwar, Kunal. 2004. Bypassing the embedding: Algorithms for low dimensional metrics. In Proceedings of the 36th Annual ACM Symposium on Theory of Computing. 281-290. · Zbl 1192.68918
[75] Tardos, Éva. 1985. A strongly polynomial minimum cost circulation algorithm. Combinatorica5, 3 (1985), 247-256. · Zbl 0596.90030
[76] Todorovic, Sinisa and Ahuja, Narendra. 2008. Region-based hierarchical image matching. Int. J. Comput. Vis.78, 1 (2008), 47-66.
[77] Vaidya, Pravin M.. 1989. Geometry helps in matching. SIAM J. Comput.18, 6 (1989), 1201-1225. · Zbl 0692.68042
[78] Varadarajan, Kasturi R. and Agarwal, Pankaj K.. 1999. Approximation algorithms for bipartite and non-bipartite matching in the plane. In Proceedings of the 10th Annual ACM-SIAM Symposium on Discrete Algorithms. 805-814. · Zbl 0929.65036
[79] Vázquez, Alexei, Flammini, Alessandro, Maritan, Amos, and Vespignani, Alessandro. 2003. Modeling of protein interaction networks. Complexus1, 1 (2003), 38-44.
[80] Villani, Cédric. 2008. Topics in optimal transportation. Am. Math. Soc.58 (2008). · Zbl 1106.90001
[81] Wahba, Grace. 1965. A least squares estimate of satellite attitude. SIAM Rev.7, 3 (1965), 409-409.
[82] C. Wang, P. Krafft, S. Mahadevan, Y. Ma, and Y. Fu. 2011. Manifold alignment. In Manifold Learning: Theory and Applications, CRC Press. 95-120.
[83] Yin, Zekun, Lan, Haidong, Tan, Guangming, Lu, Mian, Vasilakos, Athanasios V., and Liu, Weiguo. 2017. Computing platforms for big biological data analytics: Perspectives and challenges. Comput. Struct. Biotechnol. J.15 (2017), 403-411.
[84] Youn, Hyejin, Sutton, Logan, Smith, Eric, Moore, Cristopher, Wilkins, Jon F., Maddieson, Ian, Croft, William, and Bhattacharya, Tanmoy. 2016. On the universal structure of human lexical semantics. Proc. Natl. Acad. Sci. U.S.A.113, 7 (2016), 1766-1771.
[85] Zhang, Meng, Liu, Yang, Luan, Huanbo, and Sun, Maosong. 2017. Earth mover’s distance minimization for unsupervised bilingual lexicon induction. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2017). 1934-1945.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.