Kybernetika 59 no. 2, 234-253, 2023

TPM: Transition probability matrix -- Graph structural feature based embedding

Sarmad N. Mohammed and Semra GündüçDOI: 10.14736/kyb-2023-2-0234

Abstract:

In this work, Transition Probability Matrix (TPM) is proposed as a new method for extracting the features of nodes in the graph. The proposed method uses random walks to capture the connectivity structure of a node's close neighborhood. The information obtained from random walks is converted to anonymous walks to extract the topological features of nodes. In the embedding process of nodes, anonymous walks are used since they capture the topological similarities of connectivities better than random walks. Therefore the obtained embedding vectors have richer information about the underlying connectivity structure. The method is applied to node classification and link prediction tasks. The performance of the proposed algorithm is superior to the state-of-the-art algorithms in the recent literature. Moreover, the extracted information about the connectivity structure of similar networks is used to link prediction and node classification tasks for a completely new graph.

Keywords:

graph representation learning, feature learning, link prediction, node classification, anonymous random walk

Classification:

05C82, 94C15

References:

  1. R. Albert and A. Barabási: Statistical mechanics of complex networks. Rev. Mod. Phys. 74 (2002), 47-97.   DOI:10.1103/RevModPhys.74.47
  2. A. Barabási: Network science. Phil. Trans. R. Soc. A. 371 (2013), 20120375.   DOI:10.1098/rsta.2012.0375
  3. A. Barabási and E. Bonabeau: Scale-free networks. Scientif. Amer. 288 (2003), 60-69.   DOI:10.1038/scientificamerican0303-60
  4. P. W. Battaglia, J. B. Hamrick, V. Bapst, A. Sanchez-Gonzalez, V. Zambaldi and et al.: Relational inductive biases, deep learning, and graph networks. ArXiv, 2018.   https://arxiv.org/abs/1806.01261
  5. S. Bhagat, G. Cormode and S. Muthukrishnan: Node classification in social networks. In: Social Network Data Analytics (C. Aggarwal, ed.), Springer, Boston 2011.   DOI:10.1007/978-1-4419-8462-3\_5
  6. D. Buffelli and F. Vandin: The impact of global structural information in graph neural networks applications. Data 7 (2022), 10.   DOI:10.3390/data7010010
  7. P. Cetin and S. Emrah Amrahov: A new overlapping community detection algorithm based on similarity of neighbors in complex networks. Kybernetika 58 (2022), 277-300.   DOI:10.14736/kyb-2022-2-0277
  8. P. Cetin and Ş. E. Amrahov: A new network-based community detection algorithm for disjoint communities. Turk. J. Electr. Eng. Co. 30 (2022), 2190-2205.   DOI:10.55730/1300-0632.3933
  9. H. Cherifi, G. Palla, B. K. Szymanski and X. Lu: On community structure in complex networks: challenges and opportunities. Appl. Netw. Sci. 4 (2019), 1-35.   DOI:10.1007/s41109-019-0238-9
  10. K. Chi, G. Yin, Y. Dong and H. Dong: Link prediction in dynamic networks based on the attraction force between nodes. Knowledge-Based Systems 181 (2019), 0950-7051.   DOI:10.1016/j.knosys.2019.05.035
  11. S. Fortunato: Community detection in graphs. Phys. Rep. 486 (2010), 75-174.   DOI:10.1016/j.physrep.2009.11.002
  12. C. L. Giles, K. D. Bollacker and S. Lawrence: CiteSeer: An automatic citation indexing system. In: Proc. Third ACM conference on Digital libraries, 1998, pp. 89-98.   CrossRef
  13. M. Girvan and M. E. J. Newman: Community structure in social and biological networks. Proc. Nat. Acad. Sci. 99 (2002), 7821-7826.   DOI:10.1073/pnas.122653799
  14. A. Grover and J. Leskovec: node2vec: Scalable feature learning for networks. In: Proc. 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York 2016, pp. 855-864.   DOI:10.1145/2939672.2939754
  15. W. Hamilton, Z. Ying and J. Leskovec: Inductive representation learning on large graphs. In: 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach 2017.   CrossRef
  16. X. Han, L. Wang, C. Cui, J. Ma and S. Zhang: Linking multiple online identities in criminal investigations: A spectral co-clustering framework. IEEE Trans. Inf. Forensics Security 12 (2017), 2242-2255.   DOI:10.1109/TIFS.2017.2704906
  17. J. A. Hanley and B. J. McNeil: The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 143 (1982), 29-36.   DOI:10.1148/radiology.143.1.7063747
  18. S. Ivanov and E. Burnaev: Anonymous walk embeddings. In: Proc. 35th International Conference on Machine Learning, 2018, pp. 2186-2195.   CrossRef
  19. T. Khafaei, A. Tavakoli Taraghi, M. Hosseinzadeh and A. Rezaee: Tracing temporal communities and event prediction in dynamic social networks. Soc. Netw. Anal. Min. 9 (2019), 1-11.   DOI:10.1007/s13278-019-0604-8
  20. T. N. Kipf and M. Welling: Semi-supervised classification with graph convolutional networks. https://arxiv.org/abs/1609.02907, 2016.   CrossRef
  21. G. Kossinets and D. J. Watts: Empirical analysis of an evolving social network. Science 311 (2006), 88-90.   DOI:10.1126/science.1116869
  22. A. Lancichinetti, S. Fortunato and F. Radicchi: Benchmark graphs for testing community detection algorithms. Phys. Rev. E 78 (2008), 046110.   DOI:10.1103/PhysRevE.78.046110
  23. D. Liben-Nowell and J. Kleinberg: The link-prediction problem for social networks. In: Proc. twelfth international conference on Information and knowledge management, New Orleans 2003, pp. 556-559.   DOI:10.1145/956863.956972
  24. V. Martínez, F. Berzal and J. Cubero: A survey of link prediction in complex networks. ACM Comput. Surv. 49 (2016), 1-33.   DOI:10.1145/3012704
  25. A. K. McCallum, K. Nigam, J. Rennie and K. Seymore: Automating the construction of internet portals with machine learning. Inform, Retrieval 3 (2000), 127-163.   DOI:10.1023/A:1009953814988
  26. A. Mele: A structural model of homophily and clustering in social networks. J. Bus. Econom. Statist. 40 (2022), 1377-1389.   DOI:10.1080/07350015.2021.1930013
  27. S. Micali and Z. A. Zhu: Reconstructing markov processes from independent and anonymous experiments. Discrete Appl. Math. 200 (2016), 108-122.   DOI:10.1016/j.dam.2015.06.035
  28. S. N. Mohammed and S. Gündüç: Degree-based random walk approach for graph embedding. Turk. J. Electr. Eng. Co. 30 (2022), 1868-1881.   DOI:10.55730/1300-0632.3910
  29. B. Molokwu, S. B. Shuvo, N. C. Kar and Z. Kobti: Node classification and link prediction in social graphs using RLVECN. In: 32nd International Conference on Scientific and Statistical Database Management, Vienna 2020, pp. 1-10.   DOI:10.1145/3400903.3400928
  30. G. Palla, I. Derényi, I. Farkas and T. Vicsek: Uncovering the overlapping community structure of complex networks in nature and society. Nature 435 (2005), 814-818.   DOI:10.1038/nature03607
  31. M. E. G. Pavlopoulou, G. Tzortzis, D. Vogiatzis and G. Paliouras: Predicting the evolution of communities in social networks using structural and temporal features. In: 12th International Workshop on Semantic and Social Media Adaptation and Personalization (SMAP), Bratislava 2017, pp. 40-45.   DOI:10.1109/SMAP.2017.8022665
  32. B. Perozzi, R. Al-Rfou and S. Skiena: Deepwalk: Online learning of social representations. In: Proc. 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York 2014, pp. 701-710.   DOI:10.1145/2623330.2623732
  33. P. Sen, G. Namata, M. Bilgic, L. Getoor, B. Galligher and T. Eliassi-Rad: Collective classification in network data. AI Magazine 29 (2008), 93-93.   DOI:10.1609/aimag.v29i3.2157
  34. K. Sun, L. Wang, B. Xu, W. Zhao, S. W. Teng and F. Xia: Network representation learning: From traditional feature learning to deep learning. IEEE Access 8 (2020), 205600-205617.   DOI:10.1109/ACCESS.2020.3037118
  35. R. Tamassia: Handbook of Graph Drawing and Visualization. (First edition. Chapman and Hall/CRC, New York 2013.   DOI:10.1201/b15385
  36. J. Tang, M. Qu, M. Wang, M. Zhang, J. Yan and Q. Mei: Line: Large-scale information network embedding. In: Proceedings of the 24th International Conference on World Wide Web, Florence, Italy, 2015, pp. 1067-1077.   DOI:10.1145/2736277.2741093
  37. P. Velickovic, G. Cucurull, A. Casanova, A. Romero, P. Lio and Y. Bengio: Graph attention networks. Stat. 20 (2017), 10-48550.   CrossRef
  38. O. Vencálek and D. Hlubinka: A depth-based modification of the k-nearest neighbour method. Kybernetika 57 (2021), 15-37.   DOI:10.14736/kyb-2021-1-0015
  39. Y. Xie, P. Jin, M. Gong, C. Zhang and B. Yu: Multi-task network representation learning. Front. Neurosci. 14 (2020), 1.   DOI:10.3389/fnins.2020.00001
  40. M. Xu: Understanding graph embedding methods and their applications. SIAM Rev. 63 (2021), 825-853.   DOI:10.1137/20M1386062
  41. M. J. Zaki and W. Meira: Data Mining and Analysis: Fundamental Concepts and Algorithms. Cambridge University Press, 2014.   CrossRef
  42. Z. Zhang, P. Cui and W. Zhu: Deep learning on graphs: A survey. IEEE Trans. Knowl. Data Eng. 34 (2020), 249-270.   DOI:10.1109/TKDE.2020.2981333
  43. X. M. Zhang, L. Liang, L. Liu and M. J. Tang: Graph neural networks and their current applications in bioinformatics. Front, Genetics 12 (2021), 690049.   DOI:10.3389/fgene.2021.690049