Kybernetika 61 no. 6, 789-816, 2025

Honeycomb graphs for parametric identification of correlation classes in multidimensional datasets

Adam Dudáš and Tomáš PeregrínDOI: 10.14736/kyb-2025-6-0789

Abstract:

In the process of gaining knowledge from large sets of data, one of the most significant methods from the area of descriptive statistics $-$ correlation analysis $-$ is applied to determine direct functional relationships between pairs of attributes. Even though the results of correlation analysis are measured through a crisp correlation coefficient, whose values belong to the $[-1,1]$ interval, human interpretation of these values is conventionally vague and uses linguistic classes of correlation to describe the strength of relationships between attribute pairs. However, this interpretative vagueness $-$ and the correlation classes themselves $-$ are not commonly employed in the decision-making processes. Therefore, this work focuses on the design and implementation of so-called Honeycomb Graphs $-$ a visualization method for parametric identification of correlation classes in multidimensional datasets based on graphical models. After implementing the proposed visualization technique, two case studies on benchmark datasets are conducted, and the model is evaluated from both qualitative and quantitative points of view. The results of these studies highlight interactive exploration of correlation analysis while adhering to qualitative and quantitative standards of scientific visualizations and high utilization potential of the method in feature selection tasks, making it a valuable tool for predictive analysis and data exploration.

Keywords:

visualization, big data analysis, correlation analysis, correlation classes

Classification:

68P99, 62H20

References:

  1. E. Birihanu and I. Lendák: Explainable correlation-based anomaly detection for industrial control systems. Forntiers Artificial Intelligence 7 (2025).   DOI:10.3389/frai.2024.1508821
  2. L. Candanedo: Appliances Energy Prediction [Dataset]. UCI Machine Learning Repository.   DOI:10.24432/C5VC8G
  3. L. Candanedo, V. Feldheim and D. Deramaix: Data driven prediction models of energy use of appliances in a low-energy house. Energy Buildings 140 (2017), 81-97.   DOI:10.1016/j.enbuild.2017.01.083
  4. S. Carpendale: Evaluating Information Visualizations. Lecture Notes in Computer Science 4950 (2008).   DOI:10.1007/978-3-540-70956-5\_2
  5. Y. F. Chen, Y. T. Long, Z. Yang and J. Long: Correlation embedding semantic-enhanced hashing for multimedia retrieval. Image Vision Comput. 154 (2025).   DOI:10.1016/j.imavis.2025.105421
  6. A. Chen, C. D. Wu and C. J. Leng: Hourglass-GCN for 3D human pose estimation using skeleton structure and view correlation. Computers Materials Continua 82 (2025), Q, 173-191.   DOI:10.32604/cmc.2024.059284
  7. P. Cortez, A. Cerdeira and F. Almeida et al.: Wine Quality [Dataset]. UCI Machine Learning Repository.   DOI:10.24432/C56S3T
  8. P. Cortez, A. Cerdeira and F. Almeida et al.: Modeling wine preferences by data mining from physicochemical properties. Decision Support Systems 47 (2009), 4, 547-553.   DOI:10.1016/j.dss.2009.05.016
  9. A. Dudáš: Graphical representation of data prediction potential: correlation graphs and correlation chains. Visual Computer 40 (2024), 10, 6969-6982.   DOI:10.1007/s00371-023-03240-y
  10. A. Dudáš A, E. Kršák and M. Kvaššay: Exploration and deconstruction of correlation cycles in multidimensional datasets. Technologies 13 (2025), 2.   DOI:10.3390/technologies13020085
  11. A. Dudáš and M. Vagač: Diagnostic analysis approach to correlation maps through large language models. In: Proc. 2024 IEEE 17th International Scientific Conference on Informatics 2024.   DOI:10.1109/Informatics62280.2024.10900889
  12. H. Held: Quantile-filtered Bayesian learning for the correlation class. In: Proc. 5th International Symposium on Imprecise Probability 2007, pp. 223-232.   CrossRef
  13. L. B. Iantovics: Avoiding mistakes in bivariate linear regression and correlation analysis, in rigorous research. Acta Polytechn. Hungarica 21 (2024), 6, 33-52.   DOI:10.12700/APH.21.6.2024.6.2
  14. T. Isenberg, P. Isenberg, J. Chen, M. Sedlmair and T. Moller: A systematic review on the practice of evaluating visualization. IEEE Trans. Visual. Computer Graphics 19 (2013), 12, 2818-2827.   DOI:10.1109/TVCG.2013.126
  15. M. Jamei, N. Bailek, K. Bouchouicha, M. A. Hassan and A. Elbeltagi et al.: Data-driven models for predicting solar radiation in semi-arid regions. Computers Materials Continua 74 (2023), 1, 1625-1640.   DOI:10.32604/cmc.2023.031406
  16. 0. Jianu and M. Dragoicea: Enhancing seismic analysis: A fusion of smar visualization and correlation techniques. Univ. Politeh. Bucharest Sci. Bull. Ser. C - Electr. Engrg. Comput. Sci. 86 (2024), 3, 51-66.   CrossRef
  17. A. Karduni, D. Markant, R. Wesslen and V. W. Dou: A Bayesian cognition approach for belief updating of correlation judgement through uncertainty visualizations. IEEE Trans. Visual. Computer Graphics 27 (2021), 2, 978-988.   DOI:10.1109/TVCG.2020.3029412
  18. S. Lee, H. Seong, S. Lee and E. Kim: Correlation verification for image retrieval and its memory footprint optimization. IEEE Trans. Pattern Anal. Machine Intell. 47 (2025), 3, 1514-1529.   DOI:10.1109/TPAMI.2024.3504274
  19. L. C. Li, Z. X. He, B. Z. Wang, Z. Wang and L. B. Li: Multi-agent reinforcement learning algorithm based on local information. In: Lecture Notes in Electrical Engineering. Proc. 2022 International Conference on Autonomous Unmanned Systems 2022, 1010, pp. 3080-3091.   DOI:10.1007/978-981-99-0479-2\_284
  20. B. E. Monroy-Castillo, M. A. Jácome and R. C. R. Cao: Improved distance correlation estimation. Appl. Intelligence 55 (2025), 4.   DOI:10.1007/s10489-024-05940-x
  21. N. Pahuja: Correlations in multispecies PASEP on a ring. Electron. Commun. Probab. 30 (2025).   DOI:10.1214/25-ECP666
  22. S. S. Skiena: The Data Science Design Manual. Springer, 2017.   DOI:10.1007/978-3-319-55444-0
  23. K. H. Yang, C. W. She, W. Zhang, J. Q. Yao and S. S. Long: Multi-label learning based on transfer learning and label correlation. Computers Materials Continua 61 (2019), 1, 155-169.   DOI:10.32604/cmc.2019.05901
  24. L. Zhang, Q. B. Hou, Y. Liu, J. W. Bian and X. Xu et al.: Deep negative correlation classification. Machine Learning 113 (2024), 10, 7223-7241.   DOI:10.1007/s10994-024-06604-0