Uncovering Latent Biological Function Associations through Gene Set Embeddings
- Posted
- Server
- bioRxiv
- DOI
- 10.1101/2024.10.10.617577
The complexity of biological systems has increasingly been unraveled through computational methods, with biological network analysis now focusing on the construction and exploration of well-defined interaction networks. Traditional graph-theoretical approaches have been instrumental in mapping key biological processes using high-confidence interaction data. However, these methods often struggle with incomplete or/and heterogeneous datasets. In this study, we extend beyond conventional bipartite models by integrating attribute-driven knowledge from the Molecular Signatures Database (MSigDB) using the node2vec algorithm. Our approach explores unsupervised biological relationships and uncovers potential associations between genes and biological terms through network connectivity analysis. By embedding both human and mouse data into a shared vector space, we validate our findings cross-species, further strengthening the robustness of our method. This integrative framework reveals both expected and novel biological insights, offering a comprehensive perspective that complements traditional biological network analysis and paves the way for deeper understanding of complex biological processes and diseases.