Identifying Protein Complexes in Protein-Protein Interaction Data Using Graph Convolutional Network

Nazar Zaki, Harsh Singh, Elfadil A. Mohamed

Research output: Contribution to journalArticlepeer-review

2 Citations (Scopus)

Abstract

Protein complexes are groups of two or more polypeptide chains that bind to form noncovalent networks of protein interactions. Over the past decade, researchers have created a number of means of computing the ways in which protein complexes and their members can be identified through these interaction networks. Although most of the existing methods identify protein functional complexes from the protein-protein interaction networks (PPIs) at a fairly decent level, the applicability of advanced graph network methods has not yet been adequately investigated. This paper proposes various graph convolutional network (GCN) methods to improve the detection of protein complexes. We first formulate the protein complex detection problem as a node classification problem. Then, we developed a Neural Overlapping Community Detection (NOCD) model to cluster the nodes (proteins) using a complex affiliation matrix. A representation learning approach, that combines a multi-class GCN feature extractor (to obtain the nodes' features) and a mean shift clustering algorithm (to perform the clustering), is also utilized. We convert the dense-dense matrix operations into dense-sparse or sparse-sparse matrix operations to improve the efficiency of the multi-class GCN network by reducing space and time complexities. The proposed solution significantly improves the scalability of the existing GCN. Finally, we apply clustering aggregation to find the best protein complexes. A grid search is then performed on various detected complexes obtained via three well-known protein detection methods, namely ClusterONE, CMC, and PEWCC, with the help of the Meta-Clustering Algorithm (MCLA) and the Hybrid Bipartite Graph Formulation (HBGF). We test the proposed GCN-based methods on various publicly available datasets and find that they perform significantly better than previous state-of-the-art methods. The code/data are available for free download from https://github.com/Analystharsh/GCN_complex_detection.

Original languageEnglish
Pages (from-to)123717-123726
Number of pages10
JournalIEEE Access
Volume9
DOIs
Publication statusPublished - 2021

Keywords

  • Protein complex detection
  • graph convolutional network (GCN)
  • hybrid bipartite graph formulation (HBGF)
  • meta-clustering algorithm (MCLA)
  • neural overlapping community detection (NOCD)
  • protein-protein interaction (PPI)

ASJC Scopus subject areas

  • Computer Science(all)
  • Materials Science(all)
  • Engineering(all)

Fingerprint

Dive into the research topics of 'Identifying Protein Complexes in Protein-Protein Interaction Data Using Graph Convolutional Network'. Together they form a unique fingerprint.

Cite this