Introduction
Understanding peptide-protein interactions is vital for decoding cellular signaling and developing targeted therapies. However, the complexity of multi-molecular associations and diverse non-covalent interactions make accurate prediction and site-specific annotation challenging. Here, we propose KGIPA, a knowledge-guided pragmatic analysis framework that incorporates pragmatic concepts from natural language into life science, capturing the influence of biological environments on non-covalent interactions. KGIPA integrates intra- and extra-linguistic contextual information to combine multimodal single-molecule features and build residue-level interaction maps. It also uses biological prior knowledge to coordinate various non-covalent interaction types. Benchmark tests demonstrate KGIPA outperforms the state-of-the-art methods in evaluating molecular binding, including protein and peptide binding residues and residue-pair interactions. Furthermore, KGIPA demonstrates strong performance in peptide-protein binding affinity prediction and peptide virtual screening. Wet-lab experiments validate its reliability, revealing high consistency between predicted and experimentally measured binding behaviors. These results highlight KGIPA’s potential to accelerate peptide drug discovery and establish pragmatic analysis as an effective paradigm for decoding the molecular language of interactions.
Figure 1. The model architecture of KGIPA. KGIPA is a neural network model designed to achieve biological sequence pragmatic analysis, and it can be mainly divided into two parts: intra-linguistic and extra-linguistic contextual representation.