Standalone Package Download
1. Download
You can obtain the KGIPA standalone package either by downloading the ZIP file or cloning the GitHub repository: https://github.com/ShutaoChen97/KGIPA
To clone the repository, run the following command:
git clone git@github.com:ShutaoChen97/KGIPA.git cd KGIPA/
2. Installation
2.1 Create Conda Environment
conda create -n kgipa python=3.10 conda activate kgipa
2.2 Requirements
We recommend installing the environment using the provided environment.yaml file to ensure compatibility:
conda env update -f environment.yaml --prune
If this approach fails or Conda is not available, you can manually install the main dependencies as listed below:
python 3.10 biopython 1.84 huggingface-hub 0.26.1 numpy 2.1.2 transformers 4.46.0 tokenizers 0.20.1 sentencepiece 0.2.0 torch 2.5.0+cpu torchaudio 2.5.0+cpu torchvision 0.20.0+cpu torch-geometric 2.6.1 shap 0.48.0
Note: If you have an available GPU, you can install the accelerated version of KGIPA using the corresponding CUDA toolkit. Change the URL below to reflect your version of the CUDA toolkit (cu118 for CUDA 11.6/11.8, cu121 for CUDA 12.1). Do not provide a number greater than your installed CUDA toolkit version. For more information on other CUDA versions, see the PyTorch installation documentation.
pip3 install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu118
2.3 Tools
Feature extraction tools and databases on which KGIPA relies. For more details on installation and usage, please refer to KGIPA GitHub repository .
SCRATCH-1D 1.2 IUPred2A ncbi-blast 2.13.0 ProtT5 trRosetta
Databases and model:
| Database / Model | Description | Download |
|---|---|---|
| nrdb90 | NCBI BLAST sequence database | Download |
| uniclust30_2018_08 | HHsuite sequence database | Download |
| model_res2net_202108 | Pre-trained network models of trRosetta | Download |
2.4 Install KGIPA
Finally, configure the default paths of the tools and databases in conf.py.
3. Usage
To predict peptide-protein binary interaction and peptide-protein-specific binding residues, follow these steps:
- Replace the default peptide sequence in
example/Peptide_Seq.fastaand protein sequence inexample/Protein_Seq.fastawith your own sequences (FASTA format). - Run the predictor:
conda activate kgipa python run_predictor.py -uip example
If you want to retrain KGIPA on your private dataset, locate the original KGIPA model in model.py.
The KGIPA source code is implemented in PyTorch and can be easily imported by instantiating the model.
4. Problem Feedback
If you have questions on how to use KGIPA, feel free to raise them in the discussions section. If you identify any potential bugs, please report them in the issue tracker. In addition, if you have any further questions about KGIPA, you can contact us directly at stchen@bliulab.net.