NOXclass Logo
max planck institut
informatik
mpii logo Minerva of the Max Planck Society
 

NOXclass Help

Introduction
Interface Properties
Classification Method
Input
Output
Notes
References

Introduction

NOXclass is a classifier identifying protein-protein interaction types (biological obligate, biological non-obligate and crystal packing) implemented using a support vector machine (SVM) algorithm [1].

1LFD
Figure 1: Protein-protein Interaction (PDB: 1LFD chain A&B)

Interface Properties

To discriminate the three types of interactions, we utilize the following interface properties.


Classification Method

We employed a support vector machine [7,8] to classify the three type of interactions. In general, an SVM is a supervised learning algorithm for binary classification of data. For more than two classes of data, multi-class techniques are required. These techniques include "one-against-one" and "one-against-all" approaches [9]. For these purposes, several binary SVM classifiers are constructed and the appropriate class is determined using a majority voting scheme. An alternative approach is a multi-stage classifier that separates data progressively. Here, the classification is performed in several stages, and in each stage one class of data is separated.

Schematic plots of the multi-class SVM and the multi-stage SVM are given in Figure 2 and Figure 3, respectively.

Multi-Class SVM
Figure 2: Multi-Class SVM

When using multi-class SVM, NOXclass calculates three posterior probability values for the query interaction. Each of them is the probability that the query interaction belongs to one of the three interaction types (obligate, non-obligate, and crystal packing). See section Output for an example.

Multi-Stage SVM
Figure 3: Multi-Stage SVM

When using multi-stage SVM, NOXclass calculates two posterior probability values in each stage for the query interaction. In the first stage, the two values are the probabilities that the query interaction belongs to biological interaction and crystal packing contact. In the second stage, they are the probabilities that the query interaction belongs to the two biological interaction types (obligate and non-obligate). See section Output for an example.

LIBSVM [10] has been chosen as the SVM package used in the NOXclass program.

Input

Users should provide two protein protomers, which refer to the two polypeptide chains in the protein complex, to NOXclass. Either a PDB identifier with two chain names, or two PDB files containing the two protomer structure models have to be specified to NOXclass.
Users can choose the interface properties and classification method used in the prediction. In our statistical study, the best performing classifier is the one uses three interface properties (interface area, interface area ratio, and area-based amino acid composition of protein-protein interface) and multi-stage SVM method, as presented on the main page of NOXclass.

Output

1. Sample prediction results using three interface properties and multi-class SVM method.

Multi-Class SVM
Figure 4: Sample prediction output using the multi-class SVM

2. Sample prediction results using three interface properties and multi-class SVM method.

Multi-Class SVM
Figure 5: Sample prediction output using the multi-class SVM

Notes

  1. An example of protein complex is given in Figure 1;
  2. For NMR structures, please upload only one model;
  3. To upload your own structure files, only the two PDB files for the two protomers are needed. The concatenation of these two files will be taken as the structure file for the complex.

References

[1] Zhu H, Domingues FS, Sommer I, Lengauer T. NOXclass: prediction of protein-protein interaction types. BMC Bioinformatics 2006, 7:27 [FullText] [pdf]
[2] Hubbard S, Thornton J. 'NACCESS', Computer Program, Department of Biochemistry and Molecular Biology, University College London. 1993.
[3] Ofran Y, Rost B. Analysing six types of protein-protein interfaces. J Mol Biol. 2003 Jan 10;325(2):377-87.
[4] Laskowski RA. SURFNET: a program for visualizing molecular surfaces, cavities, and intermolecular interactions. J Mol Graph. 1995 Oct;13(5):323-30, 307-8.
[5] Bahadur RP, Chakrabarti P, Rodier F, Janin J. A dissection of specific and non-specific protein-protein interfaces. J Mol Biol. 2004 Feb 27;336(4):943-55.
[6] Armon A, Graur D, Ben-Tal N. ConSurf: an algorithmic tool for the identification of functional regions in proteins by surface mapping of phylogenetic information. J Mol Biol. 2001  Mar 16;307(1):447-63.
[7] Vapnik V. The nature of statistical learning theory. New York: Springer 1995.
[8] Vapnik V. Statistical Learning Theory. New York: Wiley 1998.
[9] Hsu C, Lin C. A comparison of methods for multi-class support vector machines. IEEE Transactionson Neural Networks, 2002, 13(2):415-425.
[10] Chang CC, Lin CJ. LIBSVM -- A Library for Support Vector Machines. http://www.csie.ntu.edu.tw/~cjlin/libsvm/.

If you have any problem or suggestion, please contact Zhu, Hongbo <hzhu@mpi-sb.mpg.de>