The algorithm of metaPocket 2.0

The metaPocket 2.0 algorithm

There are three steps in metaPocket2.0 procedure: calling based methods, meta-pocket site generation and mapping binding residues. The whole working procedure of metaPocket 2.0 is illustrated in Figure 1. In the first step, the given protein structure will be sent to 8 predictors of LIGSITEcs, PASS, Q-SiteFinder, SURFNET, Fpocket, GHECOM, ConCavity and POCASA to identify pocket sites on its surface, all the predictors are called in parallel to save running time. In the second step, the pocket sites identified by these element predictors have different ranking scoring functions, so it is hard to compare and evaluate the predicted pocket sites directly. To make the ranking scores comparable, a z-score is calculated separately for each pocket site in different predictors. Afterwards, only the top three pocket sites in each predictor are taken into further consideration. Therefore, we have a total of 24 pocket sites. Then the pocket sites will be clustered according to their spatial similarity and all the final clusters will be ranked by the total z-score values of them. The final pocket sites are the mass center of the final clusters. The purpose of the third step to identify functional residues around the identified meta-pocket site which could be the potential ligand binding sites on protein surface.


Figure 1. The illustration of the metaPocket 2.0 procedure.

For the detailed description, please refer to our publiction(s):
Bingding Huang (2009), metaPocket: a meta approach to improve protein ligand binding site prediction , Omics, 13(4), 325-330. link,PDF .

The LIGSITEcs algorithm    top

The given protein strcture is projected onto a 3D grid. LIGSITEcs scans 7 directions to calculate the number of solvent-surface-solvent events(Figure 2).
Reference: Bingding Huang and Michael Schroeder (2006), LIGSITE csc: predicting protein binding sites using the Connolly surface and degree of conservation, BMC structural Biology, 6:19. link.
The LIGSITE csc server is available here .


Figure 2. The illustration of LIGSITEcs procedure.

The PASS algorithm     top

In PASS, the protein is coated with layers of probes(Figure 3). The probes are clusterd to find active site probes(ASP).
Reference: Brady G, Stouten P: Fast prediction and visualization of protein binding pockets with PASS. J Comput Aided Mol Des 2000, 14:383-401. link
PASS is available here .
In order to use metaPocket, please obtain a license for PASS according to the instructions on the above page.


Figure 3. The illustration of PASS procedure.

The Q-SiteFinder algorithm     top

In Q-SiteFinder, the protein surface is coated with a layer of methyl (-CH3) probes to calculate van der Waals interaction energies between the protein and probes. Probes with favorable interaction energies are retained and clusters of these probes are ranked based on the number of probes in a cluster. The largest or energetically most favorable cluster is then ranked first and considered as a potential ligand-binding site.
Reference: Laurie A, Jackson R: Q-SiteFinder: an energy-based method for the prediction of protein-ligand binding sites. Bioinformatics 2005, 21:1908-1916. link
The Q-SiteFinder server is at http://www.modelling.leeds.ac.uk/qsitefinder/.
In order to use metaPocket, please obtain a license for Q-SiteFinder according to the instructions on the above page.

The SURFNET algorithm     top

A sphere is placed so that the two given atoms are on opposite sides of the sphere's surface. If the sphere contains any other atoms, it is reduced in size until no more atoms are contained. Only spheres with a radius of 1 to 4 angstrom are kept(Figure 4). The result of this procedure is a number of separate groups of interpenetrating spheres, called gap regions, both inside the protein and on its surface, which correspond to the protein's cavities and clefts.
Reference: Laskowski R: SURFNET: a program for visualizing molecular surfaces, cavities and intermolecular interactions. J Mol Graph 1995, 13:323-330. link.
SURFENT is available here .
In order to use metaPocket, please obtain a license for SURFNET according to the instructions on the above page.


Figure 4. The illustration of SURFNET procedure.

The Fpocket algorithm     top

The fpocket core can be resumed by three major steps(Figure 5). During the first step the whole ensemble of alpha spheres is determined from the protein structure. Fpocket returns a pre-filtered collection of spheres. The second step consists in identifying clusters of spheres close together, to identify pockets, and to remove clusters of poor interest. The final step calculates properties from the atoms of the pocket, in order to score each pocket.

Reference: Vincent Le Guilloux, Peter Schmidtke and Pierre Tuffery, "Fpocket: An open source platform for ligand pocket detection", BMC Bioinformatics, 2009, 10:168. link.
Fpocket is available here .
In order to use metaPocket, please obtain a license for Fpocket according to the instructions on the above page.


Figure 5. The illustration of Fpocket procedure.

The GHECOM algorithm     top

First project the protein into a 3D grid, the grid width is 0.8 angstrom, then initiate 17 types of different large probes, their radius are 2.0, 2.5, 3.0, 3.5, ..., and 10 angstrom and one small probe S with its radius 1.87 angstrom. Then calculate the multiscale dilation ( ID(X) ), multiscale closing ( or multiscale molecular volume ) ( IC(X) ) and multiscale pocket ( IP(X) ), and the multiscale pocket regions are the binding sites(Figure 6).

Reference: Kawabata T. (2010) Detection of multi-scale pockets on protein surfaces using mathematical morphology. Proteins,78, 1195-1121. link.
GHECOM is available here .
In order to use metaPocket, please obtain a license for GHECOM according to the instructions on the above page.


Figure 6. The illustration of GHECOM procedure.

The ConCavity algorithm     top

ConCavity proceeds in three conceptual steps(Figure 7): grid creation, pocket extraction, and residue mapping. First, the structural and evolutionary properties of a given protein are used to create a regular 3D grid surrounding the protein in which the score associated with each grid point represents an estimated likelihood that it overlaps a bound ligand atom (A). Second, groups of contiguous, high-scoring grid points are clustered to extract pockets that adhere to given shape and size constraints (B). Finally, every protein residue is scored with an estimate of how likely it is to bind to a ligand based on its proximity to extracted pockets (C).

Reference: Capra JA, Laskowski RA, Thornton JM, Singh M, and Funkhouser TA (2009) Predicting Protein Ligand Binding Sites by Combining Evolutionary Sequence Conservation and 3D Structure. PLoS Comput Biol, 5(12). link.
ConCavity is available here .
In order to use metaPocket, please obtain a license for ConCavity according to the instructions on the above page.


Figure 5. The illustration of ConCavity procedure.

The POCASA algorithm     top

POCASA uses spheres to identify a layer of probe surface on protein(Figure 8). Those regions between the protein surface and the probe surface are the pocket sites. Besides, by changing the size of probe sphere, pocket sites with different sizes can be detected. POCASA is also a method based on grid since it uses protein atoms to fill the 3D grid system and different radius is used for different atoms. This 3D grid system is divided into a set of slices at the same size and a probe scans from the original point though the slices. Once it encounters a protein grid, it rolls over the protein surface to build the probe surface out of protein surface as mentioned above.

Reference: Yu J, Zhou Y, Tanaka I, Yao M (2010) Roll: a new algorithm for the detection of protein pockets and cavities with a rolling probe sphere. Bioinformatics 26: 46-52. link.
POCASA is available here .
In order to use metaPocket, please obtain a license for POCASA according to the instructions on the above page.


Figure 5. The illustration of POCASA procedure.





We thank the authors of PASS, Q-SiteFinder, SURFNET, Fpocket, GHECOM, ConCavity and POCASA for making their tools public. If you have your own tool for identification of ligand binding sites and would like it to be included into metaPocket server, please contact Dr. Bingding Huang.



Top
 

 

If this server is useful for your work, please cite:

Zengming Zhang, Yu Li, Biaoyang Lin, Michael Schroeder and Bingding Huang (2011), Identification of cavities on protein surface using multiple computational approaches for drug binding site prediction. Bioinformatics, 27 (15): 2083-2088. link

Bingding Huang (2009), metaPocket: a meta approach to improve protein ligand binding site prediction , Omics, 13(4), 325-330 Link, PDF.

Contact us

Bingding Huang, Email: bhuang@biotec.tu-dresden.de
Zengming Zhang, Email: zmzhang@mail.systemsbiozju.org

Report Bugs

If you find some bugs of this server, please help us improve metaPocket by reporting bugs to Zengming Zhang, any help from you will be greatly appreciated!

Acknowledgement

Funding from Klaus Tschira Foundation, MOST China (grant no: 2008DFA11320) and EU 7th Framework Marie Curie Actions IRSES project (grant no: 247097) is kindly acknowledged!