The original publication of hPRINT:
Large-scale de novo prediction of physical protein-protein association.
Elefsinioti A, Sarac OS, Hegele A, Plake C, Hubner NC, Poser I, Sarov M, Hyman A, Mann M, Schroeder M, Stelzl U, Beyer A. Mol Cell Proteomics. 2011
The search panel on the home page of hPRINT allows the user to enter a set of genes. After pressing 'Send' the database will return all interactions of the query genes with any other genes in the database. These settings can be changed by clicking on 'Advanced search'. The query form accepts gene symbols, as well as Ensembl and Entrez gene IDs. Gene names can be separated with any white space or comma, they can be listed in one or many lines.
When pressing "Advanced search" a more detailed query form appears providing the following options:
- Physical interactions: filter only for interactions that have experimental evidences.
- Score Cutoffs: set a cutoff for up to four evidences so only interactions with higher values will appear.
- Combine Cutoffs: cutoffs can be combined with an AND or an OR operator, i.e. all criteria or any of the criteria has to be fulfilled, respectively.
- Combine gene identifiers: if the operator is set to AND only interactions among the query genes will be returned. Otherwise interactions involving at least of the query genes will be returned.
- Adjust rows: define how many rows should be shown per result page
- On the top left of the results page the user can choose between the "extended" and the "condensed" table form. The extended version displays all evidence columns while the condensed displays only the summary scores integrating various evidences, i.e. the RF functional, RF physical and the Bayesian scores.
- On the top left of the results page there is another button that allows the user to download the results table. There are two options; either to download the table that is currently displayed or the whole results set. The format of the download file is a tab-separated text file, which can be imported into many other programs such as spreadsheet programs or network analysis software such as Cytoscape.
- Clicking on the interaction ID will show more details about the genes participating in that interaction.
- Several interactions can be selected using the check boxes on the left. Clicking 'Detailed View' (at the bottom of the page) will show details for all selected interactions.
Description of evidence columns
Gene fusion (STRING). Based on observed fusion of the two candidate interactors in one gene in a different species. (Derived from STRING score).
Genomic neigborhood (STRING). Scores genes that occur repeatedly in close neighborhood in genome sequences. (Derived from STRING score).
Phylogenetic profile (STRING). Based on presence or absence of linked proteins across species. (Derived from STRING score).
Coexpression (STRING). Based on co-expression in the same or in other species (transferred by homology). (Derived from STRING score).
Experimental (STRING). Based on a list of protein interaction datasets, gathered from other databases reporting experimentally tested interactions. (Derived from STRING score).
Database (STRING). Based on other curated databases. (Derived from STRING score).
Text mining (STRING). The text mining sore is extracted from the abstracts of scientific literature (Derived from STRING score).
STRING. This score is derived from the "combined score" reported in STRING. It
Gene-gene associations reported in the HiMAP database.
Gene-gene associations reported in the Bioverse database.
Text mining: cellular component. Gene association with a GO cellular component is predicted based on co-occurrences in the literature database PubMed. Gene-Gene scores are based on association with a common GO cellular component.
Text mining: molecular function. Gene association with a GO biological process is predicted based on co-occurrences in the literature database PubMed. Gene-Gene scores are based on association with a common GO biological process.
Text mining: biological process. Gene association with a GO biological process is predicted based on co-occurrences in the literature database PubMed. Gene-Gene scores are based on association with a common GO biological process. |
Text mining: disease. Gene-gene association scores sharing a common disease annotation. Evidence of a gene associated with a disease is based on co-occurrences in the literature database PubMed.
KEGG binary Binary interactions reported in KEGG.
KEGG complex. Protein complex association reported in KEGG.
KEGG functional Pathway co-membership based on KEGG.
HPRD in vivo Binary interactions tha t are validated via in vivo experiments (based on HPRD)
HPRD not in vivo Binary interactions that are tested in vitro or via yeast two-hybrid experiments (based on HPRD).
Direct interactions or physical associations reported in IntAct.
Protein complex association reported in CORUM (core set complexes for ''homo sapiens'').
CRG high confidence High confidence score selected from the paper of Bossi et al., 2009
Domain pairs Protein sequence-based prediction of matching domain pairs in the two proteins. Profile HMMs are constructed using structural information about protein interaction interfaces (see Henschel et al. 2007).
Network feature: clustering coefficient Average clustering coefficient of the two genes. Clustering coefficients are computed based on the functional network's topology.
Network feature: minimum spanning tree Based on whether or not the respective edge is part of the minimum spanning tree in the functional network. Given a connected, undirected graph, a spanning tree of that graph is a subgraph which is a tree and connects all the vertices together. A minimum spanning tree is then a spanning tree with weight less than or equal to the weight of every other spanning tree.
Network feature: extended minimum spanning tree. A minimum spanning tree (MST) is extended such that if an edge is not in the MST and the weight of the shortest path in the MST between the vertices that the edge connects is larger than the weight of the edge itself, than the edge is added to the extended MST.
Network feature: neighborhood ratio Ratio of the common neighbors of the vertices connected by the edge to the total number of neighbors of that two vertices (in the functional network).
Network feature: shortest path to weight ratio. Ratio of the weight of the shortest path between two vertices to the weight of the edge connecting them.
Network feature: local betweenness. Ratio of shortest paths going through the edge to the number of possible pairs of vertices in the local neighborhood of the edge (based on functional network).
Ratio of shortest paths going through the edge to the number of possible pairs of vertices in the graph (based on functional network).
Random Forest score for physical interactions. For computing Random Forest scores we used only non-experimental evidences.
Random Forest functional. Random Forest functional Random Forest score for functional interactions. For computing Random Forest scores we used only non-experimental evidences.
Bayesian physical. Bayesian score for physical interactions. For predicting Bayesian score we used Random Forest physical score as input evidence and we combined it with experimental evidences.